Term Selection for Query Expansion in Medical Cross-lingual Information Retrieval
published by Springer
in 2019
in Informatics Engineering
English
Abstract in English
We present a method for automatic query expansion for cross-lingual information retrieval in the medical domain. The method employs
machine translation of source-language queries into a document language and linear regression to predict the retrieval performance for each translated query when expanded with a candidate term.
Candidate terms (in the document language) come from multiple sources: query translation hypotheses obtained from the machine translation system, Wikipedia articles and PubMed abstracts. Query expansion is applied only when the model predicts a score for a candidate term that exceeds a tuned threshold which allows to expand queries with strongly related terms only.
Our experiments are conducted using the CLEF eHealth 2013--2015 test collection and show %seven source languages and also in the monolingual case. The results show
significant improvements in both cross-lingual and monolingual settings.
