TY - GEN
T1 - Identifying novel information using latent semantic analysis in the WiQA task at CLEF 2006
AU - Sutcliffe, Richard F.E.
AU - Steinberger, Josef
AU - Kruschwitz, Udo
AU - Alexandrov-Kabadjov, Mijail
AU - Poesio, Massimo
PY - 2007
Y1 - 2007
N2 - In our two-stage system for the English monolingual WiQA Task, snippets were first retrieved if they contained an exact match with the title. Candidates were then passed to the Latent Semantic Analysis component which judged them Novel if their match with the article text was less than a threshold. In Runl, the ten best snippets were returned and in Run 2 the twenty best. Run 1 was superior, with Average Yield per Topic 2.46 and Precision 0.37. Compared to other groups, our performance was in the middle of the range except for Precision where our system was the best. We attribute this to our use of exact title matches in the IR stage. In future work we will vary the approach used depending on the topic type, exploit co-references in conjunction with exact matches and make use of the elaborate hyperlink structure which is a unique and most interesting aspect of the Wikipedia.
AB - In our two-stage system for the English monolingual WiQA Task, snippets were first retrieved if they contained an exact match with the title. Candidates were then passed to the Latent Semantic Analysis component which judged them Novel if their match with the article text was less than a threshold. In Runl, the ten best snippets were returned and in Run 2 the twenty best. Run 1 was superior, with Average Yield per Topic 2.46 and Precision 0.37. Compared to other groups, our performance was in the middle of the range except for Precision where our system was the best. We attribute this to our use of exact title matches in the IR stage. In future work we will vary the approach used depending on the topic type, exploit co-references in conjunction with exact matches and make use of the elaborate hyperlink structure which is a unique and most interesting aspect of the Wikipedia.
UR - http://www.scopus.com/inward/record.url?scp=38049185091&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-74999-8_66
DO - 10.1007/978-3-540-74999-8_66
M3 - Conference contribution
AN - SCOPUS:38049185091
SN - 9783540749981
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 541
EP - 549
BT - Evaluation of Multilingual and Multi-modal Information Retrieval - 7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006, Revised Selected Papers
PB - Springer Verlag
T2 - 7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006
Y2 - 20 September 2006 through 22 September 2006
ER -