Identifying novel information using latent semantic analysis in the WiQA task at CLEF 2006

Richard F.E. Sutcliffe, Josef Steinberger, Udo Kruschwitz, Mijail Alexandrov-Kabadjov, Massimo Poesio

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In our two-stage system for the English monolingual WiQA Task, snippets were first retrieved if they contained an exact match with the title. Candidates were then passed to the Latent Semantic Analysis component which judged them Novel if their match with the article text was less than a threshold. In Runl, the ten best snippets were returned and in Run 2 the twenty best. Run 1 was superior, with Average Yield per Topic 2.46 and Precision 0.37. Compared to other groups, our performance was in the middle of the range except for Precision where our system was the best. We attribute this to our use of exact title matches in the IR stage. In future work we will vary the approach used depending on the topic type, exploit co-references in conjunction with exact matches and make use of the elaborate hyperlink structure which is a unique and most interesting aspect of the Wikipedia.

Original languageEnglish
Title of host publicationEvaluation of Multilingual and Multi-modal Information Retrieval - 7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006, Revised Selected Papers
PublisherSpringer Verlag
Pages541-549
Number of pages9
ISBN (Print)9783540749981
DOIs
Publication statusPublished - 2007
Event7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006 - Alicante, Spain
Duration: 20 Sep 200622 Sep 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4730 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006
Country/TerritorySpain
CityAlicante
Period20/09/0622/09/06

Fingerprint

Dive into the research topics of 'Identifying novel information using latent semantic analysis in the WiQA task at CLEF 2006'. Together they form a unique fingerprint.

Cite this