TY - GEN
T1 - Overview of the CLEF 2006 multilingual question answering track
AU - Magnini, Bernardo
AU - Giampiccolo, Danilo
AU - Forner, Pamela
AU - Ayache, Christelle
AU - Jijkoun, Valentin
AU - Osenova, Petya
AU - Peñas, Anselmo
AU - Rocha, Paulo
AU - Sacaleanu, Bogdan
AU - Sutcliffe, Richard
PY - 2007
Y1 - 2007
N2 - Having being proposed for the fourth time, the QA at CLEF track has confirmed a still raising interest from the research community, recording a constant increase both in the number of participants and submissions. In 2006, two pilot tasks, WiQA and AVE, were proposed beside the main tasks, representing two promising experiments for the future of QA.Also in the main task some significant innovations were introduced, namely list questions and requiring text snippet(s) to support the exact answers. Although this had an impact on the work load of the organizers both to prepare the question sets and especially to evaluate the submitted runs, it had no significant influence on the performance of the systems, which registered a higher Best accuracy than in the previous campaign, both in monolingual and bilingual tasks. In this paper the preparation of the test set and the evaluation process are described, together with a detailed presentation of the results for each of the languages. The pilot tasks WiQA and AVE will be presented in dedicated articles.
AB - Having being proposed for the fourth time, the QA at CLEF track has confirmed a still raising interest from the research community, recording a constant increase both in the number of participants and submissions. In 2006, two pilot tasks, WiQA and AVE, were proposed beside the main tasks, representing two promising experiments for the future of QA.Also in the main task some significant innovations were introduced, namely list questions and requiring text snippet(s) to support the exact answers. Although this had an impact on the work load of the organizers both to prepare the question sets and especially to evaluate the submitted runs, it had no significant influence on the performance of the systems, which registered a higher Best accuracy than in the previous campaign, both in monolingual and bilingual tasks. In this paper the preparation of the test set and the evaluation process are described, together with a detailed presentation of the results for each of the languages. The pilot tasks WiQA and AVE will be presented in dedicated articles.
UR - http://www.scopus.com/inward/record.url?scp=38049129785&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-74999-8_31
DO - 10.1007/978-3-540-74999-8_31
M3 - Conference contribution
AN - SCOPUS:38049129785
SN - 9783540749981
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 223
EP - 256
BT - Evaluation of Multilingual and Multi-modal Information Retrieval - 7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006, Revised Selected Papers
PB - Springer Verlag
T2 - 7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006
Y2 - 20 September 2006 through 22 September 2006
ER -