TY - JOUR
T1 - Over view of the CLEF 2006 multilingual question answering track
AU - Magnini, Bernardo
AU - Giampiccolo, Danilo
AU - Forner, Pamela
AU - Ayache, Christelle
AU - Jijkoun, Valentin
AU - Osenova, Petya
AU - Peñas, Anselmo
AU - Rocha, Paulo
AU - Sacaleanu, Bogdan
AU - Sutcliffe, Richard
PY - 2006
Y1 - 2006
N2 - Having being proposed for the fourth time, the QA at CLEF track has confirmed a still raising interest from the research community, recording a constant increase both in the number of participants and submissions. In 2006, two pilot tasks, WiQA and AVE, were proposed beside the main tasks, representing two promising experiments for the future of QA. Also in the main task some significant innovations were introduced, namely list questions and requiring text snippet(s) to support the exact answers. Although this had an impact on the work load of the organizers both to prepare the question sets and especially to evaluate the submitted runs, it had no significant influence on the performance of the systems, which registered a higher Best accuracy than in the previous campaign, both in monolingual and bilingual tasks. In this paper the preparation of the test set and the evaluation process are described, together with a detailed presentation of the results for each of the languages. The pilot tasks WiQA and AVE will be presented in dedicated articles.
AB - Having being proposed for the fourth time, the QA at CLEF track has confirmed a still raising interest from the research community, recording a constant increase both in the number of participants and submissions. In 2006, two pilot tasks, WiQA and AVE, were proposed beside the main tasks, representing two promising experiments for the future of QA. Also in the main task some significant innovations were introduced, namely list questions and requiring text snippet(s) to support the exact answers. Although this had an impact on the work load of the organizers both to prepare the question sets and especially to evaluate the submitted runs, it had no significant influence on the performance of the systems, which registered a higher Best accuracy than in the previous campaign, both in monolingual and bilingual tasks. In this paper the preparation of the test set and the evaluation process are described, together with a detailed presentation of the results for each of the languages. The pilot tasks WiQA and AVE will be presented in dedicated articles.
UR - http://www.scopus.com/inward/record.url?scp=84922022265&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:84922022265
SN - 1613-0073
VL - 1172
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
T2 - 2006 Cross Language Evaluation Forum Workshop, CLEF 2006, co-located with the 10th European Conference on Digital Libraries, ECDL 2006
Y2 - 20 September 2006 through 22 September 2006
ER -