TY - JOUR
T1 - Overview of QA4MRE at CLEF 2011
T2 - 2011 Cross Language Evaluation Forum Conference, CLEF 2011
AU - Peñas, Anselmo
AU - Hovy, Eduard
AU - Forner, Pamela
AU - Rodrigo, Álvaro
AU - Sutcliffe, Richard
AU - Forascu, Corina
AU - Sporleder, Caroline
PY - 2011
Y1 - 2011
N2 - This paper describes the first steps towards developing a methodology for testing and evaluating the performance of Machine Reading systems through Question Answering and Reading Comprehension Tests. This was the attempt of the QA4MRE challenge which was run as a Lab at CLEF 2011. This year a major innovation was introduced, as the traditional QA task was replaced by a new Machine Reading task whose intention was to ask questions which required a deep knowledge of individual short texts and in which systems were required to choose one answer, by analysing the corresponding test document in conjunction with the background collections provided by the organization. Beside the main task, also one pilot task was offered, namely, Processing Modality and Negation for Machine Reading. This task was aimed at evaluating whether systems were able to understand extra-propositional aspects of meaning like modality and negation. This paper describes the preparation of the data sets, the creation of the background collections to allow systems to acquire the required knowledge, the metric used for the evaluation of the systems' submissions, and the results of this first attempt. Twelve groups participated in the task submitting a total of 62 runs in three languages: English, German and Romanian.
AB - This paper describes the first steps towards developing a methodology for testing and evaluating the performance of Machine Reading systems through Question Answering and Reading Comprehension Tests. This was the attempt of the QA4MRE challenge which was run as a Lab at CLEF 2011. This year a major innovation was introduced, as the traditional QA task was replaced by a new Machine Reading task whose intention was to ask questions which required a deep knowledge of individual short texts and in which systems were required to choose one answer, by analysing the corresponding test document in conjunction with the background collections provided by the organization. Beside the main task, also one pilot task was offered, namely, Processing Modality and Negation for Machine Reading. This task was aimed at evaluating whether systems were able to understand extra-propositional aspects of meaning like modality and negation. This paper describes the preparation of the data sets, the creation of the background collections to allow systems to acquire the required knowledge, the metric used for the evaluation of the systems' submissions, and the results of this first attempt. Twelve groups participated in the task submitting a total of 62 runs in three languages: English, German and Romanian.
UR - http://www.scopus.com/inward/record.url?scp=84922032477&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:84922032477
SN - 1613-0073
VL - 1177
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
Y2 - 19 September 2011 through 22 September 2011
ER -