TY - JOUR
T1 - DriVQA
T2 - A gaze-based dataset for visual question answering in driving scenarios
AU - Rekanar, Kaavya
AU - Joyce, John M.
AU - Hayes, Martin
AU - Eising, Ciarán
N1 - Publisher Copyright:
© 2025 The Author(s)
PY - 2025/4
Y1 - 2025/4
N2 - This paper presents DriVQA, a novel dataset that combines gaze plots and heatmaps with visual question-answering (VQA) data from participants who were presented with driving scenarios. Visual Questioning Answering (VQA) is proposed as a part of the vehicle autonomy trustworthiness and interpretability solution in decision-making by autonomous vehicles. Collected using the Tobii Pro X3-120 eye-tracking device, the DriVQA dataset provides a comprehensive mapping of where participants direct their gaze when presented with images of driving scenes, followed by related questions and answers from every participant. For each scenario, the dataset contains: images of driving situations, associated questions, participant answers, gaze plots, and heatmaps. It is being used to study the subjectivity inherent in VQA. Its detailed gaze-tracking data offers a unique perspective on how individuals perceive and interpret visual scenes, making it an essential resource for training VQA models that rely on human-like attention. The dataset is a valuable tool for investigating human cognition and behaviour in dynamic, real-world scenarios. DriVQA is highly relevant for VQA models, as it allows the systems to learn from human-like attention behaviour when making decisions based on visual input when trained. The dataset has the potential to drive advancements in VQA research and development by improving the safety and intelligence of driving systems through enhanced visual understanding and interaction. DriVQA has significant potential for reuse in various research areas, including the development of advanced VQA models, attention analysis, and human-computer interaction studies. Its comprehensive gaze plots and heatmaps can also be leveraged to improve applications in autonomous driving, driver assistance systems, and cognitive science research, making it a versatile resource for both academic and industrial purposes.
AB - This paper presents DriVQA, a novel dataset that combines gaze plots and heatmaps with visual question-answering (VQA) data from participants who were presented with driving scenarios. Visual Questioning Answering (VQA) is proposed as a part of the vehicle autonomy trustworthiness and interpretability solution in decision-making by autonomous vehicles. Collected using the Tobii Pro X3-120 eye-tracking device, the DriVQA dataset provides a comprehensive mapping of where participants direct their gaze when presented with images of driving scenes, followed by related questions and answers from every participant. For each scenario, the dataset contains: images of driving situations, associated questions, participant answers, gaze plots, and heatmaps. It is being used to study the subjectivity inherent in VQA. Its detailed gaze-tracking data offers a unique perspective on how individuals perceive and interpret visual scenes, making it an essential resource for training VQA models that rely on human-like attention. The dataset is a valuable tool for investigating human cognition and behaviour in dynamic, real-world scenarios. DriVQA is highly relevant for VQA models, as it allows the systems to learn from human-like attention behaviour when making decisions based on visual input when trained. The dataset has the potential to drive advancements in VQA research and development by improving the safety and intelligence of driving systems through enhanced visual understanding and interaction. DriVQA has significant potential for reuse in various research areas, including the development of advanced VQA models, attention analysis, and human-computer interaction studies. Its comprehensive gaze plots and heatmaps can also be leveraged to improve applications in autonomous driving, driver assistance systems, and cognitive science research, making it a versatile resource for both academic and industrial purposes.
KW - Attention mapping
KW - Autonomous driving
KW - Computer vision
KW - Eye-tracking
KW - Human attention patterns
KW - Object tracking
KW - Scene analysis
UR - http://www.scopus.com/inward/record.url?scp=85217282340&partnerID=8YFLogxK
U2 - 10.1016/j.dib.2025.111367
DO - 10.1016/j.dib.2025.111367
M3 - Article
AN - SCOPUS:85217282340
SN - 2352-3409
VL - 59
JO - Data in Brief
JF - Data in Brief
M1 - 111367
ER -