TY - JOUR
T1 - FIN
T2 - Fast Inference Network for Map Segmentation
AU - Bispo, Ruan
AU - Brophy, Tim
AU - Mohandas, Reenu
AU - Scanlan, Anthony
AU - Eising, Ciarán
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2026
Y1 - 2026
N2 - Bird's-Eye View (BEV) maps provide a top-down semantic representation of the driving environment, which is critical for downstream tasks such as planning and navigation in autonomous driving systems. However, generating accurate BEV maps from on-board sensors remains challenging due to perspective distortion, occlusions, and the high computational cost of dense 3D reasoning, particularly under real-time constraints. Existing approaches often trade inference speed for accuracy or rely heavily on camera-based perception, limiting deployability. Therefore, this work presents a novel and efficient map segmentation architecture, using cameras and radars, in the BEV space by combining rich semantic information from cameras with accurate distance measurements from radar, without incurring excessive financial costs or overwhelming data processing requirements. Our model introduces a real-time map segmentation architecture considering aspects such as high accuracy, per-class balancing, and inference time. To accomplish this, we use an advanced loss set together with a new lightweight head to improve the perception results. Our results show that, with these modifications, our approach achieves results comparable to large models, reaching 53.5 mIoU, while also setting a new benchmark for inference time, improving it by 260% over the strongest baseline models.
AB - Bird's-Eye View (BEV) maps provide a top-down semantic representation of the driving environment, which is critical for downstream tasks such as planning and navigation in autonomous driving systems. However, generating accurate BEV maps from on-board sensors remains challenging due to perspective distortion, occlusions, and the high computational cost of dense 3D reasoning, particularly under real-time constraints. Existing approaches often trade inference speed for accuracy or rely heavily on camera-based perception, limiting deployability. Therefore, this work presents a novel and efficient map segmentation architecture, using cameras and radars, in the BEV space by combining rich semantic information from cameras with accurate distance measurements from radar, without incurring excessive financial costs or overwhelming data processing requirements. Our model introduces a real-time map segmentation architecture considering aspects such as high accuracy, per-class balancing, and inference time. To accomplish this, we use an advanced loss set together with a new lightweight head to improve the perception results. Our results show that, with these modifications, our approach achieves results comparable to large models, reaching 53.5 mIoU, while also setting a new benchmark for inference time, improving it by 260% over the strongest baseline models.
KW - BEV
KW - camera-radar
KW - map segmentation
KW - nuScenes
KW - real-time
KW - sensor fusion
UR - https://www.scopus.com/pages/publications/105027945519
U2 - 10.1109/OJVT.2026.3652797
DO - 10.1109/OJVT.2026.3652797
M3 - Article
AN - SCOPUS:105027945519
SN - 2644-1330
JO - IEEE Open Journal of Vehicular Technology
JF - IEEE Open Journal of Vehicular Technology
ER -