Abstract
Camera-radar fusion offers a robust and low-cost alternative to camera-lidar fusion for the 3D object detection task in real-time under adverse weather and lighting conditions. However, currently, in the literature, it is possible to find few works focusing on this modality and, most importantly, developing new architectures to explore the advantages of the radar point cloud, such as accurate distance estimation and speed information. Therefore, this work presents a novel and efficient 3D object detection algorithm using cameras and radars in the bird's-eye-view (BEV). Our algorithm exploits the advantages of radar before fusing the features into a detection head. A new backbone is introduced, which maps the radar pillar features into an embedded dimension. A self-attention mechanism allows the backbone to model the dependencies between the radar points. We used a simplified convolutional layer to replace the FPN-based convolutional layers used in the PointPillars-based architectures with the main goal of reducing inference time. Experimental results show that our approach achieves strong performance on the nuScenes dataset, reaching an NDS of 58.2 with a ResNet-50 backbone, while maintaining real-time inference.
| Original language | English |
|---|---|
| Journal | IEEE Open Journal of Vehicular Technology |
| DOIs | |
| Publication status | Accepted/In press - 2026 |
Keywords
- 3D object detection
- camera-radar
- nuScenes
- perception
- sensor fusion
Fingerprint
Dive into the research topics of 'PAN: Pillars-Attention-Based Network for 3D Object Detection'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver