TY - GEN
T1 - Reflective Teacher
T2 - 2025 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025
AU - Hazra, Saheli
AU - Das, Sudip
AU - Choudhary, Rohit
AU - Das, Arindam
AU - Sistu, Ganesh
AU - Eising, Ciarán
AU - Bhattacharya, Ujjwal
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Applying pseudo labeling techniques has been found to be advantageous in semi-supervised 3D object detection (SSO D) in Bird' s-Eye-View (BEV) for autonomous driving, particularly where labeled data is limited. In the literature, Exponential Moving Average (EMA) has been used for adjustments of the weights of teacher network by the student network. However, the same induces catastrophic forgetting in the teacher network. In this work, we address this issue by introducing a novel concept of Reflective Teacher where the student is trained by both labeled and pseudo labeled data while its knowledge is progressively passed to the teacher through a regularizer to ensure retention of previous knowledge. Additionally, we propose Geometry Aware BEV Fusion (GA-BEVFusion) for efficient alignment of multi-modal BEV features, thus reducing the disparity between the modalities-camera and LiDAR. This helps to map the precise geometric information embedded among LiDAR points reliably with the spatial priors for extraction of semantic information from camera images. Our experiments on the nuScenes and Waymo datasets demonstrate: 1) improved performance over state-of-the-art methods in both fully supervised and semi-supervised settings; 2) Reflective Teacher achieves equivalent performance with only 25% and 22% of labeled datafor nuScenes and Waymo datasets respectively, in contrast to other fully supervised methods that utilize the full labeled dataset.
AB - Applying pseudo labeling techniques has been found to be advantageous in semi-supervised 3D object detection (SSO D) in Bird' s-Eye-View (BEV) for autonomous driving, particularly where labeled data is limited. In the literature, Exponential Moving Average (EMA) has been used for adjustments of the weights of teacher network by the student network. However, the same induces catastrophic forgetting in the teacher network. In this work, we address this issue by introducing a novel concept of Reflective Teacher where the student is trained by both labeled and pseudo labeled data while its knowledge is progressively passed to the teacher through a regularizer to ensure retention of previous knowledge. Additionally, we propose Geometry Aware BEV Fusion (GA-BEVFusion) for efficient alignment of multi-modal BEV features, thus reducing the disparity between the modalities-camera and LiDAR. This helps to map the precise geometric information embedded among LiDAR points reliably with the spatial priors for extraction of semantic information from camera images. Our experiments on the nuScenes and Waymo datasets demonstrate: 1) improved performance over state-of-the-art methods in both fully supervised and semi-supervised settings; 2) Reflective Teacher achieves equivalent performance with only 25% and 22% of labeled datafor nuScenes and Waymo datasets respectively, in contrast to other fully supervised methods that utilize the full labeled dataset.
KW - birds-eye-view
KW - lidar
KW - multimodal learning
KW - semi supervised learning
UR - https://www.scopus.com/pages/publications/105003623528
U2 - 10.1109/WACV61041.2025.00168
DO - 10.1109/WACV61041.2025.00168
M3 - Conference contribution
AN - SCOPUS:105003623528
T3 - Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025
SP - 1649
EP - 1659
BT - Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 28 February 2025 through 4 March 2025
ER -