Abstract
In real-world scenarios, many objects in view are often partially occluded, making the ability to handle occlusion essential for everyday activities. While human vision exhibits robustness to extreme occlusion, Convolutional Neural Networks struggle in this regard. Current regional masking strategies effectively improve generalization to occlusion. However, these methods typically eliminate informative pixels in training images by overlaying a patch of either random pixels or Gaussian noise, leading to a loss of image features. This limitation can be addressed by employing a more informative label. Rather than augmenting occluded images during training and assigning identical labels to both clean and occluded images, we differentiate between their labels. Essentially, we treat occlusion as a virtual class and assign it a virtual label. During training with occluded images, we merge the ground truth labels with these virtual labels, thereby informing the model about the presence of occlusion. Our findings indicate a 49.26% improvement in generalization for occlusion scenarios and an 8.22% enhancement on the common corruptions benchmark.
| Original language | English |
|---|---|
| Pages (from-to) | 103-109 |
| Number of pages | 7 |
| Journal | IET Conference Proceedings |
| Volume | 2024 |
| Issue number | 10 |
| DOIs | |
| Publication status | Published - 2024 |
| Event | 26th Irish Machine Vision and Image Processing Conference, IMVIP 2024 - Limerick, Ireland Duration: 21 Aug 2024 → 23 Aug 2024 |
Keywords
- Augmentations
- Deep Neural Networks
- Generalization
- Occlusion