TY - JOUR
T1 - Pretraining instance segmentation models with bounding box annotations
AU - Agnew, Cathaoir
AU - Grua, Eoin M.
AU - Van de Ven, Pepijn
AU - Denny, Patrick
AU - Eising, Ciarán
AU - Scanlan, Anthony
N1 - Publisher Copyright:
© 2024
PY - 2024/12
Y1 - 2024/12
N2 - Annotating datasets for fully supervised instance segmentation tasks can be arduous and time-consuming, requiring a significant effort and cost investment. Producing bounding box annotations instead constitutes a significant reduction in this investment, but bounding box annotated data alone are not suitable for instance segmentation. This work utilizes ground truth bounding boxes to define coarsely annotated polygon masks, which we refer to as weak annotations, on which the models are pre-trained. We investigate the effect of pretraining on data with weak annotations and further fine-tuning on data with strong annotations, that is, finely annotated polygon masks for instance segmentation. The COCO 2017 detection dataset along with 3 model architectures, SOLOv2, Mask-RCNN, and Mask2former, were used to conduct experiments investigating the effect of pretraining on weak annotations. The Cityscapes and Pascal VOC 2012 datasets were used to validate this approach. The empirical results suggest two key outcomes from this investigation. Firstly, a sequential approach to annotating large-scale instance segmentation datasets would be beneficial, enabling higher-performance models in faster timeframes. This is accomplished by first labeling bounding boxes on your data followed by polygon masks. Secondly, it is possible to leverage object detection datasets for pretraining instance segmentation models while maintaining competitive results in the downstream task. This is reflected with 97.5%, 100.4% & 101.3% of the fully supervised performance being achieved with just 1%, 5% & 10% of the instance segmentation annotations of the COCO training dataset being utilized for the best performing model, Mask2former with a Swin-L backbone.
AB - Annotating datasets for fully supervised instance segmentation tasks can be arduous and time-consuming, requiring a significant effort and cost investment. Producing bounding box annotations instead constitutes a significant reduction in this investment, but bounding box annotated data alone are not suitable for instance segmentation. This work utilizes ground truth bounding boxes to define coarsely annotated polygon masks, which we refer to as weak annotations, on which the models are pre-trained. We investigate the effect of pretraining on data with weak annotations and further fine-tuning on data with strong annotations, that is, finely annotated polygon masks for instance segmentation. The COCO 2017 detection dataset along with 3 model architectures, SOLOv2, Mask-RCNN, and Mask2former, were used to conduct experiments investigating the effect of pretraining on weak annotations. The Cityscapes and Pascal VOC 2012 datasets were used to validate this approach. The empirical results suggest two key outcomes from this investigation. Firstly, a sequential approach to annotating large-scale instance segmentation datasets would be beneficial, enabling higher-performance models in faster timeframes. This is accomplished by first labeling bounding boxes on your data followed by polygon masks. Secondly, it is possible to leverage object detection datasets for pretraining instance segmentation models while maintaining competitive results in the downstream task. This is reflected with 97.5%, 100.4% & 101.3% of the fully supervised performance being achieved with just 1%, 5% & 10% of the instance segmentation annotations of the COCO training dataset being utilized for the best performing model, Mask2former with a Swin-L backbone.
KW - Computer vision
KW - Instance segmentation
KW - Object detection
KW - Supervised learning
KW - Weak annotations
UR - http://www.scopus.com/inward/record.url?scp=85207934829&partnerID=8YFLogxK
U2 - 10.1016/j.iswa.2024.200454
DO - 10.1016/j.iswa.2024.200454
M3 - Article
AN - SCOPUS:85207934829
SN - 2667-3053
VL - 24
JO - Intelligent Systems with Applications
JF - Intelligent Systems with Applications
M1 - 200454
ER -