TY - JOUR
T1 - TokenCutSeg
T2 - Label-Efficient Skin Lesion Segmentation by Powering Supervised Refinement With Self-Learned Visual Vocabularies
AU - Singh, Aryan
AU - van de Ven, Pepijn
AU - Eising, Ciarán
AU - Denny, Patrick
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2025
Y1 - 2025
N2 - Accurate skin lesion segmentation is crucial for early melanoma diagnosis, yet deep learning models are often hindered by two key challenges: the scarcity of expert-annotated data and the difficulty of precisely delineating complex lesion boundaries. Existing methods either require extensive annotated data or, in the case of graph-based refinement techniques, suffer from a feature representation bottleneck, where performance is limited by the quality of initial, often supervised, features. This paper introduces TokenCutSeg, a novel hybrid framework designed to overcome these limitations by strategically separating representation learning from supervised refinement. Our approach first leverages all available images, both labelled and unlabeled, in a self-supervised stage. A Vector Quantised Generative Adversarial Network (VQGAN) learns a rich, discrete visual vocabulary, tokenising images into a compact and meaningful representation. Subsequently, a Transformer, trained with a masked token prediction task, models the deep contextual relationships between these visual tokens. This self-supervised pipeline produces powerful, context-aware features. Finally, these high-quality features are fed into a lightweight, supervised graph-based module that performs the final segmentation refinement, effectively solving the traditional feature bottleneck. Evaluated on the ISIC 2016, 2017, and 2018 datasets, TokenCutSeg establishes a new state-of-the-art, achieving mIoU scores of 86.18%, 81.37%, and 82.76%, respectively, demonstrating superior boundary accuracy and generalization. Our results validate that this label-efficient, hybrid approach leads to more robust and practical segmentation models. Our code is available at GitHub.
AB - Accurate skin lesion segmentation is crucial for early melanoma diagnosis, yet deep learning models are often hindered by two key challenges: the scarcity of expert-annotated data and the difficulty of precisely delineating complex lesion boundaries. Existing methods either require extensive annotated data or, in the case of graph-based refinement techniques, suffer from a feature representation bottleneck, where performance is limited by the quality of initial, often supervised, features. This paper introduces TokenCutSeg, a novel hybrid framework designed to overcome these limitations by strategically separating representation learning from supervised refinement. Our approach first leverages all available images, both labelled and unlabeled, in a self-supervised stage. A Vector Quantised Generative Adversarial Network (VQGAN) learns a rich, discrete visual vocabulary, tokenising images into a compact and meaningful representation. Subsequently, a Transformer, trained with a masked token prediction task, models the deep contextual relationships between these visual tokens. This self-supervised pipeline produces powerful, context-aware features. Finally, these high-quality features are fed into a lightweight, supervised graph-based module that performs the final segmentation refinement, effectively solving the traditional feature bottleneck. Evaluated on the ISIC 2016, 2017, and 2018 datasets, TokenCutSeg establishes a new state-of-the-art, achieving mIoU scores of 86.18%, 81.37%, and 82.76%, respectively, demonstrating superior boundary accuracy and generalization. Our results validate that this label-efficient, hybrid approach leads to more robust and practical segmentation models. Our code is available at GitHub.
KW - computer-aided diagnosis
KW - dermoscopy
KW - graph attention networks
KW - Graph neural networks
KW - image segmentation
KW - medical image analysis
KW - semantic segmentation
KW - skin lesion
KW - transformers
KW - vision transformer
UR - https://www.scopus.com/pages/publications/105014458815
U2 - 10.1109/ACCESS.2025.3603401
DO - 10.1109/ACCESS.2025.3603401
M3 - Article
AN - SCOPUS:105014458815
SN - 2169-3536
VL - 13
SP - 150671
EP - 150683
JO - IEEE Access
JF - IEEE Access
ER -