TokenCutSeg: Label-Efficient Skin Lesion Segmentation by Powering Supervised Refinement With Self-Learned Visual Vocabularies

Research output: Contribution to journalArticlepeer-review

Abstract

Accurate skin lesion segmentation is crucial for early melanoma diagnosis, yet deep learning models are often hindered by two key challenges: the scarcity of expert-annotated data and the difficulty of precisely delineating complex lesion boundaries. Existing methods either require extensive annotated data or, in the case of graph-based refinement techniques, suffer from a feature representation bottleneck, where performance is limited by the quality of initial, often supervised, features. This paper introduces TokenCutSeg, a novel hybrid framework designed to overcome these limitations by strategically separating representation learning from supervised refinement. Our approach first leverages all available images, both labelled and unlabeled, in a self-supervised stage. A Vector Quantised Generative Adversarial Network (VQGAN) learns a rich, discrete visual vocabulary, tokenising images into a compact and meaningful representation. Subsequently, a Transformer, trained with a masked token prediction task, models the deep contextual relationships between these visual tokens. This self-supervised pipeline produces powerful, context-aware features. Finally, these high-quality features are fed into a lightweight, supervised graph-based module that performs the final segmentation refinement, effectively solving the traditional feature bottleneck. Evaluated on the ISIC 2016, 2017, and 2018 datasets, TokenCutSeg establishes a new state-of-the-art, achieving mIoU scores of 86.18%, 81.37%, and 82.76%, respectively, demonstrating superior boundary accuracy and generalization. Our results validate that this label-efficient, hybrid approach leads to more robust and practical segmentation models. Our code is available at GitHub.

Original languageEnglish
Pages (from-to)150671-150683
Number of pages13
JournalIEEE Access
Volume13
DOIs
Publication statusPublished - 2025

Keywords

  • computer-aided diagnosis
  • dermoscopy
  • graph attention networks
  • Graph neural networks
  • image segmentation
  • medical image analysis
  • semantic segmentation
  • skin lesion
  • transformers
  • vision transformer

Fingerprint

Dive into the research topics of 'TokenCutSeg: Label-Efficient Skin Lesion Segmentation by Powering Supervised Refinement With Self-Learned Visual Vocabularies'. Together they form a unique fingerprint.

Cite this