TY - GEN
T1 - Document Image Classification with Intra-Domain Transfer Learning and Stacked Generalization of Deep Convolutional Neural Networks
AU - Das, Arindam
AU - Roy, Saikat
AU - Bhattacharya, Ujjwal
AU - Parui, Swapan K.
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/11/26
Y1 - 2018/11/26
N2 - In this article, a region-based Deep Convolutional Neural Network framework is presented for document structure learning. The contribution of this work involves efficient training of region based classifiers and effective ensembling for document image classification. A primary level of 'inter-domain' transfer learning is used by exporting weights from a pre-trained VGG16 architecture on the ImageNet dataset to train a document classifier on whole document images. Exploiting the nature of region based influence modelling, a secondary level of 'intra-domain' transfer learning is used for rapid training of deep learning models for image segments. Finally, a stacked generalization based ensembling is utilized for combining the predictions of the base deep neural network models. The proposed method achieves state-of-the-art accuracy of 92.21% on the popular RVL-CDIP document image dataset, exceeding the benchmarks set by the existing algorithms.
AB - In this article, a region-based Deep Convolutional Neural Network framework is presented for document structure learning. The contribution of this work involves efficient training of region based classifiers and effective ensembling for document image classification. A primary level of 'inter-domain' transfer learning is used by exporting weights from a pre-trained VGG16 architecture on the ImageNet dataset to train a document classifier on whole document images. Exploiting the nature of region based influence modelling, a secondary level of 'intra-domain' transfer learning is used for rapid training of deep learning models for image segments. Finally, a stacked generalization based ensembling is utilized for combining the predictions of the base deep neural network models. The proposed method achieves state-of-the-art accuracy of 92.21% on the popular RVL-CDIP document image dataset, exceeding the benchmarks set by the existing algorithms.
KW - deep convolutional neural network
KW - deep learning
KW - document recognition
KW - document structure learning
KW - intra-domain
KW - neural network
KW - transfer learning
UR - https://www.scopus.com/pages/publications/85059774330
U2 - 10.1109/ICPR.2018.8545630
DO - 10.1109/ICPR.2018.8545630
M3 - Conference contribution
AN - SCOPUS:85059774330
T3 - Proceedings - International Conference on Pattern Recognition
SP - 3180
EP - 3185
BT - 2018 24th International Conference on Pattern Recognition, ICPR 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 24th International Conference on Pattern Recognition, ICPR 2018
Y2 - 20 August 2018 through 24 August 2018
ER -