CNN-Based Approaches for Various Types of Tabular Data

Research output: Contribution to journalArticlepeer-review

Abstract

Deep learning (DL) includes various architectures, such as deep neural networks (DNNs) and convolutional neural networks (CNNs). DL is very powerful and flexible for non-tabular (non-structured) data (e.g. image, text). However, in tabular data, standard DNNs often do not outperform traditional machine learning (ML) methods such as tree-based models (e.g. random forest, XGBoost). CNNs carry out dimensionality reduction for non-tabular (especially image) data, but may be useful in tabular data too. In this paper, we present a unified framework of one-dimensional CNN (1D-CNN)-based approaches for various types of tabular data, which provides an end-to-end learning framework. We also propose two novel 1D-CNN-based models, i.e. a negative binomial CNN (NB-CNN) model for over-dispersed count data and a Cox-based CNN Self-Attention model for high-dimensional survival data. The predictive performance of the proposed method is evaluated by comparing it with existing ML/DL methods using four types of real tabular data, i.e. a binary response data with high dimensional features, over-dispersed count data, high-dimension survival data, and time-series data with substantial variability. The experimental results show that the proposed methods overall outperform existing ML/DL models. In particular, the NB-CNN achieves lower root mean squared error (RMSE) and higher coefficient of determination (R2) on over-dispersed count data than tree-based methods. Similarly, the Cox-based CNN Self-Attention model yields higher C-index values for high-dimensional survival tasks relative to state-of-the-art approaches.

Original languageEnglish
Pages (from-to)200537-200554
Number of pages18
JournalIEEE Access
Volume13
DOIs
Publication statusPublished - 2025

Keywords

  • CNN
  • deep learning
  • DNN
  • high-dimensional survival data
  • machine learning

Fingerprint

Dive into the research topics of 'CNN-Based Approaches for Various Types of Tabular Data'. Together they form a unique fingerprint.

Cite this