TY - JOUR
T1 - GlassBoost
T2 - A Lightweight and Explainable Classification Framework for Tabular Datasets
AU - Namjoo, Ehsan
AU - O’Connor, Alison N.
AU - Buckley, Jim
AU - Ryan, Conor
N1 - Publisher Copyright:
© 2025 by the authors.
PY - 2025/6
Y1 - 2025/6
N2 - Explainable artificial intelligence (XAI) is essential for fostering trust, transparency, and accountability in machine learning systems, particularly when applied in high-stakes domains. This paper introduces a novel XAI system designed for classification tasks on tabular data, which offers a balance between performance and interpretability. The proposed method, GlassBoost, first trains an XGBoost model on a given dataset and then computes gain scores, quantifying the average improvement in the model’s loss function contributed by each feature during tree splits. Based on these scores, a subset of significant features is selected. A shallow decision tree is then trained using the top d features with the highest gain scores, where d is significantly smaller than the total number of original features. This model compression yields a transparent, IF–THEN rule-based decision process that remains faithful to the original high-performing model. To evaluate the system, we apply it to an anomaly detection task in the context of intrusion detection systems (IDSs), using a dataset containing traffic features from both malicious and normal activities. Results show that our method achieves high accuracy, precision, and recall while providing a clear and interpretable explanation of its decision-making. We further validate its explainability using SHAP, a well-established approach in the field of XAI. Comparative analysis demonstrates that GlassBoost outperforms SHAP in terms of precision, recall, and accuracy, with more balanced performance across the three metrics. Likewise, our review of literature findings indicate that Glassboost outperforms many other XAI models while retaining computational efficiency. In one of our configurations, GlassBoost achieved accuracy of 0.9868, recall of 0.9792, and precision of 0.9843 using only eight features within a tree structure of a maximum depth of four.
AB - Explainable artificial intelligence (XAI) is essential for fostering trust, transparency, and accountability in machine learning systems, particularly when applied in high-stakes domains. This paper introduces a novel XAI system designed for classification tasks on tabular data, which offers a balance between performance and interpretability. The proposed method, GlassBoost, first trains an XGBoost model on a given dataset and then computes gain scores, quantifying the average improvement in the model’s loss function contributed by each feature during tree splits. Based on these scores, a subset of significant features is selected. A shallow decision tree is then trained using the top d features with the highest gain scores, where d is significantly smaller than the total number of original features. This model compression yields a transparent, IF–THEN rule-based decision process that remains faithful to the original high-performing model. To evaluate the system, we apply it to an anomaly detection task in the context of intrusion detection systems (IDSs), using a dataset containing traffic features from both malicious and normal activities. Results show that our method achieves high accuracy, precision, and recall while providing a clear and interpretable explanation of its decision-making. We further validate its explainability using SHAP, a well-established approach in the field of XAI. Comparative analysis demonstrates that GlassBoost outperforms SHAP in terms of precision, recall, and accuracy, with more balanced performance across the three metrics. Likewise, our review of literature findings indicate that Glassboost outperforms many other XAI models while retaining computational efficiency. In one of our configurations, GlassBoost achieved accuracy of 0.9868, recall of 0.9792, and precision of 0.9843 using only eight features within a tree structure of a maximum depth of four.
KW - anomaly detection
KW - cybersecurity
KW - explainability
KW - feature importance score
KW - gradient-boosting machine (GBM)
KW - model compression
UR - https://www.scopus.com/pages/publications/105008989789
U2 - 10.3390/app15126931
DO - 10.3390/app15126931
M3 - Article
AN - SCOPUS:105008989789
SN - 2076-3417
VL - 15
JO - Applied Sciences (Switzerland)
JF - Applied Sciences (Switzerland)
IS - 12
M1 - 6931
ER -