TY - JOUR
T1 - Detecting Process Duration Drift Using Gamma Mixture Models in a Left-Truncated and Right-Censored Environment
AU - Yang, Lingkai
AU - McClean, Sally
AU - Donnelly, Mark
AU - Khan, Kashaf
AU - Burke, Kevin
N1 - Publisher Copyright:
© 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.
PY - 2024/8/21
Y1 - 2024/8/21
N2 - Within the realm of business context, process duration signifies time spent by customers between successive activities. This temporal perspective offers important insight to customer behavior, highlighting potential bottlenecks, and influencing business management decisions. The distribution of these process duration often changes over time due to factors such as seasonality, emerging legislation, changes to supply chains, and customer demand. Referred to as concept drift, these variations pose challenges for robust process modeling, understanding, and refinement. Subsequently, gamma mixture models are widely employed to model durations. These source data can, however, become left-truncated and right-censored within any specific observation window thereby necessitating a (well-known) modification to the likelihood function. The approach reported in this article leveraged this adapted likelihood across a series of observation windows, applying the likelihood ratio test to identify duration changes/concept drift. Due to its flexibility in modelling any duration distribution, the gamma mixture model was used with Nelder-Mead optimized likelihood for the left-truncated and right-censored data. The number of gamma components was determined by the Bayesian information criterion. The proposed framework underwent validation through simulated exponential samples, leading to recommendations for its practical application. Subsequently, we applied the methodology to three real-life event logs exhibiting diverse characteristics. Experimental results showcase the effectiveness of our approach in terms of data fitting, as compared to Kaplan-Meier curves, and in detecting instances of drift. This comprehensive validation underscores the practical utility and reliability of our framework for dynamic business scenarios.
AB - Within the realm of business context, process duration signifies time spent by customers between successive activities. This temporal perspective offers important insight to customer behavior, highlighting potential bottlenecks, and influencing business management decisions. The distribution of these process duration often changes over time due to factors such as seasonality, emerging legislation, changes to supply chains, and customer demand. Referred to as concept drift, these variations pose challenges for robust process modeling, understanding, and refinement. Subsequently, gamma mixture models are widely employed to model durations. These source data can, however, become left-truncated and right-censored within any specific observation window thereby necessitating a (well-known) modification to the likelihood function. The approach reported in this article leveraged this adapted likelihood across a series of observation windows, applying the likelihood ratio test to identify duration changes/concept drift. Due to its flexibility in modelling any duration distribution, the gamma mixture model was used with Nelder-Mead optimized likelihood for the left-truncated and right-censored data. The number of gamma components was determined by the Bayesian information criterion. The proposed framework underwent validation through simulated exponential samples, leading to recommendations for its practical application. Subsequently, we applied the methodology to three real-life event logs exhibiting diverse characteristics. Experimental results showcase the effectiveness of our approach in terms of data fitting, as compared to Kaplan-Meier curves, and in detecting instances of drift. This comprehensive validation underscores the practical utility and reliability of our framework for dynamic business scenarios.
KW - Business process duration
KW - concept drift detection
KW - gamma mixture models
KW - left-truncated and right-censored
KW - likelihood ratio test
KW - nelder-mead optimization
UR - http://www.scopus.com/inward/record.url?scp=85202435239&partnerID=8YFLogxK
U2 - 10.1145/3669942
DO - 10.1145/3669942
M3 - Article
AN - SCOPUS:85202435239
SN - 1556-4681
VL - 18
JO - ACM Transactions on Knowledge Discovery from Data
JF - ACM Transactions on Knowledge Discovery from Data
IS - 8
M1 - 195
ER -