TY - GEN
T1 - The Monad Platform - Temporal Aspects in Behavioral Modeling
AU - Rychalska, Barbara
AU - Sieradzki, Igor
AU - Dąbrowski, Jacek
N1 - Publisher Copyright:
© 2023 The Authors.
PY - 2023/9/28
Y1 - 2023/9/28
N2 - Behavioral modeling is an emerging machine learning area which aims to predict user actions, especially in commercial settings. Companies actively gather data such as clicks, likes, page views, card transactions, add-to-basket, or purchase events. However, the large size of data combined with hardships in applying sophisticated graph-based machine learning often leads to data not being used for modeling at all. In response to this, we propose the Monad platform. The primary focus of Monad is to train a large, private behavioral foundation model for each client company. The foundation model is then fine-tuned to any user behavioral prediction task, such as recommendations or churn. The Monad foundation models are based on our algorithms - Cleora and EMDE - which are award winning solutions (KDD Cup and other anonymized AI contests). Cleora and EMDE allow to process real-life datasets composed of billions of events in record time. In this paper, we present and analyze the temporal side of Monad. Time is a crucial aspect of behavioral training, because many target tasks such as propensity to buy rely on seasonal aspects and predictions are usually required to include a time frame (e.g. 'Will the user buy brand X within a week from now?'). To tackle this problem, we present the concept of time-based data splits in training, as well as approaches towards time-based feature encoding, which do not require normalization or any feature statistics.
AB - Behavioral modeling is an emerging machine learning area which aims to predict user actions, especially in commercial settings. Companies actively gather data such as clicks, likes, page views, card transactions, add-to-basket, or purchase events. However, the large size of data combined with hardships in applying sophisticated graph-based machine learning often leads to data not being used for modeling at all. In response to this, we propose the Monad platform. The primary focus of Monad is to train a large, private behavioral foundation model for each client company. The foundation model is then fine-tuned to any user behavioral prediction task, such as recommendations or churn. The Monad foundation models are based on our algorithms - Cleora and EMDE - which are award winning solutions (KDD Cup and other anonymized AI contests). Cleora and EMDE allow to process real-life datasets composed of billions of events in record time. In this paper, we present and analyze the temporal side of Monad. Time is a crucial aspect of behavioral training, because many target tasks such as propensity to buy rely on seasonal aspects and predictions are usually required to include a time frame (e.g. 'Will the user buy brand X within a week from now?'). To tackle this problem, we present the concept of time-based data splits in training, as well as approaches towards time-based feature encoding, which do not require normalization or any feature statistics.
UR - http://www.scopus.com/inward/record.url?scp=85175858223&partnerID=8YFLogxK
U2 - 10.3233/FAIA230645
DO - 10.3233/FAIA230645
M3 - Conference contribution
AN - SCOPUS:85175858223
T3 - Frontiers in Artificial Intelligence and Applications
SP - 3226
EP - 3232
BT - ECAI 2023 - 26th European Conference on Artificial Intelligence, including 12th Conference on Prestigious Applications of Intelligent Systems, PAIS 2023 - Proceedings
A2 - Gal, Kobi
A2 - Gal, Kobi
A2 - Nowe, Ann
A2 - Nalepa, Grzegorz J.
A2 - Fairstein, Roy
A2 - Radulescu, Roxana
PB - IOS Press BV
T2 - 26th European Conference on Artificial Intelligence, ECAI 2023
Y2 - 30 September 2023 through 4 October 2023
ER -