TY - GEN
T1 - Synerise at RecSys 2021
T2 - 15th ACM Recommender Systems Challenge Workshop, RecSysChallenge 2021
AU - Daniluk, Michal
AU - Dabrowski, Jacek
AU - Rychalska, Barbara
AU - Goluchowski, Konrad
N1 - Publisher Copyright:
© 2021 Owner/Author.
PY - 2021/10/1
Y1 - 2021/10/1
N2 - In this paper we present our 2nd place solution to ACM RecSys 2021 Challenge organized by Twitter. The challenge aims to predict user engagement for a set of tweets, offering an exceptionally large data set of 1 billion data points sampled from over four weeks of real Twitter interactions. Each data point contains multiple sources of information, such as tweet text along with engagement features, user features, and tweet features. The challenge brings the problem close to a real production environment by introducing strict latency constraints in the model evaluation phase: the average inference time for single tweet engagement prediction is limited to 6ms on a single CPU core with 64GB memory. Our proposed model relies on extensive feature engineering performed with methods such as the Efficient Manifold Density Estimator (EMDE) - our previously introduced algorithm based on Locality Sensitive Hashing method, and novel Fourier Feature Encoding, among others. In total, we create numerous features describing user twitter account status and content of a tweet. In order to adhere to the strict latency constraints, the underlying model is a simple residual feed-forward neural network. The system is a variation of our previous methods which proved successful in KDD Cup 2021, WSDM Challenge 2021, and SIGIR eCom Challenge 2020. We release the source code at: https://github.com/Synerise/recsys-challenge-2021.
AB - In this paper we present our 2nd place solution to ACM RecSys 2021 Challenge organized by Twitter. The challenge aims to predict user engagement for a set of tweets, offering an exceptionally large data set of 1 billion data points sampled from over four weeks of real Twitter interactions. Each data point contains multiple sources of information, such as tweet text along with engagement features, user features, and tweet features. The challenge brings the problem close to a real production environment by introducing strict latency constraints in the model evaluation phase: the average inference time for single tweet engagement prediction is limited to 6ms on a single CPU core with 64GB memory. Our proposed model relies on extensive feature engineering performed with methods such as the Efficient Manifold Density Estimator (EMDE) - our previously introduced algorithm based on Locality Sensitive Hashing method, and novel Fourier Feature Encoding, among others. In total, we create numerous features describing user twitter account status and content of a tweet. In order to adhere to the strict latency constraints, the underlying model is a simple residual feed-forward neural network. The system is a variation of our previous methods which proved successful in KDD Cup 2021, WSDM Challenge 2021, and SIGIR eCom Challenge 2020. We release the source code at: https://github.com/Synerise/recsys-challenge-2021.
KW - deep learning
KW - neural networks
KW - recommendation systems
KW - RecSys Twitter Challenge
UR - http://www.scopus.com/inward/record.url?scp=85140239428&partnerID=8YFLogxK
U2 - 10.1145/3487572.3487599
DO - 10.1145/3487572.3487599
M3 - Conference contribution
AN - SCOPUS:85140239428
T3 - ACM International Conference Proceeding Series
SP - 15
EP - 21
BT - Proceedings of Workshop on the RecSys Challenge 2021, RecSysChallenge 2021
PB - Association for Computing Machinery
Y2 - 1 October 2021
ER -