The RecSys Challenge 2020 will be organized by Politecnico di Bari, Free University of Bozen-Bolzano, TU Wien, University of Colorado, Boulder, and Universidade Federal de Campina Grande, and sponsored by Twitter. The challenge focuses on a real-world task of tweet engagement prediction in a dynamic environment. The goal is to predict the probability for different types of engagement (Like, Reply, Retweet, and Retweet with Comment) of a target user for a set of tweets, based on heterogeneous input data.
Luca Belli, Sofia Ira Ktena, Alykhan Tejani, Alexandre Lung-Yut-Fon, Frank Portman, Xiao Zhu, Yuanpu Xie, Akshay Gupta, Michael Bronstein, Amra Delić, Gabriele Sottocornola, Walter Anelli, Nazareno Andrade, Jessie Smith, and Wenzhe Shi. 2020. Privacy-Preserving Recommender Systems Challenge on Twitter's Home Timeline, arXiv:2004.13715.
Twitter has released a large public dataset of 160M public tweets, obtained by subsampling within ~2 weeks, that contains engagement features, user features, and tweet features.
If you are experiencing any issues, please notify the organizers.
Participation for this challenge is subject to the acceptance of these Terms & Conditions
IMPORTANT
Due to the reported possible data leakage, we want to emphasize and enrich some of the existing rules in the competition in order to discourage and prevent participants from taking advantage of it.
Please notice that no actual data leakage happened: The Twitter data provided are all publicly available via the API and de-anonymization of the dataset was always possible and expected. The following measures are just intended to ensure a fair competition.
The Terms & Conditions already require that all submissions are accompanied by reproducible code, so that we can inspect winning solutions in detail: “Your Submission must include the source code and any related information used to derive the results contained in your Submission. The source code must be released under an open-source license (Apache 2.0). A third party should be able to use your submitted source to regenerate your results.” Furthermore, we explicitly state in our rules that NO de-anonymization or access to data from Twitter users and user behavior from the Twitter API other than that in the challenge dataset is allowed. Enriching the data with other data sources remains possible.
If any of the rules mentioned in the Terms & Conditions (and explained further above) are broken and thus discovered by the organizers in the code submission, the participant(s) that the submission belongs to will be disqualified from the competition. As mentioned in the Terms and Conditions: “Organizers and Sponsor reserve the right, in their sole discretion, to disqualify any participant who makes a Submission that does not meet the Requirements or is in violation of these Terms.”
Note: the timeline is subject to slight modifications.
When? | What? |
---|---|
March 2, 2020 |
Dataset Release & RecSys Challenge Starts
Training and validation datasets released |
June 1, 2020 | Test Dataset Release |
June 15, 2020 | RecSys Challenge ends |
June 22, 2020 | Announcement of the final leaderboard and winners Paper submission for RecSys Challenge Workshop |
July 8, 2020 | Paper Submission Due |
August 5, 2020 | Paper Acceptance Notifications |
August 19, 2020 | Camera-ready Papers Due |
September 22-26, 2020 | Workshop virtually taking place as part of the ACM RecSys conference. |
Submission website: EasyChair
Time | Session |
---|---|
2:00 PM - 2:10 PM |
Workshop Opening: Vito Walter Anelli, Amra Delic, Gabriele Sottocornola, Jessie Smith, Nazareno Andrade |
2:10 PM - 2:40 PM |
Introduction to the Challenge: Twitter Team |
2:40 PM - 3:10 PM |
3rd place - A Stacking Ensemble Model for Prediction of Multi-type Tweet Engagements: Shuhei Goda, Naomichi Agata and Yuya Matsumura
25 min Presentation + 5 min Q&A |
3:10 PM - 3:30 PM |
A combination of classification based methods for recommending tweets: Sumit Sidana
15 min Presentation + 5 min Q&A |
3:30 PM - 4:30 PM |
Break |
4:30 PM - 4:50 PM |
Engaging with Tweets: The Missing Dataset On Social Media: Seyed Ali Alhosseini, Raad Bin Tareaf and Christoph Meinel
15 min Presentation + 5 min Q&A |
4:50 PM - 5:20 PM |
2nd place - Predicting Twitter Engagement With Deep Language Models: Maksims Volkovs, Zhaoyue Cheng, Mathieu Ravaut, Hojin Yang, Kevin Shen, Jin Peng Zhou, Anson Wong, Saba Zuberi and Aidan Gomez
25 min Presentation + 5 min Q&A |
5:20 PM - 5:40 PM |
Leveraging User Embeddings and Text to Improve CTR Predictions With Deep Recommender Systems: Carlos Patiño, Camilo Velásquez, Juan Muñoz, Juan Gutiérrez, David Valencia and Cristian Bartolome Aramburu
15 min Presentation + 5 min Q&A |
5:40 PM - 6:00 PM |
Multi-Objective Blended Ensemble For Highly Imbalanced Sequence Aware Tweet Engagement Prediction: Nicolò Felicioni, Andrea Donati, Luca Conterio, Luca Bartoccioni, Davide Yi Xian Hu, Cesare Bernardis and Maurizio Ferrari Dacrema
15 min Presentation + 5 min Q&A |
6:00 PM - 7:00 PM |
Break |
7:00 PM - 7:30 PM |
1st place - GPU Accelerated Feature Engineering and Training for Recommender Systems: Benedikt Schifferer, Gilberto Titericz, Chris Deotte, Christof Henkel, Kazuki Onodera, Jiwei Liu, Bojan Tunguz, Even Oldridge, Gabriel De Souza Pereira Moreira and Ahmet Erdem
25 min Presentation + 5 min Q&A |
7:30 PM - 7:50 PM |
Gradient Boosting and Language Model Ensemble for Tweet Recommendation: Pere Gilabert and Santi Seguí
15 min Presentation + 5 min Q&A |
7:50 PM - 8:10 PM |
Why Are Deep Learning Models Not Consistently Winning Recommender Systems Competitions Yet?: Dietmar Jannach, Gabriel de Souza Pereira Moreira and Even Oldridge
15 min Presentation + 5 min Q&A |
8:10 PM - 8:30 PM |
Final Remarks & Panel Discussion
|