Overview of the SIGIR 2018 eCom Rakuten Data Challenge

Author: Yiu-Chang Lin, Pradipto Das, Ankur Datta


This paper presents an overview of the SIGIR 2018 eCom Rakuten Data Challenge. In this data challenge, has released a sampling of one million product titles and the corresponding
anonymized category paths from their entire product catalog. Of these, 0.8 million of product titles and their corresponding category paths are released as training data and 0.2 million of product titles
are released as test data. The task is to predict the category, defined as a full path in the taxonomy tree as provided in the training set, of the product titles in the test set. The evaluation is divided into two
stages to measure system performance on a part of the test data and the entire test data. The different systems are evaluated using weighted precision, recall and F1. In total, 26 teams have submitted
28 systems with a top performance of 0.8513 weighted F1 score.

