Language Program

The 2021 Rakuten Data Challenge


Rakuten Institute of Technology’s Paris office (RIT Paris) launched its latest data challenge, “Rakuten Multi-modal Colour Extraction” with Ecole Normale Superieure and College de France. RIT Paris has provided numerous data challenges in past years, including the Rakuten Multi-modal Classification and Retrieval challenge as part of the 2020 SIGIR Workshop on Ecommerce.  This year’s challenge is focused on making colour predictions of products with image, title, and description information. Products can have multiple colours, creating a multi-label classification problem where multiple labels can be assigned to each observation.  For this challenge, Rakuten has released a multimodal dataset of approximately 250k listings from the Rakuten Ichiba marketplace; the dataset consists of product titles, product descriptions, product images, and their corresponding tags. There are 19 unique colour tags in the dataset.

The challenge presents several interesting research opportunities in an area with promising advances and breakthroughs still yet to be made.  It is especially pertinent for e-commerce companies looking to create scalable categorization with applications ranging from personalized research and recommendations to query understanding.

In anticipation of the Data Challenge, RIT Paris’s European Director, Laurent Ach, and mathematician and Professor at College de France, Stephane Mallet, spoke together in a cross interview about the origins of the challenge and some of the key ingredients for finding success. (The interview is conducted in French.)

Rakuten Paris welcomes researchers and students who are passionate about AI, problem-solving and unlocking the power of open data, and people at the intersection of research and practice. To learn more about the challenge and registration, please check the website or contact the Paris office team at 

Copied! instagram