Unconstrained Product Categorization with Sequence-to-Sequence Models

Author: Maggie Yundi Li, Liling Tan, Stanley Kok, Ewa Szymanska


Product categorization is a critical component of e-commerce platforms that enables organization and retrieval of the relevant products. Instead of following the conventional classification approaches, we consider category prediction as a sequence generation task where we allow product categorization beyond the hierarchical definition of the full taxonomy. This paper presents our submissions for the Rakuten Data Challenge at SIGIR eCom’18. The goal of the challenge is to predict the multi-level hierarchical product categories given the e-commerce product titles. We ensembled several attentional sequence-to-sequence models to generate product category labels without supervised constraints. Such unconstrained product categorization suggests possible addition to the existing category hierarchy and reveals ambiguous and repetitive category leaves. Our system achieved a balanced F-score of 0.8256, while the organizers’ baseline system scored 0.8142, and the best performing
system scored 0.8513.

Copied! instagram