Extracting Semantic Information for E-Commerce
Rakuten Ichiba uses a taxonomy to organize the items it sells. Currently, the taxonomy classes that are relevant in terms of profit generation and difficulty of exploration are being manually extended with data properties deemed helpful to create pages that improve the user search experience and ultimately the conversion rate. In this paper we present a scalable approach that aims to automate this process, automatically selecting the relevant and semantically homogenous subtrees in the taxonomy, extracting from semi-structured text in items descriptions a core set of properties and a popular subset of their ranges, then extending the covered range using relational similarities in free text. Additionally, our process automatically tags the items with the new semantic information and exposes them as RDF triples. We present a set of experiments showing the effectiveness of our approach in this business context.