PUBLICATIONS
Finding food entity relationships using user-generated data in recipe service
Rakuten recipe is a recipe site where users can submit their recipes and share with the others. Since recipe contents are generated by users, they usually contain many misspellings, abbreviations, synonyms, hypernyms and hyponyms. Identifying and normalizing these words is essential to retrieve relevant recipes to user’s request. In this paper, we introduce a new approach to finding related words in a recipe domain using the data structure. Based on the observation that people usually write the main ingredient in the first position of ingredient lists of each recipe and such a ingredient is strongly related to the categories where recipes belong, we calculate relation scores of word pairs using real service data, which contains 790 categories and 405,519 recipes. The experimental result showed that we successfully found semantically related word pairs with f-score of 0.93.