Natural Language Processing

New Data Release on Hotel Reviews using Aspect-Based Sentiment Analysis


RIT released new annotated data entitled “Rakuten Travel Review aspects and sentiment-tagged corpus.”  Conducted over two months, the review data explores Aspect-Based Sentiment Analysis (ABSA), which identifies sentiment polarities in texts, in the Japanese language. Drawing from a dataset collection of 12,000 hotel reviews from Rakuten Travel, the dataset was distributed among multiple annotators and targets seven aspect categories with positive and negative sentiment labels for each aspect. The seven aspect categories include Meal, Dinner, Breakfast, Location, Facilities, Room, and Bath.

The dataset contains over 72,000 review sentences with multiple aspects and sentiments.  Hotel clients reflected on their stay in the sample review sentence, which we have translated here:”The room was large, the food was delicious, and the stars spread out like a planetarium from the open-air bath in the room, which was amazing.”

When the dataset is publicly released, it will be the first and largest standard dataset in Japanese. 

Copied! instagram