Even though the most online review systems offer star rating in addition to free text reviews, this only applies to the overall review. However, different users may have different preferences in relation to different aspects of a product or a service and may struggle to extract relevant information from a massive amount of consumer reviews available online. In this paper, we present a framework for extracting prevalent topics from online reviews and automatically rating them on a 5-star scale. It consists of five modules, including linguistic pre-processing, topic modelling, text classification, sentiment analysis, and rating. Topic modelling is used to extract prevalent topics, which are then used to classify individual sentences against these topics. A state-of-the-art word embedding method is used to measure the sentiment of each sentence. The two types of information associated with each sentence — its topic and sentiment — are combined to aggregate the sentiment associated with each topic. The overall topic sentiment is then projected onto the 5-star rating scale. We use a dataset of Airbnb online reviews to demonstrate a proof of concept. The proposed framework is simple and fully unsupervised. It is also domain independent, and, therefore, applicable to any other domains of products and services.
We present a framework for rating online reviews, which extracts the underlying topics automatically and rates each review against these topics. The framework consists of five modules, including
- linguistic pre-processing,
- topic modeling,
- text classification,
- sentiment analysis
The following subsections provide details about each module.
The following picture provides an example of topic-related ratings for a given review. Different topics are highlighted in different colors. Each sentence is tagged with its sentiment score and topic classification at the end. The overall ratings of the given review in terms of location and amenities were calculated as 4-stars and 3-stars respectively.
For original pdf download from ACM, click here