Data from: Lexicon-enhanced sentiment analysis framework using rule-based classification scheme

Asghar, Muhammad Zubair1; Khan, Aurangzeb2; Ahmad, Shakeel3; Qasim, Maria1; Khan, Imran Ali4

Published Feb 28, 2017 on Dryad. https://doi.org/10.5061/dryad.p1j71

Data files

Feb 28, 2017 version files 5.31 KB

pyhton-based-source code.rar
5.31 KB

Abstract

With the rapid increase in social networks and blogs, the social media services are increasingly being used by online communities to share their views and experiences about a particular product, policy and event. Due to economic importance of these reviews, there is growing trend of writing user reviews to promote a product. Nowadays, users prefer online blogs and review sites to purchase products. Therefore, user reviews are considered as an important source of information in Sentiment Analysis (SA) applications for decision making. In this work, we exploit the wealth of user reviews, available through the online forums, to analyze the semantic orientation of words by categorizing them into +ive and -ive classes to identify and classify emoticons, modifiers, general-purpose and domain-specific words expressed in the public’s feedback about the products. However, the un-supervised learning approach employed in previous studies is becoming less efficient due to data sparseness, low accuracy due to non-consideration of emoticons, modifiers, and presence of domain specific words, as they may result in inaccurate classification of users’ reviews. Lexicon-enhanced sentiment analysis based on Rule-based classification scheme is an alternative approach for improving sentiment classification of users’ reviews in online communities. In addition to the sentiment terms used in general purpose sentiment analysis, we integrate emoticons, modifiers and domain specific terms to analyze the reviews posted in online communities. To test the effectiveness of the proposed method, we considered users reviews in three domains. The results obtained from different experiments demonstrate that the proposed method overcomes limitations of previous methods and the performance of the sentiment analysis is improved after considering emoticons, modifiers, negations, and domain specific terms when compared to baseline methods.

Data from: Lexicon-enhanced sentiment analysis framework using rule-based classification scheme

Data files

Abstract

Usage notes

Python and NLTK-based Source Code

Works referencing this dataset