Users Online: 57
Home Print this page Email this page Small font size Default font size Increase font size
Home About us Editorial board Search Ahead of print Current issue Archives Submit article Instructions Subscribe Contacts Login 

 Table of Contents  
Year : 2018  |  Volume : 3  |  Issue : 2  |  Page : 41-47

Semi-supervised sentiment analysis of consumer reviews

Department of Computer Science, College of Information and Computer Science, Imam Mohammad Bin Saud Islamic University, Riyadh, Saudi Arabia

Date of Submission16-Jul-2018
Date of Acceptance06-Sep-2018
Date of Web Publication12-Dec-2018

Correspondence Address:
Dr. Sarah Omar Alhumoud
Department of Computer Science, College of Information and Computer Science, Imam Mohammad Bin Saud Islamic University, Riyadh
Saudi Arabia
Login to access the Email id

Source of Support: None, Conflict of Interest: None

DOI: 10.4103/ijas.ijas_8_18

Rights and Permissions

Background: Consumer transactions and individual's online information exchange streams hold within them an enormous amount of data that is large in volume, velocity of generation, and variety.
Motivation: Extracting information and trends from these data is a valuable asset in getting a better understanding of consumer's activities and preferences to guide future decision-making. Consequently, analyzing customer reviews through sentiment analysis classification is increasingly growing in interest. However, the resources and lexicons available to aid the classification learning are still scarce.
Aim: The present research presents a domain-specific lexicon, enhancing the analysis intelligence of customer reviews on services. The Lexicon for Sentiment Analysis for Reviews (LSAR) is applied using semi-supervised SVM classification.
Results: Results were encouraging, showing that the classifier based on the proposed lexicon, LSAR achieved better accuracy 0.94 compared to 0.72 for the AFINN-based classifier.

Keywords: Computational linguistics, natural language processing, sentiment analysis

How to cite this article:
Alhumoud SO. Semi-supervised sentiment analysis of consumer reviews. Imam J Appl Sci 2018;3:41-7

How to cite this URL:
Alhumoud SO. Semi-supervised sentiment analysis of consumer reviews. Imam J Appl Sci [serial online] 2018 [cited 2021 Dec 6];3:41-7. Available from:

  Introduction Top

Analyzing large amounts of data using data mining, text mining, machine learning, and natural language processing (NLP) is of great value in revealing meaning and patterns from unstructured available text.[1] Data mining is the exploration and analysis of large amounts of data.[2] Sentiment Analysis (SA) is one of the NLP concepts, which is also called “opinion mining,” “subjectivity analysis,” “analysis of stance,” and “evidentiality,"[2],[3],[4] According to Cambria,[1] we are gradually shifting into an era in which people's opinions will dictate the shape of the final products and services. Opinion mining or SA is concerned with quantifying content from subjective accounts. This is done by extracting “sentiment” from a text written in a natural language, providing useful information about the communicator's views and tendencies about a specific item or event by classifying text into positive and negative classes. SA could be applied to different domains such as customer reviews, business intelligence, services ratings, political opinions, social media transactions and interactions to extract tendencies, polarity of opinion, and trends.

Turning large amounts of data into valuable knowledge and learning from that data is a necessity for improving data intelligence. Data mining is “the process of discovering interesting patterns and knowledge from large amounts of data."[2] Data mining is a major step in knowledge discovery. Other steps include data cleaning, data integration, data selection, data transformation, pattern evaluation, and knowledge presentation.[2] The definition of “machine learning” could be considered like that of “data mining,” with a noticeable difference in the kind of inputs and outputs in a machine learning algorithm. That is, in the latter, the mining goals could not be directly described, unlike in the aforementioned,[5] that is analyzing movie reviews to classify the movies into likable or not, based on analyzing a sample of the movie reviews to predict other movies' likability. A machine learning algorithm would be a good candidate for use in this case.

Implementing SA could be using two approaches; the first is the supervised learning approach, known as the “corpus-based” approach. Where both inputs and the wanted outputs of the system are known and the system learns to map inputs to outputs.[6] This is done using machine learning algorithms such as the support vector machine (SVM), Naïve Bayes (NB), decision tree (D-Tree), and K-nearest neighbors algorithms. The second approach is the unsupervised approach, also known as “knowledge-based,” “rule-based,” or “lexicon-based."[6] In this latter approach, a word is compared to a “look up” dictionary of terms, where each word is associated to with a polarity value of +1 or −1, for positive or negative, respectively. Most of the previous studies do not use the neutral class (0), making the classification easier, and more striking, but possibly biased.[3] In the present study, data are classified into three classes: positive, negative, and neutral.

This research presents a new lexicon for review analysis. It was tested with reviews from Feed-Finder application about how to locate places amenable to natural breastfeeding outside of home. Feed Finder was developed in the Open Lab, Newcastle University, UK. It promotes breastfeeding, as opposed to bottle feeding, by providing a location-based social network to interested mothers. This application allows mothers to tag, review, or search for breastfeeding-friendly venues. Utilizing the available 1956 reviews about different venues, the present research aims to design a model to classify the reviews into positive, negative, and neutral using a semi-supervised approach, or what is called the “Bag-of-Words” method.[7],[8],[9] That is a mixed approach between the supervised learning using machine learning classification and an unsupervised approach, using lexicon-based learning. This method proved to be much faster than the pure supervised approach.[7] Exercising this method, two different classifiers are implemented, the first is built on the Lexicon for Sentiment Analysis for Reviews (LSAR) and the other is built upon the AFINN lexicon.[10] The next section presents a review of the related work, Section 3 describes the methodology, and Section 5 shows the results and discussion, which is followed by the conclusion and acknowledgment.

Literature review

The body of knowledge about SA is large, however, works that mostly classify reviews in a chronological manner are the main focus in the present research, [Table 1]. Additionally, I take into consideration papers that represent a marked methodology in classifying unstructured data entries. Reviewing the literature, a general methodology is followed comprising different steps. This includes preprocessing and data cleaning, data normalization and tokenization, data annotation, and data classification and analysis. The former steps' specification details are dependent on three factors: first, the source of data under consideration; second, how noisy is the data; and third is the aim of the classification. Classification could be done using different algorithms. Few of the marked algorithms used are SVM,[11],[12],[13] NB,[14],[15] and Maximum Entropy (MaxEnt).[14] Classification with SVM algorithm was first used to classify reviews by Pang et al.[16] using SVM with supervised learning and the Bag-of-Words lexicon method for text classification which proved its superiority over other algorithms like NB.[3],[16],[17] Pang et al.[16] used a scoring system in combination with the SVM classifier. That is, a review is classified word-by-word to either 1, −1, or 0 that is positive, negative, or neutral, respectively. Then, the sum of the words' scores is then calculated. One of the most-cited articles about SA by Sindhwani and Melville used both a sentiment lexicon and a sentiment pattern database.[17] The classification was applied on online product-review articles and on more general documents including general web pages and news articles. Moreover, the sentiment analyzer was developed using NLP techniques to extract topic-specific features, and the key sentiment from each sentiment-bearing phrase, and to make topic and feature sentiment associations.[17] Another interesting research done by Yi et al.[18] used a framework of adjectives, verbs, and adverbs to intensify the weight of sentiments in a sentence. For example, the sentence “the movie was good” is positive, but “the movie was very good” is extra positive. Asur and Huberman[19] studied the microblogging application, Twitter, in specific and analyzing sentiments aiming to predict a new movie's revenues. They were trying to answer this question specifically, “Using the tweets referring to movies before their release, can we accurately predict the box-office revenue generated by the movie in its opening weekend?” the classifier was built using the LingPipe linguistic analysis package,[20] in particular, using the DynamicLMClassifier. The results of this study successfully scored higher accuracy in predicting revenues than those of the Hollywood Stock Exchange.
Table 1: Related work highlights

Click here to view

Shi and Li in paper[12] did sentiment analysis for 4000 hotel reviews, half of which were positive and the other half is negative, using SVM. Also, they segmented the document using the Chinese lexical analysis system ICTCLAS (Institute of Computing Technology, Chinese Lexical Analysis System).[21] The main focus was on unigram features having 12745 unigrams in all documents. The evaluation was measured using recall, precision, and F-score for both approaches frequency and a term frequency-inverse document frequency approach. The results are comparable for both approaches 86% and 87% for the former and the latter, respectively, with slight improvement for the latter. Colbaugh and Glass[22] proposed a sentiment orientation (SO) algorithm to classify a number of documents n utilizing a labeled subset of n1. The algorithm was studied using the publicly available dataset from Cornel NLP Group[23] with 2000 reviews that are equally divided into positive and negative. Furthermore, they built a lexical vector of 1400 sentimental domain-independent words from Ramakrishnan et al.[24] comparing the proposed SO algorithm to three other schemes: Lexicon only, NB classifier, and RLS classifier. The results show that the proposed algorithm's accuracy outperforms the other schemes in all dataset sizes reaching 1000 data entries. Zheng and Ye[11] used SVM to classify Chinese hotel reviews extracted from Ctrip ( The results then were compared to the results of English SA of reviews to a different dataset carried out by one of the authors. Although the comparison carries questionable value, due to the difference between the compared languages and hence the classification features, the results of the analyzing the proposed classifier are quite high, 94.87% and 91.15% for recall and accuracy, respectively. De Albornoz et al. analyzed 60 hotel reviews from[13] Each review contained hotel information, a numerical score from 0 to 10 for five different aspects, and textual comment. The authors noticed that polarity of the review score had no relation to the reviewers' comment polarity, as people tend to be tactful when writing comments. The study was carried out using three Weka classifiers: logistic regression model, SVM, and a functional tree to classify data into three classes: good, fair, and poor. Using movie reviews to predict sales, Yu et al.[25] developed two algorithms to model the prediction, first is the sentiment analyzer, Sentiment Probabilistic Latent Semantic Analysis (PLSA) based on the traditional algorithm.[26] Second, a product sales prediction model, the Autoregressive Sentiment Aware model, based on the autoregressive model presented in.[27] In the study, 45,046 blog entry concerning 30 movies were collected using Apache Lucene. Furthermore, the daily gross revenues for those movies were collected from IMDB.

To measure the prediction accuracy in a fitted time series value, mean absolute percentage error (MAPE)[28] was used. The study shows that using SA for prediction gives a better indicator than volume alone.

Sixto et al.[14] aimed at classifying hotel reviews using the MaxEnt Model and the NB Model. The features had three components: tokenizing the text, Stanford lemmatization, and Part-of-Speech (POS) tagger.[29] Lemmas, their POS tags, and their raw frequency are used as classification features. Words are annotated using Q-WordNet, which is a lexical resource based on WordNet to find words' polarities. Classifying 1000 reviews, the F-score is comparable to the basic classifier without features scoring 0.79 and 0.82 for without and with features, respectively.

Dalal and Zaveri[30] proposed a feature-based sentiment analyzer for product reviews implementing fuzzy linguistic hedges, taking into account the product features, descriptors, and modifying hedges. Hedges, or ‘intensifiers’,[3],[31],[32] ensemble words like ‘very, highly, extremely’. Specifying products' features is a manual labor that needs to be done for each product beforehand, thus the study considered only four products, tablets, e-book readers, smartphones, and laptops of different brands. The reviews were classified into one of five classes: very positive, positive, neutral, negative, and very negative. Compared to two other approaches, valence points adjustment approach[33] and Vo and Ock's fuzzy adjustment approach,[34] the proposed approach outperformed the other two approaches in terms of accuracy.

Rajput et al.[35] used a sentiment dictionary MPQA corpus[36] containing 8221 records where each record consists of six features that describe the word in the record. The students' reviews are analyzed using this corpus. The sentiment of a word is calculated by multiplying its sentiment by the frequency of the word. Where the overall sentiment score of the review is the sum of the sentiments of each word in that review. The analysis is done using the KNIME open source data analytics platform.[37] The proposed approach was able to achieve an accuracy of 91.2%. The paper suggests that the SA of reviews gives more insight than the Likert-based score that has five predefined options the student selects from.

Paredes-Valverde et al.[38] implemented a Spanish sentiment analyzer to improve products and services using deep learning approach, in specific implementing a Convolutional Neural Network (CNN) and word2Vec[39],[40] classification model with Tensorflow. More than 130k Spanish tweets were collected, only 50k positive and 50k negative tweets were used to create the classifier. With 80% of the tweets are used for training and the rest for testing. To evaluate the proposed approach, precision, recall, and F-measure metrics were used. The study proves that the CNN algorithm outperformed SVM and NB with around 5% of gain with no clear justification.

Martins et al.[15] analyzed hotel reviews in Portuguese available in TripAdvisor, a website for tourist attractions reviews. The data size were 69,075 reviews comprising a 10 words review title, a review with an average of 60 words, and a star rating. For the sake of simplicity, the rating was mapped into three classes: positive, neutral, and negative. The corpus was divided into 69% for training and the rest for testing. The normalization process comprised lemmatization, polarity inversion for negation, and creating vibe tables, that is, word frequency in each class. For classification, Naïve Bays with Laplace smoothing was used. The F measure, 87% was comparable to other related work.

  Methodology Top

As explained in the previous sections, classification of reviews could be done using SA with supervised classification or unsupervised learning using a lexicon. In this research, we implement a hybrid classification using “Bag-of-words,” taking advantage of both techniques using SVM classification and a lexicon.[41] The problem could be described as following. Given a set of reviews R and a set of classes PX= {positive, neutral, negative}, the aim is to map each review to its appropriate class R → PX that describes the review polarity. The process of analyzing the Feed-Finder reviews involved different stages including data extraction, data annotation, data normalization, classification, and results analysis as in [Figure 1]. Moreover, in this research, two classifiers are trained using two different lexicons LSAR and AFINN. The LSAR lexicon is created manually observing the top sentimental terms used in location reviews of several location-based applications such as Foursquare, Yelp, and Google Maps. Incorporating the Bag-of-Words method,[7] each word of the review is regarded as a feature. The overall sentiment of a review is decided based on the sum of the numerical values of each word, assuming a positive word is 1, negative is −1, and otherwise, it is regarded as 0. If the sum is >0, a review is regarded as positive, if <0, it is negative, and if it is equal to 0, then it is neutral.
Figure 1: Procedures for reviews' sentiment analysis

Click here to view

The lexicons are incorporated in the learning process as an annotated corpus to train the classifier. The LSAR lexicon comprises words created by the present author of this research; containing 768 labeled positive, negative, and neutral keywords that are likely to describe services reviews into either positive, negative, or neutral. Those words were inspired from a subset of the reviews available and AFINN lexicon contains 2477 words and phrases[10] having ordinal scores as integers between −5 for negative words and 5 for positive words, having ten different classes. In both cases, the overall polarity of a review is calculated as the average weight of the sum of each word's score in the review.

Review analysis started with data extraction and recovering data from the log file of the Feed-Finder application. Then each review was annotated with a label, positive, negative, or neutral. After that, reviews are normalized, all emoticons are replaced with words, for example, “:)” or “:-)” are replaced with the word “Happy.” In addition, all numbers and punctuation marks are deleted. After that, reviews are classified using the SVM classifier written and programmed in R.

The resultant data contains a classification of each review to positive, negative, or neutral. Positive reviews are those with positive comments and connotations describing the venue as opposed to negative reviews. Neutral reviews are reviews with no sentiment or reviews outside of the scope of Feed-Finder data. The latter includes offensive reviews, “null” and “test” reviews. Removing offensive reviews that include swear words and with no substance is necessary for the classification accuracy.[25]

The classification phase mentioned earlier is composed of two parts. The training phase and the testing phase. The training phase trained two classifiers using the mentioned lexicons. The testing phase used the two trained classifiers to classify the list of unlabeled reviews. The results of the classifier are then compared to the labeled reviews. The results are analyzed in the next section.

  Results and Discussion Top

The number of reviews analyzed was 1951, classified using the SVM classifier and trained using two lexicons: LSAR and AFINN. A description of the two lexicons is given in the previous section. The number of positive, negative, and neutral reviews was 1639, 138, and 174 reviews, respectively.

To measure the performance of both classifiers, Lexicon for Sentiment Analysis for Reviews (LSAR) based and AFINN based, the confusion matrix of both techniques is depicted in [Table 2]. While true positive (TP) and true negative represent the correctly classified positive and negative reviews, respectively. While false positive and false negative represent the wrongly classified reviews into positive and negative and they were otherwise. Also, a third class is considered in the classification, neutral, and true neutral (TE) represents the correctly classified neutral reviews. The LSAR-based classifier, 1834 reviews are classified correctly, [Table 3] with an accuracy of 0.94. While the AFINN-based classifier resulted in 1412 reviews classified correctly, [Table 3] with an accuracy of 0.72. One reason to the superiority of LSAR-based classifier is that although the AFINN lexicon has more than 2477 entry and 10 degrees of sentiment. They represent general terms that describe positive and negative emotions and states but not necessarily used by consumers. For example, a word like “agog” in the AFINN lexicon expresses positive tendencies but is almost never used to express sentimental reactions to a service. Moreover, some of the keywords that are used frequently to describe services or feelings toward those services are missing from the AFINN lexicon, that is, “exceptional, welcoming, expensive, and suitable.” LSAR is specifically created to analyze consumer reviews to services, incorporating 768 weighted terms. As the classes are of different sizes, F1 score will give a better indication of the classifier performance as it represents the weighted average of TP rate. The value of the F1 score is calculated based on precision (P) and recall (R) metrics as shown in equation 1. Precision, equation 2, calculates the ratio of correctly predicted positive reviews to the total predicted positive reviews, answering the question: among all reviews that are classified as positive, how many were positive?
Table 2: Confusion matrix for classifier using LSAR lexicon and AFINN lexicon (n=1951)

Click here to view
Table 3: Accuracy and Error rate for LSAR-based and AFFIN-based classifiers

Click here to view

While recall, or sensitivity, equation 3, calculates the ratio of correctly predicted positive reviews to all reviews in the actual class, answering the question, of all the reviews that are positive how many did we actually label? calculating precision and recall, F1 measure for both classifiers is depicted in [Table 4]. As the results show the F1 score of classification based on LSAR is better than that for AFINN-based, scoring 0.96 and 0.82, respectively.
Table 4: F1 score for LSAR-based and AFFIN-based classifiers

Click here to view

  Conclusions Top

This article presented the analysis of a social application review dataset, Feed-Finder, by modeling a classifier that automatically generates a numerical value or polarity to each venue based on the reviews on that venue. This classifier incorporates a new lexicon for SA for consumer reviews, LSAR. The work implemented is compared against a classifier based on the already available lexicon, AFINN. Using SVM implemented in R, results show that the classifier based on LSAR lexicon achieved better accuracy 0.94 compared to 0.72 for the AFINN-based classifier. This is attributed to the LSAR domain-specific lexicon that incorporates consumer review terms as opposed to general sentimental terms. As a continuation of this work, implementing a visualization system that maps review polarity on a map would be a key aim to show location pleasantness in a fraction of the time compared to the textual output.


This research is supported by a grant provided from King Abdulaziz City for Science and Technology (KACST), Riyadh, Saudi with letter number 3213/15. Hosted by Professor Patrick Olivier head of the Open Lab in School of Computing Science at the University of Newcastle, United Kingdom.

Conflicts of interest

There are no conflicts of interest.

  References Top

Cambria E. Affective computing and sentiment analysis. IEEE Intell Systc 2016;31:102-7.  Back to cited text no. 1
Han J, Kamber M, Pei J. Data Mining. Waltham: Morgan Kaufmann, 2012.  Back to cited text no. 2
Pang B, Lee L. Opinion mining and sentiment analysis. Hanover, MA: Now Publishers, 2008.  Back to cited text no. 3
Chen H, Zimbra D. AI and opinion mining. IEEE Intell Syst 2010;25:74-80.  Back to cited text no. 4
Leskovec J, Rajaraman A, Ullman JD. Mining of Massive Datasets. 2nd ed. Cambridge: Cambridge University Press; 2014.  Back to cited text no. 5
L'Heureux A, Grolinger K, ElYamany HF, Capretz MA. Machine learning with big data: Challenges and approaches. IEEE Access 2017;5:7776-97.  Back to cited text no. 6
Augustyniak L, Kajdanowicz T, Szymański P, Tuligłowicz W, Kazienko P, Alhajj R, et al. Simpler is Better? Lexicon-Based Ensemble Sentiment Classification Beats Supervised Methods,” in 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014); Beijing, 2014. p. 924-9.  Back to cited text no. 7
Hamouda A, Marei M, Rohaim M. Building machine learning based senti-word lexicon for sentiment analysis. J Adv Inf Technol 2011;2: 199-203.  Back to cited text no. 8
Alhumoud S, Albuhairi T, Alohaideb W. Hybrid sentiment analyser for Arabic tweets using R, 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K), Lisbon, 2015, p. 417-24.  Back to cited text no. 9
Nielsen F. A new ANEW: Evaluation of a word list for Sentiment Analysis in Microblogs. Proceedings of the ESWC2011 Workshop on ‘Making Sense of Microposts’: Big things come in small packages, Heraklion, Crete, 2011;93-8.  Back to cited text no. 10
Zheng W, Ye Q. Sentiment Classification of Chinese Traveler Reviews by Support Vector Machine Algorithm. In: 2009 Third International Symposium on Intelligent Information Technology Application. Vol. 3. Shanghai: 2009. p. 335-8.  Back to cited text no. 11
Shi H, Li X. A sentiment analysis model for hotel reviews based on supervised learning. In: 2011 International Conference on Machine Learning and Cybernetics. Vol. 3. Guilin: 2011. p. 950-4.  Back to cited text no. 12
de Albornoz JC, Plaza L, Gervás P, Díaz A. A Joint Model of Feature Mining and Sentiment Analysis for Product Review Rating. In: Clough P, Foley C, Gurrin C, Jones GJ, Kraaij W, Lee H, et al., editors. Advances in Information Retrieval: 33rd European Conference on IR Research, ECIR, Proceedings. Dublin, Ireland, Berlin, Heidelberg: Springer Berlin Heidelberg; 2011. p. 55-66.  Back to cited text no. 13
Sixto J, Almeida A, López-de-Ipiña D. Analysing Customers Sentiments: An Approach to Opinion Mining and Classification of Online Hotel Reviews. In: Métais E, Meziane F, Saraee M, Sugumaran V, Vadera S, editors. Proceedings, Natural Language Processing and Information Systems: 18th International Conference on Applications of Natural Language to Information Systems, NLDB. Salford, UK, Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. p. 359-62.  Back to cited text no. 14
Martins GS, de Paiva Oliveira A, Moreira A. Sentiment Analysis Applied to Hotels Evaluation. 17th International Conference on Computational Science and Its Applications – ICCSA 2017. Trieste, Italy; 2017.  Back to cited text no. 15
Pang B, Lee L, Vaithyanathan S. Thumbs Up? Sentiment Classification Using Machine Learning Techniques. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing. Vol. 10. Stroudsburg, PA, USA; 2002. p. 79-86.  Back to cited text no. 16
Sindhwani V, Melville P. Document-Word Co-Regularization for Semi-Supervised Sentiment Analysis. In: 2008 Eighth IEEE International Conference on Data Mining. Pisa; 2008. p. 1025-30.  Back to cited text no. 17
Yi J, Nasukawa T, Bunescu R, Niblack W. Sentiment Analyzer: Extracting Sentiments About a Given Topic Using Natural Language Processing Techniques. In: Third IEEE International Conference on Data Mining. Melbourne, FL, USA; 2003. p. 427-34.  Back to cited text no. 18
Asur S, Huberman BA. Predicting the Future with Social Media. In: 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology. Vol. 1. Toronto, ON; 2010. p. 492-9.  Back to cited text no. 19
LingPipe, “LingPipe.” Available from: [Last accessed on 2017 Dec 11].  Back to cited text no. 20
ICTCLAS. “Chinese Lexical Analysis System ICTCLAS.” Available from: [Last accessed on 2017 Dec 04].  Back to cited text no. 21
Colbaugh R, Glass K. Estimating Sentiment Orientation in Social Media for Intelligence Monitoring and Analysis. In: 2010 IEEE International Conference on Intelligence and Security Informatics; 2010. p. 135-7.  Back to cited text no. 22
Data, Movie Review Data. Available from: [Last accessed on 2017 Dec 04].  Back to cited text no. 23
Ramakrishnan G, Jadhav A, Joshi A, Chakrabarti S, Bhattacharyya P. Question Answering via Bayesian Inference on Lexical Relations. In: Proceedings of the ACL 2003 Workshop on Multilingual Summarization and Question Answering. Vol. 12. Stroudsburg, PA, USA: 2003. p. 1-10.  Back to cited text no. 24
Yu X, Liu Y, Huang X, An A. Mining online reviews for predicting sales performance: A case study in the movie domain. IEEE Trans Knowl Data Eng 2012;24:720-34.  Back to cited text no. 25
Hofmann T. Probabilistic Latent Semantic Analysis. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence. San Francisco, CA, USA: 1999. p. 289-96.  Back to cited text no. 26
Enders W. Applied Econometric Time Series, 2nd ed. Hoboken, New Jersey, USA, Wiley, 2004.  Back to cited text no. 27
Jank W, Shmueli G, Wang S. Dynamic, Real-time Forecasting of Online Auctions via Functional Models. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: 2006. p. 580-5.  Back to cited text no. 28
About WordNet. Available from: [Last accessed on 2017 Jan 03].  Back to cited text no. 29
Dalal MK, Zaveri MA. Opinion mining from online user reviews using fuzzy linguistic hedges. Appl Comp Intell Soft Comput 2014;2014:2-2.  Back to cited text no. 30
El-Din DM. Enhancement Bag-of-Words Model for Solving the Challenges of Sentiment Analysis. International Journal of Advanced Computer Science and Applications, IJACSA, 2016;7:244-52.  Back to cited text no. 31
Khoo CS, Johnkhan SB. Lexicon-based sentiment analysis: Comparative evaluation of six sentiment lexicons. J Inf Sci 2017;44:491-511.  Back to cited text no. 32
Kennedy A, Inkpen D. Sentiment classification of movie reviews using contextual valence shifters. Comput Intell 2006;22:110-25.  Back to cited text no. 33
Vo AD, Ock CY. Sentiment Classification: A Combination of PMI, Senti Word Net and Fuzzy Function. In: Computational Collective Intelligence. Technologies and Applications, Berlin, Heidelberg: 2012. p. 373-82.  Back to cited text no. 34
Rajput Q, Haider S, Ghani S. Lexicon-based sentiment analysis of teachers' evaluation. Appl Comp Intell Soft Comput 2016;2016:1-12.  Back to cited text no. 35
Wiebe J, Wilson T, Cardie C. Annotating expressions of opinions and emotions in language. Lang Resour Eval 2005;39:165-210.  Back to cited text no. 36
KNIME Analytics Platform KNIME. Available from: Y. [Last accessed on 2018 Feb 10].  Back to cited text no. 37
Paredes-Valverde M, Colomo-Palacios R, Salas-Zárate M, Valencia-García R. Sentiment analysis in Spanish for improvement of products and services: A deep learning approach. Sci Program 2017;2017:1-6.  Back to cited text no. 38
Word2vec – Google Code Archive. Available from: [Last accessed on 2018 Feb 12].  Back to cited text no. 39
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J. Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems. Vol. 2. USA: 2013. p. 3111-9.  Back to cited text no. 40
Alhumoud S, Albuhairi T, Altuwaijri M. Arabic sentiment analysis using WEKA a hybrid learning approach. In: Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2015). Vol. 1. Lisbon, Purtogal; 2015.  Back to cited text no. 41


  [Figure 1]

  [Table 1], [Table 2], [Table 3], [Table 4]


Similar in PUBMED
   Search Pubmed for
   Search in Google Scholar for
 Related articles
Access Statistics
Email Alert *
Add to My List *
* Registration required (free)

  In this article
Results and Disc...
Article Figures
Article Tables

 Article Access Statistics
    PDF Downloaded239    
    Comments [Add]    

Recommend this journal