From the introductionary blog we know that the Naive Bayes Classifier is based on the bag-of-words model. With the bag-of-words model we check which word of the text-document appears in a positive-words-list or a negative-words-list. If the word appears in a positive-words-list the total score of the text is updated with +1 and vice versa. … More Sentiment Analysis with the Naive Bayes Classifier
In the previous post we have learned how to do basic Sentiment Analysis with the bag-of-words technique. Here is a short summary: To keep track of the number of occurences of each word, we tokenize the text and add each word to a single list. Then by using a Counter element we can keep track … More Sentiment Analysis with bag-of-words (part 2)
Introduction: In my previous post I have explained the Theory behind three of the most popular Text Classification methods (Naive Bayes, Maximum Entropy and Support Vector Machines) and told you that I will use these Classifiers for the automatic classification of the subjectivity of Amazon.com book reviews. The purpose is to get a better understanding of … More Sentiment Analysis with bag-of-words
We all know that visualizing data is an important part of Data Science. If it is done wrong, it can be boring not grabbing the attention of the readers, or even worse; convey the wrong message. If it done correctly, it can intrigue even the most indifferent reader (some people can even turn Data Visualizations into … More Visualizing Data
Introduction: Natural Language Processing (NLP) is a vast area of Computer Science that is concerned with the interaction between Computers and Human Language. Within NLP many tasks are – or can be reformulated as – classification tasks. In classification tasks we are trying to produce a classification function which can give the correlation between a … More Text Classification and Sentiment Analysis
For most people, the most interesting part of the previous post, will be the final results. But for the ones who would like to try something similar or the ones who are also curious about the technical part, I will explain the methods and techniques I used (mostly webscraping with Beautifulsoup4) to collect a few million … More Collecting Data from Twitter
As promised, here is the post-election analysis. Although my predicted voting percentage for AKP was much closer to the actual result compared to most of the traditional polls, it is also true that my predicted value for MHP is far off, making the overall prediction error bigger than most conventional polls (see table below). AKP CHP … More Post-Election Analysis: Twitter vs traditional polls
Short summary: According to Analysis of Twitter data, the upcoming elections of 1 November will result in a victory for AKP. AKP: 47.13% CHP: 22.35% MHP: 18.84% HDP: 11.68% I will keep updating these numbers as more Twitter data is collected. Not so short Summary: Since the demographics of the Twitter users and the electorate … More Four more years of AKP?
Social Media and Automated Sentiment Analysis Social Media monitoring and analysis has become increasingly popular since the Web 2.0 because it provides an easy and effective way to directly measure the effect of a campaign. This can be done by KPI’s like the number of followers, likes, shares, comments, comments-per-post. Besides these hard and easy-to-measure KPI’s … More Predicting the Turkish General Elections with Twitter Data
“I have no doubt that in reality the future will be vastly more surprising than anything I can imagine. Now my own suspicion is that the Universe is not only stranger than we imagine, but queerer than we can imagine.” like most geeks, I am interested in future technologies and frequently read up on it. I find … More Hello world!