Bloody Valentine’s Day: Basic Sentiment Analysis

A harsh day, indeed! It all begun February 14, 1929, when  the Saint Valentine’s Day Massacre took place in Chicago.

Seven mob associates as part of a prohibition era conflict between two powerful criminal gangs in Chicago: the South SideItalian gang led by Al Capone and the North Side Irish gang led by Bugs Moran. Former members of the Egan’s Rats gang were also suspected of having played a significant role in the incident, assisting Capone (source: Wikipedia).

Perhaps we should update the definition: the famous South African runner Pistorius allegedly shot down his own girlfriend last night. What really happened is still unclear.

And, yes, today is Valentine’s day, before I forget. It is supposed to be a day devoted to love and wine, not to death. But everybody knows that love and death get along quite well. Let alone wine…

So, what do all the people think about today? Actually, we have a mean to estimate the global “sentiment” by analyzing the tweets that are posted every minute. This is called in fact “sentiment analysis“, an analysis technique that is getting very popular recently. To cut the story short, we want to tag a tweet as “positive” or “negative” by interpreting its content, literally!

We are lucky we don’t have to reinvent the wheel. We can set up a quite basic but still effective analyzer with a couple of tools:

The idea is quite simple: every 30 seconds we get the first 100 tweets with a certain hashtag. The, we analyze then assigning a sentiment score to each of them, and then we plot the score as a function of time. The numerical value is simply:

  • Negative score = -1
  • Positive score = +1

I added some words to the dictionaries shipped with Basic_sentiment_analysis. I decided not to modify the code, further improvements are possible but beyond the scope of this post.

At the same time, I analyzed #pistorius and #valentine hashtags, fetching 100 tweets every 30 seconds. Now take a look at the results (over few minutes, though): which is which? name it!


This is the code is used:

#!/usr/bin/env python
import time
import sys
import twitter
from sentiment_analyzer import *
consumer_key = "xx"
consumer_secret = "xx"
access_token_key = "xx"
access_token_secret = "xx"
api = twitter.Api(consumer_key=consumer_key, consumer_secret=consumer_secret, access_token_key=access_token_key, access_token_secret=access_token_secret )
#print api.VerifyCredentials()
#statuses = api.GetUserTimeline()
#print [ for s in statuses]
pattern = "#lhc"
res_max = 1000
if len( sys.argv ) > 1:
 pattern = "#" + sys.argv[1]
if len( sys.argv ) > 2:
 res_max = int( sys.argv[2] )
splitter = Splitter()
postagger = POSTagger()
dicttagger = DictionaryTagger([ 'dicts/positive.yml', 'dicts/negative.yml', 
 'dicts/inc.yml', 'dicts/dec.yml', 'dicts/inv.yml'])

if __name__ == "__main__":
while( True ):
 results = api.GetSearch( pattern, per_page=res_max )
#print "Found", len(results), "tweets about", pattern
alltexts = [ res.AsDict()['text'] for res in results ]
tot_score = 0
 for tweet in alltexts:
 if tweet.startswith( "RT @" ): continue
 #print ">>>", tweet
 splitted_sentences = splitter.split( tweet )
 pos_tagged_sentences = postagger.pos_tag(splitted_sentences)
 dict_tagged_sentences = dicttagger.tag(pos_tagged_sentences)
 score = sentiment_score(dict_tagged_sentences)
 tot_score += score
 #print "Total sentiment:", tot_score
 print pattern + ":" + str(tot_score)

One thought on “Bloody Valentine’s Day: Basic Sentiment Analysis

  1. Pingback: Sentimenti del giorno « Beyond the Standard Model Pub

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s