Twitter Streaming with SpringXD

At EMC World, Chris Harrold of EMC’s Big Data division demoed Pivotal’s Spring XD . Spring XD (eXtreme Data) is a runtime environment for big data that is part of the Spring IO platform. Chris took the class through a quick tutorial during this 60 minute presentation. This tutorial used Spring XD to stream and compute Twitter data into a prebuilt dashboard that is live sorting tweets based on their native language and hashtag. After the first 10 minutes, I was astonished. I HAD to try this. At the end of the session, he mentions that all of this is open-source and easily deployed on your own laptop following an easy guide. I knew what I was doing when I got back to Nashville.

Spring XD Unified Platform for Big Data

After my trip to Vegas, I dug through Spring XD’s documentation and found the tutorial on their site. Unfortunately, this stopped at being able to pipe it out to a file. Big Whoop. Don’t get me wrong, the technology behind that is incredible! But i feel like we’re only telling half the story with the data. The huge win comes from the visualization of the data. I dug up another tutorial on how to visualize the Twitter stream data and followed that walk through. After some tweaking and adjusting to get it up and running on my system, I was able to visualize live Twitter data on my laptop. Grinning from ear to ear, I screen capped my visualization during the day.

While I’m beyond impressed with the platform, I want more. I’ve made a short list of things that I want out of this and will be playing with in the near future.

Spring XD Data Platform

  • NFS share via Isilon– I want to run this Twitter stream for days. Lets get these files big; huge even. I want to see what this visualization will look like after a week or month of streaming. I think we can ask a lot of questions with this amount of data: How long does a hashtag trend for? Where do hashtags trend geographically?
  • Sentiment analysis with Hadoop– With Hadoop on Isilon, we will be able to run sentiment analysis on our twitter stream and see which hashtags correlate with positivity. We can see where positive tweets are geographically. We can see if negative tweets are correlating with trending events in America and around the world.
  • Geographic dashboard– I want to take the geographic data of these tweets and map them out on a Google Maps interface. Let’s see tweets emerge across a map on your screen.

I also will be posting more blurbs of my progress on these options. Please explore Spring XD on your own and let me know if you have any questions or comments.

Recent Posts