Back to Parent

Outcome


The goals

The intent of this project is to analyze tweets posted throughout Pittsburgh to gain a sense of where Pittsburgh Steelers fans are most pronounced. In theory, the entire city loves the Steelers, but it is my assumption that the tweets will reveal areas of the city where fans aren’t as vocal in their expression of their love for the team. Via heat mapping of tweets and animated aggregation of the data during game day, we will be able to utilize the data of tweets to better interpolation where and when fans are most passionate about the home team.  

Approach and process

 A dataset of selected adjectives was scraped from PGH Twitter last week. This data has been refined to reflect tweets that contain the words ‘Steelers’ in them, from here the data can be utilized to perform a heat mapping that stretches the length of a week )between two separate games) and isolated down to game day to see the pre-game, during game, and after game tweet activity.  

Dataset

The dataset is a 1-½ week of scraped twitter data, sorted via Twitter comments and hashtags with the words ‘Steelers’ embedded. These tweets all have geo-coding information which gives a rough approximation as to where they originated. This, in turn, is overlaid over a map of City neighborhoods and County municipalities, to understand what parts of the city and greater area tweet about the Steelers.

Excel sheet.jpg.thumb
Show Advanced Options

Limitations when sorting the data:

Some of the Latitude and Longitude information was incorrect for some of the data. These errors ranged from the latitude and longitude being flipped/reversed to a series of tweets that screwed the location data as originating from a Data Metrics firm.

A City Tweets

The overall mapping of the city

Show Advanced Options

Tweets across the week

Twitter data described above is reflected over the course of a 10 day period where the Steelers played twice.

Game Day Mapping: Steelers at Chiefs

Here the Tweets right before during and after the game are measured not only for their mention of the Steelers, but also for the quality of sentiment they express.

Show Advanced Options
Game 1.jpg.thumb
Show Advanced Options

Limitations when sorting the data:

An additional issue discovered during the process was that the Time of Day information was off by 4 hrs. which is most likely due to a GMT (Greenwich Mean Time) standard in place at Twitter, or the Twitter data collecter.

Game Day Mapping: Bengals at Steelers

Here the Tweets right before during and after the game are measured not only for their mention of the Steelers, but also for the quality of sentiment they express.

Show Advanced Options
Game 2.jpg.thumb
Show Advanced Options

Conclusions

There are some limiting factors to Twitter, in that it doesn't have the typical daily users as more popular apps, subsequently, there is a loss of information in this analysis.

Additionally, of the users captured,   most are outside the city limits, with very little activity outside of the Stadium and downtown, for the rest of Pittsburgh to express their opinions.

Moving forward, it would be nice to see this information overlayed with other social media services to get the full spectrum of people watching a match, while also expressing their opinions of the play-by-play.

Out of this experiment, a lot of business ideas can spur off, ones focused more on understanding a market/group of people interested in developing something.

Drop files here or click to select

You can upload files of up to 20MB using this form.