Taste of Pittsburgh

Made by Ankita Patel

With K-mean clustering analysis, this project is intended to discover what is the taste of Pittsburgh, through data scraped from Foursquare

Created: October 30th, 2017

0

MOTIVATIONS : 

Through my stay in Pittsburgh for past 1 year, observing that places like Squirrel Hill which are dominated by Jewish residents since ages have more of Chinese restaurants, seeing many Indians prefer eating Mexican food on a routine basis and similarly studying at American University, I see many of American colleagues, eating more Indian food on daily basis then I do being an Indian. This sort of variations in peoples choice and preference brings this question to my mind. Living in a such a diverse city, I wanted to know Pittsburgh is dominated by what type of cuisine, and which most of the people prefer. 

0

DATASET USED : 

For this project, I used the dataset which was scraped from Foursquare. I choose to work with the data what which was categorized specially for all sorts food in Pittsburghs. 

0

PROCESS :

To explore the idea of finding the taste of Pittsburgh, I choose to use the K-mean clustering method, as cartographic represntation of huge no. would best answer my question. 

I filtered the .csv file to the required data. 

0

VISUALISATION : 


0

Each circle represents a wide category of food such as restaurants, coffee shops, fast food etc , while the size of the of circles indicate the number of checkin counts. 

0

The second clustering was done, using the Tip count. It was interesting learn that the tip count did not align with the checkin or number of users count. The clusters  sizes change according to the tipping counts and many restaurants which are small points, become bigger here. 

0

After, studying the broad category. I choose to study in detail just the types of restaurants categorized by the type of cuisine they serve by the checkin counts.

0

It can be inferred from the data, that what see is not always true. The most preferred food is American followed by Mexican and then Chinese. I was surprised to learn that inspite of the huge number of Indians, there is no single checkin in Indian restaurant. 

0

To further check the hypothesis whether to see the correlation of the count and tip count I tried clustering with a smaller number and using them as sum and average. Thus one can see from both the clusters that the American restaurants have less tip count but the higher number of user count. 

0

I further plotted it on a scatter plot to check for a various cluster. It can be seen that most of the clusters have tip countless with the growing user base while some have a less more tip count but high user base. This raises altogether new questions whether is it because the restaurants are cheap and have more people going and less tip count or whether because high end restaurant have generally less people going but the tip count is always high. 

This mapping definitely helps to find the trend

0

LESSONS LEARNED : 

Clustering is a great method to represent quantitative data and easy to draw inferences. 

It would be better if there could be away to show one or more dimensions to the one particular diagram. 

x
Share this Project


About

With K-mean clustering analysis, this project is intended to discover what is the taste of Pittsburgh, through data scraped from Foursquare