Good professor = Easy professor?
Made by Mingquan Chen
Made by Mingquan Chen
I want to select data purposely to lie about the truth.
Created: November 2nd, 2015
I made a database of ratings of part of CMU professors using the data from Rate My Professors. Because I don't know how to use other wonderful software and did not have time mastering in those, I just used google spreadsheet. In my process, I created a sequence of graphs to confuse audience at first and then show them the truth. I think the comparison will make my lie more intuitive.
data: https://docs.google.com/spreadsheets/d/18KAPWrO30o_Q8amKGlysCxG1Ry3qwCnrqls72CKAcHQ/edit#gid=0
I have one motivation of each of the graph above. The first one aims to tell a lie that the general trends of professors' easiness is increasing with the quality of professors. I think it can be told by the graph. The second one aims to say every subject has similar easiness(from the height of polygon), so there is no difficult subject. Just take the course you want!
But the truth is told by the third graph. First of all, the trend that easiness rises with quality is a lie. As people can see from this graph, the trend is nearly flat. The reason is I used the rank of professors in their college as the x-axis in previous graphs, but I used the real rate of quality for each professor as the x-axis now. Furthermore, I did not include the Math Professors' line which is really flat. The rest of data can 'support' my words.
Also, professors I picked are those that are popular (most rates), which to some extents are the best in students' mind. As shown in the graph, the best professor in Math and SCS have highest quality, while science and econ professors have comparatively lower quality. (This does not aim at any certain professor or college, but only the analysis from the data)
I always think the selection of data is the trickiest part of analysis. People can choose the data that support their ideas and overlook others. Data visualization, as a way to magnify the details, can be used to emphasize on the 'twisted' part or other parts to make people ignore the distortion. That is what I want to achieve. Also, let my friends be more challenging.
The idea about data selection is the thing I always concern on. Whenever I saw an analysis about some controversial things I would think if the editor selected the data on purpose. There are just so many fake analysis in China and I want to know the truth. I wanted to test the reactions when I apply this idea to audience.
For the data part, I only picked the professors with more than 10 rates to ensure the validity of data. The quality is calculated by the average of three category in Rate My Professor: helpfulness, clarity, and easiness. I drew several graphs using different selection of data (for example, calculate quality by only averaging helpfulness and clarity) and found that it was not deceptive enough.
Also, I tried bar chart, column chart, pie chart, and so on. Finally I found that line chart can show trend, and the area chart is usually misleading in the size. That was why I chose them.
Another thing is the selection of data. I should get the same amount of data for each college (but there are just not too many professors been rated) so that the trend will be more convincing.
Still, I am happy about what I've done. Through the process I also know there is nothing called 'high rating for easiness'. It cannot be shown by the trends, but if you see the y-axis clearly, you will notice the highest rating for easiness is nearly 4 out of 5.
I want to select data purposely to lie about the truth.