Here are 2 wordclouds I made to illustrate similarities and differences in information on twitter and on news website front pages. I scraped 1 day of headlines from the NYtimes, Fox News, Wall Street Journal, Washington Post, the Raw Story and Breitbart. I also collected tweets containing the phrase “fake news” for 3 hours. Although I filtered 2,000 tweets, their word diversity was very small compared to just one days worth of headlines on 6 news sites. The news site words are also much longer, and appear more sophisticated. The first image is a venn diagram of the two data sets, and the second is a wordcloud of all the overlapping words, minus some low information words that were crowding more interesting ones.
It is interesting that the overlapping words have a lot of intensity, despite their brevity. For example, bad, right and cartel feature prominently.
I am expanding this project, and expect updates in the future.


Leave a Reply