Tweet content compared to news headlines

Here are 2 wordclouds I made to illustrate similarities and differences in information on twitter and on news website front pages. I scraped 1 day of headlines from the NYtimes, Fox News, Wall Street Journal, Washington Post, the Raw Story and Breitbart.  I also collected tweets containing the phrase “fake news” for 3 hours. Although I filtered 2,000 tweets, their word diversity was very small compared to just one days worth of headlines on 6 news sites. The news site words are also much longer, and appear more sophisticated. The first image is a venn diagram of the two data sets, and the second is a wordcloud of all the overlapping words, minus some low information words that were crowding more interesting ones.

It is interesting that the overlapping words have a lot of intensity, despite their brevity. For example, bad, right and cartel feature prominently.

I am expanding this project, and expect updates in the future.

twit_news_venn_wordcloud_captioned

wordcloud copy

July 23, 2018 (0)


Leave a Reply

Your email address will not be published. Required fields are marked *