Home » Uncategorized

Data Science Reveals Trump Tweets are Written by Two People

By David Robinson. David Robinson is a data scientist at Stack Overflow. His article (parts of it) was re-posted in the Washington Post, here. This is also a short version that summarizes his analysis. The details and source code can be found on David’s website, here

In short, David found that Donald Trump’s tweets are authored by two different people: Someone on his campaign staff is tweeting from an iPhone, and the billionaire himself is tweeting from his Android. 

According to David, the Android and iPhone tweets are clearly from different people, posting during different times of day and using hashtags, links, and retweets in distinct ways. Also, the Android tweets are angrier and more negative, while the iPhone tweets tend to be benign announcements and pictures. 

Overall, this analysis is based on 628 tweets from iPhone, and 762 tweets from Android.

David’s Analysis features the following highlights:

Tweets from the iPhone were 38 times as likely to contain either a picture or a link. This also makes sense with our narrative: the iPhone (presumably run by the campaign) tends to write “announcement” tweets about events, like this:

2808327176

Trump on the Android does a lot more tweeting in the morning, while the campaign posts from the iPhone more in the afternoon and early evening. Another place we can spot a difference is in Trump’s anachronistic behavior of “manually retweeting” people by copy-pasting their tweets, then surrounding them with quotation marks.

Which are the words most likely to be from Android and most likely from iPhone? See chart below.

2808332274

Most hashtags come from the iPhone. Indeed, almost no tweets from Trump’s Android contained hashtags, with some rare exceptions.

Sentiment analysis: Trump’s tweets are much more negative than his campaign’s.

To read more, click here. Especially, the section on sentiment analysis is pretty detailed in the original article, which is also peppered with source code, sample tweets, and use of some R libraries. 

Note from the editor: If you check my own tweets, you will see that it is also a blend of multiple sources: tweets entered manually by myself, re-tweets (manual), fully automated tweets using Hootsuite (some are about my articles, most are about other DSC authors, and they come non stop every 30 minutes) and semi-automated tweets using BufferApp (many come with pictures and are scheduled for day time, and typically feature new or very popular content.) It would be a good exercise, for a data scientist, to test some machine learning / pattern detection techniques on my tweets to see if you properly capture the architecture behind my somewhat sophisticated tweeting system. And write a paper about it! There are more  than 88,000 tweets to analyze, and that is just for the main account.

Top DSC Resources

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge