Home » Uncategorized

An NLP Approach to Analyzing Twitter, Trump, and Profanity

This article was written by Stephanie Kim. Stephanie has a professional experience with data mining and processing including natural language processing along with a small amount of machine learning and script automation.


Who swears more? Do Twitter users who mention Donald Trump swear more than those who mention Hillary Clinton? Let’s find out by taking a natural language processing approach (or, NLP for short) to analyzing tweets.

This walkthrough will provide a basic introduction to help developers of all background and abilities get started with the NLP microservices available on Algorithmia. We’ll show you how to chain them together to perform light analysis on unstructured text. Unfamiliar with NLP? Our gentle introduction to NLP will help you get started.

We know that getting started with a new platform or developer tool is an investment in time and energy. Sometimes it can be hard to find the information you need in order to start exploring on your own. That’s why we’ve centralized all our information in the Algorithmia Developer Center and API Docs, where users will find helpful hints, code snippets, and getting started guides. These guides are designed to help developers integrate algorithms into applications and projects, learn how to host their trained machine learning models, or build their own algorithms for others to use via an API endpoint.

Now, let’s tackle a project using some algorithms to retrieve content, and analyze it using NLP. What better place to start than Twitter, and analyzing our favorite presidential candidates?

What you will find in this article : 

  • Twitter, Trump, and Profanity: An NLP Approach
  • Step One: Retrieve Tweets by Keyword
  • Step Two: Collecting Data
  • Step Three: Data Preprocessing
  • Step Four: Checking Tweets for Profanity

To check out all this information, click here.

Top DSC Resources

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge

Leave a Reply

Your email address will not be published. Required fields are marked *