This guide was originally posted on the AYLIEN Blog. It was written as a how-to guide for using RapidMiner and AYLIEN to scape and Analyze online content.
One of the major challenges with mining the Web and Social Media for insights is trying to get all of your data into one place. To do this, you need to extract information from multiple sources in order to gain an accurate and holistic view.
Combining multiple data sources and analyzing their content can be a daunting task, but thankfully data mining frameworks such as RapidMiner and Weka make it easy to extract information from multiple sources in a quick and straightforward manner.
In this blog post, we're going to show you how to use AYLIEN's Text Analysis API from within RapidMiner to analyze text gathered from sources on the web.
The Web Mining extension for RapidMiner provides access to internet sources like web pages, RSS feeds, and web services. In this tutorial, we're going to use it to make HTTP requests to the Text Analysis API. In part 2 we will use it to scrape information from web pages such as Rotten Tomatoes.
The Web Mining package provides you with an operator for invoking external web services. This operator is called "Enrich Data by Webservice" and can be found in the Operators panel under Web Mining > Services > Enrich Data by Webservice.
url
: "https://api.aylien.com/api/v1/sentiment?mode=tweet&text=<%text%>" or if you're using Mashape: "https://aylien-text.p.mashape.com/sentiment?mode=tweet&text=<%text%>"request method
: POSTbody
: "text=<%title%>"request properties
:Accept
: "text/xml"X-AYLIEN-TextAPI-Application-Key
: "YOUR_API_KEY"X-AYLIEN-TextAPI-Application-ID
: "YOUR_APPLICATION_ID"X-Mashape-Key
: "YOUR_API_KEY"query type
: XPathxpath queries
:polarity
: "//polarity/text()"Here we are basically calling the /sentiment
endpoint of the Text Analysis API to analyze the sentiment of some text in order to find out if it's positive, negative or neutral.
Now that our API call is setup, we need to provide the operator with some input text.
text attribute
parameter to "text"url attribute
parameter to "text"Now that we have everything setup, it's time to run our process by clicking the Run button.
As you can see, "I love puppies!" was deemed to be positive and the result is now accessible in RapidMiner for further analysis and reporting. You could use one of the many other methods provided in the Text Processing package to generate any number of documents and analyze their sentiment in the same fashion. Also, by changing the url
parameter in the API call you can access any other endpoint from the Text API (Concept Extraction, Classification, Summarization and so on).
In the 2nd part of this series, we're going to crawl Rotten Tomatoes with RapidMiner to extract movie reviews and analyze their sentiment to gain some interesting insights.
Posted 1 March 2021
© 2021 TechTarget, Inc.
Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central