A key strength of NLP (natural language processing) is being able to process large amounts of texts and then summarise them to extract meaningful insights.
In this example, a selection of economic bulletins in PDF format from 2018 to 2019 are analysed in order to gauge economic sentiment. The bulletins in question are sourced from the European Central Bank website. tf-idf is used to rank…Continue
Added by Michael Grogan on July 11, 2019 at 12:28pm — No Comments
I propose a new word for data science for sparking new thinking. Signuology is defined as the study of sets of characteristic predictive signals contained within data in the form of combined features of the data that are characteristic of an observation of interest within the data.
The terms data mining and data structure imply rigid and discrete characteristics. A signal has more flexibility, borrowing from ideas contained in the superposition principle in physics. One can take the…Continue
Added by Steve Bowling on March 15, 2019 at 6:11am — No Comments
The key to perform any text mining operation, such as topic detection or sentiment analysis, is to transform words into numbers, sequences of words into sequences of numbers. Once we have numbers, we are back in the well-known game of data analytics, where machine learning algorithms can help us with classifying and clustering.
We will focus here exactly on that part of the analysis that transforms words…Continue
Added by Rosaria Silipo on February 11, 2019 at 3:09pm — No Comments
Artificial Intelligence is growing at a rapid pace in the last decade. You have seen it all unfold before your eyes. From self-driving cars to Google Brain, artificial intelligence has been at the centre of these amazing huge-impact projects.
Artificial Intelligence (AI) made headlines recently when people started reporting that Alexa was laughing unexpectedly. Those news reports led to the usual…Continue
Summary: Advanced analytics and AI are the fourth great lever available to create organic improvement in corporations. We’ll describe why this one is different from the first three and why the CEO needs the direct help of data scientists to make this happen.
If you’re a CEO or any other flavor of top executive leading a…Continue
Python and R are the two most commonly used languages for data science today. They are both fully open source products and completely free to use and modify as required under the GNU public license.
But which one is better? And, more importantly, which one should you learn?
Both are widely used and are standard tools in the hands of every data scientist.
The answer may surprise you – because as a professional data scientist, you should be ready to deal with…Continue
One interesting metric to check the usefulness of Everipedia as a desk reference for data mining is to compare the number of relevant articles. Go to Everipedia (https://everipedia.org/) and search for "data mining". You will get 7 articles.Then go to Wikipedia and search "data mining" You will see 4 articles (overlapped with similar Everipedia articles).
Another example. Try the word "smoothing" which is a popular topic in data analysis.…Continue
Added by jwork.ORG on August 2, 2018 at 1:34pm — No Comments
R has spread deep into the private sector and can be found in the production pipelines at some of the most advanced and successful enterprises.
Learn the fundamentals of data analysis in the second edition of Data Analysis with R, authored by data scientist…Continue
Added by Packt Publishing on May 8, 2018 at 10:30pm — No Comments
A long, long time ago (maybe 10 years) the data analytics industry was fairly easy to define and track. Back in that pre-historic era SAS was considered the gold standard of analytics companies with a comprehensive range of solutions addressing the demands of many industries. Given the relative paucity of data, analytics tended to focus on those industries that generated usable data. Companies that were part of the analytics universe back then would have included:
Added by Gregory Thompson on August 8, 2017 at 12:30pm — No Comments
Let’s start with the bottom line - there is no excuse for virtually any company today, regardless of size or manpower (and within reason), not to be making data analyics a part of their normal business routines. Traditional objections such as cost, resources and expertise no longer cut the mustard. As many observers have noted, a company’s internally generated data is a key asset that needs to be leveraged in the same way as any other corporate asset if the…Continue
Added by Gregory Thompson on July 30, 2017 at 4:30pm — No Comments
One of the best ways to learn about any topic is start with very fundamental questions like What, Why etc? Good old Socratic method. In this series of articles on data mining, I plan to approach this topic in a similar fashion.
Simply put, Data mining is the process of sifting through large data sets to identify…Continue
Data is almost everywhere. The amount of digital data that currently exists is now growing at a rapid pace. The number is doubling every two years and it is completely transforming our basic mode of existence. According to a paper from IBM, about 2.5 billion gigabytes of data had been generated on a daily basis in the year 2012. Another article from Forbes informs us…Continue
This post covers the following tasks using R programming:
There’s a lot of buzzword around the term “Sentiment Analysis” and the various ways of doing it. Great! So you report with reasonable accuracies what the sentiment about a particular brand or product is.
After publishing this report, your client comes back to you and…Continue
Added by Vivek Kalyanarangan on November 4, 2016 at 5:00am — No Comments
Unlike traditional application programming, where API functions are changing every day, database programming basically remains the same. The first version of Microsoft Visual Studio .NET was released in February 2002, with a new version released about every two years, not including Service Pack releases. This rapid pace of change forces IT personnel to evaluate their corporation’s applications every couple years, leaving the functionality of their application intact but with a completely…Continue
Added by Irina Papuc on July 21, 2016 at 3:00pm — No Comments
This post highlights a number of important applications found for deep learning so far. It is well known that 80% of data is unstructured. Unstructured data is the messy stuff every quantitative analyst tries to traditionally stay away from. It can include images of accidents, text notes of loss adjusters, social media comments, claim documents and review of medical doctors etc. Unstructured data has massive potential but has never been traditionally considered as a source of insight before.…Continue
Added by Syed Danish Ali on June 26, 2016 at 5:00am — No Comments
Data mining (sometimes called knowledge discovery) is the process of analyzing and summarizing data into useful information which can be used to understand common features, the origin of data and to extract hidden predictive information. Data mining is used in science, engineering,modeling and analysis of financial markets.
In this article we will discuss a free data-analysis framework called DMelt (The DataMelt project,…
Added by jwork.ORG on June 13, 2016 at 5:00pm — No Comments
For a number of months, I have been generating codified narrative from films, fairytales, paintings, court cases, and news events. Codified narrative might be described as a tokenized rendition of the underlying content. There are many ways to do a rendering. Imagine asking 100,000 people to write a story based on the same general details such as scenery, major events, and specific outcomes. To the extent there are commonalities in the resulting storylines, I would say that "social…Continue
Added by Don Philip Faithful on June 11, 2016 at 10:22am — No Comments
Did you know that Big Data can help us predict who the future football stars will be? This is precisely what happened in Riyad Mahrez’s case, PFA Player of the Year, who has conquered the English Premier League with Leicester City.
During last year’s Big Data Week Conference in London, Malta-based entrepreneur Valery Bollier correctly foresaw…Continue
Added by Jure Rejec on May 11, 2016 at 5:00am — No Comments