Summary: 99% of our application of NLP has to do with chatbots or translation. This is a very interesting story about expanding the bounds of NLP and feature creation to predict bestselling novels. The authors created over 20,000 NLP features, about 2,700 of which proved to be predictive with a 90% accuracy rate in predicting NYT bestsellers.
Added by William Vorhies on September 3, 2019 at 7:35am — No Comments
There are many good and sophisticated feature selection algorithms available in R. Feature selection refers to the machine learning case where we have a set of predictor variables for a given dependent variable, but we don’t know a-priori which predictors are most important and if a model can be improved by eliminating some predictors from a model. In linear regression, many students are taught to fit a data set to find the best model using so-called “least squares”. In most…Continue
Added by Blaine Bateman on April 30, 2018 at 7:30am — No Comments
In this blog post, I will discuss feature engineering using the Tidyverse collection of libraries. Feature engineering is crucial for a variety of reasons, and it requires some care to produce any useful outcome. In this post, I will consider a dataset that contains description of crimes in San Francisco between…Continue
Added by Burak Himmetoglu on April 10, 2017 at 7:30am — No Comments
The ability to recognize objects and their relationships is at the core of intelligent behavior. This, in turn, depend on one’s ability of perceiving similarity or dissimilarity between objects, be physical or abstract ones. Hence, if we are interested to make computers behave with any degree of intelligence, we have to write programs that can work with relevant representation of objects and means to compute their similarities or lack thereof, i.e., dissimilarity (obviously, they are…Continue
A Data science-based solution needs to address problems at multiple levels. While it addresses a business problem, computationally it is comprised of a pipeline of algorithm which, in turn, operates on relevant data presented in proper format. Thus to understand the them we need to focus at least at the
Contrary to the popular belief, almost all non-trivial data science solutions are needed to be…Continue
Added by Arijit Laha on October 3, 2016 at 10:30pm — No Comments
We frequently get questions about whether we have chosen all the right parameters to build a machine learning model. There are two scenarios: either we have sufficient attributes (or variables) and we need to select the best ones OR we have only a handful of attributes and we need to know if these are impactful. Both are classic examples of feature engineering challenges.
Most of the…Continue
Added by BR Deshpande on April 16, 2016 at 9:00am — No Comments
There are many ways to choose features with given data, and it is always a challenge to pick up the ones with which a particular algorithm will work better. Here I will consider data from monitoring performance of physical exercises with wearable accelerometers, for example, wrist bands.
The data for this project come from this source: http://groupware.les.inf.puc-rio.br/har.
In this project, researchers used data from…Continue