Subscribe to DSC Newsletter

All Blog Posts Tagged 'feature' (9)

Building an Intelligent QA System With NLP and Milvus

Milvus Project: github.com/milvus-io/milvus

The question answering system is commonly used in the field of natural language processing. It is used to answer questions in the form of natural language and has a wide range of applications. Typical applications include intelligent voice interaction, online customer service,…

Continue

Added by Kate Shao on July 13, 2020 at 11:07pm — No Comments

Targeting Hate Speech

Summary:  Objectively identifying hateful or abusive speech on social media platforms would allow those platforms to better control it.  However to be objective and without bias that identification would have to be independent of the author especially where elected officials are involved.

 

What could be more…

Continue

Added by William Vorhies on June 8, 2020 at 2:25pm — No Comments

NLP Picks Bestsellers – A Lesson in Using NLP for Hidden Feature Extraction

Summary:  99% of our application of NLP has to do with chatbots or translation.  This is a very interesting story about expanding the bounds of NLP and feature creation to predict bestselling novels.  The authors created over 20,000 NLP features, about 2,700 of which proved to be predictive with a 90% accuracy rate in predicting NYT bestsellers.

 …

Continue

Added by William Vorhies on September 3, 2019 at 7:35am — 1 Comment

Simple automated feature selection using lm() in R

There are many good and sophisticated feature selection algorithms available in R.  Feature selection refers to the machine learning case where we have a set of predictor variables for a given dependent variable, but we don’t know a-priori which predictors are most important and if a model can be improved by eliminating some predictors from a model.  In linear regression, many students are taught to fit a data set to find the best model using so-called “least squares”.  In most…

Continue

Added by Blaine Bateman on April 30, 2018 at 7:30am — No Comments

Feature Engineering with Tidyverse

In this blog post, I will discuss feature engineering using the Tidyverse collection of libraries. Feature engineering is crucial for a variety of reasons, and it requires some care to produce any useful outcome. In this post, I will consider a dataset that contains description of crimes in San Francisco between…

Continue

Added by Burak Himmetoglu on April 10, 2017 at 7:30am — No Comments

Measuring Similarity between Objects

The ability to recognize objects and their relationships is at the core of intelligent behavior. This, in turn, depend on one’s ability of perceiving similarity or dissimilarity between objects, be physical or abstract ones. Hence, if we are interested to make computers behave with any degree of intelligence, we have to write programs that can work with relevant representation of objects and means to compute their similarities or lack thereof, i.e., dissimilarity (obviously, they are…

Continue

Added by Arijit Laha on October 12, 2016 at 11:00pm — 1 Comment

Context Levels in Data Science Solutioning in real-world

A Data science-based solution needs to address problems at multiple levels. While it addresses a business problem, computationally it is comprised of a pipeline of algorithm which, in turn, operates on relevant data presented in proper format. Thus to understand the them we need to focus at least at the

  • Business level;
  • Algorithm level; and
  • Data level.

Contrary to the popular belief, almost all non-trivial data science solutions are needed to be…

Continue

Added by Arijit Laha on October 3, 2016 at 10:30pm — No Comments

Feature engineering for building clustering models

We frequently get questions about whether we have chosen all the right parameters to build a machine learning model. There are two scenarios: either we have sufficient attributes (or variables) and we need to select the best ones OR we have only a handful of attributes and we need to know if these are impactful. Both are classic examples of feature engineering challenges

Most of the…

Continue

Added by BR Deshpande on April 16, 2016 at 9:00am — No Comments

Choosing features for random forests algorithm

There are many ways to choose features with given data, and it is always a challenge to pick up the ones with which a particular algorithm will work better. Here I will consider data from monitoring performance of physical exercises with wearable accelerometers, for example, wrist bands.

The data for this project come from this source: http://groupware.les.inf.puc-rio.br/har.

In this project, researchers used data from…

Continue

Added by Maiia Bakhova on February 18, 2016 at 11:00am — 5 Comments

Monthly Archives

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

1999

Videos

  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service