Subscribe to DSC Newsletter

All Blog Posts Tagged 'selection' (8)

Simple automated feature selection using lm() in R

There are many good and sophisticated feature selection algorithms available in R.  Feature selection refers to the machine learning case where we have a set of predictor variables for a given dependent variable, but we don’t know a-priori which predictors are most important and if a model can be improved by eliminating some predictors from a model.  In linear regression, many students are taught to fit a data set to find the best model using so-called “least squares”.  In most…

Continue

Added by Blaine Bateman on April 30, 2018 at 7:30am — No Comments

Using Selection to Find Superman - More on Demand and Capacity

During my childhood, our school librarian said that I was invited to attend a conference of writers.  I felt honoured and privileged.  I asked what the writers intended to ask me.  She smiled and said that actually I would be asking the writers questions.  Not quite sure why I would ask these people anything and why their thoughts would matter, I nodded anyways and at some point attended the most boring event imaginable for a young child.  I thought I had died, I really did.  I sat there…

Continue

Added by Don Philip Faithful on May 7, 2017 at 6:00am — No Comments

Feature engineering for building clustering models

We frequently get questions about whether we have chosen all the right parameters to build a machine learning model. There are two scenarios: either we have sufficient attributes (or variables) and we need to select the best ones OR we have only a handful of attributes and we need to know if these are impactful. Both are classic examples of feature engineering challenges

Most of the…

Continue

Added by BR Deshpande on April 16, 2016 at 9:00am — No Comments

The Awkward Road

This blog is about the peculiar nature in which software sometimes gets developed. I hope that many readers will recognize the relevance of data science in the examples taken from my own projects. I propose that development is the product of creativity more than accreditation. Creativity is something complicated that interacts with a person over his or her life circumstances. Many people know how to write . . . sentences and paragraphs. However, the ability to write well does not necessarily…

Continue

Added by Don Philip Faithful on August 30, 2014 at 8:59am — No Comments

Origins of the Species

In this blog, I will explain how an approach to handle small amounts data can be reconstructed to handle much larger amounts. This reconstruction is the product of an anomalous perspective or mutation relating to the attribution of performance.

Fig. 1 - 6-fingered handprint spotted near my truck

Many businesses share certain common features.…

Continue

Added by Don Philip Faithful on August 1, 2014 at 6:21am — No Comments

Geography of Data - Restoring the Transpositional

Above is a distribution of price differentials for the Dow Jones Industrial Average from the 1930s. The image was generated by one of my programs called Storm. I posted a few images from the same application in other blogs. If I recall correctly, the more volatile differentials (closer to the action) are at top; the more stable differentials (further from the…

Continue

Added by Don Philip Faithful on May 24, 2014 at 6:51am — No Comments

Strategic Placement for Big Data in Organizations

I tend to examine the different roles played by data. For instance, when I work on computer code, I often ask myself what the presence of data is meant to accomplish. Sometimes the analysis is not at all straightforward or simple. In society and organizations, people exist and persist in the records as data. The data survives even as employees come and go. I therefore consider it important to regard the data and its environment as a system in itself, something that has a life all of its own.…

Continue

Added by Don Philip Faithful on May 3, 2014 at 6:30am — No Comments

Fooled by Twitter Data

Data scientists must always remember that data sets are not objective -  they are selected, collected, filtered, structured and analyzed by human design. Naked and hidden biases in selecting,…

Continue

Added by Michael Walker on October 7, 2013 at 9:14pm — No Comments

Monthly Archives

2019

2018

2017

2016

2015

2014

2013

2012

2011

1999

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service