Subscribe to DSC Newsletter

All Blog Posts Tagged 'predictive modeling' (140)

The 5 Essential Skills Any Data Scientist Needs

Guest blog post by Bernard Marr.

In my last post, I explained the difference between what I consider the two core types of data scientist – strategic and operational.

Broadly speaking, they require many of the same skillsets – but the distribution of your expertise and experience within these skillsets will vary, depending on whether…

Continue

Added by Vincent Granville on January 18, 2015 at 2:39pm — 3 Comments

Data science without statistics is possible, even desirable

The purpose of this article is to clarify a few misconceptions about data and statistical science.

I will start with a controversial statement: data science barely uses statistical science and techniques. The truth is actually more nuanced, as explained below.

1. Data science heavily uses new statistical…

Continue

Added by Vincent Granville on December 8, 2014 at 5:00pm — 15 Comments

13 New Trends in Big Data and Data Science

Based on requests from clients - vendors of data processing platforms and products - as well as trends in popular blogs,  job postings, and my own reading. Here are a few topics recently gaining strong traction (items beyond #13 were recently added)::

  1. The rise of data plumbing, to make big data run smoothly, safely, reliably, and fast through all "data…
Continue

Added by Vincent Granville on November 11, 2014 at 10:30am — 3 Comments

The growth of data science over the last two years: 300%

A few websites catering to analytics and data science professionals have experienced tremendous growth recently. Organizations such as INFORMS or AMSTAT have seen their traffic explode, targeting high school students to join the ranks of data scientists. Niche publishers providing high quality, actionable content - and run by true data scientists rather than journalists - have also seen spectacular growth.…

Continue

Added by Vincent Granville on November 8, 2014 at 3:00pm — 1 Comment

Choosing a classifier for predictions

One of the biggest decisions that a data scientist need to make during a predictive modeling exercise is to choose the right classifier.There is no best classifier for all problems. The accuracy of the classifier varies based on the data set. Correlation between the predictor variables and the outcome is a key influencer. The choice need to be made based on experimentation. There are two main selection criteria here.

Accuracy:  While accuracy of the…

Continue

Added by Kumaran Ponnambalam on November 4, 2014 at 6:08pm — No Comments

Weekly Digest - November 3

The full version is always published Monday. Starred articles or sections are new additions or updated content, posted between Thursday and Sunday. 

Featured

Continue

Added by Vincent Granville on October 29, 2014 at 4:00pm — No Comments

Data science versus statistics, to solve problems: case study

In this article, I compare two approaches (with their advantages and drawbacks) to compute a simple metric: the number of unique visitors ("uniques") per year for a website. I use the word user or visitor interchangeably.

Source for picture: …

Continue

Added by Vincent Granville on October 27, 2014 at 9:30am — 7 Comments

Weekly Digest - October 27

The full version is always published Monday. Starred articles or sections are new additions or updated content, posted between Thursday and Sunday. 

Featured

Continue

Added by Vincent Granville on October 22, 2014 at 3:00pm — No Comments

200 Top Bloggers on Data Science Central

Top data science bloggers, authors, websites, or Twitter profiles worth following is now a popular topic, sure to attract lots of attention. We've published our share, including 

Continue

Added by Vincent Granville on October 7, 2014 at 5:00pm — 2 Comments

Top 30 DSC blogs, based on new scoring technology

Most of you will read this article to discover the most popular blogs, but the real purpose here is to show what goes wrong with many data science projects as simple as this one, and how it can easily be fixed. In the process, we created a new popularity score, much more robust than any ranking used in similar articles (top bloggers, popular books, best websites etc.) This scoring, based on a decay function, could be incorporated in recommendation engines.…

Continue

Added by Vincent Granville on October 4, 2014 at 9:00am — No Comments

Optimizing Disease Management Programs Using Predictive Modeling

Summary:  Here’s an easy to understand example of how predictive analytics can reduce cost while increasing efficacy of disease management programs.

Healthcare providers have made major breakthroughs over the last two decades by creating and implementing increasingly sophisticated disease management programs (DMPs).  At their core there are always two motives, improve the human condition by…

Continue

Added by William Vorhies on October 2, 2014 at 11:05am — No Comments

Weekly Digest, October 6

The full version is always published Monday. Starred articles or sections are new additions or updated content, posted between Thursday and Sunday. 

Featured

Continue

Added by Vincent Granville on October 1, 2014 at 3:00pm — No Comments

Top 2,500 Websites - mentioned a few times (Page 2)

Click here for explanations.

  • macrorisk.com (2012) - analytics 
  • datamining.togaware.com (2012) - statistics, machine learning, analytics, data mining, database 
  • kavaii.com (2013) - statistics, big data, analytics 
  • data-mining-blog.com (2012) - text mining, analytics, data mining, business…
Continue

Added by Vincent Granville on September 20, 2014 at 12:35pm — No Comments

Top 2,500 Websites - mentioned a few times (Page 1)

Click here for explanations.

  • indiana.edu (2012) - analytics 
  • businessintelligence.ittoolbox.com (2011) - text mining, analytics, data mining, database, business intelligence 
  • plot.ly (2014) - analytics 
  • dssresources.com (2012) - business intelligence 
  • powerbi.com (2014) - analytics, business…
Continue

Added by Vincent Granville on September 20, 2014 at 12:32pm — 1 Comment

Top 2,500 Websites - top of the top

For explanations about the methodology, including source code and possible improvements, read our main article on this subject. It also provides links to our other three listings.

The field between parentheses represents the year when the website in question was first mentioned - it does not represent when the website was created,…

Continue

Added by Vincent Granville on September 20, 2014 at 12:00pm — 2 Comments

Top 2,500 Data Science, Big Data and Analytics Websites

The following comprehensive listings were produced by analyzing our large member database, extracting websites that our members mentioned or liked, and for each web site, identifying

  • When it is first mentioned by one of our members
  • The number of times it was mentioned
  • Keywords found when visiting the front page with a web crawler, using a pre-selected list of seed keywords

The design of the member database (non-mandatory sign-up questions and…

Continue

Added by Vincent Granville on September 20, 2014 at 10:00am — No Comments

Weekly Digest - September 22

The full version is always published Monday. Starred articles or sections are new additions or updated content, posted between Thursday and Sunday. 

Sponsored Announcement…

Continue

Added by Vincent Granville on September 17, 2014 at 5:30pm — 1 Comment

Announcing the winner of our second competition - Jackknife regression

The winner for our second data science competition is Tom De Smedt, biostatistician completing a Ph.D program at University of Leuven, Belgium. His special interests are in spatial statistics, environmental epidemiology, novel regression techniques and data visualization.

The competition consisted of simulating data and testing the …

Continue

Added by Vincent Granville on September 3, 2014 at 7:00pm — 1 Comment

Weekly Digest - September 1

The full version is always published Monday. Starred articles or sections are new additions or updated content, posted between Thursday and Sunday. 

Sponsored Announcements…

Continue

Added by Vincent Granville on August 27, 2014 at 2:30pm — No Comments

Why Zipf's law explains so many big data and physics phenomenons

The Zipf's law states that in many settings (that we are going to explore), the volume or size of entities is inversely proportional to a power s (s > 0) of their ranking. This has important implications in predictive modeling, discussed below. The processes that create this type of dynamic are not well understood. It is the purpose of this article to explain the underlying mechanics. The traditional example for the Zipf distribution is the distribution of Internet…

Continue

Added by Vincent Granville on August 21, 2014 at 3:00pm — 12 Comments

Monthly Archives

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

1999

Videos

  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service