Subscribe to DSC Newsletter

October 2016 Blog Posts (87)

Designing better algorithms: 5 case studies

In this article, using a few examples and solutions, I show that the "best" algorithm is many times not what data scientists or management think it is. As a result, too many times, misfit algorithms are implemented. Not that they are bad or simplistic. To the contrary, they are usually too complicated, but the biggest drawback is that they do not address the key problems. Sometimes they lack robustness, sometimes they are not properly maintained (for instance they rely on outdated lookup…

Continue

Added by Vincent Granville on October 31, 2016 at 8:30pm — 1 Comment

Price Optimisation Using Decision Tree (Regression Tree) - Machine Learning

INTRODUCTION TO THE RESEARCH QUESTION

The research was conducted to find out what price  maximises profit without sacrificing the high demand for the product due to the price being too high nor sacrificing the margins on the product due to the price being too low. 

The goal is to experiment with different price levels for the same product in one market place and country to see how sales volumes change with prices and which volume level of…

Continue

Added by Bernard Antwi Adabankah on October 29, 2016 at 10:30pm — 3 Comments

The Case for a New “Final Frontier” in Data Analytics

This article has been contributed by Alain Louchez (Georgia Tech Research Institute)

The Internet of Things already integrates a new phase beyond prescriptive analytics. 

There is no shortage of attention lately on the “Internet of Things”. As a case in point, see the “Developing Innovation and Growing the Internet of Things Act” or “DIGIT Act”, i.e.,…

Continue

Added by Shay Pal on October 29, 2016 at 4:00pm — No Comments

Comparison Between Global Vs Local Normalization of Tweets, and Various Distances

In the previous example we used clustering to see if an apparent pattern exists within Brexit tweets.   We found out that we have three distinct patterns, the leave, the referendum, and Brexit.  This in itself helps us think that we may even create a classifier that can identify if the tweet writer is pro or agains an issue automatically, with no human intervention.



Let's get back to the issues related to clustering.  To use the clustering algorithm we had to…

Continue

Added by Dalila Benachenhou on October 29, 2016 at 2:30pm — No Comments

Weekly Digest, October 31

Monday newsletter published by Data Science Central. Previous editions can be found here.  The contribution flagged with a + is our selection for the picture of the week.

Featured Resources and Technical Contributions

Continue

Added by Vincent Granville on October 28, 2016 at 4:30pm — No Comments

Machine Learning vs. Traditional Statistics: Different philosophies, Different Approaches

"Machine Learning (ML)" and "Traditional Statistics(TS)" have different philosophies in their approaches. With "Data Science" in the forefront getting lots of attention and interest, I like to dedicate this blog to discuss the differentiation between the two. I often see discussions and arguments between statisticians and data miners/machine learning practitioners on the definition of "data science" and its coverage and the required skill sets. All is needed, is just paying attention to the…

Continue

Added by Khosrow Hassibi on October 28, 2016 at 7:00am — 5 Comments

Sentiment Analysis of Movie Reviews (3): doc2vec

This is the last – for now – installment of my mini-series on sentiment analysis of the Stanford collection of IMDB reviews (originally published on recurrentnull.wordpress.com).

So far, we’ve had a look at classical bag-of-words models and…

Continue

Added by Sigrid Keydana on October 27, 2016 at 10:30pm — No Comments

Context Matters When Text Mining

Context Matters When Text Mining

Many times the most followed approach can result in failure.  The reason has more to do with thinking that one approach works in all cases.  This is specially true in text mining.  For instance, a common approach in clustering documents is to create tf-idf matrix for all documents, use SVD or other dimension reduction algorithm and then use a clustering.  In most cases, this will work; However, as I will present here,  there are instances…
Continue

Added by Dalila Benachenhou on October 27, 2016 at 5:30pm — 2 Comments

Sentiment Analysis of Movie Reviews (2): word2vec

This is the continuation of my mini-series on sentiment analysis of movie reviews, which originally appeared on recurrentnull.wordpress.comLast time, we had a look at how well classical bag-of-words models worked for classification of the Stanford collection of IMDB reviews. As it turned out, the “winner” was Logistic…

Continue

Added by Sigrid Keydana on October 27, 2016 at 3:00pm — No Comments

Thursday News: Data Science, Python, Spark, NLP, Deep Learning, Neural Nets, ML

Here is our new selection of featured articles and resources for this Thursday, covering data science, python, spark, natural language processing (sentiment analysis), deep learning, neural nets, and machine learning. The first four articles are more technical.

Thursday News

Continue

Added by Vincent Granville on October 27, 2016 at 8:33am — No Comments

24 Neural Network Adjustements

Added by Rubens Zimbres on October 27, 2016 at 7:30am — No Comments

Why The Aviation Industry Needs to Hurry Up With IoT Implementation

With their billions of annual captive customers, one would think that airports, and by logical extension, airlines, were prime candidates for the implementation of the Internet of Things (IoT) technology to improve passenger experience, yet, there’s not been much progress…

Continue

Added by Raj Dalal on October 27, 2016 at 3:30am — No Comments

10 Required Non-technical Skills for a Data Scientist

"Data Science(DS)" is nothing new but the term itself and the recent level of interest in it. As a practice it has commercially (not academically) existed for more than 25 years, mainly under "Data Mining (DM)" and "predictive analytics(PA)," since early 1990's. DM and PA got a lot of traction originally in financial, Telco, and retail industries that had a lot of granular historical data. Like anything that gets sudden attention and interest, DS has been misused and abused in a variety of…

Continue

Added by Khosrow Hassibi on October 26, 2016 at 7:00pm — 1 Comment

47 New External Data Science / Machine Learning Resources and Articles

Starred articles are candidates for the picture of the week. A comprehensive list of all past resources is found here. We are in the process of automatically categorizing them using indexation and automated tagging…

Continue

Added by Vincent Granville on October 26, 2016 at 7:30am — No Comments

Accelerated Computing and Deep Learning

Guest blog post by Jen-Hsun Hunag, Founder, President and CEO at NVIDIA, Originally entitled "The Intelligent Industrial Revolution".

A New Era

A New Era of Computing

Intelligent machines powered by AI computers that can learn, reason and interact with people are no…

Continue

Added by Vincent Granville on October 25, 2016 at 5:00pm — No Comments

Catching up on Big Data & Healthcare

The health area is characterized by the management of huge data volumes. What if those data are processed and provided to the health professionals and their patients or, even to the health system at large? Not only one, two, three…

Continue

Added by Ernesto Mislej on October 25, 2016 at 7:30am — No Comments

How to Intelligently Apply Data Integration and Visual Analytics Tools

Data integration requires merging date from different sources, stored using technologies. Companies build a “data warehouse where aggregated data can be stored and retrieved. This is particularly useful for researchers looking to big data to aid in their investigation and corporations usually during…

Continue

Added by Dante Munnis on October 25, 2016 at 7:00am — No Comments

[Cheat Sheet] Python Basics For Data Science

The use of Python as a data science tool has been on the rise over the past few years: 54% of the respondents of the latest O'Reilly Data Science Salary Survey indicated that they used Python. The results of the 2015 survey indicated that 51% of the respondents used Python. 

Nobody can deny that Python has been on the rise in the data science industry and it certainly seems that it's here to stay.

This rise in popularity in the industry, the long gone infancy of Python packages…

Continue

Added by karlijn willems on October 25, 2016 at 12:30am — 4 Comments

22 Great Blogs Posted in the last 12 Months

This is part of a new series of articles: once or twice a month, we post previous articles that were very popular when first published. These articles are at least 6 month old but no more than 12 month old. The previous digest in this series was posted here a while back. Below is our fourth edition.…

Continue

Added by Vincent Granville on October 24, 2016 at 5:00pm — No Comments

Recurrent Neural Nets – The Third and Least Appreciated Leg of the AI Stool

Summary:  Convolutional Neural Nets are getting all the press but it’s Recurrent Neural Nets that are the real workhorse of this generation of AI.

 We’ve paid a lot of attention lately to…

Continue

Added by William Vorhies on October 24, 2016 at 3:53pm — No Comments

Monthly Archives

2017

2016

2015

2014

2013

2012

2011

1999

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2017   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service