Subscribe to DSC Newsletter

December 2018 Blog Posts (90)

Weekly Digest, December 31

Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week. To subscribe, follow this link.  

Upcoming DSC Webinars and…

Continue

Added by Vincent Granville on December 30, 2018 at 8:30am — No Comments

Forward and Reverse Containment Logic – Introduction to Relational Data Logistics

The question of how to structure or arrange data in order to gain worthwhile insights is quite different from the issue of what data to include or how it should be analyzed.  I often find myself preoccupied with structure and organization.  In part, this is because of my belief that unique logistics can lead to different types of insights.  Popular these days to the point of gaining industry dominance is holding data as large tables.  If we peer into the code for the application…

Continue

Added by Don Philip Faithful on December 30, 2018 at 6:49am — No Comments

1 Rule Prediction

I originally started writing this notebook to serve as an introduction decision trees. It's a description of the "1-rule" algorithm which I think is worth studying for the following reasons:

  • It’s arguably the simplest and most useful machine learning algorithm you can learn
  • It’s a simple introduction to “decision…
Continue

Added by John Smethurst on December 29, 2018 at 11:00pm — 3 Comments

The "lxml" Package and xpath Expressions for Web Scraping

The Jupyter notebook at this link contains a tutorial that explains how to use the lxml package and xpath expressions to extract data from a webpage.

The tutorial consists of two sections:

  • A basic example to demonstrate the process of downloading a webpage, extracting data with lxml and xpath and analysing it with pandas
  • A comprehensive…
Continue

Added by John Smethurst on December 29, 2018 at 10:30pm — No Comments

Announcement: Winner of the Data Science Central Competition

Back in 2017, we posted a problem related to stochastic processes and controlled random walks, offering a $2,000 award for a sound solution, see here for full details. The problem, which had a FinTech flavor, was only solved recently (December 2018) by Victor Zurkowski.

About the problem:

Let's start…

Continue

Added by Vincent Granville on December 28, 2018 at 9:30am — No Comments

So You Think You Can Be A Data Scientist?

So, you think you can be a data scientist. But, are you sure you have it what it takes to excel in the data science field? Be careful. It’s a very complicated field, and getting competitive…

Continue

Added by Tanmoy Ray on December 28, 2018 at 8:59am — No Comments

In a World of AI Delirium, Data is the Source of Business Value

A report from Accentureclearly highlights those companies that don’t want to capitalize on AI will not survive the future (see Figure 1).  Adopt or die.  Yippee Ki Yay, Mr. Falcon.…

Continue

Added by Bill Schmarzo on December 28, 2018 at 6:06am — No Comments

Google Data Studio in 10 minutes: Step-by-Step Guide

Google Data Studio is a new data visualization tool allowing you to transform your clear and dry data into visually appealing and understandable reports and to be shared then with your colleagues and clients. Using the bar graphs, geographical maps, charts, line charts, etc. you can represent your data and - the most…

Continue

Added by Igor Bobriakov on December 28, 2018 at 5:05am — No Comments

Comparison of the Top Speech Processing APIs

Speech processing is a very popular area of machine learning. There is a significant demand in transforming human speech into text and text into speech. It is especially important regarding the development of self-services in different places: shops, transport, hotels, etc. Machines replace more and more human labor…

Continue

Added by Igor Bobriakov on December 28, 2018 at 4:54am — No Comments

How Airlines & Hotels Profit From Your Data

The past few years have seen a wave of travel companies member accounts compromised which have rocked the industry.

Cathay Pacific had 9.4M accounts compromised.…

Continue

Added by Mark Ross-Smith on December 27, 2018 at 9:30pm — No Comments

Data Science, Common Stocks and V&V

I thought I would follow on my first blog posting with a follow-up on a claim in the post that going returns followed a truncated Cauchy distribution in three ways.  The first way was to describe a proof and empirical evidence to support it in a population study.  The second was to discuss the consequences by performing simulations so that financial modelers using things such as the Fama-French, CAPM or APT would understand the full consequences of that decision.  The third was to discuss…

Continue

Added by David Harris on December 27, 2018 at 7:32pm — No Comments

Predicting the demise of retail bookstores: a time series forecasting

“The internet is killing retail. Bookstores are just the first to go.” -- quoted in this NYT article. Retail bookstores are in a death row; looks like it's just a matter of time for those to be in the museum. eBooks are partly to blame, but with eBook sales leveling off recently, the remaining affect seems to be online book sales, dominated by, with no surpirse!, Amazon. So, exactly how long retail bookstores are going to…

Continue

Added by Mab Alam on December 27, 2018 at 5:00pm — 1 Comment

Why I agree with Geoff Hinton: I believe that Explainable AI is over-hyped by media

 

Geoffrey Hinton dismissed the need for explainable AI. A range of experts have explained why he is wrong.

 

I actually tend to agree with…

Continue

Added by ajit jaokar on December 27, 2018 at 12:30pm — 2 Comments

Thursday News: NLP, Visu, Python, DS Generalist, 2019 Predictions, Random Forests

Happy Holidays! Here is our list of featured articles and resources posted since Monday.

Resources

Continue

Added by Vincent Granville on December 27, 2018 at 9:00am — No Comments

30 Things I learned Organizing South East Asia's Largest Datathon

Continue

Added by Pedro URIA RECIO on December 27, 2018 at 7:00am — No Comments

The Truth about Artificial Intelligence

How will AI evolve and what major innovations are on the horizon? What will its impact be on the job market, economy, and society? What is the path toward human-level machine intelligence? What should we be concerned about as artificial intelligence advances?…

Continue

Added by Packt Publishing on December 26, 2018 at 9:43pm — No Comments

Why You Should be a Data Science Generalist - and How to Become One

The new advice today for data scientists is not to become a generalist. You can read recent articles on this topic, for instance here.  In this blog, I explain why I believe it should be the opposite. I wrote about this…

Continue

Added by Vincent Granville on December 26, 2018 at 4:00pm — 7 Comments

What will make "Data" work in 2019?

Speed of Adoption will matter more than ever

It’s been a while since businesses have been debating over investment into data and analytics. Some people have already done it and it is working out. We are over and above the apprehensions of whether Data investments work or not, now,  the questions is how soon you can make it work. It has to be strategy first and a top down push on getting the data investments to execution and results. It is a herculean task but by now…

Continue

Added by Gaurav Kumar on December 25, 2018 at 7:30pm — No Comments

Developing a Big Data / Data Science / Design College Curriculum with Infographics

For my final class this year at the University of San Francisco School of Management, I taught the students using nothing but infographics.  Not only was it fun for me, but I think the students enjoyed being able to summarize their learnings from the semester through group discussions centered around the infographics.  The infographics provide a visual opportunity to meld the three fundamental concepts that I believe every business leader needs to understand to be…

Continue

Added by Bill Schmarzo on December 25, 2018 at 4:30am — 2 Comments

Text Summarization and Sentiment Analysis: Novel Approach

Huge amount data have been generated every day on various platforms such as Wikipedia, technical and non-technical blogs, social media, online news articles etc. Around five millions of articles are present in Wikipedia alone and every day thousands of new articles are added to it. Due to the huge amount data gathered every day, the users are bombarded with the large volume of data. For the human being, it is difficult to assimilate this huge amount of data. So, effective techniques are…

Continue

Added by Siddhaling Urolagin on December 24, 2018 at 9:24pm — No Comments

Monthly Archives

2019

2018

2017

2016

2015

2014

2013

2012

2011

1999

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service