Subscribe to DSC Newsletter

Featured Blog Posts – April 2019 Archive (83)

The Mathematics of Forward and Back Propagation

Understanding the maths behind forward and back propagation is not very easy.

There are some very good – but also very technical explanations.

For example : The Matrix Calculus You Need For Deep Learning Terence Parr and Jeremy Howard is an excellent resource but still too complex for beginners. 

I found a much simpler explanation in the ml cheatsheet.

The…

Continue

Added by ajit jaokar on April 30, 2019 at 9:00pm — No Comments

The Anatomy of K-means

A complete guide to K-means clustering algorithm



Let’s say…

Continue

Added by Diego Lopez Yse on April 30, 2019 at 2:17pm — No Comments

The Complete Guide to Decision Trees

Everything you need to know about a top algorithm in Machine Learning

In the…

Continue

Added by Diego Lopez Yse on April 30, 2019 at 2:00pm — 3 Comments

The Data Science Method (DSM) -Pre-processing and Training Data Development

The Data Science Method (DSM) - Pre-processing and Training Data Development…

Continue

Added by Aiden Johnson on April 29, 2019 at 10:30am — No Comments

What to Tell Your Board About AI/ML

Summary:  Communicating with your Board of Directors about AI/ML is different from conversations with top operating executive.  It’s increasingly likely your Board will want to know more and planning that communication in advance will make your presentation more successful.

 

As AI/ML becomes increasingly…

Continue

Added by William Vorhies on April 29, 2019 at 9:35am — No Comments

The Role of Data Curation in Big Data

Introduction

Good data management practices are essential for ensuring that research data are of high quality, findable, accessible and have high validity. You can then share data ensuring their sustainability and accessibility in the long-term, for new research and policy or to replicate and validate existing research and policy. It is important that researchers extend these practices to their work with all types of data, be it big (large…

Continue

Added by Divya Singh on April 28, 2019 at 8:00pm — No Comments

Technology Use as a Function of Device Type

I’m frustrated! Being a Technophile poses several problems that seem to go unaddressed by leaders of the “Big Tech” firms. As the big firms such as Apple, Google and Samsung continue to develop impressively beautiful, technologically capable, faster and yes, addictive technologies,…

Continue

Added by Richard Charles, PhD on April 28, 2019 at 12:30pm — No Comments

Interweaving Design Thinking and Data Science to Unleash Economic Value of Data

Did you ever have a concept that you knew was right, but just couldn’t find the right words to articulate that concept?  Okay, well welcome to my nightmare.  I know that Data Science and Design Thinking share many common characteristics including the power of “might” (i.e., that “might” be a better predictor of performance), “learning through failing” (which is the only way to determine where the edges of the solution really reside), and the innovation liberation…

Continue

Added by Bill Schmarzo on April 28, 2019 at 11:54am — No Comments

Data Science Central Weekly Digest, April 29

Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week. To subscribe, follow this link.  

Announcement…

Continue

Added by Vincent Granville on April 28, 2019 at 8:30am — No Comments

Determining Number of Clusters in One Picture

If you want to determine the optimal number of clusters in your analysis, you're faced with an overwhelming number of (mostly subjective) choices. Note that there's no "best" method, no "correct" k, and there isn't even a consensus as to the definition of what a "cluster" is. With that said, this picture focuses on three popular methods that should fit almost every need: Silhouette, Elbow, and Gap Statistic.…

Continue

Added by Stephanie Glen on April 28, 2019 at 12:30am — No Comments

Optimizing price, maximizing revenue

Problem statement

Price and quantity sold are the two determinants of business revenue/profit. At higher price the revenue is expected to be high. But this is not the case all the time. We know from our everyday experience, as price of something goes up, people have less tendency to buy it.

The reverse is also true, that is, as price is down, sales goes up (think what happens in a block buster sales event in a nearby shopping…

Continue

Added by Mab Alam on April 27, 2019 at 6:47am — No Comments

Common MapReduce Patterns

This article by Chanchal Singh and Manish Kumar will delve into some of the common MapReduce patterns that will help you work with Hadoop. Chanchal Singh has more than five years of experience in product development and architect design, and Manish Kumar is a technical architect with more than ten years of experience in data management, working as a data architect and product architect.

The design patterns are the solution templates for solving specific problems. Developers…

Continue

Added by Packt Publishing on April 26, 2019 at 7:27pm — No Comments

Top 5 Books on AI and ML to Grab Today

It has been popularly noted that artificial intelligence would be like the ultimate version of Google. With recent advancements in research and technology, Artificial Intelligence (AI) and Machine Learning (ML) are slowly becoming a part of our routine.

The pace at which technology is growing is unfathomable. As these smart technologies engulf our life, staying updated with them is the need of the day. So, here’s Packt’s selection of finest books in artificial intelligence and machine…

Continue

Added by Packt Publishing on April 26, 2019 at 7:13pm — No Comments

How will the Data Scientist’s job change through automated machine learning?

 

 

Introduction

Automated machine learning is a fundamental shift to machine learning and data science. Data science as it stands today, is resource-intensive, expensive and challenging. It requires skills which are in high demand. Automated Machine learning may not quite lead to the beach lifestyle for the data…

Continue

Added by ajit jaokar on April 26, 2019 at 10:51am — No Comments

Naive Bayes in One Picture

Naive Bayes is a deceptively simple way to find answers to probability questions that involve many inputs. For example, if you're a website owner, you might be interested to know the probability that a visitor will make a purchase. That question has a lot of "what-ifs", including time on page, pages visited, and prior visits. Naive Bayes essentially allows you to take the raw inputs (i.e. historical data), sort the data into more meaningful chunks, and input them into a formula. …

Continue

Added by Stephanie Glen on April 25, 2019 at 10:00am — No Comments

Funny: Medical Diagnostic and Treatment Algorithm - IBM Watson

Interesting cartoon featuring the decision tree used in medical diagnosis. To see other cartoons about data science, follow this link.…

Continue

Added by Capri Granville on April 25, 2019 at 9:00am — 1 Comment

Free Book: Lecture Notes on Machine Learning

Lecture notes for the Statistical Machine Learning course taught  at the Department of Information Technology, University of Uppsala (Sweden.) Updated in March 2019. Authors: Andreas Lindholm, Niklas Wahlström, Fredrik Lindsten, and Thomas B. Schön.

Source: page 61 in these lecture notes…

Continue

Added by Capri Granville on April 25, 2019 at 9:00am — No Comments

Frequencies in Pandas Redux

 

A little less than a year ago, I posted a blog on generating multivariate frequencies with the Python Pandas data management library, at the same time showcasing Python/R graphics interoperability. For my…

Continue

Added by steve miller on April 25, 2019 at 5:33am — No Comments

What is the Difference Between Hadoop and Spark?

Hadoop and Spark are software frameworks from Apache Software Foundation that are used to manage ‘Big Data’. There is no particular threshold size which classifies data as “big data”, but in simple terms, it is a data set that is too high in volume, velocity or variety such that it cannot be stored and processed by a single computing…

Continue

Added by Divya Singh on April 24, 2019 at 8:30pm — No Comments

Featured Monthly Archives

2019

2018

2017

2016

2015

2014

2013

2012

2011

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service