Featured Blog Posts – April 2018 Archive (75)

Why Are Data Science Leaders Running for the Exit?

Guest blog post by Edward Chenard, Contributor at DataScience.com.

I've had several conversations recently with people I know in the data science space that always start out about business and then drift to the state of data science as a whole. One theme constantly comes up in these conversations: There are a lot of people currently running data…


Added by Vincent Granville on April 16, 2018 at 5:30pm — 17 Comments

Elements of Modern Data Science, AI, Big Data and ML

Guest blog post by Michael Li, Head of Analytics and Data Science at LinkedIn.

I’m sure everyone who has been following tech industry news knows about “big data” and “AI.” Although there is no industry-consistent definition for either term, most people tend to agree…


Added by Vincent Granville on April 16, 2018 at 5:00pm — 4 Comments

Unsupervised Learning an Angle for Unlabelled Data World

This is our second post in this sub series “Machine Learning Types”. Our master series for this sub series is “Machine Learning Explained”.

Unsupervised Learning; is one of three types of machine learning i.e. Supervised Machine Learning, Unsupervised…


Added by Vinod Sharma on April 16, 2018 at 8:00am — No Comments

Data Dictionary to Meta Data II -- Simple Text Wrangling and Factor Creation in R

My blog last week articulated a first shot at automating the creation of meta data…


Added by steve miller on April 16, 2018 at 6:30am — No Comments

44 Original Data Science and Machine Learning Articles

Written exclusively for Data Science Central, by Vincent Granville. These articles are intended for non-experts, written in simple English, and particularly suited for professionals managing a data science team, or for practitioners interested in the field of data science and machine learning. These articles (and more) will soon be combined in several booklets available exclusively for…


Added by Vincent Granville on April 14, 2018 at 11:00am — No Comments

Weekly Digest, April 16

Monday newsletter published by Data Science Central. Previous editions can be found here.  The contribution flagged with a + is our selection for the picture of the week.

  • Optimize your career with UVA’s M.S. in Business Analytics. In just twelve months, you’ll develop the skills you need to leverage analytics and drive…

Added by Vincent Granville on April 14, 2018 at 9:00am — No Comments

Technical Boundary Analysis

About a month ago, I posted a blog on “Technical Deconstruction.” I described this as a technique to break down aggregate data to distinguish between its contributing parts: these parts might contain unique characteristics compared to the aggregate.  For instance, I suggested that it can be helpful to break down data by workday - that is to say, maintaining separate data for each day of the week.  I said that the data could be further deconstructed perhaps by time period and employee: the…


Added by Don Philip Faithful on April 14, 2018 at 8:00am — No Comments

Simple Trick to Prevent Cambridge Analytica and Others to Hack into Facebook Data

Cambridge Analytica was caught tampering with elections by exploiting Facebook, but chances are that this is the tip of the iceberg, and that many others, including scammers and ID thieves, are also exploiting Facebook and other social networks. One way that they do this is as follows.

Cambridge Analytica website (front page) -…


Added by Vincent Granville on April 14, 2018 at 7:30am — No Comments

Tutorial: Multistep Forecasting with Seasonal ARIMA in Python

When trend and seasonality is present in a time series, instead of decomposing it manually to fit an ARMA model using the Box Jenkins method, another very popular method is to use the seasonal autoregressive integrated moving average (SARIMA) model which is a generalization of an ARMA model. SARIMA models are denoted SARIMA(p,d,q)(P,D,Q)[S], where S refers to the number of periods in each season, d is the degree of differencing (the number of times the…


Added by Kostas Hatalis on April 12, 2018 at 10:30am — 1 Comment

Book: Blockchain Basics: A Non-Technical Introduction in 25 Steps

In 25 concise steps, you will learn the basics of blockchain technology. No mathematical formulas, program code, or computer science jargon are used. No previous knowledge in computer science, mathematics, programming, or cryptography is required. Terminology is explained through pictures, analogies, and metaphors.

This book bridges the gap that exists between purely technical books about the blockchain and purely business-focused books. It does so by explaining both the technical…


Added by Capri Granville on April 12, 2018 at 6:30am — 1 Comment

Machine Learning with Signal Processing Techniques

Stochastic Signal Analysis is a field of science concerned with the processing, modification and analysis of (stochastic) signals.

Anyone with a background in Physics or Engineering knows to some degree about signal analysis techniques, what these technique are and how they can be used to analyze, model and classify signals.

Data Scientists coming from a different fields, like Computer Science or Statistics, might not be aware of the analytical power these techniques bring with…


Added by Ahmet Taspinar on April 12, 2018 at 6:00am — No Comments

From Data Dictionary to Meta Data with Simple Text Wrangling in Python

My last DSC blog left me a bit disappointed. While the loads of the beefy household and population files for the American Community Survey worked well, the data, just about entirely integer, represents categorical attributes whose meta info is not…


Added by steve miller on April 11, 2018 at 12:00pm — No Comments

Learning Rules in Neural Network

What are the Learning Rules in Neural Network?

Learning rule or Learning process is a method or a mathematical logic. It improves the Artificial Neural Network's performance and applies this rule over the network. Thus learning rules updates the weights and bias levels of a network when a network simulates in a specific data environment.

Applying learning rule is an iterative process. It helps a…


Added by Sheetal Sharma on April 10, 2018 at 7:00pm — No Comments

Natural Language Understanding (NLU) in Fraud Risk Management – a case study

I.  Introduction

This is a continuation of my previous blog, “Natural Language Understanding – Application Notes with Context Discriminant”. 


Natural Language Understanding (NLU) is a subtopic of Natural Language Processing (NLP). Successful implementations of NLU are difficult because of limitations in prevailing technology. SiteFocus solved these limitations with a new approach to NLU. This approach has been successfully…


Added by Sing Koo on April 10, 2018 at 1:30pm — 1 Comment

Is Your Organization Ready for Data Science?

 Summary:  To take advantage of data science, an organization needs to consider their data quality and accessibility, and the willingness of their staff to use the results of data analysis results.  Most importantly, an organization must have a clear understanding of how it expects to benefit from data science.


Can data science benefit your organization?  If so, is your organization ready to take advantage of it?


“Data science” has…


Added by Stephen R Poulin on April 10, 2018 at 9:00am — 1 Comment

Automated Deep Learning – So Simple Anyone Can Do It

Summary:  There are several things holding back our use of deep learning methods and chief among them is that they are complicated and hard.  Now there are three platforms that offer Automated Deep Learning (ADL) so simple that almost anyone can do it.


There are several things holding back our use of deep learning methods…


Added by William Vorhies on April 10, 2018 at 8:18am — 1 Comment

Researching and introducing kinetic energies

Kinetic energy also called (Information Energy) for random vectors (features) is basicaly the analogous of kinetic energy from physics in probability.Some people say it is an entropy just like Shannon entropy for measuring bits of information to determine uncertainty. It is also an entropy , but the correct way to think about it is to think at it as 1/2∗m∗v2 of random vector.

It was discovered by Octav Onicescu and it is described ad simple sum of squared probabilities. For a trivial…


Added by Daia Alexandru on April 10, 2018 at 7:30am — No Comments

A New Way to Compare and Reorganize Data

In real-world daily business routines, it is common that data that comes from different sources is of the same structure. Sometimes each set of data is independent and there isn’t any overlapping, like the sales data each branch office exports from their own database. Other times data overlaps heavily. In a common complete business process, it is most probably that all systems and sections input data based on their store of data. To compare the overlapped data and find and…


Added by JIANG Buxing on April 10, 2018 at 12:30am — No Comments

Two Beautiful Mathematical Results - Part 2

In Part 1 of this article (see here) we featured the two results below, as well as a simple way to prove these formulas.

Here, we continue on the same topic, featuring and proving the formulas below, which are just the tip of the…


Added by Vincent Granville on April 9, 2018 at 8:00pm — No Comments

Is it ‘always’ necessary to treat outliers in a machine learning model?

Outliers is one of those issues we come across almost every day in a machine learning modelling. Wikipedia defines outliers as “an observation point that is distant from other observations.” That means, some minority cases in the data set are different from the majority of the data. I would like to classify outlier data in to two main categories: Non-Natural and Natural.

The non-natural outliers are those which are caused by measurement errors,…


Added by Rohit Walimbe on April 9, 2018 at 2:30am — 1 Comment

Featured Monthly Archives












© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service