Subscribe to DSC Newsletter

February 2017 Blog Posts (85)

Book: Data for Business Performance

Long title: The Goal-Question-Metric (GQM) Model to Transform Business Data into an Enterprise Asset.

Today, digitization is dramatically changing the business landscape, and many progressive organizations have started to treat data as a valuable business asset. While many enterprises are investing in improved data management, only a…


Added by Emmanuelle Rieuf on February 22, 2017 at 12:30pm — No Comments

How and Why: Decorrelate Time Series

When dealing with time series, the first step consists in isolating trends and periodicites. Once this is done, we are left with a normalized time series, and studying the auto-correlation structure is the next step, called model fitting. The purpose is to check whether the underlying data follows some well known stochastic process with a similar auto-correlation structure, such as ARMA processes, using tools such as…


Added by Vincent Granville on February 21, 2017 at 11:00pm — No Comments

For tax purposes, how do you define a robot?

More and more people are talking about the new economy, and in particular, the role played by robots. As jobs are being eliminated and replaced by robots, governments are losing tax money. There are discussions as to whether robots should be taxed. …


Added by Vincent Granville on February 21, 2017 at 5:00pm — 1 Comment

In Search of Artificial General Intelligence (AGI)

Summary:  Looking beyond today’s commercial applications of AI, where and how far will we progress toward an Artificial Intelligence with truly human-like reasoning and capability?  This is about the pursuit of Artificial General Intelligence (AGI).


There is no question that we’re making a lot of progress in artificial intelligence (AI).  So much so that we are rapidly approaching or have already arrived at a plateau in development where more effort is…


Added by William Vorhies on February 21, 2017 at 8:30am — No Comments

Data Matching – Entity Identification, Resolution & Linkage

Data matching is the task of identifying, matching, and merging records that correspond to the same entities from several source systems. The entities under consideration most commonly refer to people, places, publications or citations, consumer products, or businesses. Besides data matching, the names most prominently used are record or data linkage, entity resolution, object identification, or field matching.

A major challenge in data matching is the lack of common entity…


Added by Raghavan Madabusi on February 20, 2017 at 2:30pm — No Comments

Python vs R: 4 Implementations of Same Machine Learning Technique

Actually, this is about two R versions (standard and improved), a Python version, and a Perl version of a new machine learning technique recently published here. We asked for help to translate the original Perl script to Python and R, and finally decided to work with …


Added by Vincent Granville on February 20, 2017 at 1:30pm — 4 Comments

Storytelling And Bot Making

Posted with permission from author: Vaisagh Viswanathan

Chatbot frameworks, toolkits and the rest make it easy to build bots these days. And everyone seems to building some kind of bot or the other. How do you design a good chatbot? How does that have anything to do with storytelling?…


Added by Sudhanshu Ahuja on February 20, 2017 at 4:30am — No Comments

Top Hadoop Interview Questions & Answers

Q1. What exactly is Hadoop?

A1. Hadoop is a Big Data framework to process huge amount of different types of data in parallel to achieve performance benefits.

Q2. What are 5 Vs of Big Data ?

A2. Volume – Size of the data

Velocity – Speed of change of data

Variety – Different types of data : Structured, Semi-Structured, Unstructured data.

Q3. Give me examples of Unstructured data.

A3. Images, Videos, Audios etc.

Q4. Tell me about Hadoop file system…


Added by Sarvesh Kumar on February 20, 2017 at 1:30am — No Comments

Executive Guide to Artificial Intelligence

 Only Homo sapiens, of all the descendants of Homo erectus, survived on earth whereas other species such as homo soloensis, homo denisova, Homo neanderthalensis, Homo floresiensis faded away more than 40,000 years ago. What advantages did Homo sapiens possess that helped them to flourish while other species are extinct? Apparently a cognitive revolution (according to Prof. Yuval Harari in his famous book Sapiens) triggered by some kind of genetic mutation provided Homo Species with more…


Added by Amith Parameshwara on February 19, 2017 at 7:30am — No Comments

Timestamp Data Visualization by Matplotlib

A large volume of timestamp data is a reality, this is common when we are dealing with networked devices. Typically a network of devices generate a large number of alerts. Mining of alert dataset provides insights about the network .

Recently, I came across a situation where a business user was looking for a multidimensional visualization of timestamp data.  Data was  about a network  of thousand plus devices and alarms  generated from the devices  about the status of the network  - …


Added by Jishnu Bhattacharya on February 19, 2017 at 7:00am — 3 Comments

Weekly Digest, February 20

Monday newsletter published by Data Science Central. Previous editions can be found here.  The contribution flagged with a + is our selection for the picture of the week.

Upcoming DSC Webinar


Added by Vincent Granville on February 18, 2017 at 10:30am — No Comments

Analytics as Value lever in Oil and Gas industry

Over the decades, oil and gas companies have built their core skills in many areas such as engineering innovation, project execution, process management, risk management etc.  These core capabilities have been traditionally serving as the key value levers for companies in this sector. As benefits from these levers reach plateau, along with pressure from oil price, policy risks, political risks etc, these companies are looking at fortifying these levers with big data analytics as well as…


Added by Amith Parameshwara on February 18, 2017 at 9:30am — No Comments

Internal Capacity, External Demand, and the Metrics of Consumption

In my blogs, I often distinguish between event data and metrics.  I usually say something to the effect that events help to explain the metrics - or events “provide the story behind the metrics.”  In this blog, I will be discussing two competing lines of thought behind events:  internal capacity and external demand.  Why do sales appear much lower for the month of June compared to July?  Some explanations relating to internal capacity are as follows:  “There weren’t enough agents in June to…


Added by Don Philip Faithful on February 18, 2017 at 6:30am — No Comments

Big Data analytics in India - an opportunity worth choosing

Big Data, in other terms, is known as IoT (Internet of Things). Technically, Big Data can be defined a huge amount of data, but it has a broader meaning. In case you consider IoT, here Big Data refers to devices, data and connectivity. Big Data analytics has a growing market to fit into and its career scope varies as well.…


Added by Great Learning on February 16, 2017 at 6:30pm — 2 Comments

The Twilight Zone Between True and False

Recently we read a lot about fake news, alternate facts and journalism lies. Companies like Facebook develop data science algorithms to detect these postings, based among other things on crowd sourcing (collective intelligence.)

But can the data scientist, with her inquisitive mind and strong sense of numbers and probabilities, use her brain to assess how true a piece…


Added by Vincent Granville on February 16, 2017 at 4:30pm — No Comments

Thursday News: ML, Data Engineering, Python, Model Selection, AI

Here is our selection of featured articles and resources posted since Monday:


Added by Vincent Granville on February 16, 2017 at 9:00am — No Comments

The Mathematics of Machine Learning

Guest blog post by Wale Akinfaderin, PhD Candidate in Physics. 

In the last few months, I have had several people contact me about their enthusiasm for venturing into the world of data science and using Machine Learning (ML) techniques to probe statistical regularities and build impeccable data-driven products. However, I've observed that some actually lack the necessary mathematical intuition and…


Added by Vincent Granville on February 15, 2017 at 8:00pm — 7 Comments

How Uber Depends on Data Analytics to Deliver Extreme Customer Service – Face To Face With Uber’s Chief Data Architect

From a simple limo hailing app for friends to the world’s go-to taxi app. Uber’s growth in the approximately 7 years of existence can be described by one word, “Phenomenal”.

But there’s another way to define Uber, one that not many have given thought to.  Uber is a Big Data company, on the likes of Google and Amazon. It not only uses existing…


Added by Raj Dalal on February 14, 2017 at 7:00pm — No Comments

Indicator Based Recommenders – The One We Missed

Summary:  In our recent article on “5 Types of Recommenders” we failed to mention Indicator-Based Recommenders.  These have some unique features and ease of implementation that may be important in your selection of a recommender strategy.


A few weeks ago in the midst of our series on recommenders we published an article “5 Types of Recommenders” in which…


Added by William Vorhies on February 14, 2017 at 9:38am — 1 Comment

Data Engineering vs. Data Science Infographic

If you're interested in the field of analytics, you've probably heard the terms Data Engineering and Data Science, but do you know the difference? Although there has historically been considerable overlap between the two professions, they are each becoming more distinct. Here is  an infographic to help you understand the skills and responsibilities of each role. You'll also get a chance to compare salaries, popular software and tools used by each, and some educational resources to…


Added by Jake Moody on February 14, 2017 at 8:30am — 2 Comments

Blog Topics by Tags

Monthly Archives













  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service