Subscribe to DSC Newsletter

All Blog Posts (6,143)

Seduced by the Big Data meme: Hadoop vs the Public Cloud


Currently, Cloudera is in the news for all the wrong reasons(Cloudera stock down 42%)

Since Cloudera now also incorporates Hortonworks – the current issues are just the latest in the Big Data woes. Apparently, the third vendor…


Added by ajit jaokar on June 10, 2019 at 10:30am — No Comments

Big Ideas in AI for the Next 10 Years

Summary:  Despite our concerns about China taking the lead in AI, our own government efforts mostly through DARPA continue powerful leadership and funding to maintain our lead.  Here’s their plan to maintain that lead over the next decade. 

Think all those great ideas that have powered AI/ML for the…


Added by William Vorhies on June 10, 2019 at 8:28am — No Comments

Alternatives to R-squared (with pluses and minuses)

R-squared can help you answer the question "How does my model perform, compared to a naive model?". However, r2 is far from a perfect tool. Probably the main issue is that every data set contains a certain amount of unexplainable data. R-squared can't tell the difference between the explainable and the…


Added by Stephanie Glen on June 10, 2019 at 5:30am — No Comments

Interesting Type of Chart: Hexagonal Binning

This chart communicates the same insights as a contour plot. What is interesting is the choice of hexagonal buckets (rather than squares) to aggregate data. In fact, any tessellation would work, in particular Voronoi tessellations.…


Added by Capri Granville on June 9, 2019 at 8:00am — 1 Comment

Hiring the right data scientist for the organisation

Any organisation needs talented, hardworking and skilled employees irrespective of department, business unit or a team. But finding and nurturing such talent can be challenging sometimes. When it comes to data science field, with rapid change and demand in the technology, many organisations have set up the data science teams. A successful data science team has 3 major strengths, A-availability of data, B- infrastructure and most importantly C - the “right” data scientists. 



Added by Rohit Walimbe on June 9, 2019 at 6:03am — No Comments

Data Science Central Monday Digest, June 10

Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week. To subscribe, follow this link.  



Added by Vincent Granville on June 9, 2019 at 5:30am — No Comments

Growth Modeling for Business Managers and Executives

You don't need a sophisticated model nor advanced machine learning techniques to quickly get a high level picture and trends for bottom-line business metrics. Not only the concepts explained here are easy to grasp, but while being high level, it nevertheless includes granular effects. The methodology presented here was used in business contexts in the past, when I was working with enterprise executives, particularly finance people, to assess the overall health of their business, and the…


Added by Vincent Granville on June 8, 2019 at 4:49pm — No Comments

Free Book: Azure Machine Learning in a Weekend

By Ajit Jaokar and Ayse Mutlu. 

Exclusively for Data Science Central members, with free access. You can download this book (PDF) here

This tutorial is the second book in the ‘in a weekend’ series – after Classification and Regression…


Added by Vincent Granville on June 8, 2019 at 5:22am — No Comments

What is DataOps and Why It’s Critical to the Data Monetization Value Chain

In my previous blog “How DevOps Drives Analytics Operationalization and Monetization”, I discussed the critical and complementary role of DevOps to operationalize and monetize the analytics that came out of the Data Science development process. While the combination of Design Thinking and Data Science accelerate the creation of more effective, more…


Added by Bill Schmarzo on June 6, 2019 at 11:14pm — No Comments

Data Science Central Thursday Digest, June 6

Here is our selection of featured articles and technical resources posted since Monday:



Added by Vincent Granville on June 6, 2019 at 9:30am — No Comments

Hunting for Data: a Few Words on Data Scraping

No matter how intelligent and sophisticated your technology is, what you ultimately need for Big Data Analysis is data. Lots of data. Versatile and coming from many sources in different formats. In many cases, your data will come in a machine-readable format…


Added by Max Ved on June 6, 2019 at 1:29am — No Comments

How to Make Machine Learning Models for Beginners


Data science is one of the hottest topics in the 21st century because we are generating data at a rate which is much higher than what we can actually process. A lot of business and tech firms are now leveraging key benefits by harnessing the benefits of data science. Due to this, data science right now is really booming.

In this blog, we will deep dive into the world of machine learning. We will walk you…


Added by Divya Singh on June 4, 2019 at 8:30pm — No Comments

Neuromorphic Chips and the Future of Your Cell Phone

Summary:  The ability to train large scale CNNs directly on your cell phone without sending the data round trip to the cloud is the key to next gen AI applications like real time computer vision and safe self-driving cars.  Problem is our current GPU AI chips won’t get us there.  But neuromorphic chips look like they will.



Added by William Vorhies on June 4, 2019 at 9:00am — No Comments

Unleashing Artificial Intelligence in Government Services and Operations

A significant sector with direct influence on our lives is the role Government plays in terms of the services it offers to citizens and operations of the government.

In view of the penetration of a data-driven approach to all businesses, there is a clear need to adopt a data-driven approach to government services and operations. Hence the penetration of AI is a case of positive influence on Govt products and services.

Some key areas where government operations and services can…


Added by Mahesh Kumar CV on June 4, 2019 at 6:23am — No Comments

Why Every Hadoop Professional Needs Data Science Skills?

Value of adopting Data Science Skills

Data Science is responsible to provide meaning to the large amounts of complex data called big data. It involves different fields of work in statistics and computation to interpret data for decision-making.

Advances in the internet and social media is increasing access to big data. Extraction of meaningful information requires the use of AI and ML by data science. Big data is used in every…


Added by Yoey Thamas on June 4, 2019 at 2:33am — No Comments

NLP vs. NLU: from Understanding a Language to Its Processing

As artificial intelligence progresses and technology becomes more sophisticated, we expect existing concepts to embrace this change — or change…


Added by Max Ved on June 4, 2019 at 12:30am — No Comments

Discrimination, Data Science and a Call to Action

Four years ago, the software engineer Jack Alciné caused a storm by pointing out to Google that their algorithm had the unsavoury tendency to classify his black friends as Gorillas. Following a public outcry for blatant racism, the giant apologised and diligently ‘fixed’ the problem. Last year Amazon got into hot water by finding its advanced AI hiring software heavily favoured men for technical positions. Again, retraction followed the outcry. In a more newsworthy style, an unfortunate…


Added by Dany Majard on June 3, 2019 at 3:10am — 2 Comments

Simple Trick to Normalize Correlations, R-squared, and so on

Many statistics, such as correlations or R-squared, depend on the sample size, making it difficult to compare values computed on two data sets of different sizes. Here, we address this issue.

Below is an example with 20 observations. The 10 last observations (the second half of the data set) is a mirror of the first 10, and the two correlations, computed on each subset, are identical and equal to  0.30. The full correlation computed on the 20 observations is 0.85.…


Added by Vincent Granville on June 2, 2019 at 7:30am — No Comments

The Call for a New Device for Data Scientists

My first computer was a Commodore Vic-20 in 1981. I bought the device because of this incredible urge to program in BASIC as a result of Mr. Ted Becker’s course on computer programming. I vaguely remember the leap from the pain-staking process of programming using punch cards to writing code and watching your program run immediately, once you resolved all of the syntax errors of course. Nonetheless, it was thrilling and addictive! In hindsight, a…


Added by Richard Charles, PhD on June 2, 2019 at 12:00am — No Comments

The Homogeneity and Location Index: An open-source Statistical Framework for the classification of ordinal categorical data

The analysis and classification of ordinal categorical data are central in most scientific domains and ubiquitous in governments and businesses.

Examples of ordinal data are either found in questionnaires for measuring opinions or self-reported health status. A well-known example of ordinal data is the Likert Scale [1]



Added by Ludovico Pinzari on June 1, 2019 at 3:35pm — No Comments

Blog Topics by Tags

Monthly Archives












  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service