Subscribe to DSC Newsletter

All Blog Posts (6,652)

Statistics for Data Science in One Picture

There's no doubt about it, probability and statistics is an enormous field, encompassing topics from the familiar (like the average) to the complex (…

Continue

Added by Stephanie Glen on December 9, 2019 at 7:48am — No Comments

Data Cleaning and Wrangling With R

Originally posted by Michael Grogan.

One of the big issues when it comes to working with data in any context is the issue of data cleaning and merging of datasets, since it is often the case that you will find yourself having to collate data across multiple files, and will need to rely on R to carry out functions that you would normally carry out using commands like VLOOKUP in Excel.

The…

Continue

Added by Vincent Granville on December 8, 2019 at 7:50pm — No Comments

Weekly Digest, December 9

Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week. To subscribe, follow this link.  …

Continue

Added by Vincent Granville on December 8, 2019 at 3:30pm — No Comments

List of Time Series Methods in One Picture

The picture below was found in some tweets posted by top data science influencers, though its origin is somewhat obscure. 

Many of these methods are described in Wikipedia. Many are also described on Data Science Central, see for instance…

Continue

Added by Capri Granville on December 8, 2019 at 12:30pm — No Comments

Regression analysis using Python

This article was written by Stuart Reid. 

 

This tutorial covers regression analysis using the Python StatsModels package with Quandl integration. For motivational purposes, here is what we are working towards: a regression analysis program which receives multiple data-set names from Quandl.com, automatically downloads the data, analyses it, and plots the results in a new window.…

Continue

Added by Andrea Manero-Bastin on December 8, 2019 at 6:30am — No Comments

Event Distribution as a Subject of Ontological Recognition Criteria

The “Diagnostic and Statistical Manual of Mental Disorders” produced by the American Psychiatric Association is an interesting document from a conceptual standpoint.  In order to count the number of individuals with a particular disorder and to make the numbers comparable regardless of source, there has to be clear criteria guiding the ontology.  This document therefore serves an important ontological purpose. By the way, given that some dictionaries don’t…

Continue

Added by Don Philip Faithful on December 8, 2019 at 6:18am — No Comments

Python: Implementing a k-means algorithm with sklearn

Originally posted by Michael Grogan. 

The below is an example of how sklearn in Python can be used to develop a k-means clustering algorithm.

The purpose of k-means clustering is to be able to partition observations in a dataset into a specific number of clusters in order to aid in analysis of the data. From this perspective, it has particular value from a data visualisation perspective.

This post explains how…

Continue

Added by Vincent Granville on December 6, 2019 at 12:22pm — No Comments

Visualizing New York City WiFi Access with K-Means Clustering

Visualization has become a key application of data science in the telecommunications industry.

Specifically, telecommunication analysis is highly dependent on the use of geospatial data. This is because telecommunication networks in themselves are geographically dispersed, and analysis of such dispersions can yield valuable insights regarding network…

Continue

Added by Vincent Granville on December 6, 2019 at 12:12pm — No Comments

Predicting Hotel Cancellations with Support Vector Machines and SARIMA

This is Part 1 of a three part study on predicting hotel cancellations with machine learning. Originally posted by Michael Grogan. 

Logistic Regression and SVM

Hotel cancellations can cause issues for many businesses in the industry. Not only is there the…

Continue

Added by Vincent Granville on December 6, 2019 at 11:30am — No Comments

Fighting Overfitting in Deep Learning

Problem

While training the model, we want to get the best possible result according to the chosen metric. And at the same time we want to keep a similar result on the new data. The cruel truth is that we can’t get 100% accuracy. And even if we did, the result is still not…

Continue

Added by Igor Bobriakov on December 6, 2019 at 9:00am — No Comments

2020 Forecast: Turning IT into a Profit Center

Data is a unique economic asset; it never depletes, never wears out and can be used across an unlimited number of use cases at near zero marginal cost. Data in the hands of management and operational leadership can be used to drive material, financial, operational and customer impact.  And maybe the best part of this winning data equation? You already own the data! But unfortunately, data is the Rodney Dangerfield of corporate assets – it gets no respect!

To exploit the…

Continue

Added by Bill Schmarzo on December 6, 2019 at 5:30am — No Comments

IoT Trends: Know Anticipation of IoT Development & How It Shapes the Future

Internet of things (IoT) is advanced technologies among others which have completely transformed the whole world. As per the latest report by Gartner, approximately 20 billion devices by 2020 will be connected to the IoT and this IoT based product or…

Continue

Added by Manoj Rupareliya on December 5, 2019 at 11:46pm — No Comments

Thursday News, December 5

Here is our selection of featured articles and technical resources posted since Monday:

Upcoming Webinar

Technical Resources

Continue

Added by Vincent Granville on December 5, 2019 at 12:30pm — No Comments

Visually Explained: How Can Executives Grasp What Programming Is All About?

Quite often, non-technical executives have difficulties understanding what programming, on a very fundamental level, is all about. Because of that knowledge-gap, they tend to hire and overburden experienced data professionals with tasks which they are hopelessly overqualified for. Such as, for example, doing ad-hoc SQL queries on CRM data: "You're the go-to-guy for all things data, and we need the results for the board meeting tomorrow." That's a quite humbling and frustrating…

Continue

Added by Rafael Knuth on December 5, 2019 at 6:30am — No Comments

No Matter What You Call It, It’s all the Same Thing

Summary:  A little history lesson about all the different names by which the field of data science has been called, and why, whatever you call it, it’s all the same thing.

 

A little reminiscence, or for those of you who are only recently data scientists, a little history lesson. 

Our profession of…

Continue

Added by William Vorhies on December 4, 2019 at 3:12pm — No Comments

Free open access book on Industry 4.0, factory automation and Edge

The Digital Shopfloor: Industrial Automation in the Industry 4.0 Era looks like a great free open access book by John Soldatos,  Oscar Lazaro and Franco Antonio Cavadini

The book deals with the transformation of the shop floor and the wider supply chain by the deployment of Industrial IoT

 

Some of…

Continue

Added by ajit jaokar on December 3, 2019 at 12:20pm — No Comments

World’s Best Countries in the Big Data Industry You Must Watch Out For in 2020

By 2027, the big data market is estimated to grow to USD 103 billion. And by 2022, the global big data and analytics market is predicted to grow to USD 274 billion, statistics backed by Statista.

The scarcity of talent in the big data industry is being wooed by hefty pay packages, but only to those with extensive knowledge in big data tools and technologies.…

Continue

Added by Yoey Thamas on December 3, 2019 at 1:41am — No Comments

How to Discover and Classify Metadata using Apache Atlas on Amazon EMR

Introduction

The boundaries of the enterprise are becoming diffused. You have data on the network, on the endpoint, and on the cloud. Enabling visibility into your data flows is a critical first step to understanding which data is at risk for theft or misuse. You need to know what data you have, where it’s located, and why that data exists in order to properly protect it. This is where data discovery and data classification come into…

Continue

Added by Divya Singh on December 2, 2019 at 1:30am — 1 Comment

Reinforcement Learning to Reduce Building Energy Consumption

Heating, ventilation, and air conditioning of buildings accounts alone for nearly 40% of the global energy demand [1].

The need for …

Continue

Added by Enrico Busto on December 1, 2019 at 10:30pm — No Comments

Challenges Faced by a Data Scientist and How to Overcome Them?

Data Scientist is regarded as the sexiest job of the 21st century. It is a high paying lucrative jobs which comes with a lot of responsibility and commitment. Any professional needs to master state-of-the-art skills and technologies to become a Data Scientist in the modern world. It is a profession where people from different disciplines could fit in as there are a plethora of specialties embedded in a Data Scientist role.

Data Science is not a present-day…

Continue

Added by Divya Singh on December 1, 2019 at 8:00pm — No Comments

Blog Topics by Tags

Monthly Archives

2019

2018

2017

2016

2015

2014

2013

2012

2011

1999

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service