Added by Stephanie Glen on December 9, 2019 at 7:48am — No Comments
Originally posted by Michael Grogan.
One of the big issues when it comes to working with data in any context is the issue of data cleaning and merging of datasets, since it is often the case that you will find yourself having to collate data across multiple files, and will need to rely on R to carry out functions that you would normally carry out using commands like VLOOKUP in Excel.
Added by Vincent Granville on December 8, 2019 at 7:50pm — No Comments
Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week. To subscribe, follow this link. …
Added by Vincent Granville on December 8, 2019 at 3:30pm — No Comments
The picture below was found in some tweets posted by top data science influencers, though its origin is somewhat obscure.
Many of these methods are described in Wikipedia. Many are also described on Data Science Central, see for instance…Continue
Added by Capri Granville on December 8, 2019 at 12:30pm — No Comments
This article was written by Stuart Reid.
This tutorial covers regression analysis using the Python StatsModels package with Quandl integration. For motivational purposes, here is what we are working towards: a regression analysis program which receives multiple data-set names from Quandl.com, automatically downloads the data, analyses it, and plots the results in a new window.…Continue
Added by Andrea Manero-Bastin on December 8, 2019 at 6:30am — No Comments
The “Diagnostic and Statistical Manual of Mental Disorders” produced by the American Psychiatric Association is an interesting document from a conceptual standpoint. In order to count the number of individuals with a particular disorder and to make the numbers comparable regardless of source, there has to be clear criteria guiding the ontology. This document therefore serves an important ontological purpose. By the way, given that some dictionaries don’t…Continue
Added by Don Philip Faithful on December 8, 2019 at 6:18am — No Comments
Originally posted by Michael Grogan.
The below is an example of how sklearn in Python can be used to develop a k-means clustering algorithm.
The purpose of k-means clustering is to be able to partition observations in a dataset into a specific number of clusters in order to aid in analysis of the data. From this perspective, it has particular value from a data visualisation perspective.
This post explains how…Continue
Added by Vincent Granville on December 6, 2019 at 12:22pm — No Comments
Visualization has become a key application of data science in the telecommunications industry.
Specifically, telecommunication analysis is highly dependent on the use of geospatial data. This is because telecommunication networks in themselves are geographically dispersed, and analysis of such dispersions can yield valuable insights regarding network…Continue
Added by Vincent Granville on December 6, 2019 at 12:12pm — No Comments
This is Part 1 of a three part study on predicting hotel cancellations with machine learning. Originally posted by Michael Grogan.
Hotel cancellations can cause issues for many businesses in the industry. Not only is there the…Continue
Added by Vincent Granville on December 6, 2019 at 11:30am — No Comments
While training the model, we want to get the best possible result according to the chosen metric. And at the same time we want to keep a similar result on the new data. The cruel truth is that we can’t get 100% accuracy. And even if we did, the result is still not…Continue
Added by Igor Bobriakov on December 6, 2019 at 9:00am — No Comments
Data is a unique economic asset; it never depletes, never wears out and can be used across an unlimited number of use cases at near zero marginal cost. Data in the hands of management and operational leadership can be used to drive material, financial, operational and customer impact. And maybe the best part of this winning data equation? You already own the data! But unfortunately, data is the Rodney Dangerfield of corporate assets – it gets no respect!
To exploit the…Continue
Added by Bill Schmarzo on December 6, 2019 at 5:30am — No Comments
Internet of things (IoT) is advanced technologies among others which have completely transformed the whole world. As per the latest report by Gartner, approximately 20 billion devices by 2020 will be connected to the IoT and this IoT based product or…Continue
Added by Manoj Rupareliya on December 5, 2019 at 11:46pm — No Comments
Here is our selection of featured articles and technical resources posted since Monday:
Added by Vincent Granville on December 5, 2019 at 12:30pm — No Comments
Quite often, non-technical executives have difficulties understanding what programming, on a very fundamental level, is all about. Because of that knowledge-gap, they tend to hire and overburden experienced data professionals with tasks which they are hopelessly overqualified for. Such as, for example, doing ad-hoc SQL queries on CRM data: "You're the go-to-guy for all things data, and we need the results for the board meeting tomorrow." That's a quite humbling and frustrating…Continue
Added by Rafael Knuth on December 5, 2019 at 6:30am — No Comments
Summary: A little history lesson about all the different names by which the field of data science has been called, and why, whatever you call it, it’s all the same thing.
Our profession of…Continue
Added by William Vorhies on December 4, 2019 at 3:12pm — No Comments
The Digital Shopfloor: Industrial Automation in the Industry 4.0 Era looks like a great free open access book by John Soldatos, Oscar Lazaro and Franco Antonio Cavadini
The book deals with the transformation of the shop floor and the wider supply chain by the deployment of Industrial IoT
Added by ajit jaokar on December 3, 2019 at 12:20pm — No Comments
By 2027, the big data market is estimated to grow to USD 103 billion. And by 2022, the global big data and analytics market is predicted to grow to USD 274 billion, statistics backed by Statista.
The scarcity of talent in the big data industry is being wooed by hefty pay packages, but only to those with extensive knowledge in big data tools and technologies.…Continue
Added by Yoey Thamas on December 3, 2019 at 1:41am — No Comments
The boundaries of the enterprise are becoming diffused. You have data on the network, on the endpoint, and on the cloud. Enabling visibility into your data flows is a critical first step to understanding which data is at risk for theft or misuse. You need to know what data you have, where it’s located, and why that data exists in order to properly protect it. This is where data discovery and data classification come into…Continue
Heating, ventilation, and air conditioning of buildings accounts alone for nearly 40% of the global energy demand .
The need for …
Added by Enrico Busto on December 1, 2019 at 10:30pm — No Comments
Data Scientist is regarded as the sexiest job of the 21st century. It is a high paying lucrative jobs which comes with a lot of responsibility and commitment. Any professional needs to master state-of-the-art skills and technologies to become a Data Scientist in the modern world. It is a profession where people from different disciplines could fit in as there are a plethora of specialties embedded in a Data Scientist role.
Data Science is not a present-day…Continue
Added by Divya Singh on December 1, 2019 at 8:00pm — No Comments