I have just finished and released a free new R video lecture demonstrating how to use the “Bizarro pipe” to debug magrittr pipelines. I think Rdplyr users will really enjoy it.
In this video lecture I use the “Bizarro pipe” to debug the example pipeline from RStudio’s purrr announcement.
TLDnW (too long, did…Continue
Added by John Mount on January 31, 2017 at 10:30pm — No Comments
Big Data can be intimidating! If you are new to Big Data, please read ‘What is Big Data’, ‘…Continue
Today Trump met with leaders of pharmaceutical companies, to discuss “astronomical” drug prices and reduce regulations, so that drug companies can still make hefty profits while charging less for drugs. The motivation could be to keep the costs of healthcare down to facilitate the…Continue
This is part of a new series of articles: once or twice a month, we post previous articles that were very popular when first published. These articles are at least 6 month old but no more than 12 month old. The previous digest in this series was posted here a while back.
Upcoming DSC Webinar
How to Keep Your R Code Simple While…Continue
Added by Vincent Granville on January 31, 2017 at 10:30am — No Comments
Summary: In this last article in our series on recommenders we look to the future to see how the rapidly emerging capabilities of Deep Learning can be used to enhance recommender performance.
In our first article, “Understanding…Continue
Added by William Vorhies on January 31, 2017 at 9:30am — No Comments
Hot topics like “big data”, “machine learning”, “data science” are now dominating in the scientific community. In the past 10 years alone, data availability has increased exponentially (and not even in a squared, or cubed sort of way… we are talking on the order of 1010 if not more). Exabytes (1018 or one QUINTILLION bytes!!?) of information are being passed, stored, saved and analyzed on a monthly…Continue
Added by Grant Humphries on January 31, 2017 at 6:00am — No Comments
At first I liked tinkering with computers and learn computer programming languages, after graduating high school I started to develop the concept of work on data processing and I've completed it. More recently the IT world the term Deep Learning (DL) number of campuses or institutions have been developing this concept, and many experts of computer data or data processing experts began to talk about it.
I do not know that it is actually a concept I have done resemblance to Deep…Continue
The past few years has been like a dream come true for those who work in…Continue
In this article, we discuss a general framework to drastically reduce the influence of outliers in most contexts. It applies to problems such as clustering (finding centroids,) regression, measuring correlation or R-Squared, and many more. We will focus on the centroid problem here, as it is very similar and generalizes easily to solving a linear regression. The correlation / R-Squared issue was discussed…Continue
Added by Vincent Granville on January 29, 2017 at 10:30pm — No Comments
I periodically use charts containing a crosswave “differential spectrum” or “event horizon.” In this blog, I will explain the nature of the spectrum and the relevance of any apparent bias.
I once mentioned purchasing a machine designed to monitor and reduce sleep apnea. Sleep apnea is when a person stops breathing while sleeping. During a sleep study, I was found to have moderate sleep apnea. Apart from its medical implications, sleep apnea is also a metric. The machine…Continue
I routinely study differences in production between years by charting the data on the same graph. I consider this a popular approach. It makes sense since there is often interest on how the year is shaping up compared to previous years. Moreover, seasonality would be less relevant given that the same seasons are compared between years (assuming the seasons reoccur at around the same time). Below I present some real data from an organization in 1983 comparing production to 1982. I think many…Continue
Added by Don Philip Faithful on January 28, 2017 at 10:00am — No Comments
Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week.
Upcoming DSC WebinarContinue
Added by Vincent Granville on January 28, 2017 at 8:00am — No Comments
The Art and Science of Encrypting, Embedding and Hiding Messages in Pictures and Videos.
This is related to data encryption and security. Imagine that you need to transmit the details of a patent or a confidential financial transaction over the Internet. There are three critical issues:…Continue
Essentially good hypotheses lead decision-makers like you to new and better ways to achieve your business goals. When you need to make decisions such as how much you should spend on advertising or what effect a price increase will have your customer base,…Continue
Accurate multichannel campaign attribution has stumped the online marketing industry for years. But what if the solution is to stop worrying about attribution, and move to an optimization-driven approach?
You know those photo mosaic images, which suddenly became terribly popular a few years back? They cleverly use lots of individual tiny images to make up one large image. If you look closely you can make out the…Continue
Added by Ian Thomas on January 27, 2017 at 9:30am — No Comments
By David Robinson. David Robinson is a data scientist at Stack Overflow. His article (parts of it) was re-posted in the Washington Post, here. This is also a short version that summarizes his analysis. The details and source code can be found on David's website,…Continue
Added by Vincent Granville on January 26, 2017 at 7:30pm — No Comments
By Rubens Zimbres. Rubens is a Data Scientist, PhD in Business Administration, developing Machine Learning, Deep Learning, NLP and AI models using R, Python and Wolfram Mathematica. Click here to check his Github page.…Continue
This article was written by Stephanie and Tony on R2D3.
In machine learning, computers apply statistical learning techniques to automatically identify patterns in data. These techniques can be used to make highly accurate predictions. Using a data set about homes, we will create a machine learning model to distinguish homes in New York from homes in San Francisco.…Continue
As per the largest market research firm MarketsandMarkets the speech analytics industry will grow to USD 1.60 billion by 2020 at a Compound Annual Growth Rate (CAGR) of 22% from 2015 to 2020. Today the omnichannel world consists of voice, email, chat, social channels, and surveys, and each channel has its own importance.
Therefore, it becomes inevitable for any customer centric organization to ignore the information that can be glean…Continue
One more opportunity to implement data mining techniques in the health care industry will be helping the healthcare insurers to detect fraud transactions so that the other patients can receive better and more affordable healthcare services. This occurs when individuals deceive an insurance company to try to obtain money to which they are not entitled. It happens when someone puts false information on an insurance application and when false or misleading information is given or…Continue