Here I list my most interesting contributions published on Data Science Central. My plan is to categorize and aggregate this content to produce a few self-published books. The material below will always be available for free (from this webpage), but the books won't, or if they are, they will be free for members only. So you might want to bookmark this page.
A popular phrase tossed around when we talk about statistical data is “there is correlation between variables”. However, many people wrongly consider this to be the equivalent of “there is causation between variables”. It’s important to explain the distinction: Correlation means that once we know how one variable changes we can make reasonable deductions about how other variables change There are several variants of correlation:
People change jobs, get promoted and move home. Companies go out of business, expand and relocate. Every one of these changes contributes to data decay. It’s been said that business databases degrade by around 30% per year, but why?
A report by IDG states that companies with effective data grow 35% faster year-on-year. However, for this to…Continue
Added by Martin Doyle on December 13, 2016 at 2:00am — No Comments
Enterprise AI insights from the AI Europe event in London
Added by ajit jaokar on December 11, 2016 at 8:30pm — No Comments
The two situations discussed here also apply to marketing (not just advertising), and not just using social networks, but other channels such as Google. The insights provided here are based on careful data analysis, and applicable to websites and blogs with a decent amount of content, trying to build or maintain momentum. The problem discussed here is sometimes referred to as marketing mix optimization, with attribution modeling.
1. To Grow Subscriber or Member…Continue
Added by Vincent Granville on December 11, 2016 at 8:30pm — No Comments
Guest blog by David Stephenson, Ph.D. David is a data science and big data analytics speaker and thought leader. For over 15 years, David has been delivering analytic and risk management tools that have guided $10+ billion in business decisions. Prior to returning to consulting, David led global analytics for eBay Classifieds Group, reaching 30 countries operating under a dozen consumer facing brands and spread over…Continue
Guest blog by Ajit Jaokar. Ajit”s work spans research, entrepreneurship and academia relating to IoT, predictive analytics and Mobility. His current research focus is on applying data science algorithms to IoT applications. This includes Time series, sensor fusion and deep learning (mostly in R/Apache Spark). This research underpins his teaching at Oxford University (Data Science for IoT)…Continue
Added by Vincent Granville on December 11, 2016 at 1:30pm — No Comments
Added by Sandeep Raut on December 10, 2016 at 7:00pm — No Comments
Probably like most people, I tend to recognize data as a stream of values. Notice that I use the term values rather than numbers although in practice I guess that values are usually numerical. A data-logger gathering one type of data would result in data all of a particular type. Perhaps the concept of “big data” surrounds this preconception of data of type except that there are much larger amounts. Consider an element of value in symbolic terms, which I present below: there is an index such…Continue
Added by Don Philip Faithful on December 10, 2016 at 9:30am — No Comments
Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week.
Added by Vincent Granville on December 9, 2016 at 8:30pm — No Comments
Contributed by Jiaxu Luo
The shiny app could be found at https://joshualuo.shinyapps.io/Top_500_Websites/
When Tim Berners-Lee, who could have applied for a patent as the inventor of the World Wide Web, appeared at the opening ceremony of the London Olympics in 2012 and tweeted“This is for everyone", it struck me that our lives would not be as easy and…
Added by NYC Data Science Academy on December 9, 2016 at 1:30pm — No Comments
Contributed by Diego De Lazzari. He is currently in the NYC Data Science Academy 12-week full time Data Science Bootcamp program taking place between July 5th to September 23rd, 2016. This post is based on his second project - R Shiny (due on 4th week of the program). The R code can be found on GitHub while the App is stored on…Continue
You’re late for work and hungry too. You quickly enter a fast food joint and wait in the long queue for your turn. You wish there was a quicker service where your choice would pop up automatically on a system and you could shop based on your past purchase data.
Well, it may not be at your nearest fast food joint right now, but companies like Amazon are tracking your past purchase data and…Continue
Added by Mohammad Farooq on December 8, 2016 at 10:30pm — No Comments
Over the last five years or so, e-commerce has grown hugely around the world, as consumers take advantage of great online pricing, the convenience of shopping from anywhere at any time of the day or night, and the ability to discover a whole raft of products that otherwise would be beyond reach.
However, although it has been getting continually cheaper and easier for…Continue
Added by Melissa Thompson on December 8, 2016 at 12:00pm — No Comments
About the book
You’re convinced that you want to enter into a data science career. You’ve done your research and even started to learn some of the skills needed. But how do you go from an data science enthusiast to a data scientist at your dream company?
What does a data science interview look like? What do recruiters really think of your resume? Where are the data science jobs? Can you improve your odds of getting an interview by employing a few clever…Continue
Added by Emmanuelle Rieuf on December 7, 2016 at 3:00pm — No Comments
Guest blog by Francesco Gadaleta. Francesco is Data Scientist at Janssen Pharmaceutical Companies of Johnson & Johnson and a Science writer. He is committed to “A World Without Disease” paradigm shift in healthcare, leveraging Artificial Intelligence and Data Science to predict risk and intercepting diseases. He is focused on putting machine learning at the service of human beings.
Do you know why you can’t hear the…Continue
Added by Vincent Granville on December 7, 2016 at 9:15am — No Comments
This tutorial was written by Manish Saraswat.
"The road to machine learning starts with Regression. Are you ready?"
If you are aspiring to become a data scientist, regression is the first algorithm you need to
learnmaster. Not just to clear job interviews, but to solve real world problems. Till today, a lot of consultancy firms continue to use regression techniques at a larger scale to…
Added by Emmanuelle Rieuf on December 7, 2016 at 8:30am — No Comments
In five short years the world of analytics has changed immeasurably. Five years ago Hadoop was reaching the peak of the hype cycle and it acted as a wake-up call to businesses that perhaps there is value in their data.
This evolved more recently into big data analytics. Although Hadoop opened a door to big data, it wasn’t actually the right tool for what most businesses needed – fast analytics, interactive experimentation with data and exploratory analysis of…Continue
Added by Aaron Auld on December 7, 2016 at 8:00am — No Comments
Siummary: In 2017, AI and analytics M&A activity will accelerate, data lakes will finally become useful, and data monetization strategies will mature. These are some of the predictions Ramon Chen, CMO of data management innovator, Reltio, has for the coming year.
1. AI and…Continue
Added by Vincent Granville on December 6, 2016 at 7:30pm — No Comments
One of the major selling points of collocating a business’ servers in a data center is the reduced energy consumption. Small businesses have long been sold on the idea of reducing the in-house power resources necessary to operate the network, which in many cases had moved well beyond a simple server on someone’s desk to a dedicated space requiring a specialized HVAC system…Continue