This article explains how to select important variables using boruta package in R. Variable Selection is an important step in a predictive modeling project. It is also called 'Feature Selection'. Every private and public agency has started tracking data and collecting information of various attributes. It results to access to too many predictors for a predictive model. But not every variable is important for prediction of a particular task. Hence it is essential to…Continue
It’s not easy for a retailer to face ongoing economic challenges. The power of the customers is rising. They have the right to choose the best, and they are not happy with anything less. The competition in every single industry is brutal. We’re not exaggerating when we say that every business battles for survival.
In this war of competitors, data analytics are the most effective weapon. In February 2017, JDA Software Group and PwC (PricewaterhouseCoopers) released…
Added by Robert Morris on May 29, 2017 at 5:30am — No Comments
Summary: Someone had to say it. In my opinion R is not the best way to learn data science and not the best way to practice it either. More and more large employers agree.
(Photo credit: Rob Lavinsky, iRocks.com – CC-BY-SA-3.0)
In 1945, Count ,Richard Taaffe* a Dublin gem collector, was sorting through a set of spinel gems that he had bought, and found one…Continue
Added by Peter Bruce on March 30, 2017 at 2:30pm — No Comments
This is a project I've been working on for some time to help improve the missed opportunity rate (no-show rate) at all medical centers. It demonstrates how to extract datasets from an SQL server and load them directly into an R environment. It also demonstrates the entire machine learning process, from engineering new features, tuning and training the model, and finally measuring the model's performance. I would like to share my results and methodology as a guide to help…Continue
Added by James Marquez, MBA, PMP on March 21, 2017 at 8:30am — No Comments
Summary: Count yourself lucky if you’re not in one of the regulated industries where regulation requires you to value interpretability over accuracy. This has been a serious financial weight on the economy but innovations in Deep Learning point a way out.
As Data Scientists we tend to take as gospel that more accuracy is better. There…Continue
Added by William Vorhies on February 28, 2017 at 9:21am — No Comments
Accurate multichannel campaign attribution has stumped the online marketing industry for years. But what if the solution is to stop worrying about attribution, and move to an optimization-driven approach?
You know those photo mosaic images, which suddenly became terribly popular a few years back? They cleverly use lots of individual tiny images to make up one large image. If you look closely you can make out the…Continue
Added by Ian Thomas on January 27, 2017 at 9:30am — No Comments
R is a free programming language for data analysis, statistical modeling and visualization. It is one of the most popular tool in predictive modeling world. Its popularity is getting better day by day. In 2016 data science salary survey conducted by O'Reilly, R was ranked second in a category of programming languages for data science (SQL ranked first). In another popular KDnuggets Analytics software survey poll, R scored top rank with 49% vote. These survey polls answers the question about…Continue
Added by Deepanshu Bhalla on January 1, 2017 at 9:30am — No Comments
Data analytics is a mature discipline at this point, and even those outside the data science world generally understand what it’s all about. Modern data science, however, is still new enough to spur questions. Vincent Glanville, Executive Data Scientist at Data Science Central, spoke with Roy Wilds, Chief Data Scientist from PHEMI, a Vancouver-based big data startup, about the best way to educate people…Continue
Added by Roy Wilds, PhD, PHEMI Systems on December 6, 2016 at 8:00am — No Comments
What is Data Mining?
Data mining is an integrated application in the Data Warehouse and describes a systematic process for pattern recognition in large data sets to identify conclusions and relationships. Using statistical methods, or genetic algorithms, data files can be automatically searched for statistical anomalies, patterns or rules.
Wikipedia defines Data Mining as “Data mining is an interdisciplinary subfield of computer science. It is the computational process…Continue
This article on going deeper into regression analysis with assumptions, plots & solutions, was posted by Manish Saraswat. Manish who works in marketing and Data Science at Analytics Vidhya believes that education can change this world. R, Data Science and Machine Learning keep him busy.
Regression analysis marks the first step in predictive modeling. No doubt, it’s fairly easy to implement. Neither it’s syntax nor its parameters create any kind of confusion. But,…Continue
Added by Emmanuelle Rieuf on September 8, 2016 at 12:00pm — No Comments
In this day and age, you can’t go a day without hearing terms such as “data science,” “big data,” or “analytics.” These terms have been thrown around to apply to so many situations that the original meaning of these words is lost.
So, what does it take for any organization to be successfully data-driven? Although analytics may seem complicated, the solution comes from simplicity.
I believe it comes down to four things, as I’ve illustrated below: business need, clean data…Continue
Return on Investment (ROI) is defined as the ratio of a return (benefit or net profit) over the investment of resources that generated this return. Both the return and the investment are typically expressed in monetary units, whereas the ROI is calculated as a percentage.
ROI formula: (Return – Investment)/Investment
It’s typically expressed as a percentage, so multiply your results by 100.…Continue
Added by Amy Porras on August 12, 2016 at 2:30am — No Comments
“Alone we can do so little and together we can do much” - a phrase from Helen Keller during 50's is a reflection of achievements and successful stories in real life scenarios from decades. Same thing applies with most of the cases from innovation with big impacts and with advanced technologies world. The machine Learning domain is also in the same race to make predictions and classification in a more accurate way using so called ensemble method and it is…Continue
Added by Valiance Solutions on August 11, 2016 at 12:00am — No Comments
Summary: To ensure quality in your data science group, make sure you’re enforcing a standard methodology. This includes not only traditional data analytic projects but also our most advanced recommenders, text, image, and language processing, deep learning, and AI projects.
A Little HistoryContinue
Can Pre-hire Talent Assessments Be a Part of a Predictive Talent Acquisition Strategy?
Over the past 30+ years, businesses have spent billions on talent assessments. Many of these are now being used to understand job candidates. Increasingly, businesses are asking how (or if) a predictive talent acquisition strategy can include the use of pre-hire…
Added by PIYASHI BHATTACHARYYA on July 21, 2016 at 5:00am — No Comments
This post highlights a number of important applications found for deep learning so far. It is well known that 80% of data is unstructured. Unstructured data is the messy stuff every quantitative analyst tries to traditionally stay away from. It can include images of accidents, text notes of loss adjusters, social media comments, claim documents and review of medical doctors etc. Unstructured data has massive potential but has never been traditionally considered as a source of insight before.…Continue
Added by Syed Danish Ali on June 26, 2016 at 5:00am — No Comments
There has been a lot of activity recently around revenue attribution - marketers want to develop a better understanding of their customer acquisition funnel and be able to measure progress against it. Most of this attention has been focused on the B2C space. However, less work has been done measuring the performance of B2B marketing activities.
Certainly the marketing automation segment is very vibrant with a large number of vendors (both big and small) providing solutions that…Continue
Added by Gregory Thompson on May 23, 2016 at 4:33pm — No Comments
Marketing measurement has long been an arcane field - companies interested in understanding how their marketing programs impacted revenue (or brand value) would hire expensive consultants who labored long and hard to deliver complex models at great cost to help their clients set high level marketing strategies and advertising budgets.
This worked well until the internet came along and changed the game - new digital channels and online marketing techniques were embraced by…Continue
Added by Gregory Thompson on May 19, 2016 at 11:00am — No Comments