Subscribe to DSC Newsletter

Sunil Kappal's Blog (12)

R Squared Value Demystified

As we all know in today’s world of quick results and insights nobody wants to spend time in understanding the core concepts of certain statistical terms while performing analytical routine. One statistical term that is talked a lot but known very less in terms of its mechanics is R Squared statistics a.k.a. coefficient of determination. This statistics helps to measure the closeness of the data to the fitted line of regression.

It is also worth mentioning that by squaring the…

Continue

Added by Sunil Kappal on July 11, 2018 at 3:35am — No Comments

Sentiment Analytics Symposium 2017

Continue

Added by Sunil Kappal on May 30, 2017 at 8:27am — No Comments

“Multicollinearity” a Problem or an Opportunity?

Multicollinearity (Collinearity) is not a new term especially when dealing with multiple regression models. This phenomenon of relationship in between one response variable with the set of predictor variables also include models like classification and regression trees as well as neural networks. Collinearity is infamously famous for inflating the variance of at least one estimated regression coefficient, which can cause the model to predict erroneously and in a business setup it can have an…

Continue

Added by Sunil Kappal on March 6, 2017 at 10:00am — No Comments

Deducer Tutorial: Creating Linear Model using R Deducer Package

Linear Model better known as linear regression is one of the most common and flexible analysis framework to identify relationship between two or more variables. The widely used linear model is represented by drawing the best fit line through a series of data points represented on a scatter plot. 

For any budding business analyst this should be the starting point to understand how model works at the very core of its design.

Selecting the Variables in Deducer…

Continue

Added by Sunil Kappal on February 28, 2017 at 7:00am — No Comments

Useful R Packages that Aligns with The CRISP DM Methodology

As we all know CRISP DM stands for Cross Industry Standard Process for Data Mining is a process model that outlines the most common approach to tackle data driven problems. Per the poll conducted by KDNuggets in 2014 this was and “is” one of the most popular and widest used methodology. This method of gleaning insights out of the data is very dear to the industry experts and data miners.

As the title suggest I will align some of the most useful R packages with this most popular and…

Continue

Added by Sunil Kappal on February 6, 2017 at 8:00am — 1 Comment

Fraud analysis using speech analytics and Monte Carlo

As per the largest market research firm MarketsandMarkets the speech analytics industry will grow to USD 1.60 billion by 2020 at a Compound Annual Growth Rate (CAGR) of 22% from 2015 to 2020. Today the omnichannel world consists of voice, email, chat, social channels, and surveys, and each channel has its own importance.

Therefore, it becomes inevitable for any customer centric organization to ignore the information that can be glean…

Continue

Added by Sunil Kappal on January 25, 2017 at 8:00am — 3 Comments

HealthCare Industry’s Savior: Using the Benford's Law (The Law of First Digit) to Debunk the Fraudsters

As the world is getting more tech savvy and advancements made in the information technology especially in the healthcare industry has opened areas in data mining and machine learning. Within the area of data mining one technique which has gained a lot of popularity as well as skepticism among the auditors and fraud detectives is Benford’s Law or “The Law of First digit.

In the past some researchers in Canada used the Benford’s Law distribution to detect anomalies within the claims…

Continue

Added by Sunil Kappal on January 16, 2017 at 5:12am — 5 Comments

Simple Guide for Selecting Statistical Tests When Comparing Groups

Selecting the right statistical test can prove to be a daunting task for anyone. This infographic presents a step by step approach for the test selection process. This way of looking at various conditions to pick the appropriate tests will allow the audience to visualize and remember the process easily. 

However, it is also very important to have the basic understanding of statistics, related terms and concepts. It will not be a wrong statement to make that the correct statistical…

Continue

Added by Sunil Kappal on January 4, 2017 at 5:30am — 4 Comments

The Most Common Analytical and Statistical Mistakes

It is not only about understanding about statistics, it is also about implementing the correct statistical approach or method. In this brief article I will showcase some common statistical blunders that we generally make and how to avoid them.

To make this information simple and consumable I have divided these errors into two parts:

  1. Data Visualization Errors (Erroneous Graphs)
  2. Statistical Blunders Galore (pun intended)

 

Data…

Continue

Added by Sunil Kappal on January 3, 2017 at 7:00am — 4 Comments

A Statistical Approach to Pick the Most Impactful Advertising Channel for Product Marketing

In the wake of ZMOT (Zero Moment of Truth) it becomes pivotal for any product company to choose the most appropriate advertisement channel for the promotion of their products. This not only helps the organizations to maximize their chances of creating the best first impression but will also help them to be discovered by today’s tech savvy consumers.

Today we will talk about a very…

Continue

Added by Sunil Kappal on December 30, 2016 at 8:30am — No Comments

How to create a Best-Fitting regression model?

Best Subset Regression method can be used to create a best-fitting regression model. This technique of model building helps to identify which predictor (independent) variables should be included in a multiple regression model(MLR).

This method comprises of scrutinizing all of the models created from all possible permutation combination of predictor variables. This technique uses the R Squared value to check for the best model. Considering the level of complexity involved in creating…

Continue

Added by Sunil Kappal on December 29, 2016 at 8:00am — 7 Comments

Naive Bayes Process At a Glance

Added by Sunil Kappal on December 27, 2016 at 7:00am — No Comments

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service