We will use a R package called rvest which was created by Hadley Wickham. This package simplifies the process of scraping web pages.…
Added by Deepanshu Bhalla on February 26, 2018 at 9:15am — No Comments
The following links describe a set of free SAS tutorials which help you to learn SAS programming online on your own. It includes tutorials for data exploration and manipulation, predictive modeling and some scenario based examples.
SAS (Statistical analysis system) is one of the most popular software for data analysis. It is widely used for various purposes such as data management, data mining, report writing, statistical analysis, business modeling, applications development and data…Continue
Added by Deepanshu Bhalla on June 27, 2017 at 9:00am — No Comments
R language is the world's most widely used programming language for statistical analysis, predictive modeling and data science. It's popularity is claimed in many recent surveys and studies. R programming language is getting powerful day by day as number of supported packages grows. Some of big IT companies such as Microsoft and IBM have also started developing packages on R and offering enterprise version of R.
Added by Deepanshu Bhalla on June 12, 2017 at 12:30am — No Comments
This article explains how to select important variables using boruta package in R. Variable Selection is an important step in a predictive modeling project. It is also called 'Feature Selection'. Every private and public agency has started tracking data and collecting information of various attributes. It results to access to too many predictors for a predictive model. But not every variable is important for prediction of a particular task. Hence it is essential to…Continue
It's a complete tutorial on data wrangling or manipulation with R. This tutorial covers one of the most powerful R package for data wrangling i.e. dplyr. This package was written by the most popular R programmer Hadley Wickham who has written many useful R packages such as ggplot2, tidyr etc. It's one of the most popular R package as of date. This post includes several examples and tips of how to use dply package for cleaning and transforming data.…Continue
Added by Deepanshu Bhalla on February 6, 2017 at 8:00am — No Comments
This tutorial describes theory and practical application of Support Vector Machines (SVM) with R code. It's a popular supervised learning algorithm (i.e. classify or predict target variable). It works both for classification and regression problems. It's one of the sought-after machine learning algorithm that is widely used in data science competitions.
What is Support Vector Machine?
The main idea of support vector machine is to…
Added by Deepanshu Bhalla on January 16, 2017 at 7:30am — No Comments
R is a free programming language for data analysis, statistical modeling and visualization. It is one of the most popular tool in predictive modeling world. Its popularity is getting better day by day. In 2016 data science salary survey conducted by O'Reilly, R was ranked second in a category of programming languages for data science (SQL ranked first). In another popular KDnuggets Analytics software survey poll, R scored top rank with 49% vote. These survey polls answers the question about…Continue
Added by Deepanshu Bhalla on January 1, 2017 at 9:45am — No Comments