I created an R package for exploratory data analysis. You can read about it and install it here.
The package contains several tools to perform initial exploratory analysis on any input dataset. It includes custom functions for plotting the data as well as performing different kinds of analyses such as univariate, bivariate and multivariate investigation which is the first step of any predictive modeling pipeline. This package can be used to get a good sense of any dataset before jumping on to building predictive models.
The package is constantly under development and more functionalities will be added soon. Pull requests to add more functions are welcome!
The functions currently included in the package are mentioned below:
numSummary(mydata)
function automatically detects all numeric columns in the dataframe mydata
and provides their summary statisticscharSummary(mydata)
function automatically detects all character columns in the dataframe mydata
and provides their summary statisticsPlot(mydata, dep.var)
plots all independent variables in the dataframe mydata
against the dependant variable specified by the dep.var
parameterremoveSpecial(mydata, vec)
replaces all special characters (specified by vector vec
) in the dataframe mydata
with NA
bivariate(mydata, dep.var, indep.var)
performs bivariate analysis between dependent variable dep.var
and independent variable indep.var
in the dataframe mydata
More functions to be added soon. Any feedback on improving this is welcome!
Posted 12 April 2021
© 2021 TechTarget, Inc.
Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central