Building off my last post, I want to use the same healthcare data to demonstrate the use of R packages. Packages in R are stored in libraries and often are pre-installed, but reaching the next level of skill requires being able to know when to use new packages and what they contain. With that let’s get to our example.
Useful function: gsub
When working with vectors and strings, especially in cleaning up data, gsub makes cleaning data much simpler. In my healthcare data, I wanted to convert dollar values to integers (ie. $21,000 to 21000), and I used gsub as seen below.
Package: reshape2
In looking at the data, I wanted to focus on the Payment estimate. So I used the melt() function that is part of reshape2. Melt allows pivot-table style capabilities to restructure data without losing values.
Package: sqldf
With my data melted, I wanted to get the average estimate for heart attack patients by state. This is a classic SQL query, so bringing in sqldf allows for that.
Now that my data is in perfect shape to visualize with a map overlay, ggplot2 and maps are two other R packages that would be useful. In the future, I’ll look to discuss those as well.
About: Divya Parmar is a recent college graduate working in IT consulting. For more posts every week, and to subscribe to his blog, please click here. He can also be found on LinkedIn and Twitter.
Posted 1 March 2021
© 2021 TechTarget, Inc.
Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central