Last time I wrote on using Python/Pandas as an adjunct to loading PostgreSQL tables. In this sequel, I demo how R can be used to collaborate with the database in a similar way.
The strategy adopted remains the same as in Part I: 1) use R-data.table to load a small subset of data into dataframes to determine datatypes; 2) leverage that data.table to create a relational database table in PostgreSQL; 3) generate bulk load sql copy commands along with shell scripts based on meta-data and csv files; 4) execute the shell scripts using a system command to load data with the efficient copy statements.
The R version is presented below. A proof of concept only, there's no exception/error handling in the code. Hopefully, the ideas presented resonate.
The technology used is Windows 10 with PostgreSQL 10.9, JupyterLab 0.35.4, and R 3.6.0, along with R packages data.table 1.12.2, tidyverse 1.2.1, and RPostgreSQL 0.6-2.
Read the entire blog here
© 2019 Data Science Central ®
Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central