# Visualizing Natural Gas Withdrawals, Consumption and Prices

Contributed by Ben Townson. He enrolled in the NYC Data Science Academy 12-week full time Data Science Bootcamp program taking place between July 5th to September 23rd, 2016. This post is based on their second project – R Shiny, due on 4th week of the program. The original article can be found here.

Visualizing Natural Gas Production and Consumption

Background

Natural Gas is used in the United States for electricity generation, residential and commercial heating and cooking, fueling industrial processes, powering vehicles, and for energy production operations.  The United States is one of the largest consumers of Natural Gas, in large part due to the available natural gas reserves in the country, the developed production industry, and the difficulties in storing and transporting the volatile substance.  Most of the gas consumed in the US is produced in the US, with only about 11% of the consumed gas being imported, primarily from Canada via pipelines, and only 8% being exported to Canada or Mexico (first 5 months of 2016).  Because of the difficulties in transporting the fuel, one might expect that natural gas would be primarily consumed near the production source.  We will be able to obtain some evidence for this via this visualization.

Production sources vary as well, and while the majority of gas is produced from traditional wells specifically designed to produce natural gas, it is also produced from oil wells, coal beds, and more recently an increasing number of shale wells.  While production from shale wells has suffered due to the economics of these wells, the amount of gas obtained from shale wells has grown impressively in the last decade, and we may be able to view some interesting trends in shale production via this visualization.

The Data

All data used for this visualization was obtained from the US Energy Information Administration (www.eia.gov).  An immense amount of energy data is available via the EIA website.  To access the data quickly and easily in R, I utilized the EIAdata package for R, developed by Matthew Brigida (EIAdata Link).  The package allows for easy navigation of the numerous categories of data made available by the EIA and a simple mechanism for requesting and receiving the data.  This Shiny application uses time-series data to allow the user to view changes over time; the EIAdata package makes time-series data available in a R xts format, which allows for easy subsetting of the data. Most of the data series were available in monthly increments, but was easier to visualize when grouped into annual totals, which was easily achievable using the xts built-in functions.  Unfortunately, much of the data on withdrawals was only complete through the end of 2014, so the visualization only includes data up through 2014.  Code used to obtain and sort through the data will be available via links below.

All volumes are reported in Billion Cubic Feet (Bcf).  For a detailed explanation of the size of this unit, visit the EIA site, here.

The App

The application is comprised of three parts.  Withdrawals (production) and consumption are viewable in  US states maps in the first two tabs of the application, which allows both visual and anecdotal comparisons of withdrawal and consumption statistics.  Within the Withdrawals Map, a user can view Withdrawals from the various production sources, by year, and can also compare withdrawals between pairs of years.  Similarly, consumption can be viewed by year and changes in consumption between pairs of years can be visualized.  Consumption can be filtered by the various end users.

Monthly time-series data can be visualized in the Charts tab.  Within this tab, the user can view, side-by-side, production of gas from a chosen source-type with consumption of gas by a chosen end-user-type.  This data can be viewed at the US-total level, or by state.  Additionally, a representative price is displayed, utilizing “city-gate” prices provided for each state by the EIA.

Analysis

The application has a lot of data that can be grouped and viewed in many ways, which allows the user to explore trends on a very granular level or on a much higher-level.  To illustrate the power of the application, we will focus on two specific scenarios which can be visualized in the app.  The user is encouraged to search for other trends and draw conclusions.

Example 1: Shale Gas Production Growth

In the Withdrawals Map Tab, select “Shale” from the Withdrawals drop-down.  The user will notice that there is no data prior to 2007 in this category.  This is likely due to the minimal amount of gas withdrawn from shale wells, but I suspect that shale gas was not a reportable category before 2007.  By setting the time scale to 2007, and pressing “Play” on the time scale, one can watch the growth of shale gas production, most specifically in Pennsylvania, where shale gas production grew from zero to over 4.4 million Bcf in 2014.  Total shale production grew from just under 2 million Bcf reported in 2007 (viewable in the information box in the bottom right) to almost 14 million Bcf in 2014.

The change alone can be viewed by changing the “View Statistic” dropdown to “diff” and setting the time scales to 2007 at top and 2014 below.  However, since there was no production reported in 2007, this “change” is equivalent to the total amount produced in 2014.  Alternatively, one can look at the change in total production by changing the Withdrawal Source back to “All Sources” and viewing the map again.  The growth of shale in Pennsylvania becomes obvious here.  The next-closest “increase” was in Texas (rankings can be viewed in the graphic below the map, which will show the 5 biggest increases and 5 biggest decreases), which only had 2 million Bcf of new production in 2014 vs 2007 while Pennsylvania increased total production by 4 million Bcf.

One might wonder if consumption of Natural Gas increased in Pennsylvania in step with production increases.  To view this, the user can switch to the consumption map, change the statistic to view to “diff” and select 2007 and 2014 as the comparison years.  The largest increase in consumption between those years is in Pennsylvania.  By flipping through the consumer types, one can see that most of this increase can be attributed to electricity production, and make a guess that locally-produced gas is displacing coal as the fuel source for electricity production.

Finally, the user can view the month-over-month changes by switching to the charts tab.  Select PA in the state selection at the bottom, and one can immediately see the increase in total withdrawals, and can possibly detect an increase in consumption.  Note the seasonality of demand for gas.  IN the US Northeast, natural gas is used for heating in winter, hence the increased demand in the cold winter period.  As we noted earlier, the increase is most obvious in electricity production, which interestingly does not show as much seasonality.

Finally, one can view the time-series of price in the same period by looking to the bottom of this page.  The local price of gas has decreased in step with this growth in production, and despite an increase in consumption.  However, the decrease in price corresponds to a price decrease in gas across the US, and we require further statistical analysis before attempting to explain this price trend.

Example 2: Hurricanes in the Gulf

As anyone familiar with the natural gas industry knows, hurricanes in the US Gulf cause problems for production and lead to price spikes for natural gas.  As a second example, we will visualize the phenomenon.  The year 2005 was known for the two hurricanes, Katrina and Rita, which caused widespread damage and loss-of-life.  Natural gas production and distribution was disrupted as well.

To visualize the effects, one can start by looking at the withdrawals map, and viewing the change in withdrawals between 2014 and 2015.  Louisiana, which took a direct hit from Katrina, shows a decrease in total production of 67,000 Bcf.  Total US withdrawal decreased by 500,000 Bcf from 2014-2015.

The consumption map shows a decrease in consumption in Texas of over 400,000 Bcf.  One might be able to conclude that this decrease was related to decreases in production in the area due to hurricanes.

The Charts provide even more insight.  By selecting Total production and setting both time scales to 2005 (which allows us to visualize start-end of 2005), the user will see a sharp decrease in production in September of 2005.  By expanding the timeframe in either direction, one can see that production in September of 2005 was the lowest monthly production since 1993, and that low level of production has not been seen again.  The decrease corresponds with a massive spike in the price for natural gas.

Filtering by state, one can see that production in Louisiana declined sharply in September of 2005.  A similar decrease in withdrawals can be viewed in the “Federal Offshore” area by choosing that area in the state-dropdown.

Conclusion

This app provides the interested user a number of ways to view the production and consumption of natural gas in the United States.  A user can investigate the impact of specific events or trends and can develop theories for further research and statistical analysis.  Other things to explore include the changing trends in gas production from coal beds, which might inspire the user to further research coal production and how changing trends in coal production could affect natural gas production.  A user might also be interested in exploring the relationships between production, consumption and prices in their home state over specific time frames.  The monthly granularity of the charts could allow an interested user to investigate how annual changes might be affected by spikes or valleys in specific months relating to individual events.  Or, one might be interested in simply looking at the growth of natural gas production in the US since 1991.  The possibilities are seemingly endless.

Server.R code:

library(shinydashboard)library(googleVis)library(xts)  st.codes=readRDS('st.codes.rds') other.codes=readRDS('other.codes.rds') cn.codes=readRDS('cn.codes.rds') other.con.codes=readRDS('other.con.codes.rds') total.codes=readRDS('total.codes.rds') total.con.codes=readRDS('total.con.codes.rds') cnsm.codes=readRDS('cnsm.codes.rds') total.cnsm.codes=readRDS('total.cnsm.codes.rds') price.codes=readRDS('price.codes.rds') total.price.codes=readRDS('total.price.codes.rds') source('helpers.R')  shinyServer(function(input,output){   #Render Information Page            # Reactive Functions for Data Selection      dateInput <- reactive({     return(as.character(input$daterange)) }) dateInput2 <- reactive({ return(as.character(input$daterange2))   })      texasInput <- reactive({     if (input$texas==TRUE) { return(TRUE) } else{ return(FALSE) } }) second_Cond <-reactive({ if (input$data_type=='sum'){       return(1900)     } else {       return(as.character(input$daterange3)) } }) color_codes <-reactive({ if (input$menu=="map_con") {       if (input$data_type=='sum'){ return(as.character("{values:[0,1000000,4000000],colors:['white','lightgreen','green']}")) } else { return(as.character("{values:[-500000,0,500000],colors:['red','white','green']}")) } } else { if (input$data_type=='sum'){         return(as.character("{values:[0,1000000,7500000],colors:['white','lightgreen','green']}"))       } else {         return(as.character("{values:[-5000000,0,5000000],colors:['red','white','green']}"))       }     }   })      timerangeInput <-reactive({     return(paste0(input$daterange,'/',input$daterange2))   })    timerangeInputAnn <-reactive({     return(paste0(input$daterange-1,'/',input$daterange2))   })      #Render Input/Selections      output$filter <- renderUI({ if (input$menu!="info") {       if (input$menu=="map_con") { selectInput("category","View Consumption By:",c('Total','Residential', 'Industrial','Commercial', 'Electricity')) } else { selectInput("category","View Withdrawals From:",c('All Sources','Gas Wells', 'Oil Wells','Shale', 'Coal Beds')) } } }) output$filter1 <- renderUI({     if (input$menu!="info") { if (input$menu=="charts") {         selectInput("category2","View Consumption By:",c('Total','Residential',                                                          'Industrial','Commercial',                                                          'Electricity'))       } else {         selectInput("data_type","View Statistic:",c('sum','diff'))       }       }   })      output$filter2 <- renderUI({ if (input$menu!="info") {       sliderInput("daterange", "Date range:",                   min = 1991,                   max = 2014,                   value = 2005,animate=TRUE)     }   })      output$filter3 <- renderUI({ if (input$menu!="info") {       if (input$menu=="charts") { sliderInput("daterange2", "Date range:", min = 1992, max = 2014, value = 2014) } else { if (input$data_type == 'sum') {           checkboxInput("texas","Exclude Texas")         } else if (input$data_type == 'diff') { sliderInput("daterange3", "Date range:", min = 1992, max = 2014, value = 2014) } } } }) output$filter4 <- renderUI({     if (input$menu!="info") { if (input$menu=="charts") {         selectInput("chart_filters","Select State:",c('All','Offshore',levels(st.codes$state))) } } }) #Render Withdrawals Maps output$map <- renderGvis({     colorStr=color_codes()     gvisGeoChart(render_data(st.codes,input$category,dateInput(),texasInput(),second_Cond()),"full",input$data_type,                  options=list(region="US",displaymode="regions",                               resolution="provinces",width="auto",                               height=600,defaultColor="#808080",colorAxis=colorStr))   })      output$mapData <- renderGvis({ if (input$data_type=='sum') {       gvisColumnChart(sortdata(as.data.frame(render_data(st.codes,input$category,dateInput(),texasInput(),second_Cond())),10),"state",input$data_type,                       options=list(dataOpacity=0.5))     } else {       gvisColumnChart(sortdiffdata(as.data.frame(render_data(st.codes,input$category,dateInput(),texasInput(),second_Cond())),5),"state",input$data_type,                       options=list(dataOpacity=0.5))     }   })      output$other_regions <- renderPlot({ plot_data=sortdata(as.data.frame(render_data(other.codes,input$category,dateInput(),FALSE,second_Cond())),4)     if (input$data_type=='sum'){ barplot(height=plot_data$sum,names.arg=plot_data$state,width=60,space=0.5,col=c('darkgreen','lightgreen'),options(scipen=9),legend=FALSE) } else { barplot(height=plot_data$sum,names.arg=plot_data$state,width=60,space=0.5,col=c('darkgreen','lightgreen'),options(scipen=7),legend=FALSE) } }) output$total_US <- renderText({     if (input$data_type=='sum') { paste("Total US Withdrawals: ",format(render_data(total.codes,input$category,dateInput(),texasInput(),second_Cond())[3]))     } else{       paste("Total US Withdrawals: ",format(render_data(total.codes,input$category,dateInput(),texasInput(),second_Cond())[4])) } }) #Render Consumption Map output$map_cons <- renderGvis({     colorStr=color_codes()     gvisGeoChart(render_data(cnsm.codes,input$category,dateInput(),texasInput(),second_Cond()),"full",input$data_type,                  options=list(region="US",displaymode="regions",                               resolution="provinces",width="auto",                               height=600,defaultColor="#808080",colorAxis=colorStr))   })      output$other_con_regions <- renderPlot({ plot_data=sortdata(as.data.frame(render_data(other.con.codes,'Consumption',dateInput(),FALSE,second_Cond())),4) if (input$data_type=='sum'){       barplot(height=plot_data$sum,names.arg=plot_data$state,width=60,space=0.5,col=c('darkgreen','lightgreen'),options(scipen=9),legend=FALSE)     } else {       barplot(height=plot_data$sum,names.arg=plot_data$state,width=60,space=0.5,col=c('darkgreen','lightgreen'),options(scipen=7),legend=FALSE)     }   })      output$map_consData <- renderGvis({ if (input$data_type=='sum') {       gvisColumnChart(sortdata(as.data.frame(render_data(cnsm.codes,input$category,dateInput(),texasInput(),second_Cond())),10),"state",input$data_type,                       options=list(dataOpacity=0.5))     } else {       gvisColumnChart(sortdiffdata(as.data.frame(render_data(cnsm.codes,input$category,dateInput(),texasInput(),second_Cond())),5),"state",input$data_type,                       options=list(dataOpacity=0.5))     }   })      output$total_con_US <- renderText({ if (input$data_type=='sum') {       paste("Total US Consumption: ",format(render_data(total.cnsm.codes,input$category,dateInput(),texasInput(),second_Cond())[3])) } else{ paste("Total US Consumption: ",format(render_data(total.cnsm.codes,input$category,dateInput(),texasInput(),second_Cond())[4]))     }   })      #Render Charts      output$chart <- renderPlot({ if (input$chart_filters=='All') {       plot.xts(chart_data(total.codes,input$category,timerangeInput(),'US'),main=paste('Total',input$category,'US Natural Gas Withdrawals, Bcf'))       } else if (input$chart_filters=='Offshore') { plot.xts(chart_data(other.codes,input$category,timerangeInput(),'FX'),main='Federal Offshore Gas Withdrawals, Bcf')     } else {       plot.xts(chart_data(st.codes,input$category,timerangeInput(),input$chart_filters),main=paste(input$chart_filters,input$category,'Natural Gas Withdrawals, Bcf'))     }   })      output$chart2 <- renderPlot({ if (input$chart_filters=='All') {       plot.xts(chart_data(total.cnsm.codes,input$category2,timerangeInput(),'US'),main='Total US Natural Gas Consumption, BCf') } else if (input$chart_filters=='Offshore') {       plot.xts(chart_data(other.con.codes,'Consumption',timerangeInputAnn(),'3F'),main='Federal Offshore Natural Gas Consumption, Bcf')     } else {       plot.xts(chart_data(cnsm.codes,input$category2,timerangeInput(),input$chart_filters),main=paste(input$chart_filters,input$category2,'Natural Gas Consumption, BCf'))     }   })      output$chart3 <- renderPlot({ if (input$chart_filters=='All') {       plot.xts(chart_data(total.price.codes,'Price',timerangeInput(),'US'),main='Total US Natural Gas Price')       } else if (input$chart_filters=='Offshore') { plot.xts(chart_data(total.price.codes,'Price',timerangeInput(),'US'),main='Total US Natural Gas Price') } else { plot.xts(chart_data(price.codes,'Price',timerangeInput(),input$chart_filters),main=paste(input$chart_filters,'Natural Gas Price')) } }) })  Ui.R Code: library(shinydashboard)library(googleVis)library(xts) shinyUI(dashboardPage( dashboardHeader(title = "Natural Gas"), dashboardSidebar( sidebarUserPanel("Production and Consumption"), sidebarMenu(id="menu", menuItem("Info", tabName = "info",icon=icon("info")), menuItem("Natural Gas Withdrawals Map", tabName = "map_tot",icon=icon("map")), menuItem("Natural Gas Consumption Map", tabName = "map_con", icon=icon("map-o")), menuItem("Charts", tabName = "charts",icon=icon("line-chart")), uiOutput('filter'), uiOutput('filter1'), uiOutput('filter2'), uiOutput('filter3'), uiOutput('filter4') ) ), dashboardBody( tabItems( tabItem(tabName = "info",fluidRow( box(width=12, HTML("<h2>Natural Gas Production and Consumption Mapping</h2>"), HTML("<h3>Purpose</h3>"), HTML("<p>This application is designed to allow a user to visualize the production and consumption of Natural Gas in the United States, on a state-by-state basis.</p>"), HTML("<h3>Data</h3>"), HTML("<p>All data is from the US EIA (Energy Information Administration).</p>"), HTML("<h3>Source</h3>"), HTML("<a href='http://www.eia.gov/beta/MER/'>EIA Data Links</a></br>") ))), tabItem(tabName="map_tot", verticalLayout( box(htmlOutput("map"),height=600,width=1000), box(htmlOutput("mapData"),height=200, width=800)), absolutePanel(id = "controls", class = "panel panel-default", fixed = TRUE, draggable = TRUE, top = "auto", left = "auto", right = 20, bottom = 220, width = 220, height = "auto", textOutput("total_US"), plotOutput("other_regions",height=300))), tabItem(tabName="map_con", verticalLayout( box(htmlOutput("map_cons"),height=600,width=1000), box(htmlOutput("map_consData"),height=200, width=800)), absolutePanel(id = "controls2", class = "panel panel-default", fixed = TRUE, draggable = TRUE, top = "auto", left = "auto", right = 20, bottom = 220, width = 220, height = "auto", textOutput("total_con_US"), plotOutput("other_con_regions",height=300))), tabItem(tabName="charts", verticalLayout( box(plotOutput("chart"),height=400,width=1000), box(plotOutput("chart2"),height=400,width=1000), box(plotOutput("chart3"),height=400,width=1000) )) ) ) ))  Helper.R Code: render_data <- function(DT,cat,Cond,texas,Cond1) { if (Cond1==1900) { Cond1=as.character(as.numeric(Cond)+1) } return_data = NULL data_cat=NULL data_cat_prev=NULL for (i in 1:nrow(DT)) { return_data$full[i]=levels(DT$full)[DT$full[i]]     return_data$state[i]=levels(DT$state)[DT$state[i]] if (cat=="All Sources"){ data_cat = DT$NG_Gross_Withdrawals_All[[i]][Cond]       data_cat_prev = DT$NG_Gross_Withdrawals_All[[i]][Cond1] } if (cat=="Shale"){ data_cat = DT$NG_Gross_Withdrawals_From_Shale[[i]][Cond]       data_cat_prev = DT$NG_Gross_Withdrawals_From_Shale[[i]][Cond1] } if (cat=="Coal Beds"){ data_cat = DT$NG_Gross_Withdrawals_From_Coal_Beds[[i]][Cond]       data_cat_prev = DT$NG_Gross_Withdrawals_From_Coal_Beds[[i]][Cond1] } if (cat=="Oil Wells"){ data_cat = DT$NG_Gross_Withdrawals_From_Oil_Wells[[i]][Cond]       data_cat_prev = DT$NG_Gross_Withdrawals_From_Oil_Wells[[i]][Cond1] } if (cat=="Gas Wells"){ data_cat = DT$NG_Gross_Withdrawals_From_Gas_Wells[[i]][Cond]       data_cat_prev = DT$NG_Gross_Withdrawals_From_Gas_Wells[[i]][Cond1] } if (cat=="Residential"){ data_cat = DT$residential[[i]][Cond]       data_cat_prev = DT$residential[[i]][Cond1] } if (cat=="Industrial"){ data_cat = DT$industrial[[i]][Cond]       data_cat_prev = DT$industrial[[i]][Cond1] } if (cat=="Commercial"){ data_cat = DT$commercial[[i]][Cond]       data_cat_prev = DT$commercial[[i]][Cond1] } if (cat=="Electricity"){ data_cat = DT$electricity[[i]][Cond]       data_cat_prev = DT$electricity[[i]][Cond1] } if (cat=="Vehicles"){ data_cat = DT$vehicle[[i]][Cond]       data_cat_prev = DT$vehicle[[i]][Cond1] } if (cat=="Total"){ data_cat = DT$consumers[[i]][Cond]       data_cat_prev = DT$consumers[[i]][Cond1] } if (cat=="Consumption"){ data_cat = DT$Total_Consumption[[i]][Cond]       data_cat_prev = DT$Total_Consumption[[i]][Cond1] } return_data$sum[i]=sum(data_cat)     return_data$diff[i]=sum(data_cat_prev)-sum(data_cat) } return_data$sum[which(return_data$sum==0)]=NA return_data$diff[which(return_data$diff==0)]=NA if (texas==TRUE) { return_data$sum[return_data$state=='TX']=NA return_data$diff[return_data$state=='TX']=NA } return(return_data) } sortdata <- function(D,retnum) { D = D[with(D, order(D$sum,decreasing=TRUE)), ]   return(D[1:retnum,]) } sortdiffdata <- function(D,retnum) {   D = D[with(D, order(D$diff,decreasing=TRUE)), ] D1 = D[with(D, order(D$diff,decreasing=FALSE)), ]   D = D[1:retnum,]   D1 = D1[retnum:1,]   return(rbind(D,D1)) }  chart_data <- function(DT,cat,cond,state) {   data_cat=NULL   for (i in 1:nrow(DT)) {     if (state==DT$state[i]) { if (cat=="All Sources"){ data_cat = DT$NG_Gross_Withdrawals_All[[i]][cond]       }       if (cat=="Shale"){         data_cat = DT$NG_Gross_Withdrawals_From_Shale[[i]][cond] } if (cat=="Coal Beds"){ data_cat = DT$NG_Gross_Withdrawals_From_Coal_Beds[[i]][cond]       }       if (cat=="Oil Wells"){         data_cat = DT$NG_Gross_Withdrawals_From_Oil_Wells[[i]][cond] } if (cat=="Gas Wells"){ data_cat = DT$NG_Gross_Withdrawals_From_Gas_Wells[[i]][cond]       }       if (cat=="Total"){         data_cat = DT$consumers[[i]][cond] } if (cat=="Commercial"){ data_cat = DT$commercial[[i]][cond]       }       if (cat=="Residential"){         data_cat = DT$residential[[i]][cond] } if (cat=="Industrial"){ data_cat = DT$industrial[[i]][cond]       }       if (cat=="Electricity"){         data_cat = DT$electricity[[i]][cond] } if (cat=="Vehicles"){ data_cat = DT$vehicle[[i]][cond]       }       if (cat=="Consumption"){         data_cat = DT$Total_Consumption[[i]][cond] } if (cat=="Price"){ data_cat = DT$price[[i]][cond]       }       return(data_cat)     }   } }