There are a lot of great ways to use the R packages we are learning about from DataCamp.
But first -a couple of other options for importing from a URL if your data is zipped or not in csv-
Zipped files—downloads to temp file, unzips csv
#uses read_csv from readr package loaded with tidyverse temp = tempfile() download.file("https://aqsdr1.epa.gov/aqsweb/aqstmp/airdata/annual_all_2011.zip",temp) monitr = read_csv(unz(temp, 'annual_all_2011.csv' )) #unz requires name of file to be opened unlink(temp) remove(temp)
Excel from a URL
I kept running into issues using different packages to read in excel files (readxl, xls, and xlxs ) but they would throw an error because they don’t really like the URL path- sometimes just dropping the ‘s’ from ‘https’ will allow the download to go through- (the gdata commad read.xls works but you have to first download and activate perl on your computer)
Downloading the xls/xlsx to a temp file was the best solution I could find :
url = https://www.epa.gov/sites/production/files/2016-12/2015_o3_naaqs_preliminary_transport_assessment_design_values_contributions.xlsx library(readxl) library(httr) #when downoading from URL save as temp file- then appends file extension so that read_excel recognizes the file. GET(url, write_disk(tf <- tempfile(fileext = ".xlsx"))) noda = read_excel(tf, sheet = 2) # uses library(readxl) not called by tidyverse- remove(tf); remove(url) #clean up
I found a really interesting dataset that broke down the public’s belief in Global Warming (it is mapped at http://climatecommunication.yale.edu/visualizations-data/ycom-us-2016/ ) the paper with backing statistical breakdown as to how the result were computed can be downloaded in full from the TU Library.
I combined the global warming data with the provided election data (party % voting) ran a simple linear regression and found nothing surprising -the more Democratic Party leaning a county went the more likely these counties were to agree that global warming was occurring, that there was a scientific consensus, and that global warming could harm the US.
Here is a scatterplot matrix of the data, the bottom two rows and rightmost two columns which contain the dem% and gop% data. You can see the Dem data roughly following the ‘agree about global warming’ data and the GOP data does the opposite. The scatterplot matrix is a good way to take a quick first look at your data. Scatterplot matrix is called by with the pairs() function that is loaded in the standard packages that open when you start up R.
Not to be deterred by the weight of evidence I kept working with the data, here is an example of the output from the ggplot2 tutorial using the Climate Change/Election data –(County Victor is a new column created using an if/else command that returned Dem or GOP depending on which was the larger value.)
tblAnalysisCC$County_victor = ifelse(tblAnalysisCC$dem_pct_2016 > tblAnalysisCC$gop_pct_2016, "Dem", "GOP")
Here is a group of choropleths comparing the election data to the global warming opinion data (using the choropleth lab as a framework).
QTM EXAMPLE (tmap)
The tmap package demonstrated in the Geospatial DataCamp course has this amazing function called qtm() which creates a choropleth in a single line. It is great for getting a rough idea of what your data will look like – if you like the results of the qtm() call you can then expand the map by recalling it with tm_shape() + tm_*** similar to the methodology of the ggplot2 package.
Assuming that you have a spatial dataframe the choropleth below can be made with the following line of code: [Where shp = spatial dataframe and the fill = is set to the variable to map)
qtm(shp = usmapjoin, fill = "x65_happening")
By wrapping the previous code in save_tmap() the package makes a web map, which is pretty cool.
save_tmap( qtm(shp = usmapjoin, fill = "x65_happening"), filename = "blog.html")
I can't embed the webmap but here is a link: http://42arthurdent.github.io/blog.html
postscript-while looking into the global warming opinion data I was reminded of this gem:
but that then lead me to this horrible revelation that this R package exists:
Which is how this was created (it uses ggplot2 so it is kind of relevant):