Hi All! I wanted to share the data that I downloaded to R, how I cleaned it and a quick map that I made based on the choropleth lab to test everything. A quick overview though, I am using a dataset from the EPA that discusses the amounts of specific chemicals in the air by county in the US. In addition to this, there is also a significant amount of population data. I felt that this really made it a well rounded set to use. As a note, the state map portion came from the ACS data that was imported prior in class.
Download directly to R:
urlEQuality = "https://edg.epa.gov/data/Public/ORD/NHEERL/EQI/Eqidata_all_domains_2014March11.csv"
cEQualityColClasses = c("stfips"="character")
dfEQuality = read.csv(urlEQuality, colClasses=cEQualityColClasses)
row.names(dfEQuality) = dfEQuality$stfips
Cut down columns (there were well over 200 when it was originally downloaded with all kinds of chemicals and contaminants that I didn’t know well enough to use):
dfEQuality pct_no_eng, med_hh_inc, pct_vac_units, pct_rent_occ, a_pb_ln, a_so2_mean_ln,
a_pm10_mean_ln, a_pm25_mean, a_no2_mean_ln, a_o3_mean_ln, a_co_mean_ln,
radon_zone, herbicides_ln, fungicides_ln, insecticides_ln)
Rename columns to something less annoying and faster to type:
dfEQuality pm25_ln=a_pm25_mean, no2_ln=a_no2_mean_ln, o3_ln=a_o3_mean_ln, co_ln=a_co_mean_ln)
At this point I was working on taking the data out of the ln form but decided to wait for the time being so I didn’t have a series of extremely small numbers. But if I were to do that at some point the base code would be:
select(dfEQuality, exp(a_pb_ln), exp(a_so2_ln),...)
Create classes that can be used to map some preliminary images to make sure that the data uploaded correctly:
o3 = dfEQuality$o3_ln
unemp = dfEQuality$pct_unemp
co = dfEQuality$co_ln
Create choropleth maps (because this is from the choropleth lab I won’t get to in-depth with the steps):
dfEpsg = make_EPSG()
prj4 = dfEpsg[which(dfEpsg$code == 2260),"prj4"]
spdfCounty = spTransform(spdfCounty, CRS(prj4))
intClasses = 6
ciFisher = classIntervals(o3, n=intClasses, style="fisher")
ciEqual = classIntervals(o3, n=intClasses, style="equal")
ciQuantile = classIntervals(o3, n=intClasses, style="quantile")
colRamp = brewer.pal(intClasses, "YlGnBu")
cbind(unemp, findInterval(o3, ciFisher$brks), findInterval(o3, round(ciFisher$brks, -3)), findColours(ciFisher, colRamp))
options(scipen=10)
plot(ciFisher, colRamp, main="Fisher-Jenks Classification")
plot(ciEqual, colRamp, main="Equal Interval Classification")
plot(ciQuantile, colRamp, main="Quantile Classification")
plot(spdfCounty, bg="white", col=findColours(ciFisher, colRamp))
title("Amount of Ozone Present During Air Quality Assessments")
strLegend = paste(
"$", format(round(ciFisher$brks[-(intClasses + 1)]), big.mark=","), " - ",
"$", format(round(ciFisher$brks[-1]), big.mark=","), sep=""
)
legMain = legend(
"topleft", legend=strLegend,
title="Ozone (O3), 2014", bg="white", inset=0, cex=0.5,
fill=colRamp
)