library(sp)
library(raster)
library(maps)
Day 20 favorite
Code source inspriation
A course I took on Species Distribution modeling (sdm). I wrote up my course project into a bookdown book. This was my first foray into making maps in R. https://eeholmes.github.io/Species-Dist-Modeling—Trillium/
Set-up Vermont and New Hampshire boundaries
There is surely an easier way to do this.
First I need to define an raster::extent
object for a box bounding NH and VT.
<- raster::extent(-73.61056, -70.60205, 42.48873, 45.37969) NHVT
I download the shapefile for the NH and VT state borders using getData()
which gives polygons for countries. Level 1 will be the state boundaries (I assume). The shape file has all the states. Then I use subset()
to get the two states that I want. path
says where to save the downloaded file.
<- raster::getData('GADM', country='USA', level=1, path="data")
usashp <- subset(usashp, NAME_1 %in% c("New Hampshire", "Vermont")) nhvtshp
Check the projection for this shapefile:
crs(nhvtshp)
I can plot the shapes.
plot(nhvtshp, border="blue", axes=TRUE)
I save the shapefile data to a file so I can use it later.
save(nhvtshp, NHVT, file=here::here("content", "data",
"shapefiles.RData"))
Download Trillium records
This map will use the following libraries:
library(dismo) # as sdm package
library(sp)
library(here)
Load the shapefiles created earlier.
load(file=here::here("content", "data",
"shapefiles.RData"))
Download records
I will download occurrence data for Trillium grandiflorum and Trillium undulatum in my NHVT bounding box from the Global Biodiversity Information Facility. nrecs
seems to be ignored. geo
means only points with longitude and latitude. removeZeros
means get rid of NA in location. ext
is the bounding box to use.
First I set where I will save the file and check if it is already there. I do this because if I rerun this script, I don’t want to re-download. Note that GBIF data is updated weekly so using a time-stamp on your file might be good, but I am not doing that for this example.
<- here::here("content", "data", "trillium_presences.RData") filePath
Now I download if I haven’t downloaded already because this takes awhile. The downloaded data has many columns that I don’t need. I will subset the following columns. select
in the subset()
call says what columns to use.
if (!file.exists(filePath)) {
# Download
<- dismo::gbif("Trillium",
grandiflorum species = "grandiflorum",
nrecs = 300, geo = TRUE,
removeZeros = TRUE, ext = NHVT
)<- dismo::gbif("Trillium",
undulatum species = "undulatum",
nrecs = 300, geo = TRUE,
removeZeros = TRUE, ext = NHVT
)# select columns
<- c("species", "lat", "lon", "locality", "year", "coordinateUncertaintyInMeters", "occurrenceID", "occurrenceRemarks", "geodeticDatum")
colsWeNeed <- subset(grandiflorum, select = colsWeNeed)
grandiflorum <- subset(undulatum, select = colsWeNeed)
undulatum <- rbind(grandiflorum, undulatum)
trillium.raw save(trillium.raw, file = filePath)
}
Load in the presences data (saved from code above).
load(filePath)
Check the projection to make sure it makes sense and there is only one value. Check that it is the same projection as my other layers.
unique(trillium.raw$geodeticDatum) # "WGS84"
[1] "WGS84"
trillium.raw
is just a data frame. I make it a sp
object (specifically a SpatialPointsDataFrame
) using sp::coordinates()
to specify which columns are the longitude and latitude.
<- trillium.raw
trillium ::coordinates(trillium) <- c("lon", "lat") sp
Check that it looks ok and there are no NAs.
summary(trillium$lon)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-73.61 -73.01 -72.57 -72.38 -71.73 -70.60
summary(trillium$lat)
Min. 1st Qu. Median Mean 3rd Qu. Max.
42.49 43.61 44.18 44.00 44.47 45.38
The coordinateUncertaintyInMeters
column give the uncertainty of the observation location. Some of the uncertainties are huge and I don’t want those.
table(cut(trillium$coordinateUncertaintyInMeters, c(0, 200, 500, 1000, 2000, 5000)))
(0,200] (200,500] (500,1e+03] (1e+03,2e+03] (2e+03,5e+03]
3131 142 92 87 118
I am going to keep only those locations with a location accuracy within 200m.
<- which(trillium$coordinateUncertaintyInMeters < 200)
good <- trillium[good, ] trillium
Plot the locations
Now I can plot the occurrences points and add the NH and VT state boundaries. Trillium undulatum is much more common.
plot(nhvtshp, border = "blue", axes = TRUE)
plot(subset(trillium, species == "Trillium grandiflorum"), pch = 19, cex = 0.25, add = TRUE)
plot(subset(trillium, species == "Trillium undulatum"), pch = 19, cex = 0.25, col = "red", add = TRUE)
title("Trillium grandiflorum (black) and Trillium undulatum (red) records", cex.main=0.75)
Save
png(file = "day20_trillium.png", bg = "white")
plot(nhvtshp, border = "blue", axes = TRUE)
plot(subset(trillium, species == "Trillium grandiflorum"), pch = 19, cex = 0.25, add = TRUE)
plot(subset(trillium, species == "Trillium undulatum"), pch = 19, cex = 0.25, col = "red", add = TRUE)
title("Trillium grandiflorum (black) and Trillium undulatum (red) records", cex.main=1)
dev.off()