Overview

A project to learn how to do species distribution modeling (SDM) in R. Ultimately, I’ll probably use the biomod2 or Wallace packages but in the learning phase, I am building and running models manually.

0.1 Prequisite knowledge

Before going through the code, you should have a basic understanding of spatial data and working with that data in R. The spatial manipulations done here are super simple but if you have no or very little exposure to spatial data terminology or the raster package in R, then go through this chapter on spatial data in R first.

Secondly, you’ll need a basic understanding of generalized linear models (GLMs) and generalized additive models (GAMs). Go through this introduction to GLMs and go through this introduction to GAMs or this introduction on GAMs. For the purpose of the material here, a skim of these introductions is fine just so you have a basic understanding of what models are being used.

0.2 Set-up - R and RStudio

If you have not updated R recently (in the last 6 months), go ahead and do that. Also update RStudio is you haven’t done that recently.

0.3 Get the shapefiles for Hubbard Brook

Create a project in RStudio for the SDM building. Within that project, create a folder called data and one called code. Within data create a folder called hbef_boundary. Go to the Species-Dist-Modeling—Trillium repository hbef_boundary folder and download all the files there into your hbef_boundary folder.

0.4 Set-up - R packages

The code will use the following R packages which you will need to install. Open RStudio and go to the Packages tab on the right. Then click Install and search for the package.

library(biomod2)
library(dismo)
library(sp)
library(raster)
library(ggplot2)
library(maps)
library(usdm)
library(ecospat)
library(corrplot)
library(MASS)
library(gam)
library(stringr)  # for easy string manipulation
library(tidyr)  # for data wrangling for ggplot
library(knitr)  # for R Markdown
library(here)  # for intelligent file directory navigation

0.5 Data downloads

When you go through the Rmd files, it will download a lot of data into your project, but the next time you run the files, the code will look for the downloaded files and not rerun the downloads.