**Bio793: Plant Ecology / Niche modeling / Modeling in R**

Spring 2010

Jason Fridley

When: Mondays 12-2

Where: 156 LSC

Our basic objectives will be to get comfortable with the R environment, learn to implement different modeling approaches common to ecologists (and particularly those relevant to species distribution modeling) in R, and understand hierarchical models and why they are so useful in ecology. Note that this is not a "statistics" course per se; I focus on the implementation and leave the nitty-gritty stats questions to statisticians (or more often textbooks written by statisticians).

I'd like each student to be assigned to one class period as our official record keeper for that day. Recorders will keep detailed notes on our discussion, including R code, for posting as a text document on the class website to serve as subsequent reference material.

In class we'll use the following datasets (csv file, with metadata) . The dataset will probably be added to from time to time as we require different sorts of data.

treedata.csv

tree_metadata.txt

Tf2.RData (for hierarchical model exercises; after unzipping, import via the *load* function)

Tf3.RData

treedbh.csv

Data for Maxent:

treedata2.csv (samples file; can copy and rename as treedata3.csv for environmental layers in 1st example)

grids.zip (ARC ASCII grids for environmental layers directory)

treedataxy.txt

A rough syllabus follows. This is subject to change depending on our progress and your interests; if there is something unmentioned here that you think would be of general interest please let me know. I also indicate relevant chapters from the Crawley [The R Book: 2007] and Bolker texts [Ecological Models and Data in R: 2008], and will update these will additional references as needed.

**Jan 25**: R in a nutshell [*class notes (Alyssa Pontes)*]

- installing

- basic commands

- data input and output

- object manipulation, vector processing vs. for loops

- helpful packages

- graphing basics

*Text chapters*: Crawley Chaps. 1-2

**Feb 1**: General linear models (least squares) in R [Jason's lecture notes] [notes in R code from Lee Davis] [**lm** examples from Lee Davis]

- old-school: normal residuals, independent observations, homoscedasticity

- *lm* function: regression (simple, multiple), ANOVA, ANCOVA

- stepwise fitting procedures (step or stepAIC)

- what ANOVA contrasts mean, post-hoc testing

- output summaries (R2, getting AICs, conf intervals, coefficients)

- nonlinear (least squares) models: *nls*, nonlinear ANCOVA

*Text chapters:* Bolker Chap. 9; Crawley Chap. 9 (modeling basics)

**Feb 8**: General*ized* linear models in R [*recorder: Portia Osborne*] [Jason's lecture notes] [code from P. Osborne]

- non-normality accommodated via linear transformations (*link functions*)

- error distributions: the exponential family (normal, Poisson, binomial, Gamma)

- *glm* function (iteratively re-weighted least squares)

- examples: logistic regression, Poisson error (count data), overdispersed data (quasi-likelihood, NBs)

- model comparisons (likelihood ratio tests)

*Text chapters:* Bolker Chap. 9.4; Crawley Chap. 13

**Feb 15**: Generalized additive models in R [Jason's lecture notes] [GAM paper: Bio et al. 1998] [code from John Wiley]

- non-parameteric smoothers, *loess*

- *gam* in the mgcv library; *gam* in the gam library

*Text chapters:* Crawley Chap. 18

**Feb 22**: Classification and Regression Trees [Jason's lecture notes] [CART techniques: Prasad et al. 2006] [model fitting exercise by Andrew Siefert]

- Classification

- Regression tree analysis (rpart package)

- Bagging Trees (ipred package)

- Random Forests (randomForest package)

- Boosted regression trees (model averaging; gbm package)

*Text chapters:* Crawley Chap. 21

**Mar 1**: Likelihood I [*recorder: Chris Duke*] [Jason's lecture notes] [exercises: answer code]

- writing likelihood functions

- optimization and the negative log-likelihood

- more on error distributions

* Text chapters:* Bolker Chap. 6

**Mar 8**: Bayesian methods [Jason's lecture notes] [exercise answers by Megan Skrip]

- Bayesian vs. frequentist methods

- Bayes' rule, priors and posteriors

- sampling posterior distributions, MCMC

- WinBUGS

**Mar 15**: Spring break

**Mar 22**: Hierarchical modeling I [*recorder: Giancarlo Sadoti*] [Jason's lecture notes] [attempt at exercise answer by Jason]

- introduction to modeling hierarchically

- fixed effects vs. random effects, parameter models

- block models

- ML vs. REML fitting

- zero-truncated and zero-inflated models

- GLMMs (generalized linear mixed models), *lme4*

- GAMMs (generalized additive mixed models), *mgcv*

*Text chapters:* Crawley Chap. 19

**Mar 29**: Hierarchical modeling II [Jason's lecture notes] [notes from Anthony C.] [spatial stats notes from Catherine R.]

- models of correlated observations

- built in R correlation structures (corClasses)

- basic time series models, repeated measures

- spatial autocorrelation

*Text chapters: *Crawley Chap. 24, Bolker Chap. 10

**Apr 5**: Hierarchical modeling III [*recorders: Patrick Raney*, *Adam Willis*] [Clark 2005 reading] [Jason's lecture notes] [notes from Patrick R.]

- hierarchical Bayes

- observation error, latent variables, process and parameter models

- multiple dependent variable data sources

- more WinBUGS and R

**Apr 12**: Non-R methods for niche modeling [*recorder: Joe Vineis*] [Phillips & Dudik 2008 reading] [Jason's lecture notes]

- Maximum entropy (Maxent)

**Apr 19**: R and GRASS GIS I [*recorder: Liz Droge*] [Jason's lecture notes] [GRASS on Macs from Liz D.]

- GRASS GIS: what it is, how you get it

- landscape metrics in GRASS

**Apr 26**: R and GRASS GIS II [recorder: *Aditya Rao*] [Jason's lecture notes]

- the R-GRASS interface

**May 3**: Catch-up, final HB example [lecture notes]