Lecture notes from Bio793: Plant Ecology / Niche modeling / Modeling in R
Spring 2010
Jason Fridley, Syracuse University
These are lecture notes from the Spring 2010 version of Bio793 at Syacuse. Objectives of the course were for students to learn R, implement different modeling approaches common to ecologists (and particularly species distribution modeling), and be introduced to the utility and construction of hierarchical models. The course was focused on implementation rather than statistical theory. I welcome comments (fridley -at- syr.edu). The original course syllabus is here.
Recommendations on texts: We did not use a formal text for the course, but the following books proved helpful:
Crawley MJ (2007). The R Book (Wiley). The standard 'off the shelf' R handbook; covers all the basics and most common built-in functions for standard statistical tests used by ecologists.
Bolker BM (2008). Ecological Models and Data in R (Princeton). Superb introduction to likelihood, statistical distributions, optimization.
McCarthy MA (2007). Bayesian Methods for Ecology (Cambridge). Gentle and lucid introduction to Bayes via BUGS.
Royle JA, Dorazio RM (2008). Hierarchical modeling and inference in ecology (Academic). Good introduction to hierarhical models using R and BUGS, incorporation of observation error, etc.
Pinheiro JC, Bates DM (2004). Mixed-effects Models in S and S-PLUS (Springer). Invaluable resource for using nlme package, esp. autocorrelation structures.
Zuur AF, Ieno EN, Walker NJ, Saveliev AA, Smith GM (2009). Mixed Effects Models and Extensions in Ecology with R (Springer). Includes hard-to-find instructions for techniques like zero-truncated and zero-inflated models.
Most of the earlier topics below use the treedata.csv dataset (see tree_metadata.txt). Other example datasets are available as links in the appropriate sections.
Course topics
General linear models (least squares) in R
- lm function: regression (simple, multiple), ANOVA, ANCOVA
- stepwise fitting procedures (step or stepAIC)
- what ANOVA contrasts mean, post-hoc testing
- output summaries (R2, getting AICs, conf intervals, coefficients)
- nonlinear (least squares) models: nls, nonlinear ANCOVA
Generalized linear models in R
- non-normality accommodated via transformation (link functions)
- error distributions: the exponential family (normal, Poisson, binomial, Gamma)
- glm function (iteratively re-weighted least squares)
- examples: logistic regression, Poisson error (count data), overdispersed data (quasi-likelihood, NBs)
Generalized additive models in R
- non-parameteric smoothers, loess
- gam in the mgcv library
- gam in the gam library
Classification and regression trees in R
- classification
- regression tree analysis (rpart package)
- bagging trees (ipred package)
- random forests (randomForest package)
Likelihood in R
- writing likelihood functions
- optimization and the negative log-likelihood
- likelihood profiles
- model comparison and information criteria
Hierarchical models I: parameter models and random effects in R
- fixed effects vs. random effects
- block models
- ML vs. REML fitting
- zero-truncated and zero-inflated models (VGAM, pscl packages)
- GLMMs (generalized linear mixed models), lme4 package
- GAMMs (generalized additive mixed models), mgcv package
Hierarchical models II: correlated observations in R
- R correlation structures in nlme (corClasses)
- basic time series models, repeated measures
- spatial autocorrelation
Bayesian methods in R and BUGS
- Bayesian vs. frequentist methods
- Bayes' rule, priors and posteriors
- sampling posterior distributions via BUGS
- hierarchical Bayes
- observation error, latent variables, process and parameter models
- BUGS and R
Maximum entropy with Maxent
- installation and data formatting for Maxent
- interpreting model output
- integration with GIS data
R and GRASS GIS
- GRASS GIS: what it is, how you get it
- landscape metrics in GRASS
- the R-GRASS interface
Last updated on May 20, 2010 by Jason Fridley (fridley - at - syr.edu).