May 3: A final hierarchical Bayesian example

First I create a set of predictor variables, including elevation, TCI, plot size (10, 100, or 1000 m2), and project number (there were three different research projects in the area providing data for this study).  A total of 200 plots were sampled.  The sample function is a way of sampling uniformly from an integer vector.

#simulated predictor variables

N = 200       #number of plots

elev = sample(200:2000,N,replace=T)/1000        #elevation in km

tci = runif(N,min=1,max=25)                     #tci

size = sample(c(10,100,1000),N,replace=T)       #plot sizes

project = sample(c(1,2,3),N,replace=T,prob=c(.2,.2,.6)) #vegetation project

Next I use this set of predictors to simulate a species occurrence dataset, based on 1) an environmental deterministic function (elevation and tci); 2) detection probability, which increases with plot size; and 3) a random offset for project, which describes some unknown way that species occurrence varies by project.

#simulated response variable

#deterministic env process parameters

b1 = .5     #overall intercept

b2 = 1      #elev coefficient

b3 = -.01   #elev^2 coefficient

b4 = -.3    #tci coefficient

#detection process parameters

d1 = 0      #detection intercept

d2 = .7     #plot size coefficient

tau = 1     #precision (1/variance) of random effect

#actual occurrence is a function of env variables and a random effect for project

re = rnorm(3,0,1/tau) #random effect varies by project

lp = b1 + b2*elev + b3*(elev^2) + b4*tci + re[project]

env.occ = 1 / (1+exp(-lp))     #logit transformation of occurrence probability

#real occurrence includes bernoulli process error

real.occ = rbinom(N,size=1,prob=env.occ)

#probability of detection is a function of log plot size

dp = d1 + d2*log(size)

detect.prob = 1 / (1+exp(-dp)) #logit transformation of detection probability

#observation of occurrence depends on actual occurrence and detection

obs.occ = rbinom(N,size=1,prob=detect.prob*real.occ)

#input data frame

df = data.frame(elev,tci,size,obs.occ,project)

Now we have a dataset of known properties.  We fit an HB model via BUGS in R:

library(BRugs)

library(R2WinBUGS)

J = nrow(df)       #total observations

M = 3              #number of groups (projects)

obs = df\$obs.occ   #response variable

elev = df\$elev     #predictor variable

tci = df\$tci       #predictor variable

size = log(df\$size) #predictor variable

project = df\$project    #projects

bugsdata = list("J","M","obs","elev","tci","size","project")

parameters = c("beta1","beta2","beta3","beta4","delta1","delta2","tau")

#initialization of parameters via random functions

inits = function() list(beta1=rnorm(1,.5,10),beta2=rnorm(1,1,10),

beta3=rnorm(1,-.01,10),beta4=rnorm(1,3,10),

delta1=rnorm(1,0,10),delta2=rnorm(1,.7,10),tau=rlnorm(1,1,1))

We can specify the BUGS model in R and write it to a text file for BUGS input:

sink("bugsmod1.txt")

cat("model {

for (i in 1:J) {   #loop over J observations

#occurrence model, including random effect u

logit(p[i]) <- beta1+beta2*elev[i]+beta3*elev[i]*elev[i]+beta4*tci[i]+u[project[i]]

#detection model

logit(d[i]) <- delta1 + delta2*size[i]

#observation depends on independent probabilities of detection and occurrence

op[i] <- p[i]*d[i]

obs[i] ~ dbern(op[i])

}

# random effects for M groups

for (j in 1:M) {

u[j] ~ dnorm(0,tau) #tau is precision (1/variance) between projects

}

# priors

delta1 ~ dnorm(0.0,1.0E-6)

delta2 ~ dnorm(0.0,1.0E-6)

beta1 ~ dnorm(0.0,1.0E-6)

beta2 ~ dnorm(0.0,1.0E-6)

beta3 ~ dnorm(0.0,1.0E-6)

beta4 ~ dnorm(0.0,1.0E-6)

# 'hyperprior'

tau ~ dgamma(0.001,0.001)

}

",fill=TRUE)

sink()

We're now ready to evaluate the model in OpenBUGS and examine the result. You can of course tweak the chaining parameters below.

#send to OpenBUGS

sim1 = bugs(bugsdata, inits, parameters,"bugsmod1.txt",n.chains=3,

n.iter=1000,n.burnin=500,program="openbugs")

print(sim1)

attach.bugs(sim1)

sim1\$summary

There is some reason for concern that the above model is not entirely correct, because it involves only one Bernoulli process (which has a probability equal to the product of occurrence and detection probabilities) rather than two: one concerning the occurrence model (whether the species is actually there or not), and one concerning the conditional probability of detection given the species is present (which has a probability equal to the detection probability when a species is present, zero otherwise).  Consider the following BUGS model:

sink("bugsmod2.txt")

cat("model {

for (i in 1:J) {   #loop over J observations

#model of occurrences as a Bernoulli trial

logit(p[i]) <- beta1+beta2*elev[i]+beta3*elev[i]*elev[i]+beta4*tci[i]+u[project[i]]

actual[i] ~ dbern(p[i])  #actual presences

#detection model, including logit transformation

d[i] <- delta1 + delta2*size[i]   #process model for detectability

dd[i] <- exp(d[i])/(1+exp(d[i]))  #another form of the logit formula

#observation depends on detectability and actual occurrence

detect[i] <- dd[i]*actual[i] #detectability for presences, zero otherwise

obs[i] ~ dbern(detect[i]) #actual detections

# random effects for M groups

for (j in 1:M) {

u[j] ~ dnorm(0,tau) #tau is precision (1/variance) between projects

}

# priors

delta1 ~ dnorm(0.0,1.0E-6)

delta2 ~ dnorm(0.0,1.0E-6)

beta1 ~ dnorm(0.0,1.0E-6)

beta2 ~ dnorm(0.0,1.0E-6)

beta3 ~ dnorm(0.0,1.0E-6)

beta4 ~ dnorm(0.0,1.0E-6)

# 'hyperprior'

tau ~ dgamma(0.001,0.001)

}

",fill=TRUE)

sink()

The only difference in bugsmod2.txt is that there is an extra Bernoulli trial.  The problem is that BUGS won't fit this model, or at least wouldn't fit it for me regardless of my simulated data parameterization or initial values.