###############################################################
# #
# Solutions to the exercises in the book #
# Kery, M. 2010. Introduction to WinBUGS for ecologists. #
# Academic Press, Burlington. #
# #
###############################################################
# This document contains the solutions to the exercises in the book. This is mostly R and WinBUGS code,
# therefore I chose to format this document as a simple text file, so you can open it in R directly or
# in TinnR or similar text editors from which many people run R.
# The document also contains some additional exercises, along with their solutions, which are not
# contained in the first edition of the book.
# Some exercises deal with the Swiss hare data set, which you must first download from the book website
# at www.mbr-pwrc.usgs.gov/pubanalysis/kerybook.
# This is a tab-delimited text file, which you can read into your R workspace using code similar to this:
hares <- read.table("F:/a_solutions WB book/hares.txt", header = TRUE)
# To execute the code, you can copy-paste parts of it into an open R window. Alternatively, you can open
# the text file in Tinn-R (though beware of possible problems with sink(); check the book appendix for a
# workaround) or any other R editor.
# All code was written and tried out with R 2.10.1. and WinBUGS 1.4.3.
# Note that you will have to set appropriately the following things, which are not documented in this code:
# 1. set an appropriate R working directory
# 2. change the bugs.directory setting in the bugs() function
# when using a non-English computer (e.g., to "C:/Programme/WinBUGS14/" on German locales)
# 3. load the package R2WinBUGS and possibly other packages also
# Created in September 2010 by Marc Kery
# Last changes: 20 November 2012
###############################################################
# #
# Chapter 2: Introduction to Bayesian inference #
# #
###############################################################
1. A classical application of Bayes rule
-------------------------------------
Exercise:
Use Bayes' rule to compute the probability that you actually have the disease for which you have just
been tested positively. What you know is this: 5% of the population to which you belong have the
disease and the test has a sensitivity of 99% and a specificity of 95%. Denote D = 'have the disease'
and N = 'not have the disease' and + = 'get a positive test result' and - = 'get a negative test result'.
| means 'given'. Also note that the test sensitivity is p(+|D) and specificity is p(-|N).
Solution:
Start with p(D|+) = p(+|D) * p(D) / p(+)
You can get a positive test result when you have the disease or when you don't have it. Hence, the
denominator can be expressed as follows:
p(+) = p(+|D) * p(D) + (1-p(-|N)) * p(N)
which is this:
0.99 * 0.05 + (1-0.95) * (1-0.05)
Altogether, we get this for the probability to have the disease, given a positive test result:
p(D|+) = (0.99 * 0.05) / (0.99 * 0.05 + (1-0.95) * (1-0.05)) = 0.51
So there is only about a 51% chance of actually having the disease when you were just given a positive
test result. Most people find this very surprising, given the very small error rates of the testing
procedure (1 minus sensitivity and specificity, respectively). Note that this analysis assumes that
what is effectively the prior of your analysis, p(D) = 0.05, is appropriate also for you. If there is a
history in your family of having the disease in question, then the proportion of the general population
having the disease will NOT be a suitable prior for you in this analysis.
###############################################################
# #
# Chapter 4: First session in WinBUGS #
# #
###############################################################
1. Experiment with a few trivial errors in WinBUGS
-----------------------------------------------
A colleague commented to me that she did not find this exercise very enlightening, since there are so
many ways to make an error in coding an analysis in WinBUGS. The latter may well be true, but it is
still very important that you get a feel for what may have gone wrong when WinBUGS crashes or reports
an error. So here you can experiment with a few such causes for errors and see what WinBUGS tells you
about the kind of error you made.
The part of the exercise with missing values is important for practical analyses, since those will
always include missing values, and you have to know how to deal with them. Therefore, this is probably
the most important exercise in this section to solve.
###############################################################
# #
# Chapter 5: Running WinBUGS from R #
# #
###############################################################
1. Informative priors
------------------
Retain a sample of the two data generation mechanism, i.e., one pair of data sets (with 10 or 1000
birds measured). Then define a model with prior information included. You also have to chose the
initial values carefully, because WinBUGS will crash when supplied with initials outside of the range
over which a prior of a parameter is defined.
# Save model
sink("model.txt")
cat("
model {
# Priors
population.mean ~ dunif(500, 590) # Average believed to lie between 500 and 590 g
precision <- 1 / population.variance
population.variance <- population.sd * population.sd
population.sd ~ dunif(0,100)
# Likelihood
for(i in 1:nobs){
mass[i] ~ dnorm(population.mean, precision)
}
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(mass = y1000, nobs = length(y1000))
# Function to generate starting values: have to adapt that, too
inits <- function()
list (population.mean = rnorm(1,550), population.sd = runif(1, 1, 30))
# Parameters to be monitored (= to estimate)
params <- c("population.mean", "population.sd", "population.variance")
# MCMC settings
nc <- 3 # Number of chains
ni <- 1000 # Number of draws from posterior (for each chain)
nb <- 1 # Number of draws to discard as burn-in
nt <- 1 # Thinning rate
# Start Gibbs sampler: Run model in WinBUGS and save results in object called out
out <- bugs(data = win.data, inits = inits, parameters.to.save = params, model.file = "model.txt",
n.thin = nt, n.chains=nc, n.burnin=nb, n.iter=ni, debug=TRUE, DIC=TRUE, working.directory=getwd())
print(out, dig = 3)
# Here's the solution
> print(out, dig = 3)
Inference for Bugs model at "model.txt", fit using WinBUGS,
3 chains, each with 1000 iterations (first 1 discarded)
n.sims = 2997 iterations saved
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
population.mean 589.912 0.101 589.600 589.90 589.90 590.00 590.000 1.003 1500
population.sd 31.158 0.714 29.789 30.67 31.14 31.65 32.541 1.001 3000
population.variance 971.318 44.531 887.360 940.90 969.70 1002.00 1059.100 1.001 3000
deviance 9714.048 2.675 9711.000 9712.00 9713.00 9715.00 9721.000 1.021 550
So you see that the population mean is heavily affected by your choice of prior. This is not a bad
thing, but rather, part of the essence of Bayesian inference: that your conclusion about a parameter
(embodied by the posterior distribution) is the result of both the information in the data AND of what
you knew or assumed about the magnitude of that parameter before getting your data.
2. Comparison of the inference about the parameters from the large and the small data sets
---------------------------------------------------------------------------------------
You have to first rewrite the model file to contain the non-informative priors. Then, you can simply
change the 'data statements' before running the analysis.
win.data <- list(mass = y1000, nobs = length(y1000)) # To analyse the large data set
win.data <- list(mass = y10, nobs = length(y10)) # To analyse the small data set
Here's the summary of the analysis of one particular large data set:
> print( out, dig = 3)# Produces a summary of the object
Inference for Bugs model at "model.txt", fit using WinBUGS,
3 chains, each with 1000 iterations (first 1 discarded)
n.sims = 2997 iterations saved
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
population.mean 600.140 0.937 598.30 599.5 600.10 600.80 601.900 1.001 3000
population.sd 29.457 0.671 28.17 29.0 29.45 29.88 30.801 1.001 3000
population.variance 868.148 39.647 793.66 840.9 867.10 892.90 948.490 1.001 3000
deviance 9601.074 2.225 9599.00 9600.0 9600.00 9602.00 9607.000 1.012 2600
[...]
And here's the summary of the analysis of one particular small data set:
> print( out, dig = 3)# Produces a summary of the object
Inference for Bugs model at "model.txt", fit using WinBUGS,
3 chains, each with 1000 iterations (first 1 discarded)
n.sims = 2997 iterations saved
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
population.mean 587.722 8.996 569.980 582.10 587.70 593.30 605.310 1.002 1200
population.sd 26.788 8.416 16.269 21.21 25.12 30.13 48.412 1.004 1300
population.variance 788.383 611.895 264.750 449.70 630.80 907.60 2344.200 1.004 1300
deviance 92.241 2.729 89.640 90.33 91.43 93.20 99.840 1.009 1100
[...]
The large data set contains much more information about the population parameters, which are the aim
of the estimation: the mean and the standard deviation of the body mass of the population of male
peregrines. Therefore, the posterior means from analysing the large data set are much closer to the
known, true values. In addition, the estimates are much more precise, i.e., the posterior sd for both
parameters is much smaller in the large than in the small data set.
3. Some effects of the length of the Markov chains
-----------------------------------------------
For this analysis you can either take the native WinBUGS file of the analysis (i.e., the file with the
odc ending), or run WinBUGS from R and ask for only 1 or so iterations to be run. With bugs(..., debug
= TRUE) WinBUGS then will stay open and you can run the chains for an additional 10, 100 etc. iterations
and inspect the estimates after the required number of iterations.
# Bundle data
win.data <- list(mass = y1000, nobs = length(y1000))
# Function to generate starting values: have to adapt that, too
inits <- function()
list (population.mean = rnorm(1, 600, 30), population.sd = runif(1, 0, 30))
# Parameters to be monitored (= to estimate)
params <- c("population.mean", "population.sd", "population.variance")
# MCMC settings
nc <- 3
ni <- 3
nb <- 1
nt <- 1
# Start Gibbs sampler: Run model in WinBUGS and save results in object called out
out <- bugs(data = win.data, inits = inits, parameters.to.save = params, model.file = "model.txt",
n.thin=nt, n.chains=nc, n.burnin=nb, n.iter=ni, debug=TRUE, DIC=TRUE, working.directory=getwd())
Summaries of the node statistics:
After approx. 10 iterations:
Node statistics
node mean sd MC error 2.5% median 97.5% start sample
deviance 9608.0 21.93 6.663 9599.0 9601.0 9720.0 2 30
population.mean 600.1 1.188 0.2873 597.3 600.3 602.2 2 30
population.sd 30.09 1.995 0.6734 27.82 29.54 38.43 2 30
population.variance 909.2 130.5 43.74 774.2 872.3 1477.0 2 30
After approx. 100 iterations:
Node statistics
node mean sd MC error 2.5% median 97.5% start sample
deviance 9602.0 7.548 0.6984 9599.0 9600.0 9609.0 2 297
population.mean 600.2 0.9721 0.07288 598.1 600.2 602.0 2 297
population.sd 29.47 0.8875 0.0687 28.13 29.39 31.13 2 297
population.variance 869.4 55.53 4.339 791.1 863.5 968.8 2 297
After approx. 1,000 iterations:
node mean sd MC error 2.5% median 97.5% start sample
deviance 9601.0 3.032 0.0779 9599.0 9600.0 9607.0 2 2997
population.mean 600.1 0.9501 0.01912 598.2 600.1 602.0 2 2997
population.sd 29.46 0.6804 0.01407 28.24 29.41 30.8 2 2997
population.variance 868.2 40.58 0.8468 797.5 865.1 948.7 2 2997
After approx. 10,000 iterations:
Node statistics
node mean sd MC error 2.5% median 97.5% start sample
deviance 9601.0 2.13 0.01487 9599.0 9600.0 9606.0 2 29997
population.mean 600.1 0.9381 0.005661 598.3 600.1 602.0 2 29997
population.sd 29.44 0.6642 0.003665 28.18 29.43 30.77 2 29997
population.variance 867.1 39.21 0.2173 794.1 865.9 946.9 2 29997
You see that with increasing number of MCMC samples from the posterior distributions of the parameters,
you decrease the Monte Carlo error, but that the estimates of the posterior distribution are hardly
changed. In particular, the posterior standard deviation (sd) does not go down with increasing MCMC
sample size. The posterior sd is a measure of the precision of the estimate of a parameter, i.e., of
how concentrated ('narrow') the posterior distribution is. This is only a function of the data and
of the priors (and, the model), but not of the length of the Markov chains with which you explore that
posterior distribution.
To look at the values of the Brooks-Gelman-Rubin statistics, you can repeatedly run the analysis with
different chain lengths and then look at the summary of the analysis produced by the bugs() function.
You will see that with increasing chain length, the value of this convergence criterion will come down
and become essentially equal to 1 with the largest chain length.
4. Posterior standard deviation and Monte Carlo (MC) error
-------------------------------------------------------
Look at the summaries of the node statistics above. You will see that the posterior sd does not
systematically change as you increase chain length, but that the MC error becomes smaller and smaller.
According to a rule of thumb, the MC error should be smaller that about 1-5% of the posterior sd for
acceptable chain length.
5. Derived quantities (compute CV); and also compute the residuals
---------------------------------------------------------------
Exercise:
We compute the CV and, in addition to what is asked for in the book, also add a line in the BUGS code to
compute the residuals. Note that in the book I ask for a "point estimate (with SE)". This is not
strictly correct, but what I mean is that you should report the posterior mean as a point estimate and
the posterior standard deviation as a measure of the precision of that estimate. The latter is analogous
to a standard error (SE) in ML inference.
Solution:
To compute the CV (in %) and the residuals, we simply add two lines that define these two estimands
(called 'node' by WinBUGS). When you change the name of the BUGS model file (as I do below), remember to
make the appropriate change also when you call the bugs function.
# Save BUGS description of the model to working directory
sink("model_cv_resi.txt")
cat("
model {
# Priors
population.mean ~ dunif(0,5000)
precision <- 1 / population.variance
population.variance <- population.sd * population.sd
population.sd ~ dunif(0,100)
cv <- 100*(population.sd / population.mean) # Line to compute the CV in %
# Likelihood
for(i in 1:nobs){
mass[i] ~ dnorm(population.mean, precision)
resi[i] <- mass[i] - population.mean # Line to compute the residuals
}
}
",fill=TRUE)
sink()
# Package all the stuff to be handed over to WinBUGS
# Bundle data
win.data <- list(mass = y10, nobs = length(y10))
# Function to generate starting values
inits <- function()
list (population.mean = rnorm(1,600), population.sd = runif(1, 1, 30))
# Parameters to be monitored (= to estimate)
params <- c("population.mean", "population.sd", "population.variance", "precision", "cv", "resi")
# MCMC settings
nc <- 3
ni <- 1200
nb <- 200
nt <- 2
# Start Gibbs sampler: Run model in WinBUGS and save results in object called out
out <- bugs(data = win.data, inits = inits, parameters.to.save = params, model.file = "model_cv_resi.txt",
n.thin=nt, n.chains=nc, n.burnin=nb, n.iter=ni, debug=TRUE, DIC=TRUE, working.directory=getwd())
# Look at the results
print(out, dig = 3)
> print(out, dig = 3)
Inference for Bugs model at "model_cv_resi.txt", fit using WinBUGS,
3 chains, each with 1200 iterations (first 200 discarded), n.thin = 2
n.sims = 1500 iterations saved
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
population.mean 615.411 12.934 589.795 607.200 615.600 623.700 642.005 1.000 1500
population.sd 37.841 10.796 23.044 30.165 35.890 43.105 64.310 1.002 830
population.variance 1548.392 991.761 531.112 910.050 1288.000 1857.750 4136.287 1.002 830
precision 0.001 0.000 0.000 0.001 0.001 0.001 0.002 1.002 830
cv 6.153 1.770 3.738 4.890 5.829 7.014 10.557 1.002 820
resi[1] 3.738 12.933 -22.831 -4.577 3.512 11.970 29.357 1.000 1500
resi[2] 25.865 12.933 -0.706 17.550 25.640 34.093 51.482 1.000 1500
resi[3] 13.933 12.933 -12.641 5.618 13.705 22.163 39.552 1.000 1500
resi[4] 25.859 12.933 -0.712 17.547 25.635 34.090 51.477 1.000 1500
resi[5] 15.590 12.933 -10.981 7.275 15.365 23.820 41.212 1.000 1500
resi[6] -40.275 12.933 -66.846 -48.590 -40.500 -32.047 -14.658 1.000 1500
resi[7] 21.267 12.933 -5.304 12.950 21.040 29.492 46.887 1.000 1500
resi[8] -68.865 12.933 -95.435 -77.180 -69.090 -60.638 -43.248 1.000 1500
resi[9] -13.444 12.933 -40.011 -21.760 -13.670 -5.215 12.172 1.000 1500
resi[10] 21.156 12.933 -5.415 12.840 20.930 29.383 46.772 1.000 1500
deviance 99.291 2.485 96.730 97.460 98.535 100.400 106.152 1.002 1400
[...]
# So you might report the results of this analysis as follows:
"The CV (in %, posterior mean and sd) of body mass was estimated at 6.15 (1.77)."
# To draw a picture of the posterior distribution of the CV of body mass, you could produce a histogram:
hist(out$sims.list$cv, xlab = 'Coefficient of variation (%) of body mass', ylab = 'Frequency',
breaks = 40, las = 1, col = 'grey')
# You could also draw a smoothed version of this using the function density():
plot(density(out$sims.list$cv), xlab = 'Coefficient of variation (%) of body mass',
ylab = 'Frequency', las = 1, col = 'grey', lwd = 2, main = '')
# To look at residuals you could do this:
plot(out$mean$resi, xlab = 'Observation number', ylab = '', las = 1, lwd = 2, main = '')
abline(h = 0, lwd = 1)
# Remember that in Bayesian inference, every unknown has a (posterior) distribution. So to summarize
(graphically) the distribution of each residual, you can do this:
index <- matrix(rep(1:10, 1500), ncol = 10, byrow = TRUE) #...
boxplot(out$sims.list$resi ~ index, col = 'gray', xlab = 'Observation number')
abline(h = 0, lwd = 1)
6. Swiss hare data: Mean density
-----------------------------
Exercise:
Summarize what the data tell you about the mean hare density. You can use the same BUGS model description as
before in this chapter and just fit the 'model of the mean' to the variable named 'mean.density'. To get an
impression of what might be useful ranges for the uniform distributions used to convey our ignorance about
the mean density, we first summarize the contents of that variable.
summary(hares$mean.density)
Solution:
First, don't forget to read in the data, for instance, like this:
hares <- read.table("F:/a_solutions WB book/hares.txt", header = TRUE)
# Save BUGS description of the model to working directory
sink("model.txt")
cat("
model {
# Priors
population.mean ~ dunif(0,100)
precision <- 1 / population.variance
population.variance <- population.sd * population.sd
population.sd ~ dunif(0,100)
# Likelihood
for(i in 1:nobs){
y[i] ~ dnorm(population.mean, precision)
}
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(y = hares$mean.density, nobs = length(hares$mean.density))
# Function to generate starting values
inits <- function()
list (population.mean = runif(1, 0,100), population.sd = runif(1, 1, 30))
# Parameters to be monitored (= to estimate)
params <- c("population.mean")
# MCMC settings
nc <- 3 # Number of chains
ni <- 6000 # Number of draws from posterior (for each chain)
nb <- 2000 # Number of draws to discard as burn-in
nt <- 4 # Thinning rate
# Start Gibbs sampler: Run model in WinBUGS and save results in object called out
out <- bugs(data = win.data, inits = inits, parameters.to.save = params, model.file = "model.txt",
n.thin = nt, n.chains = nc, n.burnin = nb, n.iter = ni, debug = TRUE, DIC = TRUE, working.directory = getwd())
# Look at the results
print(out, dig = 3)
> print(out, dig = 3)
Inference for Bugs model at "model.txt", fit using WinBUGS,
3 chains, each with 1200 iterations (first 200 discarded), n.thin = 2
n.sims = 1500 iterations saved
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
population.mean 4.744 0.145 4.459 4.646 4.748 4.845 5.015 1.001 1500
deviance 3735.790 2.063 3734.000 3734.000 3735.000 3736.000 3741.000 1.002 1500
So, we might report the results of this analysis in the Results section of our paper as follows:
"Mean density of European Brown hares in Switzerland (posterior mean and sd) was estimated at
4.74 (0.15), with a 95% Bayesian credible interval of (4.46, 5.02). "
###############################################################
# #
# Chapter 6: Basics of linear models #
# #
###############################################################
1. Fitting a custom-built design matrix
------------------------------------
Exercise:
Using R (lm() function), we want to fit a 'semi-interactive' ANCOVA model, using a custom-built design
matrix, where the slope of mass on length is identical in population 1 and 3. For your convenience, here
are part of the data again:
mass <- c(6, 8, 5, 7, 9, 11)
pop <- factor(c(1,1,2,2,3,3))
svl <- c(40, 45, 39, 50, 52, 57)
fm <- lm(mass ~ pop*svl) # Fully interactive model which uses up all d.f.
summary(fm)
Solution:
Here is a quick solution to build the custom design matrix:
DM <- model.matrix(~ svl * pop - 1 - svl) # Build a design matrix to start with
DM[5:6,4] <- DM[5:6,6] # Write the last two elements in col. 6 into col 4
DM <- DM[,-6] # Delete column 6
# Fit that custom design matrix (have to subtract intercept to override R's treatment parameterisation):
fm <- lm(mass ~ DM-1)
summary(fm)
This shows again that if you understand design matrices, you have full freedom to fit any
kind of linear model you want, in R, WinBUGS or other useful stats packages.
###############################################################
# #
# Chapter 7: t-tests #
# #
###############################################################
1. t-test with heterogeneous variances
------------------------------------
Exercise:
Adapt the code to directly test for equal variances of wingspan.
Solution:
We simply add a line that computes the difference between male and female wingspan and then monitor that
node, called delta.sigma2 below. The following assumes that you have the data from chapter 7.2.2.
in your R workspace.
sink("h.ttest.2.txt")
cat("
model {
# Priors
mu1 ~ dnorm(0,0.001)
mu2 ~ dnorm(0,0.001)
sigma1 ~ dunif(0, 1000)
sigma2 ~ dunif(0, 1000)
# Likelihood
for (i in 1:n1) {
y1[i] ~ dnorm(mu1, tau1)
}
for (i in 1:n2) {
y2[i] ~ dnorm(mu2, tau2)
}
# Derived quantities
delta.mu <- mu2 - mu1
sigma2.1 <- sigma1 * sigma1
sigma2.2 <- sigma2 * sigma2
tau1 <- 1 / (sigma1 * sigma1)
tau2 <- 1 / (sigma2 * sigma2)
delta.sigma2 <- sigma2.2 - sigma2.1
}
",fill=TRUE)
sink()
data <- list("y1", "y2", "n1", "n2")
inits <- function(){ list(mu1=rnorm(1), mu2=rnorm(1), sigma1 = rlnorm(1), sigma2 = rlnorm(1))}
parameters <- c("mu1","mu2", "delta.mu", "delta.sigma2", "sigma1", "sigma2", "sigma2.1", "sigma2.2")
# MCMC settings
nc <- 3
ni <- 2000
nb <- 500
nt <- 1
out <- bugs (data, inits, parameters, "h.ttest.2.txt", n.thin=nt,n.chains=nc, n.burnin=nb, n.iter=ni,
debug = TRUE)
print(out, dig = 3)
# Look at posterior for some of the variance parameters
par(mfrow = c(3,1))
hist(out$sims.list$sigma2.1)
hist(out$sims.list$sigma2.2)
hist(out$sims.list$delta.sigma2)
Since the posterior distribution of the difference between the variance of male and female wingspan overlaps
zero (which means that there is no difference), we don't have much evidence for a sex-related variability
in the wingspan of peregrine falcons
2. Effects of assumption violations on the t-test with homogeneous variances
-------------------------------------------------------------------------
Exercise:
Simulate data that differ between two groups in the variance, but not in the mean, and see whether this
screws a t-test that assumes constant variance.
Solution:
We will conduct a little simulation exercise wherein we repeatedly generate and then analyse data sets.
For this, it is convenient to define functions for data generation and analysis.
We first define a function to simulate the data. To make everything as comparable as possible, we
assume an even sample size.
data.fn <- function(n1 = 50, n2 = 50, mu = 100, sigma1 = 5, sigma2 = 3){
n1 <- n1 # Number of females
n2 <- n2 # Number of males
mu <- mu # Population mean for both sexes
sigma1 <- sigma1 # Population SD for females
sigma2 <- sigma2 # Population SD for males
n <- n1+n2 # Total sample size
y1 <- rnorm(n1, mu, sigma1) # Data for females
y2 <- rnorm(n2, mu, sigma2) # Data for males
y <- c(y1, y2) # Aggregate both data sets
x <- rep(c(0,1), c(n1, n2)) # Indicator for male
boxplot(y ~ x, col = "grey", xlab = "Male", ylab = "Wingspan (cm)", las = 1)
return(list(x=x, y=y))
}
To define this function, you execute everything between the line starting with data.fn and the closing
curly bracket. To execute it, you then type 'data.fn().'
Second, we define another function to use WinBUGS to fit the linear model underlying a t-test. Depending
on what magnitude of sigma you choose in the data generation, you will have to adapt the upper limit of
the uniform prior for that parameter:
fit.fn <- function(x = x, y = y) {
# Define BUGS model
sink("ttest.txt")
cat("
model {
# Priors
mu1 ~ dnorm(0,0.001)
delta ~ dnorm(0,0.001)
tau <- 1/ (sigma * sigma)
sigma ~ dunif(0, 50)
# Likelihood
for (i in 1:n) {
y[i] ~ dnorm(mu[i], tau)
mu[i] <- mu1 + delta *x[i]
}
# Derived quantities: one of the greatest things about a Bayesian analysis
mu2 <- mu1 + delta # Difference in wingspan
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(x=x, y=y, n = length(x))
# Inits function
inits <- function(){list(mu1=rnorm(1), delta=rnorm(1), sigma = runif(1))}
# Parameters to estimate
params <- c("mu1","mu2", "delta", "sigma")
# MCMC settings
nc <- 3
ni <- 600
nb <- 100
nt <- 1
# Start Gibbs sampler
out <- bugs(data = win.data, inits = inits, parameters = params, model = "ttest.txt",
n.thin = nt, n.chains = nc, n.burnin = nb, n.iter = ni, debug = FALSE, working.directory = getwd())
out
}
And third, we set up structures to contain the estimates from n.iter executions of these two functions and
then run the simulation for n.iter replicates. We will save the summary table from each analysis in a 3D
array.
n.iter <- 50 # The number of simreps
esti.table <- array(NA, dim = c(5,9,n.iter))
Assuming we have run this model once already and have the results in an object called out, we can do the
following to name the two first dimensions of the array.
rownames(esti.table) <- rownames(out$summary)
colnames(esti.table) <- colnames(out$summary)
start.time <- Sys.time()
for(i in 1:n.iter){
print(i)
data <- data.fn(n1 = 50, n2 = 50, mu = 100, sigma1 = 20, sigma2 = 10)
fm <- fit.fn(data$x, data$y)
esti.table[,,i] <- fm$summary
}
end.time = Sys.time()
cat('\nProcessing time: ',round(difftime(end.time,start.time,units='mins'),dig=2),' minutes\n',sep='')
We look at the n.iter simulations
esti.table
Then we summarize results of simulation:
apply(esti.table, c(1,2), mean)
>From this there does not appear to be a great sensitivity to heterogeneous variances. Not sure whether this is generally true, though.
3. Homogenous and heterogeneous-variance test for landuse effects on hare density (Swiss hare data)
------------------------------------------------------------------------------------------------
Exercise:
Fit the two t-test models to mean.density in the Swiss hare data set.
Solution:
To avoid committing pseudo-replication or having to use a mixed model (with a site random effect) we
first average mean.density over years within a site.
MMD <- aggregate(hares$mean.density, by = list(hares$site), FUN = mean, na.rm = TRUE)$x
LANDUSE <- aggregate(as.numeric(hares$landuse), by = list(hares$site), FUN = mean, na.rm = TRUE)$x -1
# LANDUSE = 1 is grass and LANDUSE = 0 is arable
We fit the t-test model with homogenous variance
# Define BUGS model
sink("ttest.txt")
cat("
model {
# Priors
mu1 ~ dnorm(0,0.001)
delta ~ dnorm(0,0.001)
tau <- 1/ (sigma * sigma)
sigma ~ dunif(0, 10)
# Likelihood
for (i in 1:n) {
y[i] ~ dnorm(mu[i], tau)
mu[i] <- mu1 + delta *x[i]
}
# Derived quantities
mu2 <- mu1 + delta # Mean density in grassland areas
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(x = LANDUSE, y = MMD, n = length(MMD))
# Inits function
inits <- function(){list(mu1=rnorm(1), delta=rnorm(1), sigma = rlnorm(1))}
# Parameters to estimate
params <- c("mu1","mu2", "delta", "sigma")
# MCMC settings
nc <- 3
ni <- 1200
nb <- 200
nt <- 2
# Start Gibbs sampler
out <- bugs(data = win.data, inits = inits, parameters = params, model = "ttest.txt", n.thin = nt,
n.chains = nc, n.burnin = nb, n.iter = ni, debug = TRUE, working.directory = getwd())
print(out, dig = 3)
> print(out, dig = 3)
Inference for Bugs model at "ttest.txt", fit using WinBUGS,
3 chains, each with 1200 iterations (first 200 discarded), n.thin = 2
n.sims = 1500 iterations saved
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
mu1 4.933 0.493 3.928 4.630 4.933 5.258 5.852 1.003 1500
mu2 2.954 0.786 1.480 2.399 2.949 3.489 4.512 1.003 830
delta -1.980 0.937 -3.742 -2.627 -2.040 -1.334 -0.115 1.002 1000
sigma 3.165 0.305 2.629 2.942 3.145 3.345 3.836 1.000 1500
deviance 286.239 2.538 283.400 284.400 285.500 287.300 292.352 1.006 1200
For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
DIC info (using the rule, pD = Dbar-Dhat)
pD = 2.9 and DIC = 289.1
DIC is an estimate of expected predictive error (lower deviance is better).
So the mean density in arable areas (mu1) is estimated to be higher than that in grassland areas (mu2),
with the 95% CRI for the difference in mean density (delta) being (-3.742, -0.115). This analysis
assumes constant variance between the two types of landuse.
Next, we fit the model that relaxes this assumption. In addition, we add a line that computes the
difference in the variance of mean density between the two types of landuse:
# Define BUGS model
sink("h.ttest.txt")
cat("
model {
# Priors
mu1 ~ dnorm(0,0.001)
mu2 ~ dnorm(0,0.001)
tau1 <- 1 / ( sigma1 * sigma1)
sigma1 ~ dunif(0, 100)
tau2 <- 1 / ( sigma2 * sigma2)
sigma2 ~ dunif(0, 100)
# Likelihood
for (i in 1:n1) {
y1[i] ~ dnorm(mu1, tau1)
}
for (i in 1:n2) {
y2[i] ~ dnorm(mu2, tau2)
}
# Derived quantities
delta.mean <- mu2 - mu1
delta.variance <- (1/tau2) - (1/tau1)
}
",fill=TRUE)
sink()
# Bundle data: have to select the right data
win.data <- list(y1 = MMD[LANDUSE == 0], y2 = MMD[LANDUSE == 1], n1 = length(MMD[LANDUSE == 0]),
n2 = length(MMD[LANDUSE == 1]))
# Inits function
inits <- function(){ list(mu1=rnorm(1), mu2=rnorm(1), sigma1 = rlnorm(1), sigma2 = rlnorm(1))}
# Parameters to estimate
params <- c("mu1","mu2", "delta.mean", "delta.variance", "sigma1", "sigma2")
# MCMC settings
nc <- 3
ni <- 2200
nb <- 200
nt <- 2
# Start Gibbs sampler
out <- bugs(data = win.data, inits = inits, parameters = params, model = "h.ttest.txt",
n.thin = nt, n.chains = nc, n.burnin = nb, n.iter = ni, debug = TRUE, working.directory = getwd())
print(out, dig = 3)
> print(out, dig = 3)
Inference for Bugs model at "h.ttest.txt", fit using WinBUGS,
3 chains, each with 2200 iterations (first 200 discarded), n.thin = 2
n.sims = 3000 iterations saved
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
mu1 4.921 0.559 3.770 4.559 4.930 5.289 6.036 1.002 1500
mu2 2.968 0.568 1.855 2.599 2.976 3.343 4.059 1.001 3000
delta.mean -1.953 0.809 -3.529 -2.475 -1.958 -1.427 -0.386 1.001 3000
delta.variance -7.102 3.935 -15.180 -9.444 -7.005 -4.705 0.545 1.001 3000
sigma1 3.510 0.423 2.797 3.208 3.466 3.764 4.487 1.002 3000
sigma2 2.274 0.475 1.586 1.941 2.196 2.520 3.427 1.001 2400
deviance 282.517 3.167 278.600 280.200 281.800 284.100 290.302 1.003 1100
So there is quite strong evidence for heterogeneous variances of counts in arable and grassland
sites. The probability that the variance is different is this (for my MCMC sample):
mean(out$sims.list$delta.variance<0)
> mean(out$sims.list$delta.variance<0)
[1] 0.968
4. Modeling the variance
---------------------
Exercise:
We model the variance in the counts as a function of the elevation of a site.
Solution:
For this, we also need the elevation of each site. Then, we model a linear effect of elevation on the log
of the variance, that is, as for the mean in a linear model, we specify a linear model for a transformation
of the variance of the model.
MMD <- aggregate(hares$mean.density, by = list(hares$site), FUN = mean, na.rm = TRUE)$x
LANDUSE <- aggregate(as.numeric(hares$landuse), by = list(hares$site), FUN = mean, na.rm = TRUE)$x
# LANDUSE = 1 is arable
ELEV <- aggregate(as.numeric(hares$elevation), by = list(hares$site), FUN = mean, na.rm = TRUE)$x
Missing values in the explanatory variables of a model need special attention. Either we must specify
a prior distribution for them in the model, or we have to 'impute' them. Here, we simply replace NA's
with the mean elevation.
ELEV[ELEV == 'NaN'] <- mean(ELEV, na.rm = TRUE)
We change the notation somewhat to clarify that a linear model is used to explain variation both in the
mean AND in the variance of the response (mean density). Note that we also normalise the elevation
covariate.
# Define model
sink("model.txt")
cat("
model {
# Priors
alpha1 ~ dnorm(0,0.001)
beta1 ~ dnorm(0,0.001)
alpha2 ~ dnorm(0,0.001)
beta2 ~ dnorm(0,0.001)
# Likelihood
for (i in 1:n) {
y[i] ~ dnorm(mu[i], tau[i]) # Note that both the mean and the variance are indexed by individual
mu[i] <- alpha1 + beta1 *GRASS[i]
tau[i] <- 1 / sigma2[i]
log(sigma2[i]) <- alpha2 + beta2 *ELEV[i]
}
}
",fill=TRUE)
sink()
# Bundle data
mean.elev <- mean(ELEV)
sd.elev = sd(ELEV)
st.ELEV <- (ELEV - mean.elev) / sd.elev
win.data <- list(y = MMD, GRASS = LANDUSE-1, ELEV = st.ELEV, n = length(MMD))
# Inits function
inits <- function(){list(alpha1=rnorm(1), beta1=rnorm(1), alpha2=rnorm(1), beta2=rnorm(1))}
# Parameters to estimate
params <- c("alpha1","beta1","alpha2","beta2")
# MCMC settings
nc <- 3
ni <- 6000
nb <- 2000
nt <- 2
# Start Gibbs sampler
out <- bugs(data = win.data, inits = inits, parameters = params, model = "model.txt", n.thin = nt,
n.chains = nc, n.burnin = nb, n.iter = ni, debug = TRUE, working.directory = getwd(), bugs.directory = bd)
print(out, dig = 3)
> print(out, dig = 3)
Inference for Bugs model at "model.txt", fit using WinBUGS,
Inference for Bugs model at "model.txt", fit using WinBUGS,
3 chains, each with 6000 iterations (first 2000 discarded), n.thin = 2
n.sims = 6000 iterations saved
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
alpha1 4.889 0.478 3.953 4.567 4.884 5.211 5.825 1.002 1700
beta1 -2.097 0.839 -3.728 -2.653 -2.107 -1.545 -0.416 1.001 6000
alpha2 2.231 0.196 1.876 2.093 2.222 2.361 2.637 1.002 1900
beta2 -0.355 0.211 -0.754 -0.499 -0.368 -0.222 0.089 1.001 5800
deviance 284.032 2.866 280.500 281.900 283.400 285.500 291.200 1.001 6000
So we estimate the mean density in arable areas to be 4.888 and that in grassland areas to be 2.098 less.
In addition, there is a hint of a negative relationship between the variance of the counts and elevation: sites
at greater elevations do appear to be less heterogeneous, as can be seen by the negative sign of beta2.
We can look at the 95% CRI for that parameter and see what proportion of the mass of the
posterior is on either side of zero:
mean(out$sims.list$beta2 < 0)
> mean(out$sims.list$beta2 < 0)
[1] 0.9436667
So actually, there is quite strong evidence for such a variance-elevation relationship in these data. In
biological terms, this might suggest that higher sites vary less among each other in the suitability of their
habitat (as expressed by observed hare density).
###############################################################
# #
# Chapter 8: Linear regression #
# #
###############################################################
1. Toy frog problem
----------------
Exercise:
Fit the linear regression model to a toy data set composed of mass-length data of 5 frogs (exercise
courtesy of Oscar Ramos).
Solution:
# The data
mass <- c(10, 20, 23, 32, 35)
length <- c(5,7,10, 12, 15)
n <- length(mass)
plot(length, mass)
# Solution with R
fm <- lm(mass ~ length)
fm
abline(fm)
# Solution with WinBUGS
# Write model
sink("linreg.txt")
cat("
model {
# Priors
alpha ~ dnorm(0,0.001)
beta ~ dnorm(0,0.001)
sigma ~ dunif(0, 100)
# Likelihood
for (i in 1:n) {
y[i] ~ dnorm(mu[i], tau)
mu[i] <- alpha + beta*x[i]
}
# Derived quantities
tau <- 1/ (sigma * sigma)
# Assess model fit using a sums-of-squares-type discrepancy
for (i in 1:n) {
residual[i] <- y[i]-mu[i] # Residuals for observed data
predicted[i] <- mu[i] # Predicted values
sq[i] <- pow(residual[i], 2) # Squared residuals for observed data
# Generate replicate data and compute fit stats for them
y.new[i] ~ dnorm(mu[i], tau) # one new data set at each MCMC iteration
sq.new[i] <- pow(y.new[i]-predicted[i], 2) # Squared residuals for new data
}
fit <- sum(sq[]) # Sum of squared residuals for actual data set
fit.new <- sum(sq.new[]) # Sum of squared residuals for new data set
test <- step(fit.new - fit) # Test whether new data set more extreme, mean is Bayesian p-value
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(x=length, y=mass, n=length(length))
# Inits function
inits <- function(){ list(alpha=rnorm(1), beta=rnorm(1), sigma = rlnorm(1))}
# Parameters to estimate
params <- c("alpha", "beta", "sigma", "fit", "fit.new", "test", "predicted")
# MCMC settings
nc = 3 ; ni=1200 ; nb=200 ; nt=1
# Start Gibbs sampler
out <- bugs(data = win.data, inits = inits, parameters = params, model = "linreg.txt", n.thin = nt,
n.chains = nc, n.burnin = nb, n.iter = ni, debug = TRUE, bugs.directory="c:/Program files/WinBUGS14/")
print(out, dig = 3)
> print(out, dig = 3)
Inference for Bugs model at "linreg.txt", fit using WinBUGS,
3 chains, each with 1200 iterations (first 200 discarded)
n.sims = 3000 iterations saved
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
alpha 0.105 8.496 -17.551 -3.495 0.207 3.907 17.592 1.007 1500
beta 2.435 0.872 0.640 2.076 2.421 2.785 4.144 1.012 3000
sigma 5.978 6.222 1.819 2.956 4.191 6.520 20.843 1.003 2300
fit 158.776 1001.987 25.829 33.367 49.140 95.080 772.495 1.005 1700
fit.new 379.442 2145.839 6.411 32.122 75.090 198.525 2374.349 1.001 2500
test 0.608 0.488 0.000 0.000 1.000 1.000 1.000 1.001 3000
predicted[1] 12.277 5.087 1.726 10.258 12.320 14.430 22.250 1.007 810
predicted[2] 17.147 4.242 9.296 15.630 17.190 18.812 24.691 1.014 620
predicted[3] 24.450 4.191 17.854 23.240 24.490 25.780 30.872 1.029 620
predicted[4] 29.319 4.988 21.258 27.887 29.360 30.853 37.000 1.028 820
predicted[5] 36.623 6.874 24.709 34.400 36.580 38.960 48.220 1.022 1300
deviance 28.546 5.149 22.760 24.760 27.125 30.832 41.840 1.003 1800
[...]
# Add line to plot
lines(length, out$mean$predicted, lwd = 3, col = "red")
Again, ML and Bayesian estimates are virtually identical numerically.
2. Get predictions for frogs with length 16 to 20 length units
-----------------------------------------------------------
One simple way to get such predictions is to simply add the five numbers 16:20 to the length data and
fill in the mass data accordingly with missing values. These NA's will then be updated automatically
as part of the Bayesian model fitting. So we simply change the 'data statement' of the analysis and
repeat the rest. Specifying the response, y, as a node to be monitored by WinBUGS implicitly defines
only the missing elements of y to be treated as unknowns.
# Bundle data
x <- c(5,7,10,12,15, 16, 17, 18, 19, 20)
y <- c(10,20,23,32,35, NA, NA, NA, NA, NA)
n <- length(x)
win.data <- list("x","y", "n")
# Inits function
inits <- function(){ list(alpha=rnorm(1), beta=rnorm(1), sigma = rlnorm(1))}
# Parameters to estimate
params <- c("alpha","beta", "sigma", "fit", "fit.new", "predicted", "y")
# MCMC settings
nc = 3 ; ni=1200 ; nb=200 ; nt=1
# Start Gibbs sampler
out <- bugs(data = win.data, inits = inits, parameters = params, model = "linreg.txt",
n.thin = nt, n.chains = nc, n.burnin = nb, n.iter = ni, debug = TRUE,
bugs.directory="c:/Program files/WinBUGS14/")
print(out, dig = 3)
> print(out, dig = 3)
Inference for Bugs model at "linreg.txt", fit using WinBUGS,
3 chains, each with 1200 iterations (first 200 discarded)
n.sims = 3000 iterations saved
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
alpha 0.105 7.827 -16.713 -3.485 0.261 4.030 15.761 1.002 1200
beta 2.438 0.775 0.866 2.068 2.418 2.798 4.083 1.002 1200
sigma 5.674 4.494 1.840 2.993 4.252 6.676 18.020 1.002 1000
fit 372.312 832.299 38.129 75.060 139.050 317.150 2414.349 1.002 1200
fit.new 506.481 1218.433 22.458 77.260 167.800 421.400 3463.299 1.002 1100
predicted[1] 12.295 4.596 2.729 10.200 12.310 14.460 21.860 1.003 1600
predicted[2] 17.171 3.672 9.746 15.540 17.155 18.793 24.921 1.003 2500
predicted[3] 24.486 3.323 17.929 23.210 24.480 25.790 30.770 1.001 3000
predicted[4] 29.362 3.927 21.409 27.777 29.325 30.850 37.190 1.002 3000
predicted[5] 36.676 5.568 25.495 34.388 36.650 38.972 48.291 1.003 2400
predicted[6] 39.114 6.213 26.359 36.500 39.080 41.690 52.030 1.003 2100
predicted[7] 41.552 6.884 27.339 38.638 41.485 44.460 55.982 1.003 1900
predicted[8] 43.990 7.576 28.295 40.785 43.900 47.270 59.830 1.003 1800
predicted[9] 46.428 8.282 29.248 42.850 46.325 50.073 63.655 1.003 1700
predicted[10] 48.866 9.000 30.308 44.895 48.745 52.840 67.902 1.003 1600
y[6] 39.307 9.089 20.129 35.278 39.270 43.133 58.311 1.001 3000
y[7] 41.713 9.129 23.748 37.267 41.450 45.790 61.744 1.003 2400
y[8] 43.857 10.453 22.055 39.545 43.800 48.140 64.451 1.001 3000
y[9] 46.578 11.401 25.187 41.818 46.245 51.103 69.991 1.001 3000
y[10] 48.920 11.661 25.336 43.977 48.795 53.700 73.092 1.002 2200
deviance 28.528 4.766 22.750 24.867 27.340 31.070 40.420 1.004 620
We see that the means of predicted[6:10] are the same as y[6:10], at least up to MCMC accuracy, but
that the posterior sd's differ. This is because the predicted values refer to the expected mass for
a frog of given length, while the y refer to the prediction of the mass of an individual frog, with
which there is more uncertainty associated.
3. Regression of mean density on year in grassland areas (Swiss hare data)
-----------------------------------------------------------------------
hares <- read.table("F:\\a_solutions WB book\\hares.txt", header = TRUE)
First we have to aggregate the data over years and restrict them to the grassland sites.
grass.data <- hares[hares$landuse == "grass",]
MMD <- aggregate(grass.data$mean.density, by = list(grass.data$year), FUN = mean, na.rm = TRUE)$x
year <- 1992:2008
Then we plot the data:
plot(year, MMD, type = "b", lwd = 3, col = "blue", main = "Mean observed density in grassland areas")
We fit fit first a linear and second a quadratic regression model using WinBUGS:
# Write model
sink("linreg.txt")
cat("
model {
# Priors
alpha ~ dnorm(0,0.001)
beta ~ dnorm(0,0.001)
sigma ~ dunif(0, 100)
tau <- 1/ (sigma * sigma)
# Likelihood
for (i in 1:n) {
y[i] ~ dnorm(mu[i], tau)
mu[i] <- alpha + beta*x[i]
predicted.linear[i] <- mu[i]
}
}
",fill=TRUE)
sink()
# Bundle data
# win.data <- list(x = year, y=MMD, n=length(MMD)) # Failure to standardise year leads to spurious convergence !
win.data <- list(x = ((year-mean(year)) / sd(year)), y=MMD, n=length(MMD))
# Inits function
inits <- function(){ list(alpha=rnorm(1), beta=rnorm(1), sigma = rlnorm(1))}
# Parameters to estimate
params <- c("alpha", "beta", "sigma", "predicted.linear")
# MCMC settings
nc = 3 ; ni=1200 ; nb=200 ; nt=1
# Start Gibbs sampler
out <- bugs(data = win.data, inits = inits, parameters = params, model = "linreg.txt", n.thin = nt,
n.chains = nc, n.burnin = nb, n.iter = ni, debug = TRUE, bugs.directory="c:/Program files/WinBUGS14/")
print(out, dig = 3)
We then plot the predicted mean density under the linear (straight-line) model:
lines(year, out$mean$predicted.linear, lwd = 3, col = "red")
Next, the quadratic (curvilinear) model:
# Write model
sink("quadreg.txt")
cat("
model {
# Priors
alpha ~ dnorm(0,0.001)
beta ~ dnorm(0,0.001)
gamma ~ dnorm(0, 0.001)
sigma ~ dunif(0, 100)
tau <- 1/ (sigma * sigma)
# Likelihood
for (i in 1:n) {
y[i] ~ dnorm(mu[i], tau)
mu[i] <- alpha + beta*x[i] + gamma * pow(x[i], 2)
predicted.quadratic[i] <- mu[i]
}
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(x=((year-mean(year)) / sd(year)), y=MMD, n=length(MMD))
# Inits function
inits <- function(){ list(alpha=rnorm(1), beta=rnorm(1), sigma = rlnorm(1))}
# Parameters to estimate
params <- c("alpha", "beta", "gamma", "sigma", "predicted.quadratic")
# MCMC settings
nc = 3 ; ni=1200 ; nb=200 ; nt=1
# Start Gibbs sampler
out <- bugs(data = win.data, inits = inits, parameters = params, model = "quadreg.txt", n.thin = nt,
n.chains = nc, n.burnin = nb, n.iter = ni, debug = TRUE, bugs.directory="c:/Program files/WinBUGS14/")
print(out, dig = 3)
We then plot the predicted mean density under the quadratic model:
lines(year, out$mean$predicted.quadratic, lwd = 3, col = "brown")
There is a strong visual impression that the population decline of the Brown hare in grassland sites
did not evolve in a straight-line fashion, but that there was a decline, followed by a stabilisation
or even slight recovery. We can test for the quadratic effect in the trend by looking at the
posterior distribution of the gamma parameter
hist(out$sims.list$gamma, breaks = 40, col = "grey")
The graph and the summary (above) show that zero is a very untypical value for gamma, so that we can
assume that the population decline did not evolve linearly.
Note: This exercise provides another example for WinBUGS erroneously indicating convergence: this
happened to me first when I forgot to normalise year in the analysis of the model with a linear
effect of year, and all the Rhat values were close to one. It was only when I plotted the estimated
regression line into the observed data that I noticed that something was wrong with the solution
that WinBUGS gave me.
###############################################################
# #
# Chapter 9: One-way ANOVA #
# #
###############################################################
1. Convert the ANOVA to an ANCOVA
------------------------------
We want to add a single continuous explanatory variate into the (fixed-effects) one-way ANOVA. This
transforms our model to what is often called an ANCOVA model. We won't bother with recreating the
data to build in an effect of that covariate but just invent it and fit it to show how this is done.
I assume that you have the data set in your R workspace already (otherwise, go and take the code to
recreate one).
ngroups <- 5 # Number of populations
nsample <- 10 # Number of snakes in each
pop.means <- c(50, 40, 45, 55, 60) # Population mean SVL
sigma <- 3 # Residual sd
eps <- rnorm(ngroups * nsample, 0, sigma) # Residuals
x <- rep(1:5, rep(nsample, ngroups)) # Indicator for population
X <- as.matrix(model.matrix(~ as.factor(x)-1)) # Create design matrix
y <- as.numeric(X %*% as.matrix(pop.means) + eps)
boxplot(y~x, col="grey", xlab="Population", ylab="SVL", main="", las = 1)
hab <- rnorm(50) # This is the covariate
# Write model
sink("ancova.txt") # Change model name (optional)
cat("
model {
# Priors
for (i in 1:5){
alpha[i] ~ dnorm(0, 0.001)
}
beta ~ dnorm(0, 0.01) # Add prior for coefficient of hab
sigma ~ dunif(0, 100)
# Likelihood
for (i in 1:50) {
y[i] ~ dnorm(mean[i], tau)
mean[i] <- alpha[x[i]] + beta * hab[i] # Adapt linear predictor
}
tau <- 1/(sigma*sigma)
}
",fill=TRUE)
sink()
# Package data
win.data <- list("y", "x", "hab") # Need to add hab to packaged data
# Define function to randomly choose inits # Add hab covariate
inits <- function(){ list(alpha = rnorm(5, mean = mean(y)), beta = rnorm(1), sigma = rlnorm(1) )}
# Select parameters you want to estimate
params <- c("alpha", "sigma", "beta")
# MCMC settings
ni <- 1200
nb <- 200
nt <- 2
nc <- 3
# Start Gibbs sampling
out <- bugs (win.data, inits, params, "ancova.txt", n.thin=nt, n.chains=nc, n.burnin=nb, n.iter=ni,
debug = TRUE)
print(out, dig = 3)
2. Watch shrinkage happen
----------------------
Exercise:
Fit both a fixed- and a random-effects ANOVA to the same data set and compare the group mean estimates.
Solution:
We take the data just generated in exercise 1 and fit the following model:
# Write fixed-effects ANOVA model
sink("fixefanova.txt")
cat("
model {
# Priors
for (i in 1:5){
alpha[i] ~ dnorm(0, 0.001)
}
sigma ~ dunif(0, 100)
# Likelihood
for (i in 1:50) {
y[i] ~ dnorm(mean[i], tau)
mean[i] <- alpha[x[i]]
}
tau <- 1/(sigma*sigma)
}
",fill=TRUE)
sink()
# Package data
win.data <- list("y", "x")
# Inits function
inits <- function(){ list(alpha = rnorm(5, mean = mean(y)), sigma = rlnorm(1) )}
# Select parameters you want to estimate
params <- c("alpha", "sigma")
# MCMC settings
ni <- 1200 ; nb <- 200 ; nt <- 2 ; nc <- 3
# Start Gibbs sampling
out.fix <- bugs (win.data, inits, params, "fixefanova.txt", n.thin=nt, n.chains=nc, n.burnin=nb,
n.iter=ni, debug = TRUE)
print(out.fix, dig = 3)
# Add fixed-effects group mean estimates into the plot
points(1:5, out.fix$mean$alpha, col = 'blue', cex = 2, lwd = 3)
abline(h = 50)
# Write random-effects ANOVA model
sink("ranefanova.txt")
cat("
model {
for (i in 1:5){
pop.mean[i] ~ dnorm(mu, tau.group) # Prior for population means
effe[i] <- pop.mean[i] - mu # Population effects as derived quant's
}
mu ~ dnorm(0,0.001) # Hyperprior for grand mean svl
sigma.group ~ dunif(0, 10) # Hyperprior for sd of population effects
sigma.res ~ dunif(0, 15) # Prior for residual sd
# Likelihood
for (i in 1:50) {
y[i] ~ dnorm(mean[i], tau.res)
mean[i] <- pop.mean[x[i]]
}
tau.group <- 1 / (sigma.group * sigma.group)
tau.res <- 1 / (sigma.res * sigma.res)
}
",fill=TRUE)
sink()
# Package data
win.data <- list("y", "x")
# Inits function
inits <- function(){ list(pop.mean = rnorm(5, mean = mean(y)), sigma.res = rlnorm(1) )}
# Select parameters you want to estimate
params <- c("pop.mean", "sigma.res", "sigma.group")
# MCMC settings
ni <- 1200 ; nb <- 200 ; nt <- 2 ; nc <- 3
# Start Gibbs sampling
out.ran <- bugs (win.data, inits, params, "ranefanova.txt", n.thin=nt, n.chains=nc, n.burnin=nb,
n.iter=ni, debug = TRUE)
print(out.ran, dig = 3)
# Add random-effects group mean estimates into the plot
points(1:5, out.ran$mean$pop.mean, col = 'red', cex = 2, lwd = 3, pch = 20)
So we see that the random-effects estimates are very slightly pulled in towards the grand mean. That is,
they are 'shrunk' to the mean of 50.
We will observe more shrinkage by discarding most of the information in one group. That way,
we will induce much more shrinkage in that group. We will turn the measurements of 8 snakes in
population 5 into NA's. Note that you won't see much shrinkage if you do this for population 1,
whose mean is very close to the grand mean anyway.
yy <- y
yy[42:50] <- NA
boxplot(y~x, col="grey", xlab="Population", ylab="SVL", main="", las = 1)
# Fixed-effects ANOVA
# Package data
win.data <- list(y = yy, x = x)
# Inits function
inits <- function(){ list(alpha = rnorm(5, mean = mean(y)), sigma = rlnorm(1) )}
# Select parameters you want to estimate
params <- c("alpha", "sigma")
# MCMC settings
ni <- 1200 ; nb <- 200 ; nt <- 2 ; nc <- 3
# Start Gibbs sampling
out.fix <- bugs (win.data, inits, params, "fixefanova.txt", n.thin=nt, n.chains=nc, n.burnin=nb,
n.iter=ni, debug = FALSE)
print(out.fix, dig = 3)
# Add fixed-effects group mean estimates into the plot
points(1:5, out.fix$mean$alpha, col = 'blue', cex = 2, lwd = 3)
abline(h=50)
# Random-effects ANOVA
# Inits function
inits <- function(){ list(pop.mean = rnorm(5, mean = mean(y)), sigma.res = rlnorm(1) )}
# Select parameters you want to estimate
params <- c("pop.mean", "sigma.res", "sigma.group")
# MCMC settings
ni <- 1200 ; nb <- 200 ; nt <- 2 ; nc <- 3
# Start Gibbs sampling
out.ran <- bugs (win.data, inits, params, "ranefanova.txt", n.thin=nt, n.chains=nc, n.burnin=nb,
n.iter=ni, debug = FALSE)
print(out.ran, dig = 3)
# Add random-effects group mean estimates into the plot
points(1:5, out.ran$mean$pop.mean, col = 'red', cex = 2, lwd = 3, pch = 20)
In random-effects models, there is more shrinkage for groups about which we have less information.
So in this case, the estimate of the mean of the last group was more shrunk than that for the other groups.
3. One-way ANOVA for mean density (Swiss hares)
-------------------------------------------
Exercise:
Fit both a fixed- and a random-effects ANOVA to the mean observed counts, treating years as
replicates. Note: there are 17 years.
Exercise:
boxplot(hares$mean.density~hares$year, col="grey", xlab="Year", ylab="Mean counts", main="", las = 1)
abline(h = mean(hares$mean.density, na.rm = TRUE))
# Write fixed-effects ANOVA model
sink("fixefanova.txt")
cat("
model {
# Priors
for (i in 1:17){
alpha[i] ~ dnorm(0, 0.001)
}
sigma ~ dunif(0, 100)
# Likelihood
for (i in 1:n) {
y[i] ~ dnorm(mean[i], tau)
mean[i] <- alpha[x[i]]
}
tau <- 1/(sigma*sigma)
}
",fill=TRUE)
sink()
# Package data
win.data <- list(y=hares$mean.density , x= as.numeric(factor(hares$year)),
n = length(hares$mean.density))
# Inits function
inits <- function(){ list(alpha = rnorm(17, mean = mean(y)), sigma = rlnorm(1) )}
# Select parameters you want to estimate
params <- c("alpha", "sigma")
# MCMC settings
ni <- 1200 ; nb <- 200 ; nt <- 2 ; nc <- 3
# Start Gibbs sampling
out.fix <- bugs (win.data, inits, params, "fixefanova.txt", n.thin=nt, n.chains=nc, n.burnin=nb,
n.iter=ni, debug = TRUE)
print(out.fix, dig = 3)
# Add fixed-effects group mean estimates into the plot
points(1:17, out.fix$mean$alpha, col = 'blue', cex = 2, lwd = 3)
# Note: the estimated means are always greater than the medians, shown by the boxplots. This is
because the counts are skewed.
# Write random-effects ANOVA model
sink("ranefanova.txt")
cat("
model {
for (i in 1:17){
pop.mean[i] ~ dnorm(mu, tau.group) # Prior for population means
effe[i] <- pop.mean[i] - mu # Population effects as derived quant's
}
mu ~ dnorm(0,0.001) # Hyperprior for grand mean svl
sigma.group ~ dunif(0, 10) # Hyperprior for sd of population effects
sigma.res ~ dunif(0, 15) # Prior for residual sd
# Likelihood
for (i in 1:n) {
y[i] ~ dnorm(mean[i], tau.res)
mean[i] <- pop.mean[x[i]]
}
tau.group <- 1 / (sigma.group * sigma.group)
tau.res <- 1 / (sigma.res * sigma.res)
}
",fill=TRUE)
sink()
# Package data
win.data <- list(y=hares$mean.density , x= as.numeric(factor(hares$year)),
n = length(hares$mean.density))
# Inits function
inits <- function(){ list(pop.mean = rnorm(17, mean = mean(y)), sigma.res = rlnorm(1) )}
# Select parameters you want to estimate
params <- c("pop.mean", "sigma.res", "sigma.group")
# MCMC settings
ni <- 1200 ; nb <- 200 ; nt <- 2 ; nc <- 3
# Start Gibbs sampling
out.ran <- bugs (win.data, inits, params, "ranefanova.txt", n.thin=nt, n.chains=nc, n.burnin=nb,
n.iter=ni, debug = TRUE)
print(out.ran, dig = 3)
# Add random-effects group mean estimates into the plot
points(1:17, out.ran$mean$pop.mean, col = 'red', cex = 2, lwd = 3, pch = 20)
Again, we clearly see the shrinkage towards the grand mean, shown by the thin line.
###############################################################
# #
# Chapter 10: Two-way ANOVA #
# #
###############################################################
1. Fit the two-way ANOVA with interaction (snake toy data set)
-----------------------------------------------------------
We fit a two-way ANOVA with interaction to the snake toy data set (chapter 6). Here are the data again:
mass <- c(6, 8, 5, 7, 9, 11)
pop <- factor(c(1,1,2,2,3,3))
region <- factor(c(1,1,1,1,2,2))
hab <- factor(c(1,2,3,1,2,3))
svl <- c(40, 45, 39, 50, 52, 57)
We want to fit a model like the one specified in R by typing lm(mass ~ pop * hab)
We may start with the code in chapter 10.5.2.
# Write model
sink("model.txt")
cat("
model {
# Priors
for (i in 1:2){
for(j in 1:3) {
group.mean[i,j] ~ dnorm(0, 0.0001)
}
}
sigma ~ dunif(0, 100)
# Likelihood
for (i in 1:6) {
mass[i] ~ dnorm(mean[i], tau)
mean[i] <- group.mean[region[i], hab[i]]
}
# Derived quantities
tau <- 1 / ( sigma * sigma)
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(mass=mass, hab = as.numeric(hab), region = as.numeric(region))
# Inits function
inits <- function(){list(sigma = rlnorm(1) )}
# Parameters to estimate
params <- c("group.mean", "sigma")
# MCMC settings
ni <- 6000
nb <- 2000
nt <- 2
nc <- 3
# Start Gibbs sampling
out <- bugs(win.data, inits, params, "model.txt", n.thin=nt, n.chains=nc, n.burnin=nb, n.iter=ni,
debug = TRUE)
# Print estimates
print(out, dig = 3)
> print(out, dig = 3)
Inference for Bugs model at "model.txt", fit using WinBUGS,
3 chains, each with 6000 iterations (first 2000 discarded), n.thin = 2
n.sims = 6000 iterations saved
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
group.mean[1,1] 6.262 14.632 -27.362 4.089 6.466 8.663 38.055 1.009 5300
group.mean[1,2] 7.666 19.216 -37.341 4.618 7.992 11.060 49.904 1.002 4400
group.mean[1,3] 4.986 19.469 -40.871 1.823 5.037 8.178 47.974 1.003 6000
group.mean[2,1] 2.014 100.338 -187.507 -66.692 0.953 70.670 199.107 1.001 5000
group.mean[2,2] 8.528 19.195 -35.532 5.553 8.994 12.102 48.761 1.004 6000
group.mean[2,3] 10.497 20.405 -34.370 7.439 10.920 13.832 54.461 1.005 6000
sigma 13.282 17.854 0.520 1.941 5.667 16.552 67.890 1.008 280
deviance 37.076 16.149 9.026 24.250 36.745 49.740 66.010 1.006 360
For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
DIC info (using the rule, pD = Dbar-Dhat)
pD = -5.0 and DIC = 32.1
DIC is an estimate of expected predictive error (lower deviance is better).
Non-identifiability in a Bayesian analysis is perhaps slightly less easily diagnosed than in a
classical analysis, since there is always some information about a parameter: if there is none in
the data, then it comes from the prior. In our analysis, we have no information on snakes in habitat
1 in region 2, hence the posterior distribution for the associated parameter is determined by the
prior. We recognize that by the huge range of the posterior of group.mean[2,1].
2. Fit the two-way ANOVA with interaction (Swiss hare data)
--------------------------------------------------------
Exercise:
Fit the model corresponding to what we would specify in R as lm(mean.density ~ landuse * region)
Solution:
We may again start with the code in chapter 10.5.2. We note that we have 8 regions and two landuse
classes. We look at the data first:
attach(hares)
aggregate(mean.density, by = list(region, landuse), FUN = mean, na.rm = TRUE)
table(region, landuse)
So there are no grassland sites in three of the regions: CH.N, CH.SW and CH.W.
As a simple way to check whether the landuse effects are the same in all regions, we compute a
derived variable called diff.density and monitor that.
# Write model
sink("model.txt")
cat("
model {
# Priors
for (i in 1:2){
for(j in 1:8) {
group.mean[i,j] ~ dnorm(0, 0.0001)
}
}
sigma ~ dunif(0, 100)
# Likelihood
for (i in 1:n) {
mean.density[i] ~ dnorm(mean[i], tau)
mean[i] <- group.mean[landuse[i], region[i]]
}
# Derived quantities
tau <- 1 / ( sigma * sigma)
for(j in 1:8){
diff.density[j] <- group.mean[1,j] - group.mean[2,j]
}
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(mean.density=mean.density, region = as.numeric(region), landuse = as.numeric(landuse),
n = length(mean.density))
# Inits function
inits <- function(){list(sigma = rlnorm(1) )}
# Parameters to estimate
params <- c("group.mean", "sigma", "diff.density")
# MCMC settings
ni <- 6000
nb <- 2000
nt <- 2
nc <- 3
# Start Gibbs sampling
out <- bugs(win.data, inits, params, "model.txt", n.thin=nt, n.chains=nc, n.burnin=nb, n.iter=ni,
debug = TRUE)
# Print estimates
print(out, dig = 3)
> print(out, dig = 3)
Inference for Bugs model at "model.txt", fit using WinBUGS,
3 chains, each with 6000 iterations (first 2000 discarded), n.thin = 2
n.sims = 6000 iterations saved
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
group.mean[1,1] 5.188 0.293 4.611 4.992 5.189 5.384 5.749 1.002 1800
group.mean[1,2] 3.136 0.908 1.366 2.533 3.128 3.749 4.949 1.001 6000
group.mean[1,3] 2.600 0.448 1.737 2.297 2.591 2.898 3.503 1.001 6000
group.mean[1,4] 3.232 0.720 1.792 2.750 3.226 3.724 4.628 1.001 2900
group.mean[1,5] 5.251 0.385 4.501 4.990 5.251 5.508 6.009 1.001 6000
group.mean[1,6] 9.625 0.406 8.839 9.349 9.623 9.901 10.420 1.001 6000
group.mean[1,7] 5.127 0.325 4.496 4.907 5.128 5.338 5.791 1.001 6000
group.mean[1,8] 4.464 0.673 3.180 3.999 4.451 4.917 5.787 1.001 6000
group.mean[2,1] 2.446 0.789 0.869 1.924 2.446 2.973 4.000 1.001 6000
group.mean[2,2] 2.863 0.653 1.572 2.430 2.864 3.286 4.136 1.001 6000
group.mean[2,3] 2.627 0.530 1.600 2.271 2.627 2.982 3.655 1.001 6000
group.mean[2,4] 3.819 0.335 3.159 3.592 3.818 4.050 4.477 1.001 6000
group.mean[2,5] 1.545 100.131 -196.305 -64.917 3.191 68.343 195.602 1.001 4200
group.mean[2,6] -0.034 99.123 -191.400 -67.010 -0.792 67.582 194.020 1.001 3700
group.mean[2,7] 1.294 100.715 -198.205 -66.855 0.496 67.867 195.005 1.001 6000
group.mean[2,8] 0.557 1.263 -1.938 -0.310 0.548 1.395 3.025 1.001 3900
sigma 3.322 0.091 3.149 3.260 3.320 3.382 3.508 1.001 3200
diff.density[1] 2.742 0.839 1.101 2.174 2.756 3.309 4.398 1.001 6000
diff.density[2] 0.274 1.112 -1.912 -0.452 0.278 1.029 2.449 1.001 6000
diff.density[3] -0.027 0.691 -1.371 -0.496 -0.034 0.444 1.323 1.001 6000
diff.density[4] -0.587 0.797 -2.149 -1.123 -0.599 -0.051 0.966 1.001 3500
diff.density[5] 3.706 100.138 -190.305 -63.150 2.153 70.190 201.310 1.001 4200
diff.density[6] 9.659 99.138 -184.700 -57.932 10.255 76.825 201.105 1.001 3700
diff.density[7] 3.832 100.706 -189.905 -62.553 4.543 71.830 203.405 1.001 6000
diff.density[8] 3.907 1.447 1.080 2.936 3.918 4.872 6.762 1.001 4300
deviance 3545.099 5.291 3537.000 3541.000 3545.000 3548.000 3557.000 1.001 3200
For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
DIC info (using the rule, pD = Dbar-Dhat)
pD = 14.0 and DIC = 3559.1
DIC is an estimate of expected predictive error (lower deviance is better).
We see that the estimates for the primary parameters, for which there is no information in the data,
the group.means for [2,5], [2,6] and [2,7] are totally influenced by their priors (see huge posterior
sd's and 95% credible intervals). As an aside, we also see that the effective number of parameters,
pD, is estimated correctly, because we estimate 13 mean parameters and a variance.
In addition, the difference in mean density between the two landuse types in the eight regions seems to
differ quite a bit, as we can see from the estimates of the diff.density nodes. (Those for which there
is no information for both arable and grassland again have huge posterior sd's and 95% CRI's).
###############################################################
# #
# Chapter 11: General linear model (ANCOVA) #
# #
###############################################################
1. Probability of a parameter
--------------------------
Exercise:
What's the probability that the mass-length slope is inferior in the Jura than in the Massif Central ?
Solution:
The derived quantity named "test1" provides the answer for this question since, this is exactly the
contrast between the slope in the Jura mountains and that in the Massif Central. So the posterior
distribution for that parameter gives us the answer. We plot the posterior distribution first:
hist(out$sims.list$test1, col = "gray")
abline(v = 0, lwd = 3, col = "red")
A fairly large part of the area of the posterior distribution lies to the left of the red line. This
part corresponds to the probability that the slope is smaller in the Jura than in the Massif Central.
To get a numerical estimate of that probability, we can do the following. Note how useful it is in R
to be able to do calculations on logical values, where TRUE is interpreted as 1 and FALSE as 0.
mean(out$sims.list$test1 < 0)
> mean(out$sims.list$test1 < 0)
[1] 0.976
Another possibility would be to use the step function within WinBUGS.
NOTE: As always, your solutions will be numerically different, because you have a slightly different
data set and also, your MCMC samples are different from mine.
2. 'Neighbouring' models
-------------------
It is very insightful to practice "going left and right" around a given model, i.e., go one step more
or less complex. Here, we reduce the ANCOVA with heterogeneous slopes first to an ANCOVA with
homogeneous slopes and then to a simple linear regression. Try to do this without looking at the
code in the book.
(b) First we make the slopes the same over all groups. This is done by removing the index of beta.
# Define model
sink("lm.txt")
cat("
model {
# Priors
for (i in 1:n.group){
alpha[i] ~ dnorm(0, 0.001) # Intercepts
}
beta ~ dnorm(0, 0.001) # There is no only a single slope, so take that out of the loop
sigma ~ dunif(0, 100) # Residual standard deviation
tau <- 1 / ( sigma * sigma)
# Likelihood
for (i in 1:n) {
mass[i] ~ dnorm(mu[i], tau)
mu[i] <- alpha[pop[i]] + beta * length[i] # remove the index of beta
}
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(mass = as.numeric(mass), pop = as.numeric(pop), length = as.numeric(scale(length)),
n.group = max(as.numeric(pop)), n = n)
# Inits function # Also note that only a single init for beta now
inits <- function(){ list(alpha = rnorm(n.groups, 0, 2), beta = rnorm(1, 1, 1), sigma = rlnorm(1))}
# Parameters to estimate
parameters <- c("alpha", "beta", "sigma") # No change here, since WB interprets beta as a scalar now
# MCMC settings
ni <- 1200
nb <- 200
nt <- 2
nc <- 3
# Start Markov chains
out <- bugs(win.data, inits, parameters, "lm.txt", n.thin=nt, n.chains=nc, n.burnin=nb, n.iter=ni,
debug = TRUE)
(b) Next, we also make the intercept identical between groups. This brings us back to the simple
linear regression.
# Define model
sink("lm.txt")
cat("
model {
# Priors
alpha ~ dnorm(0, 0.001) # Intercept is also made a scalar now, no loop anymore
beta ~ dnorm(0, 0.001)
sigma ~ dunif(0, 100)
tau <- 1 / ( sigma * sigma)
# Likelihood
for (i in 1:n) {
mass[i] ~ dnorm(mu[i], tau)
mu[i] <- alpha + beta* length[i] # no index on either alpha or beta
}
}
",fill=TRUE)
sink()
# Bundle data # pop not needed anymore. WB will crash when provided with data for pop. Also, n.pop obsolete.
win.data <- list(mass = as.numeric(mass), length = as.numeric(scale(length)), n = n)
# Inits function # Single init for alpha and for beta now
inits <- function(){ list(alpha = rnorm(1, 0, 2), beta = rnorm(1, 1, 1), sigma = rlnorm(1))}
# Parameters to estimate
parameters <- c("alpha", "beta", "sigma") # No change here
# MCMC settings
ni <- 1200
nb <- 200
nt <- 2
nc <- 3
# Start Markov chains
out <- bugs(win.data, inits, parameters, "lm.txt", n.thin=nt, n.chains=nc, n.burnin=nb, n.iter=ni,
debug = TRUE)
3. Add a quadratic term to the ANCOVA
----------------------------------
We have to decide whether we assume that the quadratic effect of length is to be the same in all three
populations or not. Here we assume that it is different in all three populations.
# Define model
sink("lm.txt")
cat("
model {
# Priors
for (i in 1:n.group){
alpha[i] ~ dnorm(0, 0.001) # Intercepts
beta1[i] ~ dnorm(0, 0.001) # Linear effects
beta2[i] ~ dnorm(0, 0.001) # Quadratic effects of length
}
sigma ~ dunif(0, 100) # Residual standard deviation
tau <- 1 / ( sigma * sigma)
# Likelihood
for (i in 1:n) {
mass[i] ~ dnorm(mu[i], tau)
mu[i] <- alpha[pop[i]] + beta1[pop[i]]* length[i] + beta2[pop[i]]* pow(length[i],2)
}
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(mass = as.numeric(mass), pop = as.numeric(pop), length = as.numeric(scale(length)),
n.group = max(as.numeric(pop)), n = n)
# Inits function
inits <- function(){ list(alpha = rnorm(n.groups, 0, 2), beta1 = rnorm(n.groups, 1, 1),
beta2 = rnorm(n.groups, 1, 1), sigma = rlnorm(1))}
# Parameters to estimate
parameters <- c("alpha", "beta1", "beta2", "sigma")
# MCMC settings
ni <- 1200
nb <- 200
nt <- 2
nc <- 3
# Start Markov chains
out <- bugs(win.data, inits, parameters, "lm.txt", n.thin=nt, n.chains=nc, n.burnin=nb, n.iter=ni,
debug = TRUE)
print(out, dig = 3)
4. ANCOVA pop*year for mean.density (Swiss hare data)
--------------------------------------------------
Exercise:
Fit a pop * year model to mean density
Solution:
# Define model
sink("lm.txt")
cat("
model {
# Priors
for (i in 1:n.group){
alpha[i] ~ dnorm(0, 0.001) # Intercepts
beta1[i] ~ dnorm(0, 0.001) # Linear effects
}
sigma ~ dunif(0, 100) # Residual standard deviation
tau <- 1 / ( sigma * sigma)
# Likelihood
for (i in 1:n) {
mean.density[i] ~ dnorm(mu[i], tau)
mu[i] <- alpha[pop[i]] + beta1[pop[i]]* year[i]
residual[i] <- mean.density[i] - mu[i]
}
}
",fill=TRUE)
sink()
# Bundle data
attach(hares)
win.data <- list(mean.density = mean.density, pop = as.numeric(site), year = as.numeric(scale(year)),
n.group = max(as.numeric(site)), n = length(mean.density))
# Inits function
inits <- function(){ list(alpha = rnorm(56, 0, 2), beta1 = rnorm(56, 1, 1), sigma = rlnorm(1))}
# Parameters to estimate
parameters <- c("alpha", "beta1", "sigma", "residual")
# MCMC settings
ni <- 1200
nb <- 200
nt <- 2
nc <- 3
# Start Markov chains
out <- bugs(win.data, inits, parameters, "lm.txt", n.thin=nt, n.chains=nc, n.burnin=nb, n.iter=ni,
debug = TRUE, DIC = FALSE)
print(out, dig = 3)
NOTE:
This was a really tricky one: without the DIC=FALSE argument in the function bugs(), WinBUGS crashed
and said something about
-------------------------------
NIL dereference (read)
DeviancePlugin.Monitor.Update [00000080H]
.chain INTEGER 0
.i INTEGER 0
.monitor DeviancePlugin.Monitor [01003E90H]
UpdaterActions.UpdateMonitors [0000011CH]
.chain INTEGER 0
[ MUCH more down here ]
-------------------------------
Because this cryptic error message said something about the deviance and in a similar case, switching
off the calculation of the DIC had helped, I tried that ---- and it DID help. That's one of the
things that one just has to know in order to succeed with some analyses with WinBUGS !
So to inspect the residuals, we can plot them:
plot(out$mean$residual)
abline(h = c(-4,-2, 0, 2, 4))
sum(abs(out$mean$residual) >4 )
We see that there are some outliers (defined here as units with absolute residuals greater than 4).
However, given that they only make up 13/952, and hence, about 1.4% of the data, this is probably
not alarming.
###############################################################
# #
# Chapter 12: Linear mixed model #
# #
###############################################################
1. Turning the LMM into an ANCOVA
------------------------------
Exercise:
Turn the LMM back into an ANCOVA.
Solution:
The LMM has explanatory variables pop*length, and the coefficients of both sets of coefficients
(alpha for pop and beta for length) are assumed to come from one normal distribution each. To turn
this model back into an ANCOVA, we simply remove these two distributional assumptions. I take my
data set generated in 12.2. and the code in 12.4.2. and do the following:
# Define model
sink("model.txt")
cat("
model {
# Priors
for (i in 1:ngroups){
alpha[i] ~ dnorm(0, 0.00001) # Change the priors (don't choose them too tight !)
beta[i] ~ dnorm(0, 0.00001) # No hyperparams anymore
}
# Delete the code for the hyperparams
#mu.int ~ dnorm(0, 0.001) # Mean hyperparameter for random intercepts
#tau.int <- 1 / (sigma.int * sigma.int)
#sigma.int ~ dunif(0, 100) # SD hyperparameter for random intercepts
#mu.slope ~ dnorm(0, 0.001) # Mean hyperparameter for random slopes
#tau.slope <- 1 / (sigma.slope * sigma.slope)
#sigma.slope ~ dunif(0, 100) # SD hyperparameter for slopes
tau <- 1 / ( sigma * sigma) # Residual precision
sigma ~ dunif(0, 100) # Residual standard deviation
# Likelihood # Note no change at all here
for (i in 1:n) {
mass[i] ~ dnorm(mu[i], tau)
mu[i] <- alpha[pop[i]] + beta[pop[i]]* length[i]
}
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(mass = as.numeric(mass), pop = as.numeric(pop), length = length,
ngroups = max(as.numeric(pop)), n = n)
# Inits function # No inits anymore for hyperparams (they're not in the model anymore)
inits <- function(){ list(alpha = rnorm(n.groups, 0, 2), beta = rnorm(n.groups, 10, 2),
sigma = rlnorm(1))}
# Parameters to estimate
parameters <- c("alpha", "beta", "sigma")
# MCMC settings
ni <- 2000
nb <- 500
nt <- 2
nc <- 3
# Start Gibbs sampling
out <- bugs(win.data, inits, parameters, "model.txt", n.thin=nt, n.chains=nc, n.burnin=nb,
n.iter=ni, debug = TRUE)
print(out, dig = 2)
For fun, we can plot the data and the regression lines:
plot(length, mass, col= "grey", cex = 1.5)
for(i in 1:max(as.numeric(pop))){
abline(out$mean$alpha[i], out$mean$beta[i], col = i)
}
2. Fit a random-coefficients regression to mean density (Swiss hares)
------------------------------------------------------------------
Exercise:
Fit the model without intercept-slope correlation (mean.density ~ pop * year)
Solution:
# Define model
sink("model.txt")
cat("
model {
# Priors
for (i in 1:ngroups){
alpha[i] ~ dnorm(mu.int, tau.int) # Random intercepts
beta[i] ~ dnorm(mu.slope, tau.slope) # Random slopes
}
mu.int ~ dnorm(0, 0.001) # Mean hyperparameter for random intercepts
tau.int <- 1 / (sigma.int * sigma.int)
sigma.int ~ dunif(0, 100) # SD hyperparameter for random intercepts
mu.slope ~ dnorm(0, 0.001) # Mean hyperparameter for random slopes
tau.slope <- 1 / (sigma.slope * sigma.slope)
sigma.slope ~ dunif(0, 100) # SD hyperparameter for slopes
tau <- 1 / ( sigma * sigma) # Residual precision
sigma ~ dunif(0, 100) # Residual standard deviation
# Likelihood
for (i in 1:n) {
mean.density[i] ~ dnorm(mu[i], tau)
mu[i] <- alpha[pop[i]] + beta[pop[i]]* year[i]
}
}
",fill=TRUE)
sink()
# Bundle data
attach(hares)
win.data <- list(mean.density = mean.density, pop = as.numeric(site), year = as.numeric(scale(year)),
ngroups = max(as.numeric(site)), n = length(mean.density))
# Inits function
inits <- function(){ list(alpha = rnorm(n.groups, 0, 2), beta = rnorm(n.groups, 10, 2),
mu.int = rnorm(1, 0, 1), sigma.int = rlnorm(1), mu.slope = rnorm(1, 0, 1), sigma.slope = rlnorm(1),
sigma = rlnorm(1))}
# Parameters to estimate
parameters <- c("alpha", "beta", "mu.int", "sigma.int", "mu.slope", "sigma.slope", "sigma")
# MCMC settings
ni <- 2000
nb <- 500
nt <- 2
nc <- 3
# Start Gibbs sampling
out <- bugs(win.data, inits, parameters, "model.txt", n.thin=nt, n.chains=nc, n.burnin=nb,
n.iter=ni, debug = TRUE)
print(out, dig = 2)
# compare with REML solutions
library('lme4')
lme.fit2 <- lmer(mean.density ~ scale(year) + (1 | site) + ( 0+ scale(year) | site))
lme.fit2
###############################################################
# #
# Chapter 13: Introduction to the GLM #
# #
###############################################################
1. Predict mean hare density in arable areas
-----------------------------------------
Exercise:
Add a code line that computes mean observed density in arable areas.
Solution:
We simply add a line that adds alpha and beta and exponentiate the result.
# Define model
sink("Poisson.t.test.txt")
cat("
model {
# Priors
alpha ~ dnorm(0,0.001)
beta ~ dnorm(0,0.001)
# Likelihood
for (i in 1:n) {
C[i] ~ dpois(lambda[i])
log(lambda[i]) <- alpha + beta *x[i]
# Fit assessments
Presi[i] <- (C[i] - lambda[i]) / sqrt(lambda[i]) # Pearson residuals
C.new[i] ~ dpois(lambda[i]) # Replicate data set
Presi.new[i] <- (C.new[i] - lambda[i]) / sqrt(lambda[i]) # Pearson resi
D[i] <- pow(Presi[i], 2)
D.new[i] <- pow(Presi.new[i], 2)
}
# Compute mean hare density in arable areas
mean.arable <- exp(alpha + beta)
# Add up discrepancy measures
fit <- sum(D[])
fit.new <- sum(D.new[])
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(C = C, x = as.numeric(x)-1, n = length(x))
# Inits function
inits <- function(){ list(alpha=rlnorm(1), beta=rlnorm(1))}
# Parameters to estimate
params <- c("lambda","alpha", "beta", "mean.arable", "Presi", "fit", "fit.new")
# MCMC settings
nc <- 3
ni <- 3000
nb <- 1000
nt <- 2
# Start Gibbs sampler
out <- bugs(data=win.data, inits=inits, parameters.to.save=params, model.file="Poisson.t.test.txt", n.thin=nt, n.chains=nc,
n.burnin=nb, n.iter=ni, debug = TRUE)
print(out, dig = 3)
2. Density difference as a derived quantity
----------------------------------------
Exercise:
Estimate the difference in density between arable and grassland areas
Solution:
We add another line of code
# Define model
sink("Poisson.t.test.txt")
cat("
model {
# Priors
alpha ~ dnorm(0,0.001)
beta ~ dnorm(0,0.001)
# Likelihood
for (i in 1:n) {
C[i] ~ dpois(lambda[i])
log(lambda[i]) <- alpha + beta *x[i]
# Fit assessments
Presi[i] <- (C[i] - lambda[i]) / sqrt(lambda[i]) # Pearson residuals
C.new[i] ~ dpois(lambda[i]) # Replicate data set
Presi.new[i] <- (C.new[i] - lambda[i]) / sqrt(lambda[i]) # Pearson resi
D[i] <- pow(Presi[i], 2)
D.new[i] <- pow(Presi.new[i], 2)
}
# Compute difference in mean hare density
difference <- exp(alpha) - exp(alpha + beta)
# Add up discrepancy measures
fit <- sum(D[])
fit.new <- sum(D.new[])
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(C = C, x = as.numeric(x)-1, n = length(x))
# Inits function
inits <- function(){ list(alpha=rlnorm(1), beta=rlnorm(1))}
# Parameters to estimate
params <- c("lambda","alpha", "beta", "difference", "Presi", "fit", "fit.new")
# MCMC settings
nc <- 3
ni <- 3000
nb <- 1000
nt <- 2
# Start Gibbs sampler
out <- bugs(data=win.data, inits=inits, parameters.to.save=params, model.file="Poisson.t.test.txt",
n.thin=nt, n.chains=nc, n.burnin=nb, n.iter=ni, debug = TRUE)
print(out, dig = 3)
To summarize the posterior distribution of the density, we plot it.
hist(out$sims.list$difference, breaks = 50, col = "grey", xlim = c(-10, 5),
xlab = "Density difference grassland - arable land")
abline(v = 0, col = "red", lwd = 2)
3. Migrating raptors
-----------------
Exercise:
Assume that we are counting raptors at some point where bird migration concentrates, for instance
at an isthmus like Gibraltar or Panama, at a mountain pass like Col de Bretolet or along a ridge like
Hawk Mountain. We have seen not a single individual of some special raptor species A during 10 consecutive
days. What is the probability to see two or more individuals of species A on day 11 ? Make explicit your
reasoning and list the assumptions of your model used to answer the question (this exercise is inspired by
Bernardo 2003).
Solution:
We assume that raptors travel independently and that there is a constant probability that one is seen
at the observatory. If this is so, then the data (ten zeroes) can be assumed to represent 10 observations
from a Poisson distribution, whose mean we want to estimate. Using Bayesian posterior predictive distributions,
it is possible the estimate the probability that on the 11th day, or indeed on any other
particular day, at least one bird of species A is seen.
# Define the data, i.e., the counts of raptor A counts over 10 days
C <-c(rep(0,10))
# Write model
sink("raptors.txt")
cat("
model {
# Prior
lambda ~ dunif(0,5)
# Likelihood
for (i in 1:n) { # Loop over 10 observations
C[i] ~ dpois(lambda) # They are assumed to come from a Poisson
}
y ~ dpois(lambda) # To get a prediction for a new day
}
",fill=TRUE)
sink()
# Package data
win.data <- list(C = C, n = length(C))
# Inits
inits <- function(){ list(lambda=rlnorm(1))}
# Stuff to get estimates from
params <- c("lambda", "y")
# MCMC settings
nc <- 3
ni <- 15000
nb <- 3000
nt <- 2
out <- bugs(data=win.data, inits=inits, parameters.to.save=params, model.file="raptors.txt",
n.thin=nt,n.chains=nc,n.burnin=nb, n.iter=ni, debug = TRUE, bugs.directory = "C:/Program files/WinBUGS14/")
print(out, dig = 5)
> print(out, dig = 5)
Inference for Bugs model at "raptors.txt", fit using WinBUGS,
3 chains, each with 15000 iterations (first 3000 discarded), n.thin = 2
n.sims = 18000 iterations saved
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
lambda 0.10034 0.10100 0.00270 0.0283 0.0696 0.1387 0.37001 1.00100 18000
y 0.10128 0.33354 0.00000 0.0000 0.0000 0.0000 1.00000 1.00103 18000
deviance 2.00672 2.01991 0.05401 0.5660 1.3920 2.7750 7.40022 1.00100 18000
For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
DIC info (using the rule, pD = Dbar-Dhat)
pD = 0.0 and DIC = 2.0
DIC is an estimate of expected predictive error (lower deviance is better).
First, don't be confused by the numerical similarity of lambda and the variable y. Lambda is the average
number of raptors per day (i.e., the Poisson parameter), while y is a random draw, or actually, a large
number of random draws, from the Poisson distribution characterised by that mean.
Then note what may appear to be a paradox to some, namely that we can get a fairly large number
of zero observations from a Poisson process and yet, the mean of that process is not 0. To answer the
question about the probability of at least one raptor of species A on a given day, we use Bayesian posterior
inference, i.e., the fact that a Bayesian mode of inference, unlike frequentist inference that
estimates a single point using, e.g., maximum likelihood, gives us a whole distribution that describes
the likely magnitude of an unobservable. That unobservable here is the number of raptors per day in a
Poisson process with mean equal to lambda. By drawing a single observation from that Poisson process at every
step of the Gibbs sampler for lambda, we furthermore fully incorporate in our assessment the entire
uncertainty about lambda. Let's plot the posterior distribution of the number of raptors A per day in this
Poisson process:
hist(out$sims.list$y, col = "grey", xlab = "Number of raptors A per day")
It seems that under a Poisson(0.10034) process, the most likely observation is indeed zero, but that
sometimes we also observe one's and rarely two's or even higher counts sometimes:
table(out$sims.list$y)
> table(out$sims.list$y)
0 1 2 3
16352 1480 161 7
To estimate the probability that we see at least one raptor of species A during any given day, we express
the proportion of cases with one or more individuals seen among all random draws from the modeled process.
more.than.0 <- out$sims.list$y >0 # Note calculation with logical funct'n
prob.one.or.more <- sum(more.than.0) / length(more.than.0)
prob.one.or.more
> prob.one.or.more
[1] 0.09155556
Hence, there is a 9% probability to see one or more individuals per day.
4. Modeling mean density when only zeroes are observed in one group
----------------------------------------------------------------
Exercise:
Take the simulated hare data from this chapter and turn all counts in one of the landuse types
Solution:
We may get another sample of hare counts and turn the first 10 to zero:
n.sites <- 10
x <- gl(n = 2, k = n.sites, labels = c("grassland", "arable"))
n <- 2*n.sites
lambda <- exp(0.69 + 0.92*(as.numeric(x)-1)) # x has levels 1 and 2, not 0 and 1
C <- rpois(n = n, lambda = lambda) # Add Poisson noise
C[1:10] <- 0
C
First, the analysis using maximum likelihood:
poisson.t.test <- glm(C ~ x, family = poisson) # Fit the model
summary(poisson.t.test) # t-Test
anova(poisson.t.test, test = "Chisq") # Likelihood ratio test (LRT)
We see that something went wrong with the algorithm, since the standard errors of the estimates are huge,
and furthermore, the estimates look nothing like the numbers we put into the data set.
Next, the analysis using Bayesian posterior inference:
# Define model
sink("Poisson.t.test.txt")
cat("
model {
# Priors
alpha ~ dnorm(0,0.001)
beta ~ dnorm(0,0.001)
# Likelihood
for (i in 1:n) {
C[i] ~ dpois(lambda[i])
log(lambda[i]) <- alpha + beta *x[i]
}
# Compute difference in mean hare density
mean.grass <- exp(alpha)
mean.arable <- exp(alpha + beta)
difference <- mean.grass - mean.arable
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(C = C, x = as.numeric(x)-1, n = length(x))
# Inits function
inits <- function(){ list(alpha=rlnorm(1), beta=rlnorm(1))}
# Parameters to estimate
params <- c("alpha", "beta", "mean.grass", "mean.arable", "difference")
# MCMC settings
nc <- 3
ni <- 20000
nb <- 10000
nt <- 10
# Start Gibbs sampler
out <- bugs(data=win.data, inits=inits, parameters.to.save=params, model.file="Poisson.t.test.txt",
n.thin=nt, n.chains=nc, n.burnin=nb, n.iter=ni, debug = TRUE)
print(out, dig = 3)
We see that Bayesian inference is more robust here.
5. Swiss hare counts in arable and grassland areas
-----------------------------------------------
Exercise:
Fit a 'Poisson t-test' to the Swiss hare counts (not the density) for year 2000.
Solution:
Get min counts at each site and landuse indicators for the year 2000
C <- apply(cbind(hares$count1[hares$year == 2000], hares$count2[hares$year == 2000]), 1, min, na.rm = TRUE)
C[C == Inf] <- NA # These are sites without counts in 2000
grass <- as.numeric(hares$landuse[hares$year == 2000]) - 1
# Define model
sink("Poisson.t.test.txt")
cat("
model {
# Priors
alpha ~ dnorm(0,0.001)
beta ~ dnorm(0,0.001)
# Likelihood
for (i in 1:n) {
C[i] ~ dpois(lambda[i])
log(lambda[i]) <- alpha + beta *x[i]
}
# Compute difference in mean hare density
mean.arable <- exp(alpha)
mean.grass <- exp(alpha + beta)
difference <- mean.arable - mean.grass
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(C = C, x = grass, n = length(C))
# Inits function
inits <- function(){ list(alpha=rlnorm(1), beta=rlnorm(1))}
# Parameters to estimate
params <- c("alpha", "beta", "mean.grass", "mean.arable", "difference")
# MCMC settings
nc <- 3
ni <- 3000
nb <- 1000
nt <- 2
# Start Gibbs sampler
out <- bugs(data=win.data, inits=inits, parameters.to.save=params, model.file="Poisson.t.test.txt",
n.thin=nt, n.chains=nc, n.burnin=nb, n.iter=ni, debug = TRUE)
print(out, dig = 3)
It may be inadequate to assume that all the variability between counts will be accounted for by the Poisson
variance. Sites vary in size and other things, and this will introduce much overdispersion. We can simply
see this by comparing the residual deviance with the residual d.f, when fitting the GLM per maximum
likelihood.
poisson.t.test <- glm(C ~ grass, family = poisson) # Fit the model
summary(poisson.t.test) # t-Test
The residual deviance is about 20 times greater than the residual d.f., whereas it should be about the same
magnitude under the Poisson assumption.
###############################################################
# #
# Chapter 14: GLM specialities #
# #
###############################################################
1. Estimating a coefficient for an offset variable
-----------------------------------------------
Exercise:
Declaring log(A) as an offset means that log(A) enters the linear predictor of a GLM with a coefficient fixed
at 1. This entails a precise interpretation of the response as a density then. In this sense, it may be
confusing to say 'we estimate a coefficient for an offset', since some might say that the variable then
ceases to be an offset. Whatever, an offset is equivalent to the assumption that there is a constant
proportionality between y and A in a model such as y = g(theta), log(theta) = log(A) + stuff. This may well
not be true and we may want to test this assumption.
Solution:
We generate a data set or take the old one again.
n.site <- 10
A <- runif(n = 2*n.site, 2,5) # Areas range in size from 2 to 5 km2
x <- gl(n = 2, k = n.site, labels = c("grassland", "arable"))
linear.predictor <- log(A) + 0.69 +(0.92*(as.numeric(x)-1))
lambda <- exp(linear.predictor)
C <- rpois(n = 2*n.site, lambda = lambda) # Add Poisson noise
Then we specify a coefficient, alpha0, for the explanatory variable logA.
# Define model
sink("model.txt")
cat("
model {
# Priors
alpha0 ~ dnorm(0,0.001)
alpha ~ dnorm(0,0.001)
beta ~ dnorm(0,0.001)
# Likelihood
for (i in 1:n) {
C[i] ~ dpois(lambda[i])
log(lambda[i]) <- alpha0 * logA[i] + alpha + beta *x[i] # Note former 'offset'
}
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(C = C, x = as.numeric(x)-1, logA = log(A), n = length(x))
# Inits function
inits <- function(){ list(alpha=rlnorm(1), beta=rlnorm(1))}
# Parameters to estimate
params <- c("lambda", "alpha0", "alpha", "beta")
# MCMC settings
nc <- 3
ni <- 11000
nb <- 1000
nt <- 2
# Start Gibbs sampling
out <- bugs(data=win.data, inits=inits, parameters.to.save=params, model.file="model.txt", n.thin=nt,
n.chains=nc,n.burnin=nb, n.iter=ni, debug = TRUE)
print(out, dig = 3)
To test whether alpha0 could also be assumed to be equal to 1 we can plot the posterior distribution and see
how likely values other then 1 are.
hist(out$sims.list$alpha0, col = "grey", breaks = 50)
abline(v = 1, col = "red", lwd = 2)
We see that 1 is not an unusual value, so we can just as well take log(A) as an offset.
2. Swiss hare counts in arable and grassland areas
-----------------------------------------------
Exercise:
Fit a 'Poisson t-test' to the Swiss hare counts for year 2000 accounting for variable area size by an offset
and overdispersion by a random site effect. Also test for a constant proportionality between counts and area.
Solution:
Get min counts at each site and landuse indicators for the year 2000
C <- apply(cbind(hares$count1[hares$year == 2000], hares$count2[hares$year == 2000]), 1, min, na.rm = TRUE)
C[C == Inf] <- NA # These are sites without counts in 2000
GRASS <- as.numeric(hares$landuse[hares$year == 2000]) - 1
AREA <- as.numeric(hares$area[hares$year == 2000])
# Define model
sink("model.txt")
cat("
model {
# Priors
alpha ~ dnorm(0,0.001)
beta ~ dnorm(0,0.001)
tau <- 1 / (sd * sd)
sd ~ dunif(0, 3)
# Likelihood
for (i in 1:n) {
C[i] ~ dpois(lambda[i])
log(lambda[i]) <- logA[i] + alpha + beta * x[i] + eps[i]
eps[i] ~ dnorm(0, tau)
}
# Compute difference in mean hare density
mean.arable <- exp(alpha)
mean.grass <- exp(alpha + beta)
difference <- mean.arable - mean.grass
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(C = C, x = GRASS, n = length(C), logA = log(AREA))
# Inits function
inits <- function(){ list(alpha=rlnorm(1), beta=rlnorm(1))}
# Parameters to estimate
params <- c("alpha", "beta", "mean.grass", "mean.arable", "difference", "sd")
# MCMC settings
nc <- 3
ni <- 3000
nb <- 1000
nt <- 2
# Start Gibbs sampler
out <- bugs(data=win.data, inits=inits, parameters.to.save=params, model.file="model.txt",
n.thin=nt, n.chains=nc, n.burnin=nb, n.iter=ni, debug = TRUE)
print(out, dig = 3)
> print(out, dig = 3)
Inference for Bugs model at "model.txt", fit using WinBUGS,
3 chains, each with 3000 iterations (first 1000 discarded), n.thin = 2
n.sims = 3000 iterations saved
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
alpha 1.295 0.168 0.969 1.185 1.298 1.408 1.627 1.046 49
beta -0.771 0.290 -1.311 -0.971 -0.770 -0.582 -0.201 1.017 130
mean.grass 1.737 0.418 1.062 1.434 1.695 1.980 2.680 1.004 1000
mean.arable 3.704 0.623 2.635 3.270 3.664 4.090 5.091 1.050 46
difference 1.967 0.755 0.510 1.473 1.962 2.455 3.525 1.036 66
sd 0.845 0.109 0.652 0.768 0.837 0.914 1.074 1.001 3000
deviance 272.773 9.967 255.700 265.600 272.000 278.925 294.000 1.001 3000
For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
DIC info (using the rule, pD = Dbar-Dhat)
pD = 43.3 and DIC = 316.1
DIC is an estimate of expected predictive error (lower deviance is better).
Note that now mean.grass and mean.arable are densities (i.e., counts per unit area defined by the variable A).
Next, we fit a coefficient for log(A) to see whether density is constant over all area sizes.
# Define model
sink("model.txt")
cat("
model {
# Priors
alpha0 ~ dnorm(0,0.01) # Reduced uncertainty of priors a bit
alpha ~ dnorm(0,0.01)
beta ~ dnorm(0,0.01)
tau <- 1 / (sd * sd)
sd ~ dunif(0, 2)
# Likelihood
for (i in 1:n) {
C[i] ~ dpois(lambda[i])
log(lambda[i]) <- alpha0 * logA[i] + alpha + beta * x[i] + eps[i]
eps[i] ~ dnorm(0, tau)
}
# Compute difference in mean hare density
mean.arable <- exp(alpha)
mean.grass <- exp(alpha + beta)
difference <- mean.arable - mean.grass
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(C = C, x = GRASS, n = length(C), logA = log(AREA))
# Inits function
inits <- function(){ list(alpha=rlnorm(1), beta=rlnorm(1))}
# Parameters to estimate
params <- c("alpha0", "alpha", "beta", "mean.grass", "mean.arable", "difference", "sd")
# MCMC settings # Note longer chains
nc <- 3
ni <- 30000
nb <- 10000
nt <- 20
# Start Gibbs sampler
out <- bugs(data=win.data, inits=inits, parameters.to.save=params, model.file="model.txt",
n.thin=nt, n.chains=nc, n.burnin=nb, n.iter=ni, debug = TRUE)
print(out, dig = 3)
> print(out, dig = 3)
Inference for Bugs model at "model.txt", fit using WinBUGS,
3 chains, each with 30000 iterations (first 10000 discarded), n.thin = 20
n.sims = 3000 iterations saved
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
alpha0 1.358 0.241 0.885 1.197 1.349 1.518 1.842 1.036 74
alpha 0.663 0.449 -0.253 0.378 0.691 0.963 1.538 1.025 94
beta -0.795 0.284 -1.377 -0.982 -0.785 -0.599 -0.249 1.002 1100
mean.grass 0.993 0.524 0.331 0.625 0.886 1.242 2.269 1.026 88
mean.arable 2.144 0.998 0.776 1.459 1.996 2.621 4.653 1.025 94
difference 1.151 0.662 0.282 0.716 1.021 1.431 2.869 1.030 160
sd 0.844 0.112 0.654 0.765 0.833 0.911 1.105 1.002 1000
deviance 271.934 9.939 254.297 265.100 271.200 278.000 293.100 1.001 3000
For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
DIC info (using the rule, pD = Dbar-Dhat)
pD = 43.2 and DIC = 315.1
DIC is an estimate of expected predictive error (lower deviance is better).
Note that now, mean.grass and mean.arable are no longer the mean density in the two landuse types.
mean(out$sims.list$alpha0 > 1)
So there is quite a bit of evidence for higher densities in larger sites.
###############################################################
# #
# Chapter 15: Poisson ANCOVA #
# #
###############################################################
1. Multiple Poisson regression
---------------------------
Exercise:
Add another explanatory variable into the model.
Solution:
We create 300 values for our new explanatory variable.
nonsense <- runif(300, -2, 2)
# Define model
sink("glm.txt")
cat("
model {
# Priors
for (i in 1:n.groups){
alpha[i] ~ dnorm(0, 0.01) # Intercepts
beta[i] ~ dnorm(0, 0.01) # Slopes
}
delta ~ dnorm(0, 0.01) # Slope of nonsense
# Likelihood
for (i in 1:n) {
C[i] ~ dpois(lambda[i]) # The random variable
lambda[i] <- exp(alpha[pop[i]] + beta[pop[i]] * length[i] + delta * nonsense[i])
} # Note double-indexing: alpha[pop[i]]
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(C = C, pop = as.numeric(pop), n.groups = n.groups, length = length, n = n,
nonsense = nonsense)
# Inits function
inits <- function(){list(alpha=rlnorm(n.groups,3,1), beta=rlnorm(n.groups,2,1))}
# Parameters to estimate
params <- c("alpha", "beta", "delta")
# MCMC settings
ni <- 4500
nb <- 1500
nt <- 5
nc <- 3
# Start Gibbs sampling
out <- bugs(data = win.data, inits = inits, parameters.to.save = params, model.file = "glm.txt",
n.thin = nt, n.chains = nc, n.burnin = nb, n.iter = ni, debug = TRUE)
print(out, dig = 3)
The posterior distribution of the coefficient for nonsense sits on zero, therefore we have not detected
any nonsense effects on the mite counts.
2. Polynomial Poisson regression
-----------------------------
Exercise:
We check for the effect of wing-length squared on the mite counts, without building in this effect into the
data set.
Solution:
We fit a model with population-specific quadratic terms of length
# Define model
sink("glm.txt")
cat("
model {
# Priors
for (i in 1:n.groups){
alpha[i] ~ dnorm(0, 0.01) # Intercepts
beta1[i] ~ dnorm(0, 0.01) # Slopes of linear terms
beta2[i] ~ dnorm(0, 0.01) # Slopes of quadratic terms
}
# Likelihood
for (i in 1:n) {
C[i] ~ dpois(lambda[i]) # The random variable
lambda[i] <- exp(alpha[pop[i]] + beta1[pop[i]] * length[i] + beta2[pop[i]] * pow(length[i], 2))
}
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(C = C, pop = as.numeric(pop), n.groups = n.groups, length = length, n = n)
# Inits function
inits <- function(){list(alpha=rlnorm(n.groups,3,1), beta1=rlnorm(n.groups,2,1))}
# Parameters to estimate
params <- c("alpha", "beta1", "beta2")
# MCMC settings
ni <- 10000
nb <- 5000
nt <- 5
nc <- 3
# Start Gibbs sampling
out <- bugs(data = win.data, inits = inits, parameters.to.save = params, model.file = "glm.txt",
n.thin = nt, n.chains = nc, n.burnin = nb, n.iter = ni, debug = TRUE)
print(out, dig = 3)
Note: I had to use quite long chains to achieve convergence.
3. Poisson ANCOVA for Swiss hare counts
------------------------------------
Exercise:
Fit a Poisson regression that expresses counts as a function of site and year (note: this is essentially the
model that people use to analyse bird counts all over Europe in monitoring programs). You can account for
variable area by fitting log(area) as an offset.
Solution:
# Define model
sink("glm.txt")
cat("
model {
# Priors
for (i in 1:n.groups){
alpha[i] ~ dnorm(0, 0.01) # Intercepts
beta1[i] ~ dnorm(0, 0.01) # Slopes of linear terms
}
# Likelihood
for (i in 1:n) {
C[i] ~ dpois(lambda[i]) # The random variable
lambda[i] <- exp(logA[i] + alpha[pop[i]] + beta1[pop[i]] * year[i])
}
}
",fill=TRUE)
sink()
# Bundle data
attach(hares)
npop <- max(as.numeric(site))
win.data <- list(C = count1, pop = as.numeric(site), n.groups = npop,
year = as.numeric(scale(year)), n = length(count1), logA = log(area))
# Inits function
inits <- function(){list(alpha=rnorm(npop), beta1=rnorm(npop))}
# Parameters to estimate
params <- c("alpha", "beta1", "lambda")
# MCMC settings
ni <- 1000
nb <- 500
nt <- 5
nc <- 3
# Start Gibbs sampling
out <- bugs(data = win.data, inits = inits, parameters.to.save = params, model.file = "glm.txt",
n.thin = nt, n.chains = nc, n.burnin = nb, n.iter = ni, debug = TRUE, DIC = FALSE)
# Note: got NIL dereference error, so switched DIC off
print(out, dig = 3)
# Quick plot of expected count (note lambda contains logA)
plot(1992:2008, out$mean$lambda[1:17], ylim = c(0, 200), main = "Expected counts (not density)",
ylab = "Exp. count", xlab = "Year", lwd = 1, type = "l")
for(i in 2:56){
start.index <- (i - 1) * 17 + 1
end.index <- i * 17
lines(1992:2008, out$mean$lambda[start.index:end.index], col = i)
}
# Compare with ML solutions
yscale <- scale(year)
fm <- glm(count1 ~ site * yscale - 1 -yscale, family = poisson, offset = log(area))
summary(fm)
predict.glm(fm)
# Compare estimates of intercepts and slopes: numerically about the same, as usual
par(mfrow = c(2,1))
plot(fm$coef[1:56], out$mean$alpha)
abline(0,1)
plot(fm$coef[57:112], out$mean$beta)
abline(0,1)
###############################################################
# #
# Chapter 16: Poisson GLMM #
# #
###############################################################
1. Random-coefficients Poisson regression (Swiss hare data)
--------------------------------------------------------
Exercise:
Fit the random-coefficients model (site * year) to the counts. Take a single count per year and site and
also include log(area) as an offset.
Solution:
# REML solution for fun
library('lme4')
glmm.fit <- glmer(count1 ~ year + (1 | site) + ( 0 + year | site), family = poisson, offset = log(area))
# Does not converge. Try scaling year
yr <- as.numeric(scale(year))
glmm.fit <- glmer(count1 ~ yr + (1 | site) + ( 0 + yr | site), family = poisson, offset = log(area))
glmm.fit
So we see that classical numerical solutions may often go astray.
Next the solution in BUGS
# Define model
sink("glmm.txt")
cat("
model {
# Priors
for (i in 1:n.groups){
alpha[i] ~ dnorm(mu.int, tau.int) # Intercepts
beta[i] ~ dnorm(mu.beta, tau.beta) # Slopes
}
mu.int ~ dnorm(0, 0.00001) # Hyperparameter for random intercepts
tau.int <- 1 / (sigma.int * sigma.int)
sigma.int ~ dunif(0, 5)
mu.beta ~ dnorm(0, 0.00001) # Hyperparameter for random slopes
tau.beta <- 1 / (sigma.beta * sigma.beta)
sigma.beta ~ dunif(0, 5)
# Poisson likelihood
for (i in 1:n) {
C[i] ~ dpois(lambda[i])
lambda[i] <- A[i] * exp(alpha[pop[i]] + beta[pop[i]]* year[i])
# lambda[i] <- exp(log(A[i]) + alpha[pop[i]] + beta[pop[i]]* year[i]) # Alternative formulation for offset
}
}
",fill=TRUE)
sink()
# Bundle data
n.groups = max(as.numeric(site))
win.data <- list(C = count1, pop = as.numeric(site), year = yr, n.groups = n.groups, n = length(count1),
A = area)
# Inits function
inits <- function(){ list(alpha = rnorm(n.groups, 0, 2), beta = rnorm(n.groups, 0, 2), mu.int = rnorm(1, 0, 1),
sigma.int = rlnorm(1), mu.beta = rnorm(1, 0, 1), sigma.beta = rlnorm(1))}
# Parameters to estimate
parameters <- c("alpha", "beta", "mu.int", "sigma.int", "mu.beta", "sigma.beta")
# MCMC settings
ni <- 2000
nb <- 500
nt <- 2
nc <- 3
# Start Gibbs sampling
out <- bugs(win.data, inits, parameters, "glmm.txt", n.thin=nt, n.chains=nc, n.burnin=nb, n.iter=ni,
debug = TRUE)
print(out, dig = 3)
glmm.fit
We get the usual, numerically comparable estimates.
2. Random-coefficients Poisson regression with landuse-dependence of hyperparams (Swiss hares)
-------------------------------------------------------------------------------------------
Exercise:
Now make the hyperparams dependent on landuse.
Solution:
Perhaps this could also be done with REML using lmer(), but I don't know how. In BUGS we simply define the
hyperparameters as vectors of length two, rather than as scalars as before, and make them dependent on landuse
as an explanatory variable at the level of the sites. That is, we introduce LU as an explanatory variable of
length 56.
attach(hares)
yr <- as.numeric(scale(year))
LU <- as.numeric(tapply(as.numeric(landuse), as.numeric(site), mean))
# without the leading as.numeric() WinBUGS would crash
# Define model
sink("glmm.txt")
cat("
model {
# Priors
for (i in 1:n.groups){
alpha[i] ~ dnorm(mu.int[LU[i]], tau.int[LU[i]]) # Intercepts
beta[i] ~ dnorm(mu.beta[LU[i]], tau.beta[LU[i]]) # Slopes
}
for(j in 1:2){
mu.int[j] ~ dnorm(0, 0.00001) # Hyperparameter for random intercepts
tau.int[j] <- 1 / (sigma.int[j] * sigma.int[j])
sigma.int[j] ~ dunif(0, 5)
mu.beta[j] ~ dnorm(0, 0.00001) # Hyperparameter for random slopes
tau.beta[j] <- 1 / (sigma.beta[j] * sigma.beta[j])
sigma.beta[j] ~ dunif(0, 5)
}
# Poisson likelihood
for (i in 1:n) {
C[i] ~ dpois(lambda[i])
lambda[i] <- A[i] * exp(alpha[pop[i]] + beta[pop[i]]* year[i])
# lambda[i] <- exp(log(A[i]) + alpha[pop[i]] + beta[pop[i]]* year[i]) # Alternative formulation for offset
}
}
",fill=TRUE)
sink()
# Bundle data
n.groups = max(as.numeric(site))
win.data <- list(C = count1, pop = as.numeric(site), year = yr, n.groups = n.groups, n = length(count1),
A = area, LU = LU)
# Inits function # Must get two values for the four hyperparams
inits <- function(){ list(alpha = rnorm(n.groups, 0, 2), beta = rnorm(n.groups, 0, 2), mu.int = rnorm(2, 0, 1),
sigma.int = runif(2), mu.beta = rnorm(2, 0, 1), sigma.beta = runif(2))}
# Parameters to estimate
parameters <- c("alpha", "beta", "mu.int", "sigma.int", "mu.beta", "sigma.beta")
# MCMC settings
ni <- 2000
nb <- 500
nt <- 2
nc <- 3
# Start Gibbs sampling
out <- bugs(win.data, inits, parameters, "glmm.txt", n.thin=nt, n.chains=nc, n.burnin=nb, n.iter=ni,
debug = TRUE)
print(out, dig = 3)
>From looking at the posterior summaries, we conclude that, indeed, the hyperparameters for describing the
inter-site variation in intercepts and slopes of the trends in density seem to vary between the two
landuse types. We could easily test this formally by looking at the posterior distributions for
derived quantities.
This is an excellent example for how WinBUGS frees the ecological modeller in you. Fitting such a custom model,
which tests a very specific ecological hypothesis, would be very challenging using other software, I believe.
###############################################################
# #
# Chapter 17: Binomial GLM for two groups #
# #
###############################################################
1. Bernoulli data as 'disaggregated' Binomial data
-----------------------------------------------
Exercise:
Simulate and analyse the data in the chapter at a site-by-site level.
Solution:
We will treat the surveys for the two species at the same site as independent.
Generate the data:
n <- 50 # Number of sites
p.cr <- 13/50 # Occurrence.AND.detection probability Cross-leaved
p.ch <- 29/50 # Occurrence.AND.detection probability Chiltern gentian
C.cr <- rbinom(n, 1, prob = p.cr) ; C.cr # Add Binomial noise
C.ch <- rbinom(n, 1, prob = p.ch) ; C.ch # Add Binomial noise
C <- c(C.cr, C.ch)
species <- gl(n = 2, k = 50, length = 100, labels = c("Cross-leaved gentian","Chiltern gentian"))
tapply(C, species, mean)
And here is the analysis using WinBUGS:
# Define model
sink("model.txt")
cat("
model {
# Priors
alpha ~ dnorm(0,0.01)
beta ~ dnorm(0,0.01)
# Likelihood
for (i in 1:n) {
C[i] ~ dbern(p[i]) # Bernoulli has a single param
logit(p[i]) <- alpha + beta * species[i]
}
# Derived quantities
Occ.cross <- exp(alpha) / (1 + exp(alpha))
Occ.chiltern <- exp(alpha + beta) / (1 + exp(alpha + beta))
Occ.Diff <- Occ.chiltern - Occ.cross # Test quantity
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(C = C, n = 100, species = as.numeric(species)-1)
# Inits function
inits <- function(){ list(alpha=rlnorm(1), beta=rlnorm(1))}
# Parameters to estimate
params <- c("alpha", "beta", "Occ.cross", "Occ.chiltern", "Occ.Diff")
# MCMC settings
nc <- 3
ni <- 1200
nb <- 200
nt <- 2
# Start Gibbs sampling
out <- bugs(data=win.data, inits=inits, parameters.to.save=params, model.file="model.txt", n.thin=nt,
n.chains=nc, n.burnin=nb, n.iter=ni, debug = TRUE)
print(out, dig = 3)
This illustrates the fact that a binomial is simply a sum of N Bernoullis.
2. Probability of occurrence of a 'large' count (Swiss hares)
----------------------------------------------------------
Exercise:
Model the landuse-dependence of the occurrence of a 'large' count. Define 'large' appropriately.
Solution:
We will analyse the first count only and first look at the distribution of those counts to decide on what
'large' could mean. We will assume counts from different years in the same area as independent and not bother
about site effects.
attach(hares)
hist(count1, breaks = 40, col = "grey")
One might choose 100 as a threshold. Check how many counts are greater than 100.
sum(count1 > 100, na.rm = TRUE)
> sum(count1 > 100, na.rm = TRUE)
[1] 22
That's not too many, so we try a lower threshold, 50 say.
sum(count1 > 50, na.rm = TRUE)
It turns out there are 106 such counts, and we will use that as a threshold to designate a count as large.
large <- as.numeric(count1 > 50)
To analyse the landuse-dependence of the occurrence of a large count, we can simply adapt the code from
exercise 1 above.
# Define model
sink("model.txt")
cat("
model {
# Priors
alpha ~ dnorm(0,0.01)
beta ~ dnorm(0,0.01)
# Likelihood
for (i in 1:n) {
large[i] ~ dbern(p[i])
logit(p[i]) <- alpha + beta * grass[i]
}
# Derived quantities
mean.arable <- exp(alpha) / (1 + exp(alpha))
mean.grass <- exp(alpha + beta) / (1 + exp(alpha + beta))
difference <- mean.arable - mean.grass # Test quantity
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(large = large, n = length(large), grass = as.numeric(landuse)-1)
# Inits function
inits <- function(){ list(alpha=rlnorm(1), beta=rlnorm(1))}
# Parameters to estimate
params <- c("alpha", "beta", "mean.arable", "mean.grass", "difference")
# MCMC settings
nc <- 3
ni <- 1200
nb <- 200
nt <- 2
# Start Gibbs sampling
out <- bugs(data=win.data, inits=inits, parameters.to.save=params, model.file="model.txt", n.thin=nt,
n.chains=nc, n.burnin=nb, n.iter=ni, debug = TRUE)
print(out, dig = 3)
> print(out, dig = 3)
Inference for Bugs model at "model.txt", fit using WinBUGS,
3 chains, each with 1200 iterations (first 200 discarded), n.thin = 2
n.sims = 1500 iterations saved
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
alpha -1.522 0.117 -1.750 -1.599 -1.518 -1.441 -1.300 1.001 1500
beta -0.651 0.265 -1.181 -0.825 -0.647 -0.480 -0.137 1.001 1500
mean.arable 0.180 0.017 0.148 0.168 0.180 0.191 0.214 1.001 1500
mean.grass 0.104 0.023 0.066 0.088 0.103 0.119 0.152 1.001 1500
difference 0.075 0.028 0.019 0.058 0.076 0.094 0.126 1.001 1500
deviance 579.717 1.986 577.800 578.300 579.100 580.500 584.900 1.002 1500
For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
DIC info (using the rule, pD = Dbar-Dhat)
pD = 2.0 and DIC = 581.7
DIC is an estimate of expected predictive error (lower deviance is better).
Conclusion: If we define a count of 51 and greater as 'large', then there is a 18% chance that a count in an
arable site is large, compared to a 10.4% chance in grassland sites. This difference is 'significant', because
the 95% credible interval for the difference (0.019-0.126) does not cover zero.
###############################################################
# #
# Chapter 18: Binomial ANCOVA #
# #
###############################################################
1. Add a further explanatory variable X
------------------------------------
Exercise:
Modify the model in this chapter by adding to it the main effect of a further explanatory variable, say X,
along with its interaction with the explanatory variable named wet.
Solution:
We will not reassemble the data but use the one we got by executing the code in the chapter. We simply
generate a new random explanatory variable called X and then build the product of it with wetness.
X <- runif(length(wetness))
Xwet <- X * wetness
Here is the analysis using WinBUGS:
# Define model
sink("glm.txt")
cat("
model {
# Priors
for (i in 1:n.groups){
alpha[i] ~ dnorm(0, 0.01) # Intercepts
beta1[i] ~ dnorm(0, 0.01) # Slopes of wetness
}
beta2 ~ dnorm(0, 0.01) # Slope of X
gamma ~ dnorm(0, 0.01) # Slope of interaction X with wet
# Likelihood
for (i in 1:n) {
C[i] ~ dbin(p[i], N[i])
logit(p[i]) <- alpha[pop[i]] + beta1[pop[i]]* wetness[i] + beta2 * X[i] + gamma * Xwet[i]
}
# Derived quantities
# Recover the effects relative to baseline level (no. 1)
a.effe2 <- alpha[2] - alpha[1] # Intercept Black Forest vs. Jura
a.effe3 <- alpha[3] - alpha[1] # Intercept Alps vs. Jura
b.effe2 <- beta1[2] - beta1[1] # Slope Black Forest vs. Jura
b.effe3 <- beta1[3] - beta1[1] # Slope Alps vs. Jura
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(C = C, N = N, pop = as.numeric(pop), n.groups = n.groups, wetness = wetness, n = n,
X = X, Xwet = Xwet)
# Inits function
inits <- function(){ list(alpha = rlnorm(n.groups, 3, 1), beta1 = rlnorm(n.groups, 2, 1))}
# Parameters to estimate
params <- c("alpha", "beta1", "beta2", "gamma", "a.effe2", "a.effe3", "b.effe2", "b.effe3")
# MCMC settings
ni <- 5000
nb <- 1000
nt <- 4
nc <- 3
# Start Gibbs sampler
out <- bugs(data = win.data, inits = inits, parameters.to.save = params, model.file = "glm.txt", n.thin = nt,
n.chains = nc, n.burnin = nb, n.iter = ni, debug = FALSE)
print(out, dig = 3)
# Compare with MLE's
fm <- glm(C/N ~ pop * wetness + X + Xwet, weights = N, family = binomial)
summary(fm)
We note slight differences that may be due to the effects of the priors on the Bayesian estimates or to
small-sample bias in the MLE's.
2. Fit a binomial ANCOVA to the occurrence of large counts (Swiss hares)
---------------------------------------------------------------------
Exercise:
Fit a model with interacting effects of landuse and year (continuous). Also check for non-linear effects of year
by fitting a quadratic year effect in addition to the linear one.
Solution:
We choose 5 as a threshold for a large count this time.
attach(hares)
large <- as.numeric(count1 > 5)
Here are the MLE's.
yr <- as.numeric(scale(year))
fm <- glm(large ~ landuse * yr - 1 - yr, family = binomial)
summary(fm)
Here is the Bayesian analysis using WinBUGS. Note that we use the numerical trick of 'stabilizing' the logit
to avoid numerical over- or underflow. Absent that, WinBUGS will often crash. Also, flat normal priors did not
lead to well-mixing chains, so I used uniform distributions instead.
# Define model
sink("glm.txt")
cat("
model {
# Priors
for (i in 1:2){
alpha[i] ~ dunif(-5, 5) # Intercepts for landuse
beta[i] ~ dunif(-5, 5) # Annual trend estimate
}
# Likelihood
for (i in 1:n) {
large[i] ~ dbern(plim[i])
logit(plim[i]) <- lplim[i]
lplim[i] <- min(999, max(-999, logitp[i]))
logitp[i] <- alpha[landuse[i]] + beta[landuse[i]]* yr[i]
}
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(large = large, landuse = as.numeric(landuse), yr = yr, n = length(count1))
# Inits function
inits <- function(){ list(alpha = rnorm(2), beta = rnorm(2))}
# Parameters to estimate
params <- c("alpha", "beta")
# MCMC settings
ni <- 5000
nb <- 1000
nt <- 4
nc <- 3
# Start Gibbs sampler
out <- bugs(data = win.data, inits = inits, parameters.to.save = params, model.file = "glm.txt", n.thin = nt,
n.chains = nc, n.burnin = nb, n.iter = ni, debug = TRUE)
print(out, dig = 5)
summary(fm)
We see the usual numerical similarity of Bayesian estimates and those obtained by the method of maximum
likelihood.
We next check for non-linear annual trends in the occurrence of large counts. To do this, we simply add a
quadratic effect of year.
Here are the MLE's.
yr <- as.numeric(scale(year))
yr2 <- yr^2
fm <- glm(large ~ landuse * (yr + yr2) - 1 - yr -yr2, family = binomial)
summary(fm)
Here is the Bayesian analysis. The vague, flat normal priors did not lead to very good mixing of the Markov
chains, so as an alternative I tried uniform priors, and they worked better.
# Define model
sink("glm.txt")
cat("
model {
# Priors
for (i in 1:2){
# alpha[i] ~ dnorm(0, 0.001) # Intercepts for landuse
# beta1[i] ~ dnorm(0, 0.001) # Annual trend estimate (linear)
# beta2[i] ~ dnorm(0, 0.001) # Annual trend estimate (quadratic)
alpha[i] ~ dunif(-5, 5) # Intercepts for landuse
beta1[i] ~ dunif(-1, 1) # Annual trend estimate (linear)
beta2[i] ~ dunif(-1, 1) # Annual trend estimate (quadratic)
}
# Likelihood
for (i in 1:n) {
large[i] ~ dbern(plim[i])
logit(plim[i]) <- lplim[i]
lplim[i] <- min(999, max(-999, logitp[i]))
logitp[i] <- alpha[landuse[i]] + beta1[landuse[i]]* yr[i] + beta2[landuse[i]]* yr2[i]
}
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(large = large, landuse = as.numeric(landuse), yr = yr, yr2 = yr2, n = length(count1))
# Inits function
inits <- function(){ list(alpha = rnorm(2), beta1 = rnorm(2))}
# Parameters to estimate
params <- c("alpha", "beta1", "beta2")
# MCMC settings
ni <- 5000
nb <- 1000
nt <- 4
nc <- 3
# Start Gibbs sampler
out <- bugs(data = win.data, inits = inits, parameters.to.save = params, model.file = "glm.txt", n.thin = nt,
n.chains = nc, n.burnin = nb, n.iter = ni, debug = TRUE)
print(out, dig = 5)
summary(fm)
Now the similarity between Bayesian and ML estimates is somewhat more moderate, although the uncertainty intervals
overlap easily. In any case, there is no evidence for a quadratic effect of year.
###############################################################
# #
# Chapter 19: Binomial GLMM #
# #
###############################################################
0. Extra exercise: do a series of GLM 'conversions'
------------------------------------------------
Exercise:
(a) Take the binomial GLMM and turn in back into a binomial GLM. Then turn that back into (b) a Poisson GLM
and finally (c) into a normal GLM.
(a) illustrates the difference between a random-effects model and a fixed-effects model.
(b) illustrates the differences between a binomial and a Poisson GLM
(c) illustrates the difference between a (Poisson) GLM and a simple normal linear model
You'll have to (re-) create the data set from chapter 19 first.
Solutions:
(a) Conversion Binomial GLMM into a binomial GLM
------------------------------------------------
Binomial GLMM (from the book chapter)
-------------------------------------
# Define model
sink("bin_glmm.txt")
cat("
model {
# Priors
for (i in 1:n.groups){
alpha[i] ~ dnorm(mu.int, tau.int) # Intercepts
beta[i] ~ dnorm(mu.beta, tau.beta) # Slopes
}
mu.int ~ dnorm(0, 0.001) # Hyperparameter for random intercepts
tau.int <- 1 / (sigma.int * sigma.int)
sigma.int ~ dunif(0, 10)
mu.beta ~ dnorm(0, 0.001) # Hyperparameter for random slopes
tau.beta <- 1 / (sigma.beta * sigma.beta)
sigma.beta ~ dunif(0, 10)
# Binomial likelihood
for (i in 1:n) {
C[i] ~ dbin(p[i], N[i])
logit(p[i]) <- alpha[pop[i]] + beta[pop[i]]* prec[i]
}
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(C = C, N = N, pop = as.numeric(pop), prec = prec, n.groups = n.groups, n = n)
# Inits function
inits <- function(){ list(alpha = rnorm(n.groups, 0, 2), beta = rnorm(n.groups, 1, 1),
mu.int = rnorm(1, 0, 1), mu.beta = rnorm(1, 0, 1))}
# Parameters to estimate
params <- c("alpha", "beta", "mu.int", "sigma.int", "mu.beta", "sigma.beta")
# MCMC settings
ni <- 2000
nb <- 500
nt <- 2
nc <- 3
# Start Gibbs sampling
out.bin.glmm <- bugs(win.data, inits, params, "bin_glmm.txt", n.thin=nt, n.chains=nc, n.burnin=nb,
n.iter=ni, debug = TRUE)
print(out.bin.glmm, dig = 2)
Binomial GLM
------------
# Define model
sink("bin_glm.txt")
cat("
model {
# Priors
for (i in 1:n.groups){
alpha[i] ~ dnorm(0, 0.0001) # Intercepts
beta[i] ~ dnorm(0, 0.0001) # Slopes
}
# Binomial likelihood
for (i in 1:n) {
C[i] ~ dbin(p[i], N[i])
logit(p[i]) <- alpha[pop[i]] + beta[pop[i]]* prec[i]
}
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(C = C, N = N, pop = as.numeric(pop), prec = prec, n.groups = n.groups, n = n)
# Inits function
inits <- function(){ list(alpha = rnorm(n.groups, 0, 2), beta = rnorm(n.groups, 1, 1))}
# Parameters to estimate
params <- c("alpha", "beta")
# Start Gibbs sampling
out.bin.glm <- bugs(win.data, inits, params, "bin_glm.txt", n.thin=nt, n.chains=nc, n.burnin=nb,
n.iter=ni, debug = TRUE)
print(out.bin.glm, dig = 2)
(b) Conversion Binomial GLM into a Poisson GLM
------------------------------------------------
# Define model
sink("Poisson_glm.txt")
cat("
model {
# Priors
for (i in 1:n.groups){
alpha[i] ~ dnorm(0, 0.0001) # Intercepts
beta[i] ~ dnorm(0, 0.0001) # Slopes
}
# Poisson likelihood
for (i in 1:n) {
C[i] ~ dpois(lambda[i])
log(lambda[i]) <- alpha[pop[i]] + beta[pop[i]]* prec[i]
# lambda[i] <- exp(alpha[pop[i]] + beta[pop[i]]* prec[i])
}
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(C = C, pop = as.numeric(pop), prec = prec, n.groups = n.groups, n = n)
# Inits function
inits <- function(){ list(alpha = rnorm(n.groups, 0, 2), beta = rnorm(n.groups, 1, 1))}
# Parameters to estimate
params <- c("alpha", "beta", "lambda")
# Start Gibbs sampling
out.Poisson.glm <- bugs(win.data, inits, params, "Poisson_glm.txt", n.thin=nt, n.chains=nc, n.burnin=nb, n.iter=ni, debug = TRUE)
print(out.Poisson.glm, dig = 2)
(c) Conversion Poisson GLM into a normal GLM (= LM)
---------------------------------------------------
# Define model
sink("lm.txt")
cat("
model {
# Priors
for (i in 1:n.groups){
alpha[i] ~ dnorm(0, 0.0001) # Intercepts
beta[i] ~ dnorm(0, 0.0001) # Slopes
}
tau <- 1/(sigma*sigma)
sigma ~ dunif(0, 100)
# Normal likelihood
for (i in 1:n) {
C[i] ~ dnorm(mu[i], tau)
mu[i] <- alpha[pop[i]] + beta[pop[i]]* prec[i]
}
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(C = C, pop = as.numeric(pop), prec = prec, n.groups = n.groups, n = n)
# Inits function
inits <- function(){ list(alpha = rnorm(n.groups, 0, 2), beta = rnorm(n.groups, 1, 1), sigma = rlnorm(1))}
# Parameters to estimate
params <- c("alpha", "beta", "sigma")
# Start Gibbs sampling
out.lm <- bugs(win.data, inits, params, "lm.txt", n.thin=nt, n.chains=nc, n.burnin=nb, n.iter=ni, debug = TRUE)
print(out.lm, dig = 2)
1. Get predictions of expected breeding success
--------------------------------------------
Exercise:
For thw woodchat shrike example, form predictions of the expected breeding success and overlay this on a
plot of the observed breeding success.
Solution:
Though this may be possible using lattice graphics, we try a simpler solution 'by hand' instead. We redraw
a grid of plots, one for each population. Since a 4 by 4 panel plot may appear too crammed, we can choose
other configurations by the par() command, for instance by selecting mfrow = c(2,2), we produce four 2 by 2 panel plots.
par(mfrow = c(4,4), mar = c(4, 4, 1, 2) + 0.1)
for(i in 1:16){
print(i)
realised.breeding.success <- C[((i-1)*10+1):(i*10)] / N[((i-1)*10+1):(i*10)]
precipitation <- prec[((i-1)*10+1):(i*10)]
plot(precipitation, realised.breeding.success, ylim = c(0,1), xlim = c(0,1), las = 1,
ylab = "Prop. successful", xlab = "Precipitation", cex = 1.5, col = "blue")
browser()
}
All we have to do in WinBUGS to obtain estimates of the expected breeding success is to monitor p !
# Define model
sink("glmm.txt")
cat("
model {
# Priors
for (i in 1:n.groups){
alpha[i] ~ dnorm(mu.int, tau.int) # Intercepts
beta[i] ~ dnorm(mu.beta, tau.beta) # Slopes
}
mu.int ~ dnorm(0, 0.001) # Hyperparameter for random intercepts
tau.int <- 1 / (sigma.int * sigma.int)
sigma.int ~ dunif(0, 10)
mu.beta ~ dnorm(0, 0.001) # Hyperparameter for random slopes
tau.beta <- 1 / (sigma.beta * sigma.beta)
sigma.beta ~ dunif(0, 10)
# Binomial likelihood
for (i in 1:n) {
C[i] ~ dbin(p[i], N[i])
logit(p[i]) <- alpha[pop[i]] + beta[pop[i]]* prec[i]
}
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(C = C, N = N, pop = as.numeric(pop), prec = prec, n.groups = n.groups, n = n)
# Inits function
inits <- function(){ list(alpha = rnorm(n.groups, 0, 2), beta = rnorm(n.groups, 1, 1), mu.int = rnorm(1, 0, 1), mu.beta = rnorm(1, 0, 1))}
# Parameters to estimate
params <- c("alpha", "beta", "mu.int", "sigma.int", "mu.beta", "sigma.beta", "p")
# MCMC settings
ni <- 2000
nb <- 500
nt <- 2
nc <- 3
# Start Gibbs sampling
out <- bugs(win.data, inits, params, "glmm.txt", n.thin=nt, n.chains=nc, n.burnin=nb, n.iter=ni, debug = TRUE)
print(out, dig = 3)
We now adapt the graphics code to also draw the lines representing the expected breeding success. As a
graphical trick we will order all three vectors in this graph by ascending values of precipitation.
par(mfrow = c(4,4), mar = c(4, 4, 1, 2) + 0.1)
for(i in 1:16){
print(i)
# Get values for each population
precipitation <- prec[((i-1)*10+1):(i*10)]
realised.breeding.success <- C[((i-1)*10+1):(i*10)] / N[((i-1)*10+1):(i*10)]
expected.breeding.success <- out$mean$p[((i-1)*10+1):(i*10)]
# Determine order of prec
ooo <- order(precipitation)
# Plot stuff
plot(precipitation[ooo], realised.breeding.success[ooo], ylim = c(0,1), xlim = c(0,1), las = 1,
ylab = "Prop. successful", xlab = "Precipitation", cex = 1.5, col = "blue")
lines(precipitation[ooo], expected.breeding.success[ooo], col = "red", lwd = 3)
browser()
}
2. Fixed and random (shrinkage)
----------------------------
Exercise:
Refit the model by dropping the distributional assumptions about the intercepts and slopes of the relationship
between breeding success and precipitation, i.e., fit the fixed-effects counterpart of the model in this
chapter, and compare the estimates.
Solution:
We first save the output of the GLMM under a different name.
out.glmm <- out
Then we refit the fixed-effects counterpart of the model.
# Define model
sink("glm.txt")
cat("
model {
# Priors
for (i in 1:n.groups){
alpha[i] ~ dnorm(0, 0.001) # Intercepts
beta[i] ~ dnorm(0, 0.001) # Slopes
}
# Binomial likelihood
for (i in 1:n) {
C[i] ~ dbin(p[i], N[i])
logit(p[i]) <- alpha[pop[i]] + beta[pop[i]]* prec[i]
}
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(C = C, N = N, pop = as.numeric(pop), prec = prec, n.groups = n.groups, n = n)
# Inits function
inits <- function(){ list(alpha = rnorm(n.groups), beta = rnorm(n.groups))}
# Parameters to estimate
params <- c("alpha", "beta")
# MCMC settings
ni <- 2000
nb <- 500
nt <- 2
nc <- 3
# Start Gibbs sampling
out.glm <- bugs(win.data, inits, params, "glm.txt", n.thin=nt, n.chains=nc, n.burnin=nb, n.iter=ni, debug = TRUE)
print(out.glm, dig = 3)
Now we plot both sets of estimates in the same graph and inspect them first in a table.
alpha.fix <- out.glm$mean$alpha
beta.fix <- out.glm$mean$beta
alpha.random <- out.glmm$mean$alpha
beta.random <- out.glmm$mean$beta
cbind(alpha.fix, alpha.random, beta.fix, beta.random)
plot(alpha.fix)
points(alpha.random)
par(mfrow = c(2,1), mar = c(4, 4, 1, 2) + 0.1)
plot(alpha.fix, ylim = c(-3,3), las = 1, ylab = "Intercept estimates", xlab = "Population", cex = 1.5,
col = "blue")
abline(h = mean(alpha.fix), col = "blue", lwd = 2)
points(alpha.random, col = "red", lwd = 2, cex = 1.5)
segments(1:16, alpha.fix, 1:16, alpha.random, lwd = 2, col = "green")
plot(beta.fix, ylim = c(-5,1), las = 1, ylab = "Slope estimates", xlab = "Population", cex = 1.5,
col = "blue")
abline(h = mean(beta.fix), col = "blue", lwd = 2)
points(beta.random, col = "red", lwd = 2, cex = 1.5)
segments(1:16, beta.fix, 1:16, beta.random, lwd = 2, col = "green")
We clearly see that, on average, under the random-effects model, more extreme group means (population means),
or those estimated with less precision, are pulled in more towards the overall mean. Here, the blue line
shows the mean of the fixed-effects estimates.
###############################################################
# #
# Chapter 20: Site-occ model #
# #
###############################################################
1. Prior sensitivity
-----------------
Exercise:
Compare the uniform and the normal priors in terms of the posterior means of the primary model params.
Solution:
As usual, I assume that the data generated float around in your R workspace. We then fit the model using
first the uniform(-10, 10) and then the flat normal priors.
# Define model
sink("model1.txt")
cat("
model {
# Priors
alpha.occ ~ dunif(-10, 10) # Set A of priors
beta.occ ~ dunif(-10, 10)
alpha.p ~ dunif(-10, 10)
beta.p ~ dunif(-10, 10)
# Likelihood
for (i in 1:R) { #start initial loop over the R sites
# True state model for the partially observed true state
z[i] ~ dbern(psi[i]) # True occupancy z at site i
logit(psi[i]) <- alpha.occ + beta.occ * humidity[i]
for (j in 1:T) { # start a second loop over the T replicates
# Observation model for the actual observations
y[i,j] ~ dbern(eff.p[i,j]) # Detection-nondetection at i and j
eff.p[i,j] <- z[i] * p[i,j]
logit(p[i,j]) <- alpha.p + beta.p * humidity[i]
}
}
# Derived quantities
occ.fs <- sum(z[])
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(y = y, humidity = humidity, R = dim(y)[1], T = dim(y)[2])
# Inits function
zst <- apply(y, 1, max) #Good starting values for latent states essential !
inits <- function(){list(z = zst, alpha.occ=runif(1, -5, 5), beta.occ = runif(1, -5, 5),
alpha.p = runif(1, -5, 5), beta.p = runif(1, -5, 5))}
# Parameters to estimate
params <- c("alpha.occ","beta.occ", "alpha.p", "beta.p", "occ.fs")
# MCMC settings
nc <- 3
nb <- 2000
ni <- 12000
nt <- 5
# Start Gibbs sampler
out1 <- bugs(win.data, inits, params, "model1.txt", n.chains=nc, n.iter=ni, n.burn = nb,
n.thin=nt, debug = TRUE)
print(out1, dig = 3)
Next, we fit the model using the flat normal priors. (If you get the dreaded Trap 66 (postcondition violation),
try to stabilize the logit; see book appendix. Or just set off the Markov chains another time and hope that
it does not occur.)
# Define model
sink("model2.txt")
cat("
model {
# Priors
alpha.occ ~ dnorm(0, 0.01) # Set B of priors
beta.occ ~ dnorm(0, 0.01)
alpha.p ~ dnorm(0, 0.01)
beta.p ~ dnorm(0, 0.01)
# Likelihood
for (i in 1:R) { #start initial loop over the R sites
# True state model for the partially observed true state
z[i] ~ dbern(psi[i]) # True occupancy z at site i
logit(psi[i]) <- alpha.occ + beta.occ * humidity[i]
for (j in 1:T) { # start a second loop over the T replicates
# Observation model for the actual observations
y[i,j] ~ dbern(eff.p[i,j]) # Detection-nondetection at i and j
eff.p[i,j] <- z[i] * p[i,j]
logit(p[i,j]) <- alpha.p + beta.p * humidity[i]
}
}
# Derived quantities
occ.fs <- sum(z[])
}
",fill=TRUE)
sink()
# Start Gibbs sampler
out2 <- bugs(win.data, inits, params, "model2.txt", n.chains=nc, n.iter=ni, n.burn = nb,
n.thin=nt, debug = TRUE)
Now we compare the estimates under both prior specifications.
print(out1, dig = 3)
print(out2, dig = 3)
We conclude that there is hardly any sensitvitiy to these choices of priors.
And finally, we check whether the uniform(-10, 10) priors did not 'squeeze in' the marginal posterior distributions
of the primary parameters by plotting them
par(mfrow = c(2,2))
hist(out1$sims.list$alpha.occ, breaks = 50, col = "grey", main = "alpha.occ", xlab = "", ylab = "",
xlim = c(-10, 10))
abline(v = c(-10, 10), col = "red")
hist(out1$sims.list$beta.occ, breaks = 50, col = "grey", main = "beta.occ", xlab = "", ylab = "",
xlim = c(-10, 10))
abline(v = c(-10, 10), col = "red")
hist(out1$sims.list$alpha.p, breaks = 50, col = "grey", main = "alpha.p", xlab = "", ylab = "",
xlim = c(-10, 10))
abline(v = c(-10, 10), col = "red")
hist(out1$sims.list$beta.p, breaks = 50, col = "grey", main = "beta.p", xlab = "", ylab = "",
xlim = c(-10, 10))
abline(v = c(-10, 10), col = "red")
>From looking at the range over which each marginal posterior distribution has support, it appears obvious
that the chosen bounds of the uniform priors for the primary model parameters were sufficiently wide so
as to be uninformative.
2. Site and survey covariates
--------------------------
Exercise:
Usually, in the analysis of a site-occ model, you will have covariates that vary by site only (site covariates),
but also covariates that vary by site and individual survey (here called survey covariates). You must
know then how to specify the analysis.
As an extra, we see what happens when there are missing values in the covariates (i.e., when the design
is not balanced, that is, when not all sites were surveyed an equal number of times)
Solution:
Survey covariates have the same dimension as the response array, i.e., the matrix containing the
detection-nondetection data y.
We generate a survey covariate and fill it with random numbers. Then, we randomly turn 50 values into NA's.
X <- matrix(round(runif(450), dig = 2), ncol = 3)
NA.data <- sort(sample((1:450), size = 50))
X[NA.data] <- NA
# Define model
sink("model.txt")
cat("
model {
# Priors
alpha.occ ~ dunif(-10, 10) # Set A of priors
beta.occ ~ dunif(-10, 10)
alpha.p ~ dunif(-10, 10)
beta.p ~ dunif(-10, 10)
beta.X ~ dunif(-10, 10)
# Likelihood
for (i in 1:R) {
# True state model for the partially observed true state
z[i] ~ dbern(psi[i]) # True occupancy z at site i
logit(psi[i]) <- alpha.occ + beta.occ * humidity[i]
for (j in 1:T) {
# Observation model for the actual observations
y[i,j] ~ dbern(eff.p[i,j]) # Detection-nondetection at i and j
eff.p[i,j] <- z[i] * p[i,j]
logit(p[i,j]) <- alpha.p + beta.p * humidity[i] + beta.X * X[i,j]
}
}
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(y = y, humidity = humidity, R = dim(y)[1], T = dim(y)[2], X = X)
# Inits function
zst <- apply(y, 1, max)
inits <- function(){list(z = zst, alpha.occ=runif(1, -5, 5), beta.occ = runif(1, -5, 5),
alpha.p = runif(1, -5, 5), beta.p = runif(1, -5, 5))}
# Parameters to estimate
params <- c("alpha.occ","beta.occ", "alpha.p", "beta.p", "beta.X")
# MCMC settings
nc <- 3
nb <- 200
ni <- 1200
nt <- 5
# Start Gibbs sampler
out <- bugs(win.data, inits, params, "model.txt", n.chains=nc, n.iter=ni, n.burn = nb, n.thin=nt,
debug = TRUE)
Executing this code, WinBUGS will crash saying 'made use of undefined node X'. The reason for this is that
we have missing values in an explanatory variable, and they need to be dealt with by either specifying a model
for them (e.g., defining a prior for X in our case) or by eliminating them before we start the analysis.
The simplest way to get rid of the missing covariate values is by 'mean-imputing' them, i.e., by replacing
them with the mean of the observed covariates. Unless the number of missing values is excessively great,
this should not affect the results too much.
So here we first mean-impute the missing X's.
X[is.na(X)] <- mean(X, na.rm = TRUE)
Then, we repeat the analysis. We need not execute all of the above R code:
# Bundle data
win.data <- list(y = y, humidity = humidity, R = dim(y)[1], T = dim(y)[2], X = X)
# Start Gibbs sampler
out <- bugs(win.data, inits, params, "model.txt", n.chains=nc, n.iter=ni, n.burn = nb, n.thin=nt,
debug = TRUE)
print(out, dig = 3)
Note that when we have missing values in the response, i.e., the array containing the detection-nondetection
data, then that's no problem at all.
3. Occurrence of 'large populations' (Swiss hare data)
---------------------------------------------------
Exercise:
Collapse the hare counts of one particular year (e.g., 2000) to binary data where a 1 indicates the
observation of a 'large population' (say, a count greater or equal 10). Estimate the proportion of sites
inhabited by a 'large population', i.e., one that is capable of producing a count greater or equal 10. Also
fit a site covariate into the occupancy parameter (psi).
Solution:
We extract the count and elevation data from 2000 and then turn the counts into an indicator variable for
counts greater than 9.
attach(hares)
counts2000 <- cbind(count1[year == 2000], count2[year == 2000])
counts2000[counts2000 < 10] <- 0
counts2000[counts2000 > 9] <- 1
y <- counts2000
We directly fit a site-occupancy model with elevation as a covariate on the probability to be a
'large population'. For this, we also extract the elevation data.
ELEV <- elevation[year == 2000]
We also must mean-impute some missing elevations and then standardise the covariate.
ELEV <- as.numeric(scale(ELEV))
ELEV[is.na(ELEV)] <- 0
# Define model
sink("model.txt")
cat("
model {
# Priors
alpha.occ ~ dunif(-10, 10)
beta.occ ~ dunif(-10, 10)
p ~ dunif(0, 1)
# Likelihood
for (i in 1:R) { #start initial loop over the R sites
# True state model for the partially observed true state
z[i] ~ dbern(psi[i]) # True occupancy z at site i
logit(psi[i]) <- alpha.occ + beta.occ * ELEV[i]
for (j in 1:T) { # start a second loop over the T replicates
# Observation model for the actual observations
y[i,j] ~ dbern(eff.p[i,j]) # Detection-nondetection at i and j
eff.p[i,j] <- z[i] * p
}
}
# Derived quantities
occ.fs <- sum(z[]) # Number of 'large populations' among 56
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(y = y, ELEV = ELEV, R = dim(y)[1], T = dim(y)[2])
# Inits function
zst <- apply(y, 1, max) #Good starting values for latent states essential !
zst[is.na(zst)] <- 1
inits <- function(){list(z = zst, alpha.occ=runif(1, -5, 5), beta.occ = runif(1, -5, 5))}
# Parameters to estimate
params <- c("alpha.occ","beta.occ", "p", "occ.fs")
# MCMC settings
nc <- 3
nb <- 2000
ni <- 12000
nt <- 5
# Start Gibbs sampler
out <- bugs(win.data, inits, params, "model.txt", n.chains=nc, n.iter=ni, n.burn = nb, n.thin=nt, debug = TRUE)
print(out, dig = 3)
> print(out, dig = 3)
Inference for Bugs model at "model.txt", fit using WinBUGS,
3 chains, each with 12000 iterations (first 2000 discarded), n.thin = 5
n.sims = 6000 iterations saved
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
alpha.occ 1.213 0.390 0.523 0.953 1.188 1.442 2.019 1.001 4000
beta.occ 0.024 0.438 -0.735 -0.265 -0.005 0.274 0.987 1.002 1400
p 0.901 0.044 0.798 0.875 0.907 0.933 0.967 1.001 5200
occ.fs 42.388 1.806 39.000 41.000 42.000 43.000 46.000 1.001 6000
deviance 38.073 8.834 29.670 30.500 35.380 42.232 60.800 1.002 1700
For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
DIC info (using the rule, pD = var(deviance)/2)
pD = 39.0 and DIC = 77.1
DIC is an estimate of expected predictive error (lower deviance is better).
Hence, we estimate that about 42 sites are occupied by populations that can produce counts greater than 9.
The detection probability is estimated at about 90%. This means that a 'large population' will produce counts
of 9 and less about 10% of the time. There does not seem to be a relationship between 'large populations'
(as defined here) and the elevation of a site.
4. Simulation study
----------------
This is a bigger project that may well be part of a thesis.
###############################################################
# #
# Chapter 21: N-mix model #
# #
###############################################################
1. Survey covariates
-----------------
Exercise:
Generate a variation of the lizard data, wherein you build in a negative effect of the temperature during a
survey on the detection probability of an individual lizard. Then analyse these data using an N-mix model.
We assume that surveys are conducted only during favourable temperatures (e.g., between 15 and 25 degrees)
and that lizards expose themselves more during colder weather. During surveys on hot days, they simply
stay concealed in the vegetation.
Solution:
We will analyse the data in the horizontal format, i.e., retain the matrix format of the data.
R <- 200 # Number of surveyed sites
T <- 3 # Number of replicate counts at each site
vege <- sort(runif(n = R, min = -1.5, max =1.5))
alpha.lam <- 2 # Intercept
beta1.lam <- 2 # Linear effect of vegetation
beta2.lam <- -2 # Quadratic effect of vegetation
lam <- exp(alpha.lam + beta1.lam * vege + beta2.lam * (vege^2))
N <- rpois(n = R, lambda = lam)
table(N) # Distribution of abundances across sites
sum(N > 0) / n.site # Empirical occupancy
alpha.p <- 1 # Intercept
beta1.p <- -4 # Linear effect of vegetation
beta2.p <- -2 # Linear effect of temperature
y <- p <- array(NA, dim=(c(R, T))) # Create arrays for y and p
temp <- array(runif(R*T, 15, 25), dim = c(R, T)) # Generate the temperature survey covariate
mean.temp <- mean(temp) # Then we standardise the covariate
sd.temp <- sd(temp)
TEMP <- (temp - mean.temp) / sd.temp
for(j in 1:T){
p[,j] <- plogis(alpha.p + beta1.p * vege + beta2.p * TEMP[,j])
y[,j] <- rbinom(n = R, size = N, prob = p[,j])
}
y
sum(apply(y, 1, sum) > 0) # Apparent distribution (proportion occupied sites)
sum(N > 0) # True occupancy
Now the analysis using WinBUGS, and without stacking the data on top of each other. That is, we analyse the model
in the same array format that we saw in the site-occ examples.
# Define model
sink("Nmix.txt")
cat("
model {
# Priors
alpha.lam ~ dunif(-10, 10)
beta1.lam ~ dunif(-10, 10)
beta2.lam ~ dunif(-10, 10)
alpha.p ~ dunif(-10, 10)
beta1.p ~ dunif(-10, 10)
beta2.p ~ dunif(-10, 10)
# Likelihood
# Biological model for true abundance
for (i in 1:R) { # Loop over R sites
N[i] ~ dpois(lambda[i])
log(lambda[i]) <- alpha.lam + beta1.lam * vege[i] + beta2.lam * vege2[i]
# Observation model for replicated counts
for (j in 1:T) { # Loop over all n observations
y[i,j] ~ dbin(p[i,j], N[i])
logit(p[i,j]) <- alpha.p + beta1.p * vege[i] + beta2.p * TEMP[i,j]
}
}
# Derived quantities
totalN <- sum(N[]) # Estimate total population size across all sites
}
",fill=TRUE)
sink()
# Bundle data
vege2 = (vege * vege)
win.data <- list(R = R, T = T, y = y, vege = vege, vege2 = vege2, TEMP = TEMP)
# Inits function
Nst <- apply(y, 1, max) + 1
inits <- function(){list(N = Nst, alpha.lam=rnorm(1, 0, 1), beta1.lam=rnorm(1, 0, 1),
beta2.lam=rnorm(1, 0, 1), alpha.p=rnorm(1, 0, 1), beta1.p=rnorm(1, 0, 1), beta2.p=rnorm(1, 0, 1))}
# Parameters to estimate
params <- c("N", "totalN", "alpha.lam", "beta1.lam", "beta2.lam", "alpha.p", "beta1.p", "beta2.p")
# MCMC settings
nc <- 3
ni <- 1200
nb <- 200
nt <- 5
# Start Gibbs sampler
out <- bugs(win.data, inits, params, "Nmix.txt", n.chains=nc, n.iter=ni, n.burn = nb,
n.thin=nt, debug = TRUE)
print(out, dig = 3)
> print(out, dig = 3)
Inference for Bugs model at "Nmix.txt", fit using WinBUGS,
3 chains, each with 1200 iterations (first 200 discarded), n.thin = 5
n.sims = 600 iterations saved
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
N[1] 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 1
N[2] 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 1
N[3] 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 1
N[4] 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 1
N[5] 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 1
[ ... ]
N[199] 2.160 1.737 0.000 1.000 2.000 3.000 6.000 1.004 600
N[200] 2.108 1.816 0.000 1.000 2.000 3.000 7.000 1.002 530
totalN 894.682 89.817 769.000 829.000 877.000 939.250 1120.050 1.015 120
alpha.lam 1.948 0.058 1.828 1.908 1.951 1.987 2.053 1.004 510
beta1.lam 1.931 0.140 1.654 1.836 1.941 2.027 2.185 1.001 600
beta2.lam -1.871 0.197 -2.240 -2.008 -1.882 -1.733 -1.501 1.020 170
alpha.p 0.961 0.128 0.724 0.876 0.961 1.053 1.209 1.007 250
beta1.p -4.016 0.281 -4.596 -4.207 -4.011 -3.829 -3.479 0.999 600
beta2.p -2.112 0.131 -2.368 -2.190 -2.111 -2.019 -1.855 1.020 96
deviance 659.568 15.873 630.092 647.875 659.250 671.100 691.112 1.011 200
We compare that with the values we used to fake our data:
alpha.lam ; beta1.lam ; beta2.lam ; alpha.p ; beta1.p ; beta2.p
> alpha.lam ; beta1.lam ; beta2.lam ; alpha.p ; beta1.p ; beta2.p
[1] 2
[1] 2
[1] -2
[1] 1
[1] -4
[1] -2
And we are quite happy with these estimates -- and with this great model in general; thanks, Andy ;)
2. Prior sensitivity
-----------------
Exercise:
Play around with different priors and see whether and how this affects the estimates. This is not shown.
3. Hare abundance in 56 Swiss study sites in 2000
----------------------------------------------
Exercise:
Fit the binomial mixture model to the hare data from a single year (e.g., 2000) and see whether there is a
difference in the probability of detection in grassland and arable areas. By what proportion will mean or
max counts underestimate true population size ?
We do three things in addition to what is asked for in the book:
1. We also fit an offset to account for variable size of the study sites.
2. Since we strongly suspect density to be different in grassland and arable sites, we also fit that effect
into the abundance part of the model.
3. We compute a derived quantity that tests whether p is different in arable and grassland sites.
This example will then illustrate nicely one of the great advantages of the N-mix model: namely, that the
effects of a single covariate on both abundance and on detection can be teased apart.
Solution:
attach(hares)
y <- cbind(count1[year == 2000], count2[year == 2000]) # These are the counts
LU <- as.numeric(landuse[year == 2000])
A <- as.numeric(area[year == 2000])
# Define model
sink("Nmix.txt")
cat("
model {
# Priors
for(j in 1:2){
alpha.lam[j] ~ dunif(-10, 10)
alpha.p[j] ~ dunif(-10, 10)
}
# Likelihood
# Biological model for true abundance
for (i in 1:R) { # Loop over R sites
N[i] ~ dpois(lambda[i])
log(lambda[i]) <- log(A[i]) + alpha.lam[LU[i]]
# Observation model for replicated counts
for (j in 1:T) { # Loop over all n observations
y[i,j] ~ dbin(p[i,j], N[i])
logit(p[i,j]) <- alpha.p[LU[i]]
}
}
# Derived quantities
totalN <- sum(N[]) # Estimate total population size across all sites
for(j in 1:2){
mean.density[j] <- exp(alpha.lam[j])
mean.p[j] <- exp(alpha.p[j])/(1+exp(alpha.p[j]))
}
diff.p <- mean.p[1] - mean.p[2]
}
",fill=TRUE)
sink()
# Bundle data
win.data <- list(R = dim(y)[1], T = dim(y)[2], y = y, LU = LU, A = A)
# Inits function
Nst <- apply(y, 1, max, na.rm = TRUE)+1
Nst[Nst == -Inf] <- 2
inits <- function(){list(N = Nst, alpha.lam=rnorm(2, 0, 1), alpha.p=rnorm(2, 0, 1))}
# Parameters to estimate
params <- c("totalN", "alpha.lam", "alpha.p", "mean.density", "mean.p", "diff.p")
# MCMC settings
nc <- 3
ni <- 12000
nb <- 2000
nt <- 10
# Start Gibbs sampler
out <- bugs(win.data, inits, params, "Nmix.txt", n.chains=nc, n.iter=ni, n.burn = nb,
n.thin=nt, debug = TRUE)
print(out, dig = 3)
> print(out, dig = 3)
Inference for Bugs model at "Nmix.txt", fit using WinBUGS,
3 chains, each with 12000 iterations (first 2000 discarded), n.thin = 10
n.sims = 3000 iterations saved
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
totalN 2351.642 147.300 2112.000 2247.000 2332.000 2444.000 2678.025 1.016 140
alpha.lam[1] 2.070 0.076 1.933 2.016 2.064 2.121 2.227 1.017 150
alpha.lam[2] 1.240 0.078 1.097 1.186 1.237 1.290 1.401 1.004 2700
alpha.p[1] 0.651 0.207 0.249 0.505 0.655 0.786 1.054 1.011 260
alpha.p[2] 1.378 0.277 0.813 1.198 1.386 1.573 1.906 1.008 910
mean.density[1] 7.951 0.615 6.912 7.507 7.877 8.344 9.268 1.016 160
mean.density[2] 3.466 0.273 2.997 3.273 3.445 3.634 4.058 1.004 3000
mean.p[1] 0.656 0.046 0.562 0.624 0.658 0.687 0.742 1.014 160
mean.p[2] 0.795 0.046 0.693 0.768 0.800 0.828 0.871 1.006 770
diff.p -0.139 0.065 -0.263 -0.182 -0.142 -0.098 -0.006 1.006 420
deviance 596.245 29.338 542.497 575.700 594.300 615.400 657.410 1.008 250
Hence, not only density but also the detection probability of individual hares differed by landuse type
in 56 Swiss study sites with predominantly arable or grassland use. The detection probability in the latter
in 2000 was estimated at about 80% while that in arable areas at about 66%. This difference was 'significant'
judged by the 95% credible interval for the difference, which does not cover 0.
The total number of hares associated with all 56 study sites in 2000 over the 2-week survey period was
estimated at 2352 hares. Let's compare this with two naive estimators of population size, the sum
(over sites) of the mean or the max observed counts:
mean.counts <- apply(y, 1, mean, na.rm = TRUE)
mean.counts[mean.counts == 'NaN'] <- NA
max.counts <- apply(y, 1, max, na.rm = TRUE)
max.counts[max.counts == -Inf] <- NA
sum.mean <- sum(mean.counts, na.rm = TRUE)
sum.max <- sum(max.counts, na.rm = TRUE)
We note that there were no counts available in 2000 for 7 of the 56 sites. The Nmix model can elegantly estimate
abundance also for these sites, but the naive estimators will be biased. As a quick and dirty correction,
we increase the two naive estimators of total abundance proportionally, i.e., by 7/56.
sum.mean <- (1 + 7/56) * sum.mean
sum.max <- (1 + 7/56) * sum.max
sum.mean
sum.max
> sum.mean
[1] 1549.125
> sum.max
[1] 1647
sum.mean / out$mean$totalN
sum.max / out$mean$totalN
> sum.mean / out$mean$totalN
[1] 0.6630308
> sum.max / out$mean$totalN
[1] 0.7049216
Under the N-mix model, we estimate a total abundance of 2352 hares. Hence, the two naive estimators of abundance
underestimate true abundance by about one third. (Interestingly, it does not make a big difference whether
one takes the mean count or the max count ! )
4. Simulation exercise
-------------------
This is a bigger project that may well be part of a thesis.