Online companion for
Marc Kéry (2010) Introduction to WinBUGS for Ecologists. Academic Press, Burlington.
|
A creed for models, algebra and WinBUGS
To make sense of an observation, everybody needs a model ...
whether he or she knows it or not.
It is difficult to imagine another method
that so effectively fosters clear thinking about a system
than the use of a model written in the language of algebra.
One of the most transparent ways
of building a model
is by describing it in the BUGS language.
|
Book description
This book is a very gentle introduction for ecologists to Bayesian analysis
using WinBUGS. It covers the linear model and its extensions to the
generalised linear (GLM) and to the linear and generalised linear mixed models
by way of extensive and fully documented examples with all code shown. All
data sets are simulated in program R ; simulation allows a much greater insight
into data and their analysis than the use of real data sets where truth is
unknown. I believe that when you are able to simulate a data set for a
particular model, then you understand that model. Indeed, one of the nicest
things that somebody has said to me about my book was this: Your book made me
finally understand the linear model and the GLM.
You can get your copy here:
www.amazon.com
For more information about the book, you can read its Preface
and two sample chapters, chapter 4 and
chapter 15.
Website description
This website contains additional information on the book,
R and
WinBUGS code
for all examples in the book as well as solutions to the exercises and bonus
material. Later, there will also be a list of errors.
Table of contents
Below is a concise table of contents of the book. You can find the complete
TOC here: Table of Contents
- Foreword by Jim Nichols
- Preface
- Acknowledgements
- Introduction
- Introduction to the Bayesian analysis of a statistical model
- WinBUGS
- A first session in WinBUGS: The "model of the mean"
- Running WinBUGS from R via R2WinBUGS
- Key components of (generalised) linear models: Statistical distributions and the linear predictor
- t-Test: Equal and unequal variances
- Normal linear regression
- Normal one-way ANOVA
- Normal two-way ANOVA
- General linear model (ANCOVA)
- Linear mixed-effects model
- Introduction to the Generalised linear model (GLM): Poisson t-test
- Overdispersion and offsets in the GLM
- Poisson ANCOVA
- Poisson mixed-effects model (Poisson GLMM)
- Binomial t-test
- Binomial ANCOVA
- Binomial mixed-effects model (Binomial GLMM)
- Non-standard GLMMs 1: Site-occupancy species distribution model
- Non-standard GLMMs 2: Binomial mixture model for the modeling of abundance
- Conclusions
Why are there no seeds in the simulation of data sets ?
I am a big fan of the use of simulated data sets. Two reasons are the
following (for more see the book intro): first, they enable one to check a
solution against known truth and, second, the act of simulating the data
enforces a much better understanding of the model that is used to analyze a
data set afterwards. When executing the simulation code, everybody will
generate a different data set and whenever you execute the simulation code,
you will get a different data set as well. Consequently, in a sense, there is
no single 'correct' answer from the ensuing analysis either.
Some people have suggested to me that I could have used 'seeds' in the
random-number generators, e.g., the R function rnorm(). This would have
guaranteed that everybody executing the simulation code always ends up with
the same data set, i.e., with the same realization of the random process that
we imagine has generated our data. Similarly, up to the chance element
inherent in Markov chain Monte Carlo, everybody would have gotten more similar
results in the BUGS analysis as well. This might be comforting to some
readers, because you could be even more sure that your solution is indeed the
'correct' one, when it matches up the one in the book. I could have achieved
the same goal by providing on this website my data sets that I used for
producing the results in the book. Indeed, I have saved them and originally
had intended to make them available to you.
However, I finally decided against the use of seeds as well as against making
available my own data sets. My reason for this is that people greatly
overestimate the importance of a specific data set at hand.
In real life, the data set at hand, however hard-won it may be (e.g., the
result of four years of data collection during my PhD), is nothing more than a
single realization of the stochastic process about which I want to learn
something. Most of the time, the data set has no particular importance, except
for being our link to that random process, which is what we are really
interested in. It is important to think more clearly about the relationship
between the data set at hand and that random process, and about the relative
importance of the two. By not singling out one particular realization from
that process, I emphasize the (usually) secondary role of the particular data
set and the primary role of the underlying stochastic process.
R/WinBUGS code
You will use WinBUGS
called from R
throughout the book with the exception of
chapter 4, where we use WinBUGS as a standalone application for the simplest
example of a linear model: the estimation of the mean of a normal population.
Here, you can download that WinBUGS document
(
The_model_of_the_(normal)_mean.odc)
Text file with all R/WinBUGS code: R_WB_code.txt
Solutions to exercises
There is a series of exercises at the end of most chapters; here are the solutions:
Solutions.txt
The Swiss hare data
This is the only real-world data set you will meet in the book. We deal with
it extensively in the exercises. The Swiss hare data contain replicated counts
of Brown hares (Lepus europaeus: see chapter 13) conducted over 17 years
(1992-2008) at 56 sites in 8 regions of Switzerland. Replicated means that
each year two counts were conducted during a two-week period. Sites vary in
area, elevation, and belong to two types of habitat (arable and grassland):
hence, there are both continuous and discrete explanatory variables. Unbounded
counts may be modeled as Poisson random variables with log(area) as an offset,
but we can also treat the observed density (i.e., the ratio of a count to
area) as a normal or the incidence of a density exceeding some threshold as a
binomial random variable. Hence, you can practice with all models shown in
this book and meet features of genuine data sets such as missing values and
other nuisances of real life.
Swiss_hare_data.Rdata
List of errors and replies to comments
Thank you for pointing out
errors to me at marc.kery@vogelwarte.ch.
I am also grateful for any comments
you might have on the book.
Erratum: Errata_and_tips.html
Updated list of WinBUGS survival tips
The book contains a list of WinBUGS survival tipps as an appendix.
I have updated this list for our new book (Kéry & Schaub, Bayesian
Population Analysis using WinBUGS, AP, due December 2011) and you
can download it here:
AppendixA_list_of_WinBUGS_tricks.pdf
Information on a similar, but more advanced book due in December 2011:
Bayesian Population Analysis using WinBUGS by Kéry & Schaub
|
The current book is targeted at ecologists, but the statistical models are simply regression models with their ordinary extensions: the generalized linear model and random-effects models. There are only two chapters on the kinds of models that ecologists use for more specialized inference about populations (the last two chapters). My colleague Michael Schaub and I have just written a more advanced sequel to this book, which is scheduled to be available from Academic Press in December 2011. We call it the BPA book. It covers a fairly comprehensive selection of specialized ecological statistical models for the analysis of populations using WinBUGS, not unlike the landmark book by Royle and Dorazio (2008), but in a similar format as the current book. For more information about our new book, see its full table of contents:
BPA_TOC.pdf
|
Workshops
Both the WinBUGS introduction and the more advanced Bayesian population
analysis (BPA) book grew out of workshops that I teach (BPA together with
Michael Schaub). Both workshops last about 5 days. We have taught them
repeatedly in Europe and in the United States. Send me
(Marc) an email if you're
interested in hosting or attending one.
Acknowledgements
I thank Andy Royle for helping me to fledge as a statistical ecologist and
BUGS programmer, Jim Nichols for teaching me how to be a scientist and for
writing the foreword, Jim Hines
for creating and maintaining this website, and
to my family (Susana and Gabriel) for their love and their patience.
This page last revised: 4 Feb 2011