USGS logo PWRC web-site link USGS web-site link

Online companion for
Marc Kéry (2010) Introduction to WinBUGS for Ecologists. Academic Press, Burlington.

A creed for models, algebra and WinBUGS
To make sense of an observation, everybody needs a model ...
whether he or she knows it or not.

It is difficult to imagine another method
that so effectively fosters clear thinking about a system
than the use of a model written in the language of algebra.

One of the most transparent ways
of building a model
is by describing it in the BUGS language.

Book description

This book is a very gentle introduction for ecologists to Bayesian analysis using WinBUGS. It covers the linear model and its extensions to the generalised linear (GLM) and to the linear and generalised linear mixed models by way of extensive and fully documented examples with all code shown. All data sets are simulated in program R ; simulation allows a much greater insight into data and their analysis than the use of real data sets where truth is unknown. I believe that when you are able to simulate a data set for a particular model, then you understand that model. Indeed, one of the nicest things that somebody has said to me about my book was this: Your book made me finally understand the linear model and the GLM. You can get your copy here: www.amazon.com

For more information about the book, you can read its Preface and two sample chapters, chapter 4 and chapter 15.

Website description

This website contains additional information on the book, R and WinBUGS code for all examples in the book as well as solutions to the exercises and bonus material. Later, there will also be a list of errors.

Table of contents

Below is a concise table of contents of the book. You can find the complete TOC here: Table of Contents
  1. Introduction
  2. Introduction to the Bayesian analysis of a statistical model
  3. WinBUGS
  4. A first session in WinBUGS: The "model of the mean"
  5. Running WinBUGS from R via R2WinBUGS
  6. Key components of (generalised) linear models: Statistical distributions and the linear predictor
  7. t-Test: Equal and unequal variances
  8. Normal linear regression
  9. Normal one-way ANOVA
  10. Normal two-way ANOVA
  11. General linear model (ANCOVA)
  12. Linear mixed-effects model
  13. Introduction to the Generalised linear model (GLM): Poisson t-test
  14. Overdispersion and offsets in the GLM
  15. Poisson ANCOVA
  16. Poisson mixed-effects model (Poisson GLMM)
  17. Binomial t-test
  18. Binomial ANCOVA
  19. Binomial mixed-effects model (Binomial GLMM)
  20. Non-standard GLMMs 1: Site-occupancy species distribution model
  21. Non-standard GLMMs 2: Binomial mixture model for the modeling of abundance
  22. Conclusions

Why are there no seeds in the simulation of data sets ?

I am a big fan of the use of simulated data sets. Two reasons are the following (for more see the book intro): first, they enable one to check a solution against known truth and, second, the act of simulating the data enforces a much better understanding of the model that is used to analyze a data set afterwards. When executing the simulation code, everybody will generate a different data set and whenever you execute the simulation code, you will get a different data set as well. Consequently, in a sense, there is no single 'correct' answer from the ensuing analysis either.

Some people have suggested to me that I could have used 'seeds' in the random-number generators, e.g., the R function rnorm(). This would have guaranteed that everybody executing the simulation code always ends up with the same data set, i.e., with the same realization of the random process that we imagine has generated our data. Similarly, up to the chance element inherent in Markov chain Monte Carlo, everybody would have gotten more similar results in the BUGS analysis as well. This might be comforting to some readers, because you could be even more sure that your solution is indeed the 'correct' one, when it matches up the one in the book. I could have achieved the same goal by providing on this website my data sets that I used for producing the results in the book. Indeed, I have saved them and originally had intended to make them available to you.

However, I finally decided against the use of seeds as well as against making available my own data sets. My reason for this is that people greatly overestimate the importance of a specific data set at hand.

In real life, the data set at hand, however hard-won it may be (e.g., the result of four years of data collection during my PhD), is nothing more than a single realization of the stochastic process about which I want to learn something. Most of the time, the data set has no particular importance, except for being our link to that random process, which is what we are really interested in. It is important to think more clearly about the relationship between the data set at hand and that random process, and about the relative importance of the two. By not singling out one particular realization from that process, I emphasize the (usually) secondary role of the particular data set and the primary role of the underlying stochastic process.

R/WinBUGS code

You will use WinBUGS called from R throughout the book with the exception of chapter 4, where we use WinBUGS as a standalone application for the simplest example of a linear model: the estimation of the mean of a normal population. Here, you can download that WinBUGS document ( The_model_of_the_(normal)_mean.odc)

Text file with all R/WinBUGS code: R_WB_code.txt

Solutions to exercises

There is a series of exercises at the end of most chapters; here are the solutions: Solutions.txt

The Swiss hare data

This is the only real-world data set you will meet in the book. We deal with it extensively in the exercises. The Swiss hare data contain replicated counts of Brown hares (Lepus europaeus: see chapter 13) conducted over 17 years (1992-2008) at 56 sites in 8 regions of Switzerland. Replicated means that each year two counts were conducted during a two-week period. Sites vary in area, elevation, and belong to two types of habitat (arable and grassland): hence, there are both continuous and discrete explanatory variables. Unbounded counts may be modeled as Poisson random variables with log(area) as an offset, but we can also treat the observed density (i.e., the ratio of a count to area) as a normal or the incidence of a density exceeding some threshold as a binomial random variable. Hence, you can practice with all models shown in this book and meet features of genuine data sets such as missing values and other nuisances of real life. Swiss_hare_data.Rdata

List of errors and replies to comments

Thank you for pointing out errors to me at marc.kery@vogelwarte.ch.
I am also grateful for any comments you might have on the book.

Erratum: Errata_and_tips.html

Updated list of WinBUGS survival tips

The book contains a list of WinBUGS survival tipps as an appendix. I have updated this list for our new book (Kéry & Schaub, Bayesian Population Analysis using WinBUGS, AP, due December 2011) and you can download it here: AppendixA_list_of_WinBUGS_tricks.pdf

Information on a similar, but more advanced book due in December 2011:
Bayesian Population Analysis using WinBUGS by Kéry & Schaub

The current book is targeted at ecologists, but the statistical models are simply regression models with their ordinary extensions: the generalized linear model and random-effects models. There are only two chapters on the kinds of models that ecologists use for more specialized inference about populations (the last two chapters). My colleague Michael Schaub and I have just written a more advanced sequel to this book, which is scheduled to be available from Academic Press in December 2011. We call it the BPA book. It covers a fairly comprehensive selection of specialized ecological statistical models for the analysis of populations using WinBUGS, not unlike the landmark book by Royle and Dorazio (2008), but in a similar format as the current book. For more information about our new book, see its full table of contents: BPA_TOC.pdf

Workshops

Both the WinBUGS introduction and the more advanced Bayesian population analysis (BPA) book grew out of workshops that I teach (BPA together with Michael Schaub). Both workshops last about 5 days. We have taught them repeatedly in Europe and in the United States. Send me (Marc) an email if you're interested in hosting or attending one.

Acknowledgements

I thank Andy Royle for helping me to fledge as a statistical ecologist and BUGS programmer, Jim Nichols for teaching me how to be a scientist and for writing the foreword, Jim Hines for creating and maintaining this website, and to my family (Susana and Gabriel) for their love and their patience.

This page last revised: 4 Feb 2011