AHM1 (2016) AHM2 (2021)

Applied hierarchical modeling in Ecology

by Marc Kéry & Andy Royle,
with big contributions to code by Mike Meredith

This is the permanent book web site of the Applied hierarchical modeling (AHM) project. AHM primarily comprises two volumes of a book with the same main title (but different subtitles) and the R package AHMbook, which can be downloaded from CRAN.

On this website you find a short introduction to hierarchical modeling, especially to what we call explicit hierarchical models, on the philosophy of applied statistical modeling espoused in the AHM project, and then a brief overview of the contents of the two books. We also give pointers to important associated resources, including websites, software, and other stuff.... If you're in a hurry please just scroll down.

For a more dynamical (i.e., more frequently updated) AHM page, see our Google page here [https://sites.google.com/site/appliedhierarchicalmodeling/home] (although RIGHT now it is blocked due to some automatic migration process).

Get your copies of AHM1 and AHM2 from Amazon:

AHM1:amazon.com

AHM2:amazon.com

Key features of the AHM books

To cite from the Preface of AHM1, both AHM books have a number of unifying themes:
  1. hierarchical modeling
  2. data simulation
  3. measurement error models
  4. dual inference paradigm approach (Bayesianism and frequentism)
  5. accessible and gentle style (including hierarchical likelihood construction and data simulation)
  6. “cookbook recipes”
  7. predictions, e.g., to produce species distribution maps

1. Hierarchical models

Hierarchical statistical models are parametric models for statistical inference about unknown quantities. In a hierarchical model (HM), we factorize a more complex joint probability distribution such as p(y, x, z) into a series of simpler probability expressions that are linked according to their conditional probability. An example for this joint probability might be a decomposition into the product p(y | x, z) * p(z | x) * p(x), where the vertical bar denotes conditioning and can be read as '(the thing on the left) given (the thing on the right)'. Thus, in a sense an HM does not really belong to any particular class of model, but rather a vast number of classes of statistical models can be represented as HMs.

HMs have many advantages. A major benefit is simply that a hierarchical representation of a statistical model often makes model fitting much easier in practice and sometimes only doable in the first place. Even more importantly perhaps, the way in which a complex joint probability is factorized into conditional probability expressions is typically guided by our understanding of the processes that underlie a particular data set. The act of hierarchical modeling thus naturally enforces a focus on processes. As a result, we (and others, too) like to claim that hierarchical modeling leads to more scientific statistical models, or at least to scientifically more interesting and relevant models that naturally represent the hypothesized mechanisms in our science or management problem. A hallmark of HMs is a rich latent structure, i.e., hidden or partially hidden stochastic processes that produce latent variables, or random effects.

HMs have a fairly long history in statistics going back at least to the 1970s. They gained a lot of traction in the 1990s, catalyzed by the computational revolution that led to widespread adoption of MCMC methods and to the great revival of Bayesian methods in statistics.

In ecology and the environmental sciences, this was evidenced by a short, but highly influential, paper by Mark Berliner in 1996, who was a mentor to Andy Royle and Chris Wikle. However, there is a vast variety of HMs and the mere term, hierarchical model, really conveys hardly any information about the actual type of model being fitted. 'HM' says more about how a statistical model is constructed, rather than about the particular model itself.

Implicit and explicit hierarchical models, and 'scientific statistical models'

Often, HMs are used to address some sort of 'overdispersion'. There is a sense in which overdispersion modeling can be thought of merely as the introduction of a statistical 'fudge factor', to improve the fit of a model to a particular data set by increasing the uncertainty around the main estimates of interest.

Very commonly, HMs are used to accommodate unexplained dependencies in the data in space or in time, which can be accommodated conveniently by correlated latent variables. The random effects in this sort of HMs are usually completely hypothetical constructs that have no real-world meaning whatsoever. Andy Royle and Bob Dorazio call such HMs 'implicit HMs.'

Such implicit hierarchical models can be very useful on many occasions, and they have been particularly influential in spatio-temporal modeling. We also use them in our own work, but our usual preference is for what Andy Royle and Bob Dorazio have called 'explicit HMs' in their influential 2008 book 'Hierarchical modeling and inference in ecology'.

Explicit HMs have an explicit description of how the data at hand relate to scientifically meaningful quantities that are latent, owing to the vagaries of sampling and data collection. These latter may include a variety of types of measurement errors, aggregation, censoring and truncation of the data, as well as non-random sampling of a population about which inferences are sought.

Key examples of such latent quantities with eminent scientific meaning are distribution, abundance, species richness, as well as the parameters in the dynamic processes that govern spatio-temporal variation in all of these, such as survival and recruitment rates for abundance, or colonization and extinction rates for species distributions. To us, explicit hierarchical modeling typically leads to what we like to think of as more scientific statistical models.

2. Data simulation

By data simulation we mean the development of generative code in R (or other software) to create some data set of interest. Very often this will mean a data set that resembles the one we have, although sometimes it will also mean to learn about study design to be able to obtain a data set that meets our needs, e.g., in terms of the precision of the resulting estimates when a model is fit to it later.

There are many reasons that data simulation must be part of the habitual methods used by an applied statistical modeller. In fact, we believe so strongly in the importance of data simulation for applied statistical analysis that we have written an entire chapter about this topic (this is Chapter 3 in AHM1).

We use data simulation all the time in our own work. To mention just a few of the crucial benefits (see Chapter 3 for a full list), simulated data let you decide whether the parameters in your model are really estimable, whether these estimates are biased, and they serve to validate your code for fitting a model.

And perhaps the most important benefit is to prove to yourself that you have understood a statistical model: we claim that if you have really understood a statistical model then you are able to simulate a data set under that model.

On the other hand, we can use data simulation to explain a statistical model. Indeed, throughout our books we use R code for assembling (i.e., simulating) a data set not only to just create a data set to play with, but, crucially, also to provide you with one more explanation of a particular model. This use of R data simulation code to explain a statistical model is one of the hallmarks of both books (and also of other books shown in www.hierarchicalmodels.com).

3. Measurement error models

The third theme is a key to what makes our HMs explicit rather than implicit: by explicitly describing the false-negative and sometimes also the false-positive error processes that produce our data, we gain inferential access to those biologically meaningful quantities mentioned above. Thus, if our modeling assumptions are not violated too badly, then the models in our books let you learn about true abundance, occurrence, species richness, survival, colonization, or extinction rates, rather than about some mere indices of these, which are biased to an unknown degree by imperfect detection and false-positive errors and possibly other features of the data-collection process.

4. Frequentist and Bayesian approaches

Are all HMs Bayesian models? No, not at all! Although in practice HMs are perhaps more often fit using Bayesian than non-Bayesian methods, there is nothing intrinsically Bayesian about an HM (Cressie and Wikle, 2011). In our books we use both likelihood and Bayesian methods to fit HMs. Indeed, we are a little weary of what we perceive as completely unnecessary trench wars between anti-Bayesians and Bayesians ... and we are very glad to note that they have ebbed down quite a bit during the last 2 decades. In such discussions, from the perspective of an applied statistical modeller, minute differences between the two approaches tend to be blown up completely out of proportion. We think that in practice the commonalities between non-Bayesian maximum likelihood and Bayesian inference are far, far greater than are their differences. Most of all, both likelihood and Bayesian inference are firmly rooted in the concept of a parametric statistical model, and in most practical applications the two will yield inferences that are numerically nearly indistinguishable. Of course, we do need to gloss over the major differences in the foundations of these methods... but we think that these are not something which will have to bother a non-statistician in most of his/her actual analyses. As a consequence, we firmly believe that both maximum likelihood and Bayesian methods are essential for applied statistical analysis, or applied data science, and the AHM project thoroughly reflects this view.

5. Accessible and gentle style

Although both AHM books aim at being a comprehensive synthesis of the field, and also contain a lot of novel material, a major goal of them is to serve as a gentle introduction to this vast field in the same spirit as what we consider to be landmark publications in the popularization of modern applied statistical modeling in ecology: first the books by Mick Crawley on GLIM and later R, and second the hugely influential "Gentle Introduction to MARK" by Evan Cooch and his co-authors.

We have strived hard to make the material in the AHM books and especially the practical implementation of the models as widely understandable and intuitive as possible. This includes a hierarchical construction of the likelihood rather than the presentation of an integrated likelihood which may often leave non-statisticians baffled, and the frequent use of data simulation to explain a statistical model (see also above for more on this).

A large part of the material in both books has been developed for, during, and in the aftermath of countless AHM and related workshops that we have taught all over the world. Thus, we believe that we know what material is asked for by ecologists and related scientists, and we like to think that we also have a lot of experience in how to explain that material to them.

6. “Cookbook recipes”

We are big believers in the importance of providing to non-statisticians fully executable cookbook recipes for all the major model classes presented. We think that for many this is a necessary first step for understanding a model. Based on this they can then go on to adapt an analysis to their own situation, rather than letting them become stuck with some minor coding issue during the attempted implementation of a complex model.

7. Predictions

We emphasize predictions from our HMs, typically of quantities such as the expected abundance or occupancy probability. Thus, in a sense the two AHM books are textbooks on a specialized form of species distribution models (SDMs).

However, we show that we can go far beyond the mere depiction of static quantities such as the above (and not forgetting the associated maps of standard errors or confidence/credible intervals!). For instance, in AHM2 we also produce predictive maps of dynamic quantities such as the survival rate of a bird species all over Britain, or of colonization and extinction rates, along with detection probability, of another avian species in Switzerland.

Brief history of the AHM project, and 'Hierarchical modeling in ecology'

We had our first plans to write a book on HMs around 2010 and then launched this project in about 2012. The main idea behind AHM was originally to write a continuation and extension of the foundational 2008 book by Andy and Bob and to make the material in that book more widely understandable among non-statisticians.

Alas, it has taken us almost a decade to bring our baby to the world. After a couple of years into the project, we were so far behind schedule, but at the same time had written so much material already, that our publisher suggested we split the project into two volumes. Thus, we published AHM1 in late 2015 (but with an official publication year of 2016!). We then thought that it would take us at most 1–2 more years to produce volume 2. But this then took us another 5 years, and so AHM2 was published in the fall of 2020 (but again with an official 2021 publication date).

The two AHM volumes are part of what might be called a distributed research program on hierarchical modeling in ecology, which includes a number of other, related books, and web resources including Google group email lists; see the landing page for that meta-project at www.hierarchicalmodels.com. This collection of books, email lists, and software is likely to be augmented constantly in the future. For instance, a 2nd edition for the Bayesian Population Analysis book is on the horizon, as are additional R packages for fitting explicit HMs (e.g., spOccupancy; Doser et al. 2022).

Statistical software for explicit hierarchical models

For fitting statistical models in the AHM volumes, we use R programs to obtain maximum likelihood estimates (MLEs) and provide code in the BUGS statistical modeling language for Bayesian inference. In particular, we explain and illustrate the use of our own R package unmarked (Fiske and Chandler, 2011) for fitting a large range of HMs for abundance and distribution/occurrence.

In recent years, Ken Kellner (SUNY-ESF) has maintained unmarked and greatly expanded its capabilities, also beyond what we document in the two AHM books. This now also includes the possibility to specify additional sets of random effects for some models by the internal use by unmarked of the TMB software (Kristensen et al., 2016) as its computational engine.

On the Bayesian side we use good old WinBUGS (Lunn et al. 2000) and JAGS (Plummer 2003), which both use for model specification what is essentially the original BUGS model definition language (Gilks et al. 1994). BUGS is a very simple language that nevertheless enables users to build almost arbitrarily complex models by combining very simple basic modules exactly in the manner of a hierarchical statistical model. Thus, a complex stochastic system can be built up from a number of conditionally linked, much simpler elemental pieces.

Importantly, the BUGS language has proven to be supremely understandable to non-statisticians. Arguably, this has been the main reason for the very wide adoption of WinBUGS, later OpenBUGS and JAGS, and now also the new Nimble software (de Valpine et al. 2017). All the BUGS code in the AHM volumes should be working in Nimble with only very minor modifications and indeed, the BUGS code in AHM1 has been translated into Nimble; see https://github.com/nimble-dev/AHMnimble.

The well-known Stan software (Carpenter et al. 2017) is another probabilistic programming language (PPL) with a similar scope as JAGS and Nimble and that also implements a dialect of the BUGS model definition language. Owing to its gradient-based MCMC variant known as Hamiltonian Monte Carlo (HMC) it may be able to explore complex, multidimensional posterior "clouds" more swiftly or more reliably than do JAGS or Nimble.

However, unfortunately, HMC is unable to deal with discrete random effects directly and most models in the AHM volumes do contain such discrete random effects. What you must do when fitting these models in Stan is to define the marginal, or integrated, likelihood, in which these random effects are summed out; see the tutorial by Joseph (2020) as well as Turek et al. (2016) and Yackulic et al. (2020).

In fact, fitting these models with an integrated likelihood is exactly how they are implemented in unmarked. Thus, for the more numerate quantitative ecologists learning how to implement these models in their marginalized likelihood form may have substantial payoffs. However, we are convinced that for most of the audiences of our workshops and also of the AHM books, marginalizing discrete random effects out of the likelihood without having a statistician looking above their shoulder is beyond their level of understanding. Thus, we believe that use of Stan would often severely curtail the great modeling freedom that the other BUGS-based PPLs (especially JAGS and Nimble) now confer to them. As Mike Meredith (pers. comm.) has aptly put it: "In JAGS people can write their models as they see the world". By this is meant that you can write a hierarchical likelihood with all individual processes in sequence, rather than having to squash that neat scheme into a marginal likelihood. This is the reason for why we haven't really done much exploration of Stan for the fitting of explicit hierarchical models.

We note though that there are interesting initiatives to use Stan as the computing engine in canned R packages. For instance, Ken Kellner has recently developed the ubms R package (Kellner et al. 2022; see also cran.r-project.org/web/packages/ubms/index.html). This stands for 'unmarked Bayesian models using Stan' and can be described in short as a variant of unmarked that enables additional random effects to be specified in abundance and occupancy models. Examples of such random-effects factors might be regions within which sites are nested or observers in models of detection probability. The data formatting for input, model specification and output are extremely similar to unmarked, and thus ubms can be viewed as the Bayesian sister package of unmarked.

Clearly, over time we expect to see yet other, and even better, software with which we can fit explicit hierarchical models. Our big peroration to the current and prospective developers of such software is just this: PLEASE give serious consideration to the possibility of adopting the hugely successful BUGS modeling language for the specification of the models in your software. We believe that the ease with which BUGS code can be understood even for very complex hierarchical models makes it a serious candidate as a true lingua franca for hierarchical models.

Overview of AHM contents

So, after this introduction, what's now inside of the AHM volumes? Here is the table of contents of the two. In short, as the subtitle of the two volumes say, AHM1 is about static models of abundance and distribution, plus some introductory topics such as statistical inference, data simulation, and parametric statistical models such as GLMs and random effects models. AHM2 then goes on to present dynamic, or open, versions of all these models and a range of more advanced topics.

AHM Volume 1: Prelude and static models

Preface
Part 1: Prelude
  1. Distribution, abundance and species richness in ecology
  2. What are hierarchical models and how do we analyze them ?
  3. Linear models, generalized linear models (GLMs), and random-effects: the components of hierarchical models
  4. Introduction to data simulation
  5. Fitting models using the Bayesian modeling software BUGS and JAGS
Part 2: Models for static systems
  1. Modeling abundance with counts of unmarked individuals in closed populations: Binomial N-mixture models
  2. Modeling abundance using multinomial N-mixture models
  3. Modeling abundance using hierarchical distance sampling
  4. Advanced hierarchical distance sampling
  5. Modeling static occurrence and species distributions using site-occupancy models
  6. Hierarchical models for communities
Summary and conclusion

AHM Volume 2: Dynamic and Advanced models

Preface
Introduction
Part 1: Models for dynamic systems
  1. Relative abundance models for population dynamics
  2. Modeling population dynamics with count data
  3. Hierarchical models of survival
  4. Modeling species distribution and range dynamics, and population dynamics using dynamic occupancy models
  5. Modeling metacommunity dynamics using dynamic community models
Part 2: Advanced models
  1. Multi-state occupancy models
  2. Modeling false positives
  3. Modeling interactions among species
  4. Spatial models of distribution and abundance
  5. Integrated models for multiple types of data
  6. Spatially explicit distance sampling along transects
  7. Conclusions

The AHMbook R package

This package contains functions and data to accompany the two volumes of AHM. Since AHM1 was published, the AHMbook package was submitted to CRAN, with a few small changes to comply with CRAN policies. You can download it from CRAN in the usual way, e.g., by typing in your R console

install.packages("AHMbook")

Commented code for the package and any new developments or bug fixes can be found on GitHub at https://github.com/mikemeredith/AHMbook.

Updated R and JAGS code

Changes in base R and the packages used in the book (notably unmarked and jagsUI) mean that some of the original printed code no longer works with current versions. Code that works – and is regularly tested and updated – can be found on GitHub at https://github.com/mikemeredith/AHM_code. (NOTE: /AHM_code, not /AHMbook, as just above) That repository has all the code not included in the book, but described as "available on the website", plus other bonus code to plot the figures and carry out simulations.

List of detected errors:

https://sites.google.com/site/appliedhierarchicalmodeling/errata

Further resources

For backwards compatibility with the printed AHM1 book, we provide the original four extra data files and the original R/BUGS code here.

Four AHM extra data files:
We used to supply four additional data files that are used in AHM1: the Swiss tits (SwissTits_mhb_2004_2013.csv), Dutch wagtails (wagtail.Rdata), Swiss squirrels (SwissSquirrels.txt) and the Swiss MHB 2014 data (MHB_2014.csv). All four are now part of the AHMbook package (SwissTits, wagtail, SwissSquirrels, MHB2014). Mike Meredith has undertaken some data cleaning and reformatting and the help files for each data set show ways how to use these instead of .csv files with the code in the book.

However, in case you want to replicate the analyses with the exact book code, here are the four files in the original format for download:

  1. Swiss tit data from the MHB 2004-2013 (chapter 6): download SwissTits_mhb_2004_2013.csv
  2. Dutch wagtail data (chapter 9): download wagtail.Rdata
  3. Swiss squirrel data from the MHB (chapter 10): download SwissSquirrels.txt
  4. Swiss MHB data from 2014 (chapter 11): download MHB_2014.csv

Download OLD text file with all R and BUGS code (in the printed AHM1book):
download R_BUGS_code_AHM_Vol_1OLD.txt

Solutions to exercises
At the end of most chapters in the AHM1 book you find a series of exercises. We had originally planned to publish their solutions later. Alas, we now see that this is unlikely to happen anytime soon, and we apologize for this. However, perhaps some of you may be interested in solving part or even all of them and if you did, then we would be delighted to receive solutions and post them on this website.

Marc, Andy, and Mike; last changes: 15 December 2021

Literature cited

Berliner, L.M. 1996. Hierarchical Bayesian time series models. Pages 15–22 in K. Hanson & R. Silver (eds.) Maximum entropy and Bayesian methods. Kluwer Academic Publishers, Dordrecht, The Netherlands.

Carpenter, B., Gelman, A., Hoffman, M.D., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P. & Riddell, A. 2017. Stan: A probabilistic programming language. Journal of Statistical Software, 76, 1–32.

Cooch, E. & White, G. 2019. Program MARK: a gentle introduction. Available in pdf format for free download at http://www.phidot.org/software/mark/docs/book.

Crawley, M.J. 2005. Statistics. An introduction using R. Wiley, Chichester, West Sussex.

Cressie, N. & Wikle, C.K. 2011. Statistics for spatio-temporal data. Wiley.

Doser, J.W., Finley, A.O. Kéry, M., & Zipkin, E.F. 2022. spOccupancy: An R package for single species, multispecies, and integrated spatial occupancy models. Submitted.

Fiske, I. & Chandler, R. 2011. unmarked: an R package for fitting hierarchical models of wildlife occurrence and abundance. Journal of Statistical Software, 43, 1–23.

Gilks, W.R., Thomas, A. & Spiegelhalter, D.J. 1994. A language and program for complex Bayesian modelling. Statistician, 43, 169–177.

Joseph, M.B. 2020. A step-by-step guide to marginalizing over discrete parameters for ecologists using Stan. Accessed on 1 August 2020. https://mbjoseph.github.io/posts/2020-04-28-a-step-by-step-guide-to-marginalizing-over-discrete-parameters-for-ecologists-using-stan/.

Kellner, K.F., Fowler, N.L., Petroelje, T.R., Kautz, T.M., Beyer, D.E., & Belant, J.L. 2021. ubms: An R package for fitting hierarchical occupancy and N-mixture abundance models in a Bayesian framework. Methods in Ecology and Evolution, in press.

Kéry, M. & Royle, J.A. 2016. Applied hierarchical modeling in ecology—Modeling distribution, abundance and species richness using R and BUGS. Volume 1: Prelude and Static Models. Elsevier, Academic Press.

Kéry, M. & Royle, J.A. 2021. Applied hierarchical modeling in ecology—Modeling distribution, abundance and species richness using R and BUGS. Volume 2: Dynamic and Advanced Models. Elsevier, Academic Press.

Kéry, M. & Schaub, M. 2012. Bayesian population analysis using WinBUGS – A hierarchical perspective. Academic Press, Waltham, MA.

Kristensen, K., Nielsen, A., Berg, C., Skaug, H., & Bell, B. 2016. TMB: Automatic differentiation and Laplace approximation. Journal of Statistical Software, 70, 1–21.

Lunn, D.J., Thomas, A., Best, N. & Spiegelhalter, D. 2000. WinBUGS - A Bayesian modelling framework: concepts, structure, and extensibility. Statistics and Computation, 10, 325–337.

Plummer, M. 2003. JAGS: a program for analysis of bayesian graphical models using Gibbs sampling. Proceedings of the 3rd International Workshop in Distributed Statistical Computing (DSC 2003), March 20–22 (eds K. Hornik, F. Leisch& A. Zeileis), pp. 1–10, Technische Universität, Vienna, Austria.

Royle, J.A. & Dorazio, R.M. 2008. Hierarchical modeling and inference in ecology. The analysis of data from populations, metapopulations and communities. Academic Press, New York.

Turek, D., de Valpine, P., & Paciorek, C.J. 2016. Efficient Markov chain Monte Carlo sampling for hierarchical hidden Markov models. Environmental and Ecological Statistics, 23, 549–564.

de Valpine, P., Turek, D., Paciorek, C.J., Anderson-Bergman, C., Lang, D.T. & Bodik, R. 2017. Programming with models: writing statistical algorithms for general model structures with NIMBLE. Journal of Computational and Graphical Statistics, 26, 403–413.

Yackulic, C.B., Dodrill, M., Dzul, M., Sanderlin, J.S., & Reid, J.A. 2020. A need for speed in Bayesian population models: a practical guide to marginalizing and recovering discrete latent states. Ecological Applications, 30, e02112