Last change: 6 September 2012
The reason for why WinBUGS is unable to recover parameter estimates that resemble the input values used for simulating the data is NOT that we did not standardize the covariate length !
Instead, my choice of prior distribution for alpha, a normal with variance 1000, was highly informative. I did not realize this because I had not conducted a prior sensitivity analysis. Look at how a normal distribution with variance 1000 (and hence, with standard deviation of about 32) looks like:
plot(x=-300:300, dnorm(-300:300, mean = 0, sd = sqrt(1000)), type = "l", col = "blue", lwd = 3, xlab = "x", ylab = "Density", main = "'Uninformative' priors for intercepts alpha ??")Now let's plot the true values of the intercepts of the three regression lines into that figure:
abline(v = c(beta.vec[1], beta.vec[1]+beta.vec[2], beta.vec[1]+beta.vec[3]), col = "red", lwd = 3)We see that my choice was extremely informative for the intercepts.
We can get the Bayesian estimates with usual numerical similarty to MLEs by using these priors in the BUGS code:
# Priors for (i in 1:n.group){ alpha[i] ~ dnorm(0, 0.000001) # Intercepts beta[i] ~ dnorm(0, 0.000001) # Slopes }Such priors are no longer informative for the parameter values chosen in the simulated data set (or only weakly so):
plot(x=-500:500, dnorm(-500:500, mean = 0, sd = sqrt(1/0.000001)), type = "l", col = "blue", lwd = 3, xlab = "x", ylab = "Density", main = "'Uninformative' priors for intercepts alpha") abline(v = c(beta.vec[1], beta.vec[1]+beta.vec[2], beta.vec[1]+beta.vec[3]), col = "red", lwd = 3)TAKE-HOME MESSAGE: Always do some sort of prior sensitivity analyses in your Bayesian modeling to make sure that priors intended to be vague indeed do not affect the posteriors (or only very weakly).
And, of course, priors that are uninformative for one data set may not be so for another. Whether a prior is uninformative or not needs to be ensured for each analyses anew.
Previous change: 2 February 2012
mu[i] <- mu1 + delta *x[i]This meant that arable sites have a mean of mu1 + 1*delta and grassland sites one of mu1 + 2*delta. Interestingly, delta still had the interpretation of the difference in the density between the two landuse categories.
This error can be easily corrected by defining LANDUSE as an indicator for grassland sites:
LANDUSE <- aggregate(as.numeric(hares$landuse), by = list(hares$site), FUN = mean, na.rm = TRUE)$x -1This error is now corrected and, consistent with the new definition of LANDUSE, I also made a change to the code in the subsequent exercise, where there data statement is now this:
win.data <- list(y1 = MMD[LANDUSE == 0], y2 = MMD[LANDUSE == 1], n1 = length(MMD[LANDUSE == 0]), n2 = length(MMD[LANDUSE == 1]))(Thanks to Sergio Vignali)
sigma.group ~ dunif(0, 30) # Hyperprior for sd of population effectsYou can check whether the posterior is bouncing against the upper bound of the uniform prior by doing this:
hist(out$sims.list$sigma.group, xlim = c(0,30), breaks = 100, col = "grey")
> all.effects [1] 40 -10 -5 5 10 5 10 -2 3 0 4 4 0 3 -2(Thanks to Jean Richardson)
Site-occupancy models have a binary response. According to McCullagh & Nelder (1989), the deviance is uniformative about the fit of a generalized linear model fit to such data. The deviance is just another discrepancy measure and so the one in chap. 20.3. (sum of absolute residuals) is likely no better.
model.matrix(lm(mass~(pop*svl - 1 - svl))) # Interactive model
The next line should be this:
>model.matrix(lm(mass~ pop * svl -1 - svl))
(Thanks to Duncan Mackay)