## Computation

The process of incorporating these spatial structures into the statistical models is computationally very intensive. These equations cannot be numerically solved. Some common computational approaches are to use Markov Chain Monte Carlo (MCMC), which uses a method of random sampling from simulated probability distributions, or approximation methods such as integrated nested Laplace approximation (INLA). For the Australian Cancer Atlas, we have used the MCMC approach, primarily because it provides greater information about the uncertainty of the estimates.

There are different forms of MCMC samplers [Roberts, Rosenthal et al. 1998]. The simplest form is a Gibbs sampler, and the diagnosis models used a combination of Gibbs and Metropolis steps (via the Metropolis adjusted Langevin algorithm [Neal. 2011]). The excess death models also used Gibbs samplers. Both models had a burn-in of 50,000 iterations, then monitored another 100,000, keeping every 10th iteration. The diagnosis models were implemented in R using the CARBayes package, and the excess death models were implemented in WinBUGS.

### Assessing convergence

Convergence was assessed through visual examination of trace and density plots for a subset of parameters and areas. The Geweke diagnostic for the modelled SIR/EHR was examined for all areas. Areas with a Geweke p-value <0.01 were flagged and further examined visually.

### Sensitivity analyses

Examining the influence of the priors is an essential part of any Bayesian analysis. Of greatest concern in these models are the priors on the variance component of the prior (i.e. the hyperpriors).

Three alternative inverse-Gamma (shape, scale) formulations of the hyperprior on the variance component were examined for the diagnosis rate models:

- \(\mathcal{IG}\left( 1, 0.01\right)\), the CARBayes default;
- \(\mathcal{IG}\left( 0.1, 0.01\right)\);
- \(\mathcal{IG}\left( 0.5, 0.0005\right)\), as used by Johnson [Johnson. 2004].

Only inverse-Gamma distributions were examined due to restrictions within CARBayes.

For the excess death models, the three hyperprior comparisons were:

- \(\mathcal{IG}\left( 0.1, 0.01\right)\);
- Half-Normal, \(\mathcal{N}\left( 0, 5\right)^+\);
- Half-Normal, \(\mathcal{N}\left( 0, 10\right)^+\).

### Software used

Stata, R, GeoDa and WinBUGS software were used.

Initial data manipulation and calculation of national summary measures was performed using Stata v14.0 (StataCorp, Texas), as follows:

- National age-standardised rate: ‘distrate’ command
- Relative survival: ‘strs’ command

The adjacency matrix was initially created using GeoDa v1.8.16.4 , then manually modified. All other analyses used R v3.4.1 , as follows:

- Bayesian diagnosis models: CARBayes package. Burn-in 50,000 iterations, monitored 100,000 iterations, keeping every 10th.
- Bayesian excess death models: R2WinBUGS. Burn-in 50,000 iterations, monitored 100,000 iterations, keeping every 10th.
- Tango’s MEET: R code obtained from Toshiro Tango

WinBUGS v1.4.3 was interfaced with R to run the Bayesian excess death models.

# R code for small area models

##### Diagnosis model

library(CARBayes) library(spdep) W <- nb2mat(aust, style = "B") # where aust is a neighbourhood structure (i.e. class = nb) file.input = data.frame(obs = file$count, expect = file$expect, id = file$grid) formula <- obs ~ offset(log(expect)) model.ler <- S.CARleroux( formula = formula, data = file.input, family = "poisson", W = W, burnin = 50000, n.sample = 150000, thin = 10)

##### Excess deaths model

library(R2WinBUGS) # Fixed values bugs.dat <- list( N = N, # Number of areas (SA2s) T = T, # Number of risk years N.d = N.d, # Number of data rows (N * T * # of covariates) d = d, # Number of deaths d.star = d.star, # Expected number of deaths due to causes other than cancer of interest y = y, # Person-time at risk offset RiskYear = RiskYear, Area = Area, adj = adj$adj, num = adj$num, cum = c(cumsum(adj$num) - adj$num, sum(adj$num)), sumnum = sum(adj$num), agegp2 = x1, agegp3 = x2, agegp4 = x3) # Initial values inits <- function() {list( alpha = rep(-3, 5), u = rep(0, N), sigma.u2 = 0.2, rho = 0.5, beta = rep(0, 3))} # Parameters to monitor in WinBUGS parameters <- c("alpha", "u", "sigma.u2", "beta", "rho") # Run WinBUGS bugs( data = bugs.dat, inits = inits, parameters.to.save = parameters, model.file = "Leroux.bug", n.chains = 1, n.iter = 150000, n.burnin = 50000, n.thin = 10, debug = FALSE, bugs.directory = "C:/WinBUGS/", program = "WinBUGS", DIC = FALSE)

## WinBUGS code

##### Excess deaths model

model{ for(i in 1:N.d){ d[i] ~ dpois(mu[i]) mu[i] <- d.star[i] + d.excess[i] log(d.excess[i]) <- log(y[i]) + alpha[RiskYear[i]] + beta[1] * agegp2[i] + beta[2] * agegp3[i] + beta[3] * agegp4[i] + u[Area[i]]} # Leroux prior for spatial random effects for(j in 1:N){ u[j] ~ dnorm(mean.u[j], prec.u[j]) A[j] <- (rho * num[j] + 1 - rho) prec.u[j] <- A[j] / sigma.u2 mean.u[j] <- rho * sum(W.u[cum[j] + 1:cum[j+1]]) / A[j]} for(h in 1:sumnum){ W.u[h] <- u[adj[h]]} # Other priors sigma.u2 ~ dnorm(0, 0.2)I(0,) rho ~ dunif(0, 1) for(t in 1:T){ alpha[t] ~ dnorm(0, 0.01)} for(k in 1:3){ beta[k] ~ dnorm(0, 0.01)}}

### References

Johnson GD. Small area mapping of prostate cancer incidence in New York State (USA) using fully Bayesian hierarchical modelling. International Journal of Health Geographics. 2004; 3(1):29.

Neal R. MCMC using Hamiltonian dynamics. 2011. In Brooks S, Gelman A, Jones G, Meng X (Eds). Handbook of Markov Chain Monte Carlo. Chapman & Hall/CRC.

Roberts GO, Rosenthal JS. Optimal scaling of discrete approximations to Langevin diffusions. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 1998; 60(1):255-268.