library(jagshelper)
The goal of jagshelper
is to streamline Bayesian
analysis in JAGS using the jagsUI
package.
Functions are provided for extracting output in a simpler form,
assessing model convergence, and plotting model output. Also included is
a function giving a template model in JAGS syntax with the associated
jagsUI
code.
Some of the jagshelper
functionality is illustrated
below, with steps approximately corresponding to those of a typical
Bayesian analysis using JAGS.
When starting a Bayesian analysis in JAGS from scratch, I can never remember the exact structure for doing so, and sometimes need reminding of basic JAGS model syntax.
The skeleton()
function prints a JAGS model template to
the screen, along with code to simulate a corresponding dataset.
library(jagshelper)
skeleton("EXAMPLE")
#>
#> library(jagsUI)
#>
#> # specify model, which is written to an external file
#> cat('model {
#> for(i in 1:n) {
#> y[i] ~ dnorm(mu[i], tau)
#> mu[i] <- b0 + b1*x[i] + a[grp[i]]
#> }
#>
#> for(j in 1:ngrp) {
#> a[j] ~ dnorm(0, tau_a)
#> }
#>
#> tau <- pow(sig, -2)
#> sig ~ dunif(0, 10)
#> b0 ~ dnorm(0, 0.001)
#> b1 ~ dnorm(0, 0.001)
#>
#> tau_a <- pow(sig_a, -2)
#> sig_a ~ dunif(0, 10)
#> }', file="EXAMPLE_jags")
#>
#>
#> # simulate data to go with the example model
#> n <- 60
#> x <- rnorm(n, sd=3)
#> grp <- sample(1:3, n, replace=T)
#> y <- rnorm(n, mean=grp-x)
#>
#> # bundle data to pass into JAGS
#> EXAMPLE_data <- list(x=x,
#> y=y,
#> n=length(x),
#> grp=as.numeric(as.factor(grp)),
#> ngrp=length(unique(grp)))
#>
#> # JAGS controls
#> niter <- 10000
#> ncores <- 3
#>
#> {
#> tstart <- Sys.time()
#> print(tstart)
#> EXAMPLE_jags_out <- jagsUI::jags(model.file="EXAMPLE_jags", data=EXAMPLE_data,
#> parameters.to.save=c("b0","b1","sig","a","sig_a"),
#> n.chains=ncores, parallel=T, n.iter=niter,
#> n.burnin=niter/2, n.thin=niter/2000)
#> print(Sys.time() - tstart)
#> }
Having run a model for the first time, it can be useful to see how many parameter nodes have been saved in total, or how many nodes exist for each named parameter. This can aid in deciding the appropriate strategy for assessing convergence: for example, whether trace plots can be feasibly assessed for all parameter nodes in the model, or just a subset.
In the example below, there are relatively few parameters saved and all trace plots can be examined.
nparam(asdf_jags_out) # how many parameters in total
#> [1] 8
nbyname(asdf_jags_out) # how many parameters (or dimensions) per parameter name
#> $b0
#> [1] 1
#>
#> $b1
#> [1] 1
#>
#> $sig
#> [1] 1
#>
#> $a
#> [1] 3
#>
#> $sig_a
#> [1] 1
#>
#> $deviance
#> [1] 1
tracedens_jags(asdf_jags_out, parmfrow=c(3,3)) # trace plots for all parameters
check_Rhat(asdf_jags_out) # proportion of Rhats below a threshold of 1.1
#> b0 b1 sig a sig_a deviance
#> 1 1 1 1 1 1
$Rhat # Rhat values
asdf_jags_out#> $b0
#> [1] 0.999954
#>
#> $b1
#> [1] 1.00248
#>
#> $sig
#> [1] 1.00394
#>
#> $a
#> [1] 1.0002087 0.9995671 1.0000089
#>
#> $sig_a
#> [1] 1.000986
#>
#> $deviance
#> [1] 1.001197
In the example below, there are many more parameters saved, and it is
perhaps more illustrative to examine the trace plots associated with the
least- converged parameter nodes, as measured by Rhat
value
(Gelman & Rubin 1992).
nparam(SS_out) # how many parameters in total
#> [1] 334
nbyname(SS_out) # how many parameters (or dimensions) per parameter name
#> $trend
#> [1] 41
#>
#> $rate
#> [1] 41
#>
#> $ypp
#> [1] 41
#>
#> $fit
#> [1] 41
#>
#> $sig_eps
#> [1] 1
#>
#> $sig_xi
#> [1] 1
#>
#> $sig_omega
#> [1] 2
#>
#> $cycle
#> [1] 41
#>
#> $cycle_s
#> [1] 41 2
#>
#> $ar1
#> [1] 41
#>
#> $phi
#> [1] 1
#>
#> $deviance
#> [1] 1
traceworstRhat(SS_out, parmfrow=c(3,2)) # trace plots for least-converged nodes
check_Rhat(SS_out) # proportion of Rhats below a threshold of 1.1
#> trend rate ypp fit sig_eps sig_xi sig_omega cycle
#> 1.0000000 1.0000000 1.0000000 0.9268293 1.0000000 1.0000000 0.5000000 0.8780488
#> cycle_s ar1 phi deviance
#> 0.5121951 0.7692308 1.0000000 1.0000000
plotRhats(SS_out) # plotting Rhat values
The function check_neff()
behaves very similarly to
check_Rhat()
, but makes comparisons based on
n.eff
(a crude measure of effective sample size) rather
than Gelman-Rubin convergence diagnostic Rhat
.
The functions traceworstRhat()
and
check_Rhat()
both contain an optional n.eff=
argument. When set to TRUE
, the functions will compare or
plot based on the value of n.eff
rather than
Rhat
.
The function tracedens_jags()
is likely to be the
most useful and concise trace plot version. However, if only line trace
plots or by-chain kernel density plots are desired (rather than both),
older versions trace_jags()
and
chaindens_jags()
are preserved.
The function pairstrace_jags()
gives methods for
plotting two-dimensional trace plots, scatter plots, or contour plots,
in which each possible pairing of parameter nodes are plotted with
respect to one another. In addition to convergence, this may provide a
graphical check for correlation between parameter nodes, or problematic
posterior surface shapes. An example is shown below.
pairstrace_jags(asdf_jags_out, p=c("a","sig_a"), points=TRUE, parmfrow=c(3,2))
cor_jags()
and
plotcor_jags()
respectively return and plot correlation
matrices of all or a subset of parameters, which may be useful in
directly assessing correlation between parameters. In the case of
multiple nodes per parameter (as in a vector or array of nodes), the
full collection of nodes per parameter name is given one axis tick. This
is intended to reduce graphical clutter as well as giving greater visual
weight to single parameters. An example is given below.plotcor_jags(SS_out, p=c("trend","rate","sig"))