Friday, March 29, 2013

First steps into Bayesian method for confirmatory path analysis: changing WinBUGS implementation to Jags

After a reasonably productive week where my understanding of some issues has improved a lot, I find I've got that Friday feeling...I'm thinking slowly and can't figure out simple problems (such as how to do a simple Poisson regression model in BUGS with my pine data). 

Here, then, is a small part of my work this week: converting model and code intended for WinBUGS into something that'll run in Jags. And hopefully, getting the same answer. I'm at the very start of really understanding how this method for d-sep confirmatory path analysis works, so, baby steps!

Required reading: 

Shipley, B. 2009. Confirmatory path analysis in a generalized multilevel context. Ecology 90:363–368. http://dx.doi.org/10.1890/08-1034.1

Clough, Y. 2012. A generalized approach to modeling and estimating indirect effects in ecology. Ecology 93:1809–1815. http://dx.doi.org/10.1890/11-1899.1

Some of the following code and all the data to run the example is available online in the supplemental information in Clough (2012). I have modified the WinBUGS section to include R scripts that specify and write the model file (although the appropriate model file is also supplied in the supplemental information of the article).

WinBUGS version:


# Load library
  library(R2WinBUGS)

#Set working directory
  setwd("C:\\Users\\kirsty.mcgregor\\wd")

# Data preparation
data.plot=read.table("data.plot.txt",h=T,dec=".")
data.subplot=read.table("data.subplot.txt",h=T,dec=".")
data.tree=read.table("data.tree.txt",h=T,dec=".")

# Defining tree-level variable vectors
Pc=data.tree$Pc
Hs=data.tree$Hs
Cc=data.tree$Cc
Npodh=data.tree$npodh
subplot=data.tree$subplot

# Defining subplot-level variable vectors
Nfert=data.subplot$Nfert
plot=data.subplot$plot

# Defining plot-level variable vectors
temp=data.plot$temp
age=data.plot$age

# Number of observations at each level
n.tree=430
n.subplot=86
n.plot=43

# Specify the model in BUGS language
mod1 <- " model{

for (i in 1:n.plot){

temp[i] ~ dnorm(temp.hat[i],tau)
temp.hat[i] <- a.temp+b.age_temp*age[i]

}

tau <- 1/sigma2
sigma2 <- pow(sigma,2)
sigma ~ dunif(0, 100)
a.temp ~ dnorm(0,1.0E-6)
b.age_temp ~ dnorm(0,1.0E-6)
}"

# Write model to working directory
  write(mod1, "claim1_bugs.txt")



# Model to test independence claim 1
m1.data <-list("n.plot","temp","age")

# Defining the initial values for the MCMC chains
m1.inits <- function (){list(
a.temp=runif(1,-1,1),
b.age_temp=runif(1,-1,1),
sigma=runif(1,1,2))
}

# Defining the parameters for which I want to save the posterior samples
m1.parameters <- c(names(m1.inits()))


m1 <- bugs (m1.data, m1.inits, m1.parameters,  n.chains=3, 
 "claim1_bugs.txt", bugs.directory="c:/Programme/WinBUGS14",
 working.directory=getwd(), clearWD=FALSE, n.iter=50000, n.sims=1000,codaPkg=F,debug=TRUE)

print(m1)

## Define the function to get the largest CI that does not include zero

fun.CI=function(x){
w=sum(x<0)
w2=(1002-(abs(501-w)*2))/1002
return(w2)
}

# Apply the function to the posterior distribution of b.age_temp
  fun.CI(m1$sims.list$b.age_temp)

# This shows the independence claim 1 to be substantiated.


And here is the version I wrote in Jags. Note that the specification of sigma is now sigma2 <-sigma^2:


# Load libraries
  library(rjags)
  library(R2jags)
  load.module("glm")          # loads the glm module for JAGS 
                              # which may help with convergence
  library(mcmcplots)


# Specify the model in BUGS language
mod1 <- " model{

for (i in 1:n.plot){

temp[i]~dnorm(temp.hat[i],tau)
temp.hat[i]<-a.temp+b.age_temp*age[i]
}

tau <-1/sigma2
sigma2 <-sigma^2
sigma ~ dunif(0, 100)
a.temp~dnorm(0,1.0E-6)
b.age_temp~dnorm(0,1.0E-6)

}"

# Write model to working directory
  write(mod1, "claim1_bugs.txt")



# Model to test independence claim 1
m1.inits <- function (){list(
a.temp=runif(1,-1,1),
b.age_temp=runif(1,-1,1),
sigma=runif(1,1,2))
}

# Run model in jags
  set.seed(rnbinom(1, mu=200, size=1))
  mod1 <- jags(model = "claim1_bugs.txt",
              data = list("n.plot","temp","age"),
              inits = m1.inits, 
              param = c(names(m1.inits())),
              n.chains = 3,
              n.iter =50000,
              n.burnin = 25000,
 n.thin=75)
  mod1

# Put output into mcmc form and plot chains and view
  out <- as.mcmc(mod1)
  mcmcplot(out, parms=c(names(m1.inits())))

# Define the function to get the largest CI that does not include
# zero
  fun.CI=function(x){
  w=sum(x<0)
  w2=(1002-(abs(501-w)*2))/1002
  return(w2)
  }

# Summary 
  all.sum <- mod1$BUGSoutput$summary
  all.sum

# Apply the function to the posterior distribution of b.age_temp
  fun.CI(mod1$BUGSoutput$sims.list$b.age_temp)

It seems to work for me and produce a similar result...but is it right?!

Wednesday, March 27, 2013

R2jags and rjags for Bayesian modelling in R

Directly after handing in my PhD I did what most people probably do: have a little rest from data analysis. The 'rest' continued into the first few months of my postdoc, because my main focus was on data collection to expand the database I'll be working on.

Consequently, the world of Bayesian modelling using R and BUGS has moved on without me. Or, perhaps I was always behind the times! Either way, I now find myself free from data collection and getting back into modelling. 

Previously, I was using BRugs to run models in OpenBUGS from R. My PhD supervisor, Richard Duncan, encouraged me to switch over to using R2jags as the R package of choice. The other package I've looked at is rjags. Switching my code over was relatively straightforward and R2jags seems to run models more quickly than through BRugs. If anyone is using R2WinBUGS or BRugs as the R interface for BUGS, I would encourage you to try switching to a jags package.

The other good package I've discovered is mcmcplots, which has a function that plots the model output in an HTML format using your web browser to display it. This is more convenient than dealing with plots in R itself.