Saturday, September 22, 2007

Auto OpenBUGS fitting and checking for convergence

Fitting a BUGS model can be frustrating. Sometimes it takes thousands of iterations to get a model converged. Reparametrize your parameters can speed up the converging. Professor Andrew Gelman has a paper talking about this issue.

I am lucky enough to have both training on fitting BUGS models using R2WinBUGS and WinBUGS/OpenBUGS. I like to use R2WinBUGS when the model is easy to converge. I have the summary statistics and convergency statistics ready to use when using R2WinBUGS.

However, if the model converges slowly (and you don't want to do the reparametrization), R2WinBUGS is not a good choice. Because R2WinBUGS does not save the last state of the MCMC chains. The last.values slot saves the last values of the saved parameters. The parameter-to-save is not necessarily the parameters that needs initial values for priors. So this mean when you figure out the model does not covergerged, you might have to re-run the MCMC chains from the low density area again based on the newly genereated priors. On the other hands, you can always stop and check and start again when you fit BUGS models using WinBUGS/OpenBUGS. But the checking procedure is time-consuming.

One alternative is to fit BUGS models using BRugs. BRugs is an R package that links to OpenBUGS. BRugs can save the last state of the parameters that needs priors. After you fit a
OpenBUGS model, just type:

modelSaveState("*")

You will have inital values saved as the txt files in your working directory. The other nice feature of BRugs is that if you find the model does not converged, you just type:

modelUpdate(numUpdates=1000)

You can restart the BUGS model from the last state and update for another 1000 iterations.

But...But...But...You still need to save the coda files:

samplesCoda("*")

Read in the code files. The file name has to be save as "CODAchain1.txt" and "CODAindex.txt".

output <- read.openbugs(stem=working.directory)

And do the convergence check.

gelman.diag(output)
geweke.diag(output)

This is annoying and time-consuming. So I wrote a function that fits BUGS models using BRugs and checks the converegence. If the model does not converged, it will update another 1000 (default) iterations. The loop of fitting and updating will stop only when either the model is converged or n.sims: (how many updating) is reached. The function uses codes from openbugs() in R2WinBUGS and modelUpdate() in BRugs. I uses Gelman and Rubin convergence statistics: Rhat to decide whether the model is converged or not.

The function is here.

Last words: If the model converges slowly, you might want to think about doing reparametrization first!!

34 comments:

Anonymous said...

Hi, thanks, that was very helpfull. Just to be sure though, I am kind of new at this.

What is the order of specifying "modelSaveState" in your code.

modelCheck("model.txt")
modelData("data.txt")
modelCompile(numChains=2)
modelInits("set1.txt", chainNum=1)
modelInits("set2.txt", chainNum=2)
??modelSaveState("*")
modelUpdate(5000)
samplesSet(c("A","B"))
modelUpdate(15000)

...if you then find out that your model hasnt converged you can run your model again by
modelUpdate(15000)
...it will then automatically use the states you have saved earlier?

Thanks again.
Frans

Yu-Sung Su said...

Hi Frans,

If you do not quit R, you don't need to do modelSaveState("*"). modelUpdate() will use the state of the previous update.

You only use modelSaveState("*") when you want to quit R and run your model again later. In this case, you load the saved inits.txt files and contiune the update.

Anonymous said...

Hi again!
I have a question from somewhat of a different order, but since you are the only one I have asked a question to on the web (and that only time was successful), I thought I'd start here (if you have a better suggestion to put questions like these, please let me know). I am trying to build a model in Winbugs and everything goes fine until the generation of initials. I think it is a very simple model, but as soon as I try to generarte inits, it gives me the very uninformative message:
"error for node T[1,1] of type GraphBern.Node first argument must be a proportion"

model{
for( j in 1 : 4) {
for( i in 1 : N ) {
T[i , j] ~ dbern(Tp[i , j])
Tp[i , j] <- SE[j] * D[i] + (1 - SP[j]) * (1 - D[i])
}
}
for( i in 1 : N ) {
D[i] ~ dbern(P)
}
for( j in 1 : 4 ) {
SE[j] ~ dnorm(0.0, 1.0E-6)
SP[j] ~ dnorm(0.0, 1.0E-6)
}
P ~ dbeta(1, 1)
}

the data looks like this:
list(N=66,T= structure(.Data= c(1,1,0,1....), .Dim=c(66,4)))

In case I use a proportion for the first value of T (which doesn't make sence, cause this should always be either 0 or 1) then the following error message shows up:
"error for node T[1,1] of type GraphBern.Node first argument must be integer valued"

These messages are completely useless and makes you go in circles, very frustrating. Maybe you have either an answer or an idea where to post a question like this.

Thanks again in advance!
Frans

Yu-Sung Su said...

Hi Frank,

I agree with you that error messages of BUGS is often useless. But this time, it is actually very informative. The parameter for dbern(P) is a proportion. So "Tp" should be value between 0 and 1 in your model. But your the linear combination of your SE, SP and D yields value over 1 for Tp.

So if you change both SE[j] and SP[j] ~ dunif(0,1). I believe the model can run. However, since I don't know the context of the model, so I don't know if it is right or not.

BTW, there is a mailinglist for BUGS. You can post your question there. And I am happy to help you, too.

Frans said...

I see what you mean, this was actually an informative message, though a bit cryptic (for me). I needed non-informative priors for the sensitivity and specificity of a clinical observation; a value between 0 and 1. I think dunif is fine for that as well. Thanks for helping again and the suggested mailing list. However, I don't think it will be my first choice next time, since the succesrate on your blog is 2/2.

Frans said...

I have actually now used a beta distribution, posteriors are very much alike compared to unif.

Anonymous said...

Hi Yu-Sung Su
This time a question on inclusion of 2 factors as random effects in a regression model on the prevalence of a disease (DD) that is observed (ObsDD) in several counties(50), by several doctors (4). The doctors cover several counties each. I would have included doctor as a fixed effect, but I would like to know the variation caused be doctor. In the way that I can describe the probability of a person having DD, including the uncertainty caused by both doctor and county.

model{
for( j in 1 : N ) {
ObsDD[j] ~ dbern(PopDD[j])
PopDD[j] <- PDD[j]*SE + (1 - PDD[j])*(1 - SP)
logit(PDD[j]) <- beta0DD + RF[j] * betaRF + cDD[COUNTY[j]] + dDD[DOC[j]]
}
for( k in 1 : M ) {
cDD[k]~dnorm(0,taucDD)
}
for( l in 1 : L ) {
bDD[l]~dnorm(0,taudDD)
}

betaRF~dnorm(0.001,0.001)
beta0DD~dnorm(0.001, 0.001)
tauDD~dgamma(0.001,0.001)
tauDDt~dgamma(0.001,0.001)
sigmacDD <- 1 / sqrt(taucDD)
sigmadDD <- 1 / sqrt(taudDD)
}

The model runs fine like this, it only seems to have a hard time estimating the variance components. I am not sure whether the clustering is represented well; county within doctor.

In GENMOD in SAS I specified it like:
repeated subject=COUNTY(DOC)/type=exch;

Can you tell me whether I am including DOC and COUNTY as random effects and whether this is the correct way to estimate their variance components?
Thanks again for your help.
Frans

Anonymous said...

just a detail:

tauDD~dgamma(0.001,0.001)
tauDDt~dgamma(0.001,0.001)

should be

taucDD~dgamma(0.001,0.001)
taudDD~dgamma(0.001,0.001)

Frans

Yu-Sung Su said...

model{
for( j in 1 : N ) {
ObsDD[j] ~ dbern(PopDD[j])
PopDD[j] <- PDD[j]*SE + (1 - PDD[j])*(1 - SP)
PDD[j] <- max(0, min(pDD[j], 1))
logit(pDD[j]) <- beta0DD + RF[j] * betaRF + cDD[COUNTY[j]] + dDD[DOC[j]]
}
for( k in 1 : M ) {
cDD[k] ~ dnorm(0,taucDD)
}
for( l in 1 : L ) {
bDD[l] ~ dnorm(0,taudDD)
}
beta0DD ~ dnorm(0, 0.001)
betaRF ~ dnorm(0, 0.001)
taucDD ~ dgamma(0.001,0.001)
taudDD ~ dgamma(0.001,0.001)
sigmacDD <- 1 / sqrt(taucDD)
sigmadDD <- 1 / sqrt(taudDD)
}

Hi Fran,

1. Yeah, it is a random effect model now. Or varying intercept model (see Gelman and Hill 2007)

2. The priors for beta's are too informative. You should use flat ones. I know this will blow the model. So the trick is to have a line that constrain PDD.

3. You might want to rescale your predictors if they are continous. Recale them by substracting means and divided by 2 sd's. This help model converges.

Yu-Sung

Frans said...

Hi Yu-Sung
Thanks for your answer.
My covariates are all dummys, not a lot of rescaling possible there.
I am not sure whether this restrain is effective, since logit(pDD) would never itself exceed 1 or get below 0, regardless of how high or low the beta estimates get. However, it gave me an idea to limit the beta's, since some of them (the model is a lot larger then displayed above) converge or seem to converge to extreme values:+20 e.g. I think I have read about that being done. Or else give them uniform prior from -3 to +3 e.g.
Could a conclusion else be that this model with tho random effects does not converge and that you can drop DOC and consider there variation to be included in county?
Thanks,
Frans

Yu-Sung Su said...

Hi Fran,

1. The constraint I used is right because BUGS will overshoot the value of pDD even though it should be between 0 and 1. So using the contraint is a way to prevent BUGS from crash and allow betas to go freely.

2. So no, you cannot constrain your betas unless you can justify that.

3. Another 'no'. If the model does not coverge, it just means you have to run it longer or reparametrize your model (See Gelman and Hill 2007).

Frans said...

Very clear, thanks!

Frans said...

Hi Yu-Sung,
A short question for your opinion. I have had some discussion about flat priors on the logit scale (as I have done), which result in priors with 2 peaks on the P-scale around 1 and 0. That should be avoided, since the model could get stuck in all 1 or all 0 solutions. However, in case the model clearly does not get stuck and converges fine, should you then worry about using dnorm(0,0.001) for your intercept? The example "Seeds" in Winbugs clearly does not worry about it. An option could be to draw the std of dnorm from a uniform distribution with the relevant range. That however, results in different results for the seeds example: at least a much larger value for the std of alpha 0. Any views or comments?
Jehan

Yu-Sung Su said...

Hi Frans,

dnorm(0,.001) and unif(-100, 100) should be very similar to each other. The sensitivity of priors is depended on your N. Also, if you really want a better prior, please check this paper.

http://www.stat.columbia.edu/~gelman/research/unpublished/priors7.pdf

Anonymous said...

Hi Yu-Sung
That was a very informative paper.
Some follow up questions
1)Since my N is very large and I don't have any indication of separation, a flat dnorm can be used for the intercept as well as all coefficients? Isn't that also what is done in the fourth collumn of figure 2 on page 9?
2)Since I dont have any separation, is there another downside / risk in using dnorm as priors?
3)In winbugs it does not seem possible to specify a dcauchy. I am interested to see what the results are, is there a way for me to do that? dt is as dcauchy with s=1 and df=1, could a modified dt be used? are a specified dnorm?
4)If so, I can do that without centering my inputs (all dummies), as you have also done for simplicity in the example on page 12?

Basically I need to know whether my analysis can be justified the way I did it. I hope to address the aspects of the priors or even trying some different methods.
I hope you can answer the questions above.
Thanks again
Frans

Yu-Sung Su said...

Hi Frans,
(1) Yes, a flat Normal prior is reasonable.

(2) Priors help resolve the separation problem through data augmentation. So any reasonable prior is ok.

(3) t-dist with df=1 is equivalent to cauchy, with df=30 is equivalent to Normal.

(4) No, you don't have to center your variables.

I don't think you have to change anything here. The code is cosher now. I would complain about using gamma for the precisions. But that's different story. Professor Gelman has a paper that discusses this. But most of time, gamma is a cosher and is a common, conjugate prior to use for the precisions.

Anonymous said...

Thanks for your help. I think I am ready to call this my final model and start writing my paper.
Maybe you have come across this report from a colleague of mine.

http://gbi.agrsci.dk/%7Eejo/publications/preprints/2003.03.i.pdf

Just for your information/interest.

Frans said...

Hi Yu-Sung,
just when you thought you got rid of me ;-). I have a practical case which goes back to the first discussion I saw on your blog:
1) I have been doing an analysis on 5000 people in 50 counties (which took extremely long!), and now I want to analyze another new county and use the information from the first 50 counties. I do not want to analyze all 51 counties in order to get an estimate on my 51st county. Is this where modelSaveState("*") can be applied?
-after the big analysis of my 50 counties, I save all states.
-check my model again with the original non-informative priors (?)
-load the data (just of this 1 extra county)
-load inits (which are the saved states for each chain)
Is that the way to go? Because I have been trying to use samples from the posterior distributions from the analysis of the 50 counties, to base the priors on for the analysis of this 51st county. It resulted in the same mean and median estimates for the 51st county, but the sd's were different as compared to analyzing all 51 counties together. I guess specifying a prior distribution based on samplesStats from your posterior distribution is hard unless they are totally normally distributed.

Just a practical problem
2) modelSaveState("*"). Whenever I give this command, all R says is "chains written out". I cannot find a text file anywhere in my working directory. Is a command missing? The manual informes poorly on this function.

I hope you can confirm the first question and see what the problem is in the second.
Thanks again Yu-Sung!
Frans

Yu-Sung Su said...

Yes, modelSaveState() is the way to go. And the parameter inside the function should be a stem. So it is modelSaveState("c:/"). You will find 1.txt, 2.txt or something similar to that under "c:/".

Anonymous said...

Hi Yu-Sung
you gave me the link to that article on prior distributions, the one that was not published. Can I refer to that as un-published and use the link you gave me in my reference list? or is it submitted somewhere now?

Yu-Sung Su said...

The paper has been submitted for publication. However, we still don't know the publication information. So in the meantime, you can use the link I gave to use as the reference.

Anonymous said...

Hi Yu-Sung
I don't know whether you have read the article of which I gave you the link, but wouldn't you expect that the way parameters are given priors in Gelman's article (dcauchy) also result in a non-uniform distribution on the p-scale? with a peak at both 0 and 1? My analysis runs fine, no separation what so ever (which should be an argument on its own) but I was hoping to use Gelman's paper as an argument that a non-uniform distribution on the p-scale is not a problem (since I expect that their cauchy results in a non-uniform distribution for p). Do you think this is a justified argument?

It is somewhat of a discussion, in case you didnt get to read that article, don't use any time you can't lose.
Frans,

Yu-Sung Su said...

Our argument is that with proper rescaling, the coefficient of a logit model should be t-distributed with mean=0, scale=2.5. This is an informative empirical prior. In any event that you want to use an uninformative one, just set df=30 or bigger, which is mathematically equivalent to a flat normal prior.

If you have further question about whether your informative priors, you can also email Professor Gelman. He would be happy to answer you, too.

shali623 said...

Hi Yu-Sung,

I am new to WinBUGS/OpenBUGS and learning to use it on my own. To me, WinBUGS/OpenBUGs is just like a black box and oftentimes I don't know how to fix the problem shown in the error message.

Right now, I am trying to see if the parameters of interest in my model change over time and my model is defined as follows:

model {
for(i in 1:k) {
y[i] ~ dbern (p[i])
logit(p[i]) <- beta0[time[i]] + beta1[time[i]] * x2[i] + beta2[time[i]] * x3[i] + b[country[i]]
}

for (i in 1:156) {
b[i] ~ dnorm (0, tau)
}
# Priors
for (j in 1:55) {
beta0[j] ~ dnorm (0.0, 1.0E-6)
beta1[j] ~ dnorm (0.0, 1.0E-6)
beta2[j] ~ dnorm (0.0, 1.0E-6)
}
tau ~ dgamma (0.01, 0.01)
}


I first tried to run the above model using R2Winbugs and then within Winbugs, but in either case I got the trap 66 message. So I rewrote the model in an inverse logit format instead of a logit function (i.e., everything else the same, I write p[i] <- 1/(1+ exp(-(beta0[time[i]] + beta1[time[i]] * x2[i] + beta2[time[i]] * x3[i] + b[country[i]]))) and ran it in Winbugs. This time I got the error message of undefined results. Later I ran this same model written in the inverse logit format in OpenBUGS, it worked. However, when I tried to run the original model (i.e., the logit one),OpenBUGS gave me the error message:"sorry something went wrong in procedure Updater. Sample in module UpdaterRejection." I really feel confused because according to Winbugs/OpenBUGS, a logit model is supposed to be written as the first one.

Do you know what is going wrong? I don't know if I can trust the results I got by using the inverse logit function in OpenBUGS and conduct further analysis based on that. Because it seems to me the results are so sensitive.

If you need the data and initial values to figure out the prolem, please let me know.

Many thanks for your kind help.

Shali

Yu-Sung Su said...

Hi Shali,

This is a well known problem in WinBUGS. The sampler for the logit model sometimes samples p out of the 0-1 bounds. So in the code, try this:

y[i] ~ dbern (p.bound[i])
p.bound[i] <- max(0, min(1, p[i]))

This should be able to pass the problem you are encountering now!

shali623 said...

Hi Yu-Sung,

Thanks for your reply. Yes, after I revise the model as you suggested, I can run it in WinBUGS now.

But it seems that I am having some new problems. After 200,000 simulations, the model still does not converge. Previously when I wrote the model as an inverse logit function, the majority of the parameters converged pretty quickly.

I also checked the autocorrelation plots which showed high correlation for all the parameters. Even if I chose thin to be 100, the high correlation between samples did not decline.

I am not sure if the nonconvergence and high autocorrelation have something to do with the way the samples are drawn. Do you have any suggestion as to how to fix those problems?

Thanks again for your help. Really appreciate it!

Shali

Yu-Sung Su said...

Try to rescale the predictors by standardizing it. This will help convergence.

student said...

Hi,

could someone help me with this one?
I made a model where I would like to test the influence of temperature on the number of flowers that open and I wanted tot add a random factor(area). The model is correct but when I want to update I get an error "Sample in module UpdaterRejection". What does it mean, and what do I need to do?
model{

#specificatie model (likelihood gedeelte)
for (i in 1:625){
numb[i]~dbin(p[i],tot[i])
p.bound[i]<-max(0,min(100, p[i]))
logit(p[i])<-beta_0+beta_1*temp[i]+beta_2[area[i]]+beta_3[area[i]]*(temp[i])
}

#random gedeelte

for(j in 1:25){
beta_2[j]~dnorm(0,tau_alfa)
beta_3[j]~dnorm(0,tau_inter)
}

#prior (weak prior)
beta_0~dnorm(0,1.0E-9)
beta_1~dnorm(0,1.0E-9)

sd~dunif(0.01,1000)
var<-pow(sd,2)
tau<-1/var
sd_alfa~dunif(0.01,1000)
var_alfa<-pow(sd_alfa,2)
tau_alfa<-1/var_alfa
sd_inter~dunif(0.01,1000)
var_inter<-pow(sd_inter,2)
tau_inter<-1/var_inter
}

Dani said...

Hi Yu-Sung Su,

Thanks very much for providing this code. I am working on a Linux machine with OpenBUGS and BRugs only (I don't have WinBUGS or WINE, and I don't have admin privileges to install them), and I was wondering if there's a way to modify your code for my environment. I think the only WinBUGS function you use is as.bugs.array. Do you know if there's an equivalent in the BRugs package, which I can easily substitute? I am particularly concerned about R-hat -- I don't see it in the BRugs output, and your syntax depends on its value.

Many thanks,
Dani

Unknown said...

Greetings! I was hoping you could assist me with a message in Wbugs when I'm trying to compile the data...the message states "made use of undefined node beta0". I'm trying to develop a spatiotemporal model to predict future cases of West Nile Virus using prior cases, temperature and precipitation as predictors...

My Bugs code is the following:

model
{
for(i in 1:n){
for(j in 1:8){
tempstar[i,j] <- (temp[i,j] - mean(temp[,j]))/sd(temp[,j])
mosqstar[i,j] <- (mosq[i,j] - mean(mosq[,j]))/sd(mosq[,j])
y[i,j] ~ dpois(mu[i,j])
log(mu[i,j]) <- beta0 + b[i] + beta1 * mosqstar[i,j] + beta2 * prec[i,j] + beta3 * tempstar[i,j] + beta4 * equals(j,1) + beta5 * equals(j,2) + beta6 * equals(j,3) + beta7 * equals(j,4) + beta8 * equals(j,5) + beta9 * equals(j,6) + beta10 * equals(j,7)

}
}
b[1:n] ~ car.normal(adj[], weights[], num[], taub)
for(k in 1:sumNumNeigh){
weights[k] <- 1
}
}
beta0 ~ dnorm(0,0.01)
beta1 ~ dnorm(0,0.01)
beta2 ~ dnorm(0,0.01)
beta3 ~ dnorm(0,0.01)
beta4 ~ dnorm(0,0.01)
beta5 ~ dnorm(0,0.01)
beta6 ~ dnorm(0,0.01)
beta7 ~ dnorm(0,0.01)
beta8 ~ dnorm(0,0.01)
beta9 ~ dnorm(0,0.01)
beta10 ~ dnorm(0,0.01)
taub ~ dgamma(0.5,0.0005)
}


Any assistance is greatly appreciated. Thanks in advance.

Wu Zhang said...

Hi, Yu-Sung,

It is nice to meet you here; I come across this error message :

****** Sorry something went wrong in procedure Sqrt in module Math ******

The code I use is following:
model
{
for (i in 1 : regions) {
y[i]<- ma[i]/100
y[i]~dlogis(mu[i],1)
logit(mu[i]) <- beta0 +beta1*log(a[i]) + beta2* log(b[i]) +beta3*d[i] + c[i]
}
c[1:regions] ~ car.normal(adj[], weights[], num[], tau.c)

beta0 ~ dflat()
beta1 ~ dflat()
beta2 ~ dflat()
beta3 ~ dflat()
zhang ~ dgamma(2,1.0E+1)
tau.c<-1/zhang

}

Thank you very much!

Wu Zhang

Dileka P said...

Hi, can you please help me to identify this error, 'Sorry something went wrong in procedure LUDecomp in module MathMatrix'

Dileka P said...

Hi again, this is the model related to my previous comment,
model
{

for (i in 1:N) {
LRF[i] ~ dnorm(mu[i], nugget.prec)
mu[i] <- beta0 + beta1*X[i] +beta2*Y[i] + beta3*Z[i]+ beta4*XY[i] + beta5*YZ[i] + beta6*ZZ[i] + W[i]
muW[i] <- 0.0
}

nugget.prec~ dgamma(0.10,0.10)
beta0 ~ dnorm(0,0.01)
beta1 ~ dnorm(0,0.01)
beta2 ~ dnorm(0,0.01)
beta3 ~ dnorm(0,0.01)
beta4 ~ dnorm(0,0.01)
beta5 ~ dnorm(0,0.01)
beta6 ~ dnorm(0,0.01)
tausq <- 1/nugget.prec

W[1:N] ~ spatial.exp(muW[ ], X[ ], Y[ ], spatial.prec, phi, 1)

phi ~ dunif(0,150000)

spatial.prec ~ dgamma(0.10, 0.10)
sigmasq <- 1/spatial.prec

}

razaur rahman said...

Hello,
I was trying to build a poisson model in openbugs with the following codes:
model{
for(i in 1:N){

Y[i]~dpois(mu[i])
log(mu[i]) <- b[1]+b[2]*SU_WI[i]+b[3]*VMT[i]
}
for(j in 1:3){
b[j]~dnorm(0,0.001)
}
}
When i run these codes in OpenBUGS, the log file showing the following:
***Sorry something went wrong in procedure Updater.Mode in module UpdaterRejection****
Could you help me to solve this problem?
I need one more help. I need to build a multivariate crash model(3 dependent variables) in openbugs with 5 independent variable where crashes are discrete event (poisson distribution). Can you provide me with some sample codes to build such model?