Sunday, January 14, 2007

Using WinBUGS from Stata

Stata outperforms R in its ability to handle large datasets. So when we want to run WinBUGS through R to deal with a large dataset, we have to reply on Stata to help us get coda statistics. Of course, if you know how to run WinBUGS through other software that can also deal with large datasets, you don't need to learn this. Skip this post.

I will use 8 schools as an example from Andrew Gelman et al. 2004. Bayesian Data Analysis. Here are the steps: (The data, the do file and the BUGS model)


1. Install WinBUGS from Stata package.

Type the following commands in the Stata command windows:

net from http://www.hs.le.ac.uk/research/HCG/winbugsfromstata
net describe winbugsfromstata
net install winbugsfromstata

Type help winbugs in the command windows. If you install the package successfully, you will see the help window of WinBUGS from Stata.

2. Read the data into Stata.

use d:\temp\schools.dta, clear

3. Check summary statistics of the data and some data management.

sum
scalar J=8
gen y=estime


4. Convert stata data to BUGS data

There are 5 commands to do this: wbarray, wbdata, wbscalar, wbstructure, wbvector.

wbdata, c(wbscalar, sc(J) f(%3.1f) + wbvector y sd, f(%3.1f)) noprint saving(d:\temp\data.txt,replace)

(NOTE: There is no line break if you type in the above commands)

You will see a file called data.txt in your working directory.

5. Prepare the initial values for 3 chains

* gen first inits
set seed 339487731
gen theta=invnormal(uniform())
scalar mu=invnormal(uniform())
scalar sigma=uniform()
wbdata, c(wbscalar, sc(mu sigma) f(%3.1f) + wbvector theta, f(%3.1f)) nop sav(d:\temp\inits1.txt, replace)

* gen second inits
set seed 3241341
replace theta=invnormal(uniform())
scalar mu=invnormal(uniform())
scalar sigma=uniform()
wbdata, c(wbscalar, sc(mu sigma) f(%3.1f) + wbvector theta, f(%3.1f)) nop sav(d:\temp\inits2.txt, replace)

* gen third inits
set seed 12341324
replace theta=invnormal(uniform())
scalar mu=invnormal(uniform())
scalar sigma=uniform()
wbdata, c(wbscalar, sc(mu sigma) f(%3.1f) + wbvector theta, f(%3.1f)) nop sav(d:\temp\inits3.txt, replace)

You will have three files contains initial values in your working directory: inits1.txt, inits2.txt, inits3.txt.

6. Prepare a WinBUGS script.

wbscript, sav(d:\temp\script.txt, replace) model(d:\temp\model.txt) data(d:\temp\data.txt) inits(d:\temp\inits.txt) coda(d:\temp\coda) burn(1000) update(5000) chain(3) set(mu sigma theta) dic log(d:\temp\log.txt) quit thin.updater(5) stats(*)

You will have a file called script.txt in your working directory. You may want to add thin.updater(5) stats(*) into your script.txt. Please check script.txt for the right position to put these two commands. Adding these two commands will help you do the analysis in R.

7. Run WinBUGS from Stata.

wbrun, sc(d:\temp\script.txt) winbugs(c:\Program Files\WinBUGS14\winbugs14.exe)

You will have coda1.txt, coda2.txt, coda3.txt and codaIndex.txt in your working directory. You can do the analysis in R using these coda statistics. Or do the simple analysis in Stata.

8. Read in the coda statistics and do the analysis.

wbcoda, root("d:/temp/coda") chain(3) multichain clear
wbstats mu sigma theta_1 theta_2 theta_3 theta_4 theta_5 theta_7 theta_8

There are other commands you can use to do the analysis. Type help winbugs in the command windows for details.

You can convert the coda files into bugs object in R. And do all the post analysis in R.

1 comment:

Taabu said...

Thanks Yu for the steps well articulated. But would you please complete the steps and tell us what to expect. Mine stopped when the Bugs lincence windo popped up. thanks again.