This is a summary of a workshop Robert gave at the Italian Stata Users’ Group in Milan in September 2025.

A person taking a selfie in a grand shopping gallery with ornate architecture and high glass ceilings.
Sightseeing in the Galleria Vittorio Emanuele

Stata version 19, released in 2025, introduced new additional flexibility to its Bayesian capabilities which makes Stata now fully flexible for all the BMA models that we discuss in the book. This is a really important development for Stata users, though in general those who are heavily into Bayes move towards probabilistic programming languages like JAGS, Stan or PyMC, and access them through R, Python, or Julia.

Having said that, if you know Stata (and many people use it for non-Bayesian MA) and are interested in BMA, then you definitely need to look into version 19.

There are three ways to fit a Bayesian model to your data in Stata:

  • the bayes: prefix
  • the bayesmh command
  • the evaluator() or llevaluator() option to bayesmh

For many estimation commands like logit, you can go Bayesian by just putting the bayes: prefix in front. However, there is no bayes: meta capability… yet. Here is a multilevel logistic regression on the Bangladesh contraceptive dataset, which you may know from Stata manuals and examples:

// load data:
webuse bangladesh.dta, clear

// REML non-Bayesian estimate:
melogit c_use age i.urban || district:

// adaptive Metropolis-Hastings: 
bayes, nchains(2): melogit c_use age i.urban || district:

Throughout this post, we want to show you the syntax, so we will skip most specification of priors, initial values, or RNG seeds, as well as any diagnostic checking, thinning, etc. Needless to say, you should proceed carefully and thoughtfully with your analysis, and not just copy and paste this for serious applications.

bayesmh expects a regression-style model. The first variable named after the command will be the dependent variable (you can also make a multivariate or multivariable (depending on your side of the Atlantic) model by collecting the variables in parentheses), and the others are independent variables. For example:

bayesmh c_use age i.urban U[district], ///
     likelihood(binomial(1)) ///
     prior({c_use:}, normal(0, 100)) ///
     prior({var_U}, igamma(0.01, 0.2)) ///
     nchains(2)

The U[district] requests a random intercept on the districts. Although most use of bayesmh will involve choosing off-the-shelf likelihood and priors, you can also create a bespoke likelihood by using the llevaluator() option. Both likelihood and priors can be tailored if you use the evaluator() option. Let’s begin with lleveluator().

You create a program that will receive arguments from bayesmh itself on every iteration of its sampling algorithm. bayesmh will send a tempvar name where it wants the log likelihood contributions of each observation to be stored. This goes into the argument called lnfj. It will also send a tempvar name containing the linear predictor values (including any random effects) as xb, and the levels of the random effect (which here is U).

Our program must use these, and other useful things that bayesmh puts in global macros, to calculate the log likelihood, and store it in lnfj. Then, we just name the program inside the llevaluator() option.

capture program drop myloglik2
program define myloglik2

  version 19.5
  args lnfj xb U
  
  // retrieve regression coefficients for covariates
  tempname mb
  matrix `mb' = $MH_b
  
  // reconstruct linear predictor manually
  tempvar xb2
  qui gen double `xb2' = $MH_x1 * `=`mb'[1,1]' + ///
                         $MH_x3 * `=`mb'[1,2]' + ///
			 `=`mb'[1,3]' + ///
			 `U' if $MH_touse

  // predicted probability of outcome
  tempvar predprob
  qui gen double `predprob' = invlogit(`xb2')
  
  // log likelihood
  qui replace `lnfj' = $MH_y * ln(`predprob') + ///
		(1 - $MH_y )*ln(1-`predprob') ///
		if $MH_touse

end


bayesmh c_use age i.urban U[district], ///
   llevaluator(myloglik2, reparameters({U})) ///
   prior({U[district]}, normal(0, {var_U})) ///
   prior({c_use:_cons}, normal(0, 16)) ///
   prior({c_use:age}, normal(0, 16)) ///
   prior({c_use:1.urban}, normal(0, 16)) ///
   prior({var_U}, igamma(0.01, 0.2)) block({var_U}) ///
   init({c_use:} 0 {var_U} 1) rseed(19) ///
   dots

You can see above that we should also tell evaluator() that U is a random effect (reparameters() suboption). Stata’s adaptive M-H algorithm has ways of speeding up sampling if it knows about random effects.

The evaluator() option is similar, but also computes the total log prior density over all parameters, and returns that in a scalar called lnp.

Green tea random-effects MA example

Here is bayesmh at work on the green tea meta-analysis. First, run the non-Bayesian version and store useful variables.

import delimited ///
"cochrane_green_tea_weight_loss.csv", clear

// non-Bayesian estimator: 

meta esize n_tea mean_tea sd_tea ///
  n_control mean_control sd_control, ///
  esize(mdiff) random ///
  studylabel(study) eslabel("BMI change (kg/m²)")
  
meta summarize
meta forestplot, name(forest, replace) ///
            nullrefline(lpattern(dash) ///
            favorsleft("Favours tea  ") ///
	    favorsright("  Favours control"))

// get variables raedy for bayesmh:

gen md = _meta_es 
gen se_md = _meta_se
gen se2 = se_md^2
gen id = _n
gen df = n_tea + n_control - 2

Now, as shown in other posts, and in Chapter 4 of the book, we can use a multilevel model in bayesmh to fit the RE meta-analysis with a t-distribution likelihood:

bayesmh md U[id], likelihood(t(se2, df)) ///
          prior({md:_cons}, normal(0,25)) ///
          prior({U}, normal(0,{tau2})) ///
          prior({tau2}, igamma(0.01, 0.2)) ///
          nchains(2) ///
          rseed(1234) ///
          burnin(10000) ///
          mcmcsize(10000) ///
          showreffects ///
          saving("stata_chains.dta", replace) ///
          init1({md:_cons} -0.2 {tau2}  0.5) ///
          init2({md:_cons} 0.2 {tau2} 0.3)

How to do this in evaluator?

capture program drop myloglik
program define myloglik

  version 19.5
  args lnfj xb U

  tokenize "$MH_extravars"
  local df = "`1'"
  local se_md = "`2'"
  
  // retrieve regression coefficients
  tempname mb
  matrix `mb' = $MH_b
  
  // reconstruct linear predictor manually
  tempvar xb2
  qui gen double `xb2' = `=`mb'[1,1]' + `U' ///
     if $MH_touse
  
  // log likelihood
  qui replace `lnfj' = ln(tden(`df', ($MH_y-`xb2') / `se_md')) if $MH_touse

end


bayesmh md U[id], ///
   llevaluator(myloglik, ///
               extravars(df se_md) ///
               reparameters({U})) ///
   prior({U[id]}, normal(0, {var_U})) ///
   prior({md:_cons}, normal(0, 16)) ///
   prior({var_U}, igamma(0.01, 0.2)) block({var_U}) ///
   nchains(2) ///
   rseed(1234) ///
   burnin(10000) ///
   mcmcsize(10000) ///
   showreffects ///
   dots ///
   init1({md:_cons} 0.2 {var_U} 1) ///
   init2({md:_cons} -0.2 {var_U} 0.5) 

One important change from the Bangladesh example is that we use the extravars() suboption to llevaluator(). This allows us to send the names of any variables so they can be accessed inside the program, even if they do not appear in the dependent and independent variable lists.

But — and this is where it gets interesting — there are now, in Stata 19, many other ways to send things in and out of the (ll)evaluator program. Mostly, these appear as specially named global macros. You can see above how some of these macros can be turned into matrices and their component elements extracted (as we did to the vector of current proposed parameter values $MH_b), or can be tokenized into constituent words which are tempnames and tempvars (as we did with $MH_extravars).

Some of the to-and-fro is documented in the 911-page Stata [BAYES] manual and we discovered some others by detective work. This table tries to summarise them as best we can in two dimensions.

Stata frame, macro, matrix
bayesmh
(ll)evaluator
depvardepvar$MH_y (contains a varname)
indepvarindepvar$MH_x1 (contains a varname)


coefficient (NB: 0b.indepvar too)$MH_b (1xp matrix)


parameters()`phi’  (for example) also $MH_p


(calculated internally)`xb’  (tempvar, linear predictor )
varlistextravars()$MH_extravars


reparameters()`U’  (for example) `var_U’  (for example)


(used internally)`lnfj’  (tempvar name for observationwise loglik)


(used internally)`lnprior’  (scalar name)
Missing valuesIf, in$MH_touse  (tempvar)
Macros, loops, matricespassthruopts()$MH_passthruopts (string)



$MH_predict_y1 $MH_predict_mu1

Unreported statistics

The fact that we can evaluate individual studies’ contributions to log likelihood means that they don’t all have to have the same likelihood formula. This is handy for dichotomised outcomes, for example (Chapter 14 of the book), which might have a binomial likelihood, while those reporting means and SDs might have a normal or t likelihood.

Missing statistics can also be imputed this way. Here is an example where we pretend that we don’t know the SD from the first green tea study (Kozuma et al), and decide to use some empirical or opinion prior in its place:

/* bayesmh evaluator code for RE MA
   with the Kozuma study's unreported SD */

import delimited "cochrane_green_tea_weight_loss.csv", clear

// visualise the stats:
/*
local kz0 = mean_control[1]
local kz1 = mean_tea[1]

twoway (scatter sd_control mean_control in 2/12, ///
           msymbol(Oh) mcolor(black)) ///
       (scatter sd_tea mean_tea in 2/12, ///
           msymbol(O) mcolor(forest_green)) ///
       , xtitle("Mean BMI change") ///
         ytitle("Standard deviation") ///
	 xline(`kz0', lpattern(dash) ///
                      lcolor(black)) ///
         xline(`kz1', lpattern(dash) ///
                      lcolor(forest_green)) ///
		 xscale(range(-4 1)) ///
		 legend(order(1 "Control" 2 "Tea"))
*/

// non-Bayesian MA:

meta esize n_tea mean_tea sd_tea ///
           n_control mean_control sd_control, ///
   esize(mdiff) ///
   random ///
   studylabel(study) ///
   eslabel("BMI change (kg/m²)")

// get useful variables:

gen md = _meta_es 
gen se_md = _meta_se
gen se2 = se_md^2
gen id = _n
gen df = n_tea + n_control - 2

// bayesmh evaluator model:

capture program drop myloglik
program define myloglik

  version 19.5
  args lnfj xb mvar U

  tokenize "$MH_extravars"
  local df = "`1'"
  local se_md = "`2'"
  
  // retrieve regression coefficients
  tempname mb
  matrix `mb' = $MH_b
  
  // reconstruct linear predictor manually
  tempvar xb2
  qui gen double `xb2' = `=`mb'[1,1]' + `U' ///
     if $MH_touse
  
  local msemd = sqrt((`mvar' / 107) + (`mvar' / 119)) // hard coding alert!
  
  // log likelihood
  qui replace `lnfj' = ln(tden(`df', ($MH_y-`xb2') / `se_md')) ///
       if $MH_touse in 2/12
  qui replace `lnfj' = ln(tden(`df', ($MH_y-`xb2') / `msemd')) ///
       if $MH_touse in 1
end


bayesmh md U[id], ///
   llevaluator(myloglik, parameters({mvar}) ///
                         extravars(df se_md) ///
		         reparameters({U})) ///
   prior({U[id]}, normal(0, {var_U})) ///
   prior({md:_cons}, normal(0, 16)) ///
                     block({md:_cons} {U}) ///
   prior({var_U}, igamma(0.01, 0.2)) block({var_U}) ///
   prior({mvar}, igamma(1, 1)) block({mvar}) ///
   nchains(2) ///
   rseed(1234) ///
   burnin(20000) ///
   mcmcsize(5000) ///
   thin(10) ///
   adaptation(every(300)) ///
   showreffects ///
   dots ///
   init1({md:_cons} 0.2 ///
         {mvar} 0.7 ///
         {var_U} 1 ///
         {U} 0.1) ///
   init2({md:_cons} -0.2 ///
         {mvar} 0.8 ///
         {var_U} 0.5 ///
         {U} -0.1) 
   
bayesstats grubin
bayesgraph diagnostics {md:_cons}, ///
   title("Mean difference") ///
   name(diag_cons, replace)
bayesgraph diagnostics {var_U}, ///
   title("Heterogeneity variance") ///
   name(diag_tau2, replace)
bayesgraph diagnostics {U[1]}, ///
   title("Kozuma random effect") ///
   name(diag_U1, replace)
bayesgraph diagnostics {U[3]}, ///
   title("Nagao random effect") ///
   name(diag_U3, replace)
bayesgraph diagnostics {mvar}, ///
   title("Missing variance") ///
   name(diag_mvar, replace)


   

We specified an extra parameter called {mvar}, the missing variance, then converted it to the standard error of the mean (with some hard-coding of the sample sizes, which is generally ill-advised). You can see in the bayesmh options that we have to use thinning, extra burn-in, and extra adaptation to get the algorithm to behave itself. (In fact, this is needed even just with the Bangladesh dataset above… more on algorithmic fragility below.)

Network meta-analysis

We can also use evaluators to disregard the linear predictor and instead create the classic simple NMA formula (see Chapter 9 of our book and also the excellent specialised book by Dias et al).

/*
  net from https://raw.githubusercontent.com/UCL/network/master/package/
  net install network
*/

network setup d n, studyvar(stud) trtvar(trt)
network map
network meta c

// bayesmh llevaluator model:

use nma_smoking.dta, clear
sort study trt
egen base = min(trt), by(study)

capture program drop myloglik
program define myloglik

  version 19.5
  args lnfj xb U

  // extract extravars
  tokenize "$MH_extravars"
  local v_n = "`1'"
  local v_trt = "`2'"
  local v_base = "`3'"
  
  // extract treatment coefficients
  tempname mb
  matrix `mb' = (0, $MH_b )
  tempvar theta_trt theta_base
  qui gen double `theta_trt' = 0
  qui gen double `theta_base' = 0
  forvalues i=2/4 {
  	// NB hard-coding number of treatments
     qui replace `theta_trt' = `=`mb'[1,`i']' ///
         if `v_trt'==`i'
     qui replace `theta_base' = `=`mb'[1,`i']' ///
         if `v_base'==`i'
  }
  
  // reconstruct NMA predictor manually
  tempvar xb2
  qui gen double `xb2' = `U' + `theta_trt' - `theta_base'
  
  // predicted probability of outcome
  tempvar predprob  
  qui gen double `predprob' = invlogit(`xb2')

  // log likelihood
  qui replace `lnfj' = ln(binomial(`v_n', $MH_y, `predprob')) if $MH_touse

end


bayesmh d i.trt U[study], nocons /// 
   llevaluator(myloglik, extravars(n trt base) ///
                         reparameters({U})) ///
   prior({U[study]}, normal(0, {var_U})) block({U}) ///
   prior({d:i.trt}, normal(0, 25)) block({d:i.trt}) ///
   prior({var_U}, igamma(0.1, 1)) block({var_U}) ///
   nchains(2) ///
   rseed(1234) ///
   burnin(10000) ///
   mcmcsize(10000) ///
   showreffects ///
   dots ///
   init1({d:} 0.2 {U} -0.3 {var_U} 10) ///
   init2({d:} 0.25 {U} -0.25 {var_U} 11) 
   
bayesstats grubin
bayesgraph diagnostics {d:2.trt}, ///
   title("Treatment 2 vs 1") ///
   name(diag_trt2, replace)
bayesgraph diagnostics {var_U}, ///
   title("Interstudy variance") ///
   name(diag_var_U, replace)
bayesgraph diagnostics {U[1]}, ///
   title("Fixed effect 1") ///
   name(diag_U1, replace)

We send the treatment number in trt and the baseline for that trial in base.

If you run this, you will see that it does not behave itself and does not produce a posterior sample that effectively provides a random sample from which we can make reliable inferences. Let’s reflect on the best next steps Stata could take to let these boundary-pushing models move forward.

Limitations (at present)

No less an NMA authority than Dr Ian White of the UK’s Medical Research Council Clinical Trials Unit tells us that he also bumped up against algorithmic fragility when trying Bayesian NMA in Stata. Fundamentally, the adaptive Metropolis-Hastings algorithm is not going to be able to cope. In some limited circumstances, one can switch to Gibbs sampling in Stata, but not with evaluator() options.

We are expressing some personal opinions and preferences here, but we feel that including Hamiltonian Monte Carlo in Stata would be the best way forward. This algorithm has some very well-tested open-source implementations, so including it should not be too hard. It also has (at least in the NUTS variant) excellent diagnostic and warning measures. HMC algorithms, by their nature, have a huge benefit for highly correlated posterior distributions, which are omnipresent in multilevel models like RE meta-analysis. You can look up the paper by Hoffman and Gelman if you want to learn more and see a head-to-head comparison of NUTS, Gibbs and M-H. These scatterplots are the take-away:

Samples generated by random-walk Metropolis, Gibbs sampling, NUTS, and Independent. The plots compare different sampling methods from a highly correlated 250-dimensional distribution, focusing on the first two dimensions.

Nevertheless, we realise that developers’ time and energy are not infinite. The progress that StataCorp has made since introducing Bayesian functionality in 2014 is impressive, and we feel confident that they could rise to this next challenge. While we are on the topic of wishes and grumbles, bayes: meta would really help a lot of beginners.

A final note is that some Stata users have long been in the habit of using user-written commands to call external Bayesian software. Professor John Thompson started this for WinBUGS, then wrote some experimental versions for OpenBUGS and JAGS before he retired. I (Robert) wrote a stan command which does something similar with CmdStan, but it has always been tricky for some users to get it working nicely, especially if they are using Windows and/or have some corporate cybersecurity constraints. When Stata introduced its tight integration with Python, I decided that the best advice was for Stata users to use cmdstanpy and go from Stata to Python to Stan and back again. Stan dev team followed my advice to downgrade the stan command (“StataStan”) from a frontline interface to a side project sitting on their GitHub account (and SSC, thanks to Professor Kit Baum). However, over the years that followed, I met a lot of Stata users who simply do not want to learn or use any Python: they want to stay inside Stata. And that is why I think that this approach using evaluator() will be potentially useful to many of them.


Discover more from The Bayesian Meta-Analysis Network

Subscribe to get the latest posts sent to your email.

Leave a Reply

Discover more from The Bayesian Meta-Analysis Network

Subscribe now to keep reading and get access to the full archive.

Continue reading