Probabilistic programming language for Bayesian inference
Stan is a probabilistic programming language for statistical inference written in C++ .[ 2] The Stan language is used to specify a (Bayesian) statistical model with an imperative program calculating the log probability density function .[ 2]
Stan is licensed under the New BSD License . Stan is named in honour of Stanislaw Ulam , pioneer of the Monte Carlo method .[ 2]
Stan was created by a development team consisting of 52 members[ 3] that includes Andrew Gelman , Bob Carpenter, Daniel Lee, Ben Goodrich, and others.
A simple linear regression model can be described as
y
n
=
α
+
β
x
n
+
ϵ
n
{\displaystyle y_{n}=\alpha +\beta x_{n}+\epsilon _{n}}
, where
ϵ
n
∼
normal
(
0
,
σ
)
{\displaystyle \epsilon _{n}\sim {\text{normal}}(0,\sigma )}
. This can also be expressed as
y
n
∼
normal
(
α
+
β
X
n
,
σ
)
{\displaystyle y_{n}\sim {\text{normal}}(\alpha +\beta X_{n},\sigma )}
. The latter form can be written in Stan as the following:
data {
int < lower = 0 > N ;
vector [ N ] x ;
vector [ N ] y ;
}
parameters {
real alpha ;
real beta ;
real < lower = 0 > sigma ;
}
model {
y ~ normal ( alpha + beta * x , sigma );
}
The Stan language itself can be accessed through several interfaces:
In addition, higher-level interfaces are provided with packages using Stan as backend, primarily in the R language :[ 4]
rstanarm provides a drop-in replacement for frequentist models provided by base R and lme4 using the R formula syntax;
brms [ 5] provides a wide array of linear and nonlinear models using the R formula syntax;
prophet provides automated procedures for time series forecasting .
Stan implements gradient-based Markov chain Monte Carlo (MCMC) algorithms for Bayesian inference, stochastic, gradient-based variational Bayesian methods for approximate Bayesian inference, and gradient-based optimization for penalized maximum likelihood estimation.
MCMC algorithms:
Variational inference algorithms:
Automatic Differentiation Variational Inference[ 7]
Pathfinder: Parallel quasi-Newton variational inference[ 8]
Optimization algorithms:
Automatic differentiation [ edit ]
Stan implements reverse-mode automatic differentiation to calculate gradients of the model, which is required by HMC, NUTS, L-BFGS, BFGS, and variational inference.[ 2] The automatic differentiation within Stan can be used outside of the probabilistic programming language.
Stan is used in fields including social science,[ 9] pharmaceutical statistics ,[ 10] market research ,[ 11] and medical imaging .[ 12]
PyMC is a probabilistic programming language in Python
ArviZ a Python library for Exploratory Analysis of Bayesian Models
^ "Release 2.35.0" . 3 June 2024. Retrieved 26 June 2024 .
^ a b c d e Stan Development Team. 2015. Stan Modeling Language User's Guide and Reference Manual, Version 2.9.0
^ "Development Team" . stan-dev.github.io . Retrieved 2024-11-21 .
^ Gabry, Jonah. "The current state of the Stan ecosystem in R" . Statistical Modeling, Causal Inference, and Social Science . Retrieved 25 August 2020 .
^ "BRMS: Bayesian Regression Models using 'Stan' " . 23 August 2021.
^ Hoffman, Matthew D.; Gelman, Andrew (April 2014). "The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo" . Journal of Machine Learning Research . 15 : pp. 1593–1623.
^ Kucukelbir, Alp; Ranganath, Rajesh; Blei, David M. (June 2015). "Automatic Variational Inference in Stan". 1506 (3431). arXiv :1506.03431 . Bibcode :2015arXiv150603431K .
^ Zhang, Lu; Carpenter, Bob; Gelman, Andrew; Vehtari, Aki (2022). "Pathfinder: Parallel quasi-Newton variational inference". Journal of Machine Learning Research . 23 (306): 1–49.
^ Goodrich, Benjamin King, Wawro, Gregory and Katznelson, Ira, Designing Quantitative Historical Social Inquiry: An Introduction to Stan (2012). APSA 2012 Annual Meeting Paper. Available at SSRN 2105531
^ Natanegara, Fanni; Neuenschwander, Beat; Seaman, John W.; Kinnersley, Nelson; Heilmann, Cory R.; Ohlssen, David; Rochester, George (2013). "The current state of Bayesian methods in medical product development: survey results and recommendations from the DIA Bayesian Scientific Working Group". Pharmaceutical Statistics . 13 (1): 3–12. doi :10.1002/pst.1595 . ISSN 1539-1612 . PMID 24027093 . S2CID 19738522 .
^ Feit, Elea (15 May 2017). "Using Stan to Estimate Hierarchical Bayes Models" . Retrieved 19 March 2019 .
^ Gordon, GSD; Joseph, J; Alcolea, MP; Sawyer, T; Macfaden, AJ; Williams, C; Fitzpatrick, CRM; Jones, PH; di Pietro, M; Fitzgerald, RC; Wilkinson, TD; Bohndiek, SE (2019). "Quantitative phase and polarization imaging through an optical fiber applied to detection of early esophageal tumorigenesis" . Journal of Biomedical Optics . 24 (12): 1–13. arXiv :1811.03977 . Bibcode :2019JBO....24l6004G . doi :10.1117/1.JBO.24.12.126004 . PMC 7006047 . PMID 31840442 .
Carpenter, Bob; Gelman, Andrew; Hoffman, Matthew; Lee, Daniel; Goodrich, Ben; Betancourt, Michael; Brubaker, Marcus; Guo, Jiqiang; Li, Peter; Riddell, Allen (2017). "Stan: A Probabilistic Programming Language" . Journal of Statistical Software . 76 (1): 1–32. doi :10.18637/jss.v076.i01 . ISSN 1548-7660 . PMC 9788645 . PMID 36568334 .
Gelman, Andrew, Daniel Lee, and Jiqiang Guo (2015). Stan: A probabilistic programming language for Bayesian inference and optimization , Journal of Educational and Behavioral Statistics.
Hoffman, Matthew D., Bob Carpenter, and Andrew Gelman (2012). Stan, scalable software for Bayesian modeling Archived 2015-01-21 at the Wayback Machine , Proceedings of the NIPS Workshop on Probabilistic Programming.