Continuous Bernoulli distribution

From HandWiki - Reading time: 3 min


Short description: Probability distribution

Template:Infobox probability distribution 2 In probability theory, statistics, and machine learning, the continuous Bernoulli distribution[1][2][3] is a family of continuous probability distributions parameterized by a single shape parameter λ(0,1), defined on the unit interval x[0,1], by:

p(x|λ)λx(1λ)1x.

The continuous Bernoulli distribution arises in deep learning and computer vision, specifically in the context of variational autoencoders,[4][5] for modeling the pixel intensities of natural images. As such, it defines a proper probabilistic counterpart for the commonly used binary cross entropy loss, which is often applied to continuous, [0,1]-valued data.[6][7][8][9] This practice amounts to ignoring the normalizing constant of the continuous Bernoulli distribution, since the binary cross entropy loss only defines a true log-likelihood for discrete, {0,1}-valued data.

The continuous Bernoulli also defines an exponential family of distributions. Writing θ=log(λ/(1λ)) for the natural parameter, the density can be rewritten in canonical form: p(x|θ)exp(θx). [10]

Statistical inference

Given an independent sample of n points x1,,xn with xi[0,1]i from continuous Bernoulli, the log-likelihood of the natural parameter θ is

(θ)=θi=1nxinlog{(eθ1)/θ}

and the maximum likelihood estimator of the natural parameter θ is the solution of (θ)=0, that is, θ^ satisfies

eθ^eθ^11θ^=1ni=1nxi

where the left hand side eθ^/(eθ^1)θ^1 is the expected value of continuous Bernoulli with parameter θ^. Although θ^ does not admit a closed-form expression, it can be easily calculated with numerical inversion.


Further properties

The entropy of a continuous Bernoulli distribution is

H[X]={0 if λ=12λlog(λ)(1λ)log(1λ)12λlog(2tanh1(12λ)e(12λ)) otherwise

Bernoulli distribution

The continuous Bernoulli can be thought of as a continuous relaxation of the Bernoulli distribution, which is defined on the discrete set {0,1} by the probability mass function:

p(x)=px(1p)1x,

where p is a scalar parameter between 0 and 1. Applying this same functional form on the continuous interval [0,1] results in the continuous Bernoulli probability density function, up to a normalizing constant.

Uniform distribution

The Uniform distribution between the unit interval [0,1] is a special case of continuous Bernoulli when λ=1/2 or θ=0.

Exponential distribution

An exponential distribution with rate Λ restricted to the unit interval [0,1] corresponds to a continuous Bernoulli distribution with natural parameter θ=Λ<0.

Continuous categorical distribution

The multivariate generalization of the continuous Bernoulli is called the continuous-categorical.[11]

References

  1. Loaiza-Ganem, G., & Cunningham, J. P. (2019). The continuous Bernoulli: fixing a pervasive error in variational autoencoders. In Advances in Neural Information Processing Systems (pp. 13266-13276).
  2. PyTorch Distributions. https://pytorch.org/docs/stable/distributions.html#continuousbernoulli
  3. Tensorflow Probability. https://www.tensorflow.org/probability/api_docs/python/tfp/edward2/ContinuousBernoulli
  4. Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.
  5. Kingma, D. P., & Welling, M. (2014, April). Stochastic gradient VB and the variational auto-encoder. In Second International Conference on Learning Representations, ICLR (Vol. 19).
  6. Larsen, A. B. L., Sønderby, S. K., Larochelle, H., & Winther, O. (2016, June). Autoencoding beyond pixels using a learned similarity metric. In International conference on machine learning (pp. 1558-1566).
  7. Jiang, Z., Zheng, Y., Tan, H., Tang, B., & Zhou, H. (2017, August). Variational deep embedding: an unsupervised and generative approach to clustering. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (pp. 1965-1972).
  8. PyTorch VAE tutorial: https://github.com/pytorch/examples/tree/master/vae.
  9. Keras VAE tutorial: https://blog.keras.io/building-autoencoders-in-keras.html.
  10. Lee, C. J.; Dahl, B. K.; Ovaskainen, O.; Dunson, D. B. (2025). Scalable and robust regression models for continuous proportional data. arXiv preprint arXiv:2504.15269. https://arxiv.org/abs/2504.15269
  11. Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). The continuous categorical: a novel simplex-valued exponential family. In 36th International Conference on Machine Learning, ICML 2020. International Machine Learning Society (IMLS).




Licensed under CC BY-SA 3.0 | Source: https://handwiki.org/wiki/Continuous_Bernoulli_distribution
18 views |
↧ Download this article as ZWI file
Encyclosphere.org EncycloReader is supported by the EncyclosphereKSF