Foundation models

Short description: Artificial intelligence model paradigm

A foundation model is a large artificial intelligence model trained on a vast quantity of unlabeled data at scale (usually by self-supervised learning) resulting in a model that can be adapted to a wide range of downstream tasks.^[1] Foundation models are behind a major transformation in how AI systems are built since their introduction in 2018. Early examples of foundation models were large pre-trained language models including BERT^[2] and GPT-3. Subsequently, several multimodal foundation models have been produced including DALL-E, Flamingo,^[3] and Florence.^[4] The Stanford Institute for Human-Centered Artificial Intelligence's (HAI) Center for Research on Foundation Models (CRFM) popularized the term.^[1]

Definitions

The Stanford Institute for Human-Centered Artificial Intelligence's (HAI) Center for Research on Foundation Models (CRFM) described a foundation model as a "paradigm for building AI systems" in which a model trained on a large amount of unlabeled data can be adapted to many applications.^[5]^[6]

History

An early concept of a foundation model was found in I. J. Good's 1965 treatise entitled "Speculations Concerning the First Ultraintelligent Machine"^[7]^[8] Stanley Kubrick's HAL 9000 supercomputer in his 1968 2001: A Space Odyssey was modelled after Good's ultraintelligent machine.^[9]

Opportunities and risks

A 2021 arXiv report listed foundation models' capabilities in regards to "language, vision, robotics, reasoning, and human interaction", technical principles, such as "model architectures, training procedures, data, systems, security, evaluation, and theory, their applications, for example in law, healthcare, and education and their potential impact on society, including "inequity, misuse, economic and environmental impact, legal and ethical considerations".^[10]

References

↑ ^1.0 ^1.1 "Introducing the Center for Research on Foundation Models (CRFM)". Stanford HAI. https://hai.stanford.edu/news/introducing-center-research-foundation-models-crfm.
↑ Rogers, Anna; Kovaleva, Olga; Rumshisky, Anna (2020). "A Primer in BERTology: What we know about how BERT works". arXiv:2002.12327 [cs.CL].
↑ Tackling multiple tasks with a single visual language model, 28 April 2022, https://www.deepmind.com/blog/tackling-multiple-tasks-with-a-single-visual-language-model, retrieved 13 June 2022
↑ Yuan, Lu; Chen, Dongdong; Chen, Yi-Ling; Codella, Noel; Dai, Xiyang; Gao, Jianfeng; Hu, Houdong; Huang, Xuedong; Li, Boxin; Li, Chunyuan; Liu, Ce; Liu, Mengchen; Liu, Zicheng; Lu, Yumao; Shi, Yu; Wang, Lijuan; Wang, Jianfeng; Xiao, Bin; Xiao, Zhen; Yang, Jianwei; Zeng, Michael; Zhou, Luowei; Zhang, Pengchuan (2022). "Florence: A New Foundation Model for Computer Vision". arXiv:2111.11432 [cs.CV].
↑ "Stanford CRFM". https://crfm.stanford.edu/.
↑ "What are foundation models?". IBM Research Blog. https://research.ibm.com/blog/what-are-foundation-models.
↑ "Huge “foundation models” are turbo-charging AI progress". The Economist. 10 June 2022. ISSN 0013-0613. https://www.economist.com/interactive/briefing/2022/06/11/huge-foundation-models-are-turbo-charging-ai-progress.
↑ Good, I.J. (1965), Speculations Concerning the First Ultraintelligent Machine, https://exhibits.stanford.edu/feigenbaum/catalog/gz727rg3869
↑ Dan van der Vat (29 April 2009), ""Jack Good" (obituary)", The Guardian: p. 32, https://www.theguardian.com/science/2009/apr/29/jack-good-codebreaker-obituary, retrieved 9 October 2013
↑ Bommasani, Rishi; Hudson, Drew A.; Adeli, Ehsan; Altman, Russ; Arora, Simran; von Arx, Sydney; Bernstein, Michael S.; Bohg, Jeannette et al. (18 August 2021). On the Opportunities and Risks of Foundation Models (Report). arXiv. http://arxiv.org/abs/2108.07258. Retrieved 10 June 2022.

0.00

(0 votes)

[CRFM-1] 1.0 ^1.1 "Introducing the Center for Research on Foundation Models (CRFM)". Stanford HAI. https://hai.stanford.edu/news/introducing-center-research-foundation-models-crfm.

[2] Rogers, Anna; Kovaleva, Olga; Rumshisky, Anna (2020). "A Primer in BERTology: What we know about how BERT works". arXiv:2002.12327 [cs.CL].

[deepmind_20220428-3] Tackling multiple tasks with a single visual language model, 28 April 2022, https://www.deepmind.com/blog/tackling-multiple-tasks-with-a-single-visual-language-model, retrieved 13 June 2022

[4] Yuan, Lu; Chen, Dongdong; Chen, Yi-Ling; Codella, Noel; Dai, Xiyang; Gao, Jianfeng; Hu, Houdong; Huang, Xuedong; Li, Boxin; Li, Chunyuan; Liu, Ce; Liu, Mengchen; Liu, Zicheng; Lu, Yumao; Shi, Yu; Wang, Lijuan; Wang, Jianfeng; Xiao, Bin; Xiao, Zhen; Yang, Jianwei; Zeng, Michael; Zhou, Luowei; Zhang, Pengchuan (2022). "Florence: A New Foundation Model for Computer Vision". arXiv:2111.11432 [cs.CV].

[5] "Stanford CRFM". https://crfm.stanford.edu/.

[6] "What are foundation models?". IBM Research Blog. https://research.ibm.com/blog/what-are-foundation-models.

[economist_20220510-7] "Huge “foundation models” are turbo-charging AI progress". The Economist. 10 June 2022. ISSN 0013-0613. https://www.economist.com/interactive/briefing/2022/06/11/huge-foundation-models-are-turbo-charging-ai-progress.

[8] Good, I.J. (1965), Speculations Concerning the First Ultraintelligent Machine, https://exhibits.stanford.edu/feigenbaum/catalog/gz727rg3869

[GuardianObit-9] Dan van der Vat (29 April 2009), ""Jack Good" (obituary)", The Guardian: p. 32, https://www.theguardian.com/science/2009/apr/29/jack-good-codebreaker-obituary, retrieved 9 October 2013

[Bommasani_20210818-10] Bommasani, Rishi; Hudson, Drew A.; Adeli, Ehsan; Altman, Russ; Arora, Simran; von Arx, Sydney; Bernstein, Michael S.; Bohg, Jeannette et al. (18 August 2021). On the Opportunities and Risks of Foundation Models (Report). arXiv. http://arxiv.org/abs/2108.07258. Retrieved 10 June 2022.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

v t e Natural language processing
General terms	Natural language understanding Text corpus Speech corpus Stopwords Bag-of-words AI-complete n-gram (Bigram, Trigram)
Text analysis	Text segmentation Part-of-speech tagging Text chunking Compound term processing Collocation extraction Stemming Lemmatisation Named-entity recognition Coreference resolution Sentiment analysis Concept mining Parsing Word-sense disambiguation Ontology learning Terminology extraction Textual entailment Truecasing
Automatic summarization	Multi-document summarization Sentence extraction Text simplification
Machine translation	Computer-assisted Example-based Rule-based Neural
Automatic identification and data capture	Speech recognition Speech synthesis Optical character recognition Natural language generation
Topic model	Pachinko allocation Latent Dirichlet allocation Latent semantic analysis
Computer-assisted reviewing	Automated essay scoring Concordancer Grammar checker Predictive text Spell checker Syntax guessing
Natural language user interface	Automated online assistant Chatbot Interactive fiction Question answering Voice user interface

v t e Existential risk from artificial intelligence
Concepts	AI box AI takeover Control problem Existential risk from artificial general intelligence Friendly artificial intelligence Instrumental convergence Intelligence explosion Machine ethics Superintelligence Technological singularity
Organizations	Allen Institute for Artificial Intelligence Center for Applied Rationality Center for Security and Emerging Technology Centre for the Study of Existential Risk DeepMind Foundational Questions Institute Future of Humanity Institute Future of Life Institute Humanity+ Institute for Ethics and Emerging Technologies Leverhulme Centre for the Future of Intelligence Machine Intelligence Research Institute OpenAI
People	Nick Bostrom Sam Harris Stephen Hawking Bill Hibbard Bill Joy Elon Musk Steve Omohundro Huw Price Martin Rees Stuart J. Russell Jaan Tallinn Max Tegmark Frank Wilczek Roman Yampolskiy Andrew Yang Eliezer Yudkowsky
Other	Open Letter on Artificial Intelligence Ethics of artificial intelligence Controversies and dangers of artificial general intelligence Artificial intelligence as a global catastrophic risk Superintelligence Our Final Invention
Category

Foundation models

Topic: Organization

Contents

Definitions

History

Opportunities and risks

References