Alphabet (formal languages)

Short description: Base set of symbols with which a language is formed

In formal language theory, an alphabet, sometimes called a vocabulary, is a non-empty set of indivisible symbols/glyphs, typically thought of as representing letters, characters, digits, phonemes, or even words.^[1]^[2] Alphabets in this technical sense of a set are used in a diverse range of fields including logic, mathematics, computer science, and linguistics. An alphabet may have any cardinality ("size") and, depending on its purpose, may be finite (e.g., the alphabet of letters "a" through "z"), countable (e.g., ${v_{1}, v_{2}, \dots}$ ), or even uncountable (e.g., ${v_{x} : x \in ℝ}$ ).

Strings, also known as "words" or "sentences", over an alphabet are defined as a sequence of the symbols from the alphabet set.^[3] For example, the alphabet of lowercase letters "a" through "z" can be used to form English words like "iceberg" while the alphabet of both upper and lower case letters can also be used to form proper names like "Wikipedia". A common alphabet is {0,1}, the binary alphabet, and a "00101111" is an example of a binary string. Infinite sequence of symbols may be considered as well (see Omega language).

It is often necessary for practical purposes to restrict the symbols in an alphabet so that they are unambiguous when interpreted. For instance, if the two-member alphabet is {00,0}, a string written on paper as "000" is ambiguous because it is unclear if it is a sequence of three "0" symbols, a "00" followed by a "0", or a "0" followed by a "00".

Notation

If L is a formal language, i.e. a (possibly infinite) set of finite-length strings, the alphabet of L is the set of all symbols that may occur in any string in L. For example, if L is the set of all variable identifiers in the programming language C, L's alphabet is the set { a, b, c, ..., x, y, z, A, B, C, ..., X, Y, Z, 0, 1, 2, ..., 7, 8, 9, _ }.

Given an alphabet $Σ$ , the set of all strings of length $n$ over the alphabet $Σ$ is indicated by $Σ^{n}$ . The set $⋃_{i \in ℕ} Σ^{i}$ of all finite strings (regardless of their length) is indicated by the Kleene star operator as $Σ^{*}$ , and is also called the Kleene closure of $Σ$ . The notation $Σ^{ω}$ indicates the set of all infinite sequences over the alphabet $Σ$ , and $Σ^{\infty}$ indicates the set $Σ^{*} \cup Σ^{ω}$ of all finite or infinite sequences.

For example, using the binary alphabet {0,1}, the strings ε, 0, 1, 00, 01, 10, 11, 000, etc. are all in the Kleene closure of the alphabet (where ε represents the empty string).

Applications

Alphabets are important in the use of formal languages, automata and semiautomata. In most cases, for defining instances of automata, such as deterministic finite automata (DFAs), it is required to specify an alphabet from which the input strings for the automaton are built. In these applications, an alphabet is usually required to be a finite set, but is not otherwise restricted.

When using automata, regular expressions, or formal grammars as part of string-processing algorithms, the alphabet may be assumed to be the character set of the text to be processed by these algorithms, or a subset of allowable characters from the character set.

References

↑ Ebbinghaus, H.-D.; Flum, J.; Thomas, W. (1994). Mathematical Logic (2nd ed.). New York City: Springer. p. 11. ISBN 0-387-94258-0. https://www.springer.com/mathematics/book/978-0-387-94258-2. "By an alphabet $𝒜$ we mean a nonempty set of symbols."
↑ Rosen, Kenneth H. "Discrete Mathematics and Its Applications, Seventh Edition" McGraw-Hill 2012. Pages 847-851. From page 849: "A vocabulary (or alphabet) V is a finite, nonempty set of elements called symbols. A word (or sentence) over V is a string of finite length of elements of V."
↑ Rautenberg, Wolfgang (2010). A Concise Introduction to Mathematical Logic (Third ed.). Springer. p. xx. ISBN 978-1-4419-1220-6. https://link.springer.com/content/pdf/bfm%3A978-1-4419-1221-3%2F1.pdf. "If 𝗔 is an alphabet, i.e., if the elements 𝐬 ∈ 𝗔 are symbols or at least named symbols, then the sequence (𝐬₁,...,𝐬_n)∈𝗔ⁿ is written as 𝐬₁···𝐬_n and called a string or a word over 𝗔."

Literature

John E. Hopcroft and Jeffrey D. Ullman, Introduction to Automata Theory, Languages, and Computation, Addison-Wesley Publishing, Reading Massachusetts, 1979. ISBN 0-201-02988-X.

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Alphabet (formal languages). Read more

[Ebbinghaus1994-1] Ebbinghaus, H.-D.; Flum, J.; Thomas, W. (1994). Mathematical Logic (2nd ed.). New York City: Springer. p. 11. ISBN 0-387-94258-0. https://www.springer.com/mathematics/book/978-0-387-94258-2. "By an alphabet $𝒜$ we mean a nonempty set of symbols."

[Rosen-2] Rosen, Kenneth H. "Discrete Mathematics and Its Applications, Seventh Edition" McGraw-Hill 2012. Pages 847-851. From page 849: "A vocabulary (or alphabet) V is a finite, nonempty set of elements called symbols. A word (or sentence) over V is a string of finite length of elements of V."

[Rautenberg-3] Rautenberg, Wolfgang (2010). A Concise Introduction to Mathematical Logic (Third ed.). Springer. p. xx. ISBN 978-1-4419-1220-6. https://link.springer.com/content/pdf/bfm%3A978-1-4419-1221-3%2F1.pdf. "If 𝗔 is an alphabet, i.e., if the elements 𝐬 ∈ 𝗔 are symbols or at least named symbols, then the sequence (𝐬₁,...,𝐬_n)∈𝗔ⁿ is written as 𝐬₁···𝐬_n and called a string or a word over 𝗔."

[1]

[2]

[3]

v t e Mathematical logic
General	Formal language Formation rule Formal proof Formal semantics Well-formed formula Set Element Class Classical logic Axiom Rule of inference Relation Theorem Logical consequence Type theory Symbol Syntax Theory
Systems	Formal system Deductive system Axiomatic system Hilbert style systems Natural deduction Sequent calculus
Traditional logic	Proposition Inference Argument Validity Cogency Syllogism Square of opposition Venn diagram
Propositional calculus and Boolean logic	Boolean functions Propositional calculus Propositional formula Logical connectives Truth tables Many-valued logic
Predicate logic	First-order Quantifiers Predicate Second-order Monadic predicate calculus
Naive set theory	Set Empty set Element Enumeration Extensionality Finite set Infinite set Subset Power set Countable set Uncountable set Recursive set Domain Codomain Image Map Function Relation Ordered pair
Set theory	Foundations of mathematics Zermelo–Fraenkel set theory Axiom of choice General set theory Kripke–Platek set theory Von Neumann–Bernays–Gödel set theory Morse–Kelley set theory Tarski–Grothendieck set theory
Model theory	Model Interpretation Non-standard model Finite model theory Truth value Validity
Proof theory	Formal proof Deductive system Formal system Theorem Logical consequence Rule of inference Syntax
Computability theory	Recursion Recursive set Recursively enumerable set Decision problem Church–Turing thesis Computable function Primitive recursive function

Alphabet (formal languages)

Contents

Notation

Applications

See also

References

Literature