In mathematical logic, a term denotes a mathematical object while a formula denotes a mathematical fact. In particular, terms appear as components of a formula. This is analogous to natural language, where a noun phrase refers to an object and a whole sentence refers to a fact.
A first-order term is recursively constructed from constant symbols, variables and function symbols. An expression formed by applying a predicate symbol to an appropriate number of terms is called an atomic formula, which evaluates to true or false in bivalent logics, given an interpretation. For example, [math]\displaystyle{ (x+1)*(x+1) }[/math] is a term built from the constant 1, the variable x, and the binary function symbols [math]\displaystyle{ + }[/math] and [math]\displaystyle{ * }[/math]; it is part of the atomic formula [math]\displaystyle{ (x+1)*(x+1) \ge 0 }[/math] which evaluates to true for each real-numbered value of x.
Besides in logic, terms play important roles in universal algebra, and rewriting systems.
Given a set V of variable symbols, a set C of constant symbols and sets Fn of n-ary function symbols, also called operator symbols, for each natural number n ≥ 1, the set of (unsorted first-order) terms T is recursively defined to be the smallest set with the following properties:[1]
Using an intuitive, pseudo-grammatical notation, this is sometimes written as:
The signature of the term language describes which function symbol sets Fn are inhabited. Well-known examples are the unary function symbols sin, cos ∈ F1, and the binary function symbols +, −, ⋅, / ∈ F2. Ternary operations and higher-arity functions are possible but uncommon in practice. Many authors consider constant symbols as 0-ary function symbols F0, thus needing no special syntactic class for them.
A term denotes a mathematical object from the domain of discourse. A constant c denotes a named object from that domain, a variable x ranges over the objects in that domain, and an n-ary function f maps n-tuples of objects to objects. For example, if n ∈ V is a variable symbol, 1 ∈ C is a constant symbol, and add ∈ F2 is a binary function symbol, then n ∈ T, 1 ∈ T, and (hence) add(n, 1) ∈ T by the first, second, and third term building rule, respectively. The latter term is usually written as n+1, using infix notation and the more common operator symbol + for convenience.
Originally, logicians defined a term to be a character string adhering to certain building rules.[2] However, since the concept of tree became popular in computer science, it turned out to be more convenient to think of a term as a tree. For example, several distinct character strings, like "(n⋅(n+1))/2", "((n⋅(n+1)))/2", and "[math]\displaystyle{ \frac{n(n+1)}{2} }[/math]", denote the same term and correspond to the same tree, viz. the left tree in the above picture. Separating the tree structure of a term from its graphical representation on paper, it is also easy to account for parentheses (being only representation, not structure) and invisible multiplication operators (existing only in structure, not in representation).
Two terms are said to be structurally, literally, or syntactically equal if they correspond to the same tree. For example, the left and the right tree in the above picture are structurally unequal terms, although they might be considered "semantically equal" as they always evaluate to the same value in rational arithmetic. While structural equality can be checked without any knowledge about the meaning of the symbols, semantic equality cannot. If the function / is e.g. interpreted not as rational but as truncating integer division, then at n=2 the left and right term evaluates to 3 and 2, respectively. Structural equal terms need to agree in their variable names.
In contrast, a term t is called a renaming, or a variant, of a term u if the latter resulted from consistently renaming all variables of the former, i.e. if u = tσ for some renaming substitution σ. In that case, u is a renaming of t, too, since a renaming substitution σ has an inverse σ−1, and t = uσ−1. Both terms are then also said to be equal modulo renaming. In many contexts, the particular variable names in a term don't matter, e.g. the commutativity axiom for addition can be stated as x+y=y+x or as a+b=b+a; in such cases the whole formula may be renamed, while an arbitrary subterm usually may not, e.g. x+y=b+a is not a valid version of the commutativity axiom.[note 1] [note 2]
The set of variables of a term t is denoted by vars(t). A term that doesn't contain any variables is called a ground term; a term that doesn't contain multiple occurrences of a variable is called a linear term. For example, 2+2 is a ground term and hence also a linear term, x⋅(n+1) is a linear term, n⋅(n+1) is a non-linear term. These properties are important in, for example, term rewriting.
Given a signature for the function symbols, the set of all terms forms the free term algebra. The set of all ground terms forms the initial term algebra.
Abbreviating the number of constants as f0, and the number of i-ary function symbols as fi, the number θh of distinct ground terms of a height up to h can be computed by the following recursion formula:
Given a set Rn of n-ary relation symbols for each natural number n ≥ 1, an (unsorted first-order) atomic formula is obtained by applying an n-ary relation symbol to n terms. As for function symbols, a relation symbol set Rn is usually non-empty only for small n. In mathematical logic, more complex formulas are built from atomic formulas using logical connectives and quantifiers. For example, letting [math]\displaystyle{ \mathbb{R} }[/math] denote the set of real numbers, ∀x: x ∈ [math]\displaystyle{ \mathbb{R} }[/math] ⇒ (x+1)⋅(x+1) ≥ 0 is a mathematical formula evaluating to true in the algebra of complex numbers. An atomic formula is called ground if it is built entirely from ground terms; all ground atomic formulas composable from a given set of function and predicate symbols make up the Herbrand base for these symbol sets.
When the domain of discourse contains elements of basically different kinds, it is useful to split the set of all terms accordingly. To this end, a sort (sometimes also called type) is assigned to each variable and each constant symbol, and a declaration [note 3] of domain sorts and range sort to each function symbol. A sorted term f(t1,...,tn) may be composed from sorted subterms t1,...,tn only if the ith subterm's sort matches the declared ith domain sort of f. Such a term is also called well-sorted; any other term (i.e. obeying the unsorted rules only) is called ill-sorted.
For example, a vector space comes with an associated field of scalar numbers. Let W and N denote the sort of vectors and numbers, respectively, let VW and VN be the set of vector and number variables, respectively, and CW and CN the set of vector and number constants, respectively. Then e.g. [math]\displaystyle{ \vec{0} \in C_W }[/math] and 0 ∈ CN, and the vector addition, the scalar multiplication, and the inner product is declared as [math]\displaystyle{ +:W \times W \to W, *:W \times N \to W }[/math], and [math]\displaystyle{ \langle .,. \rangle: W \times W \to N }[/math], respectively. Assuming variable symbols [math]\displaystyle{ \vec{v},\vec{w} \in V_W }[/math] and a,b ∈ VN, the term [math]\displaystyle{ \langle (\vec{v}+\vec{0})*a,\vec{w}*b \rangle }[/math] is well-sorted, while [math]\displaystyle{ \vec{v}+a }[/math] is not (since + doesn't accept a term of sort N as 2nd argument). In order to make [math]\displaystyle{ a*\vec{v} }[/math] a well-sorted term, an additional declaration [math]\displaystyle{ *:N \times W \to W }[/math] is required. Function symbols having several declarations are called overloaded.
See many-sorted logic for more information, including extensions of the many-sorted framework described here.
Notation example |
Bound variables |
Free variables |
Written as lambda-term |
---|---|---|---|
x/n | n | x | limit(λn. div(x,n)) |
[math]\displaystyle{ \sum_{i=1}^n i^2 }[/math] | i | n | sum(1,n,λi. power(i,2)) |
[math]\displaystyle{ \int_a^b \sin(k \cdot t) dt }[/math] | t | a, b, k | integral(a,b,λt. sin(k⋅t)) |
Mathematical notations as shown in the table do not fit into the scheme of a first-order term as defined above, as they all introduce an own local, or bound, variable that may not appear outside the notation's scope, e.g. [math]\displaystyle{ t \cdot \int_a^b \sin(k \cdot t) \; dt }[/math] doesn't make sense. In contrast, the other variables, referred to as free, behave like ordinary first-order term variables, e.g. [math]\displaystyle{ k \cdot \int_a^b \sin(k \cdot t) \; dt }[/math] does make sense.
All these operators can be viewed as taking a function rather than a value term as one of their arguments. For example, the lim operator is applied to a sequence, i.e. to a mapping from positive integer to e.g. real numbers. As another example, a C function to implement the second example from the table, Σ, would have a function pointer argument (see box below).
Lambda terms can be used to denote anonymous functions to be supplied as arguments to lim, Σ, ∫, etc.
For example, the function square from the C program below can be written anonymously as a lambda term λi. i2. The general sum operator Σ can then be considered as a ternary function symbol taking a lower bound value, an upper bound value and a function to be summed-up. Due to its latter argument, the Σ operator is called a second-order function symbol. As another example, the lambda term λn. x/n denotes a function that maps 1, 2, 3, ... to x/1, x/2, x/3, ..., respectively, that is, it denotes the sequence (x/1, x/2, x/3, ...). The lim operator takes such a sequence and returns its limit (if defined).
The rightmost column of the table indicates how each mathematical notation example can be represented by a lambda term, also converting common infix operators into prefix form.
int sum(int lwb, int upb, int fct(int)) { // implements general sum operator int res = 0; for (int i=lwb; i<=upb; ++i) res += fct(i); return res; } int square(int i) { return i*i; } // implements anonymous function (lambda i. i*i); however, C requires a name for it #include <stdio.h> int main(void) { int n; scanf(" %d",&n); printf("%d\n", sum(1, n, square)); // applies sum operator to sum up squares return 0; }
Given a set V of variable symbols, the set of lambda terms is defined recursively as follows:
The above motivating examples also used some constants like div, power, etc. which are, however, not admitted in pure lambda calculus.
Intuitively, the abstraction λx.t denotes a unary function that returns t when given x, while the application ( t1 t2 ) denotes the result of calling the function t1 with the input t2. For example, the abstraction λx.x denotes the identity function, while λx.y denotes the constant function always returning y. The lambda term λx.(x x) takes a function x and returns the result of applying x to itself.