(in mathematics)
An abstract device realizing information processing. The terms "abstract machineabstract machine" and "automatonautomaton" are also used. Abstract machines are particular cases of control systems (cf. Control system). They arose in connection with the analysis of the concept of an algorithm, starting in the middle of the 1930s, with the development of computers and with the construction of mathematical models of biological systems. Machines processing discrete information became most widespread. Typical representatives of these are finite automata (cf. Automaton, finite) and Turing machines (cf. Turing machine). Abstract machines are easily visualized, they can be easily combined in various ways, and they have simple operating steps. Machines are being studied within the frame of the theory of algorithms (cf. Algorithms, theory of) and mathematical cybernetics and their study aims at an analysis and formalization of the concept of an algorithm and at modelling real devices and processes in a mathematical way. There is a fruitful relation between abstract machines and real computers. The ideas of constructing computers and of programming them relies, to a large extent, on ideas from the theory of algorithms and mathematical cybernetics. In turn, practical work with computers poses new problems and suggests models of machines in order to solve these problems.
Common to all abstract machines is the presence of a finite control device that may be in one of the states $ q _ {1} \dots q _ {k} $. A machine has a potentially-unbounded exterior memory and has a means for reading and writing information into the exterior memory. Reading (writing) of information is, as a rule, carried out locally. The operation of a machine is determined by a program, consisting of commands of the form $ q _ {i} a \rightarrow q _ {j} b $, where $ q _ {i} , q _ {j} $ are states of the control device, $ a $ stands for ordered information of local character from the exterior memory and $ b $ is the notation for both the change of the contents of the exterior memory and the position of the reading device in it. The machine operates in discrete time. At each step $ t $ the machine uses the reading device to move information $ a $ from the exterior memory to the control device. If at step $ t $ the control device of the machine is in state $ q _ {i} $ and the program contains the command $ q _ {i} a \rightarrow q _ {j} b $, then at the next step $ t+ 1 $ the control device is in state $ q _ {j} $, and the contents of the exterior memory and the position of the reading device are changed according to $ b $. Usually one distinguishes one or several initial or final states among the states of the control device. Before starting to operate, the control device is in an initial state, and the end of the operation is determined by the final state(s).
In many cases an abstract machine is constructed to compute numerical (word) functions and predicates. The computation of a function $ f( x _ {1} \dots x _ {n} ) $ on a machine $ M $ is understood as follows. For an arbitrary choice of the arguments $ ( x _ {1} \dots x _ {n} ) $ an initial configuration $ K _ {1} ( x _ {1} \dots x _ {n} ) $ of $ M $ is indicated, this being a filling of the external memory and a position of the reading device with respect to it corresponding to $ x _ {1} \dots x _ {n} $. It is determined which configurations of $ M $ are final, i.e. when the computation of $ f $ on $ M $ is terminated. It is determined what value of the function to be computed is obtained for each final configuration. The computation of the value of $ f( x _ {1} \dots x _ {n} ) $ consists of a series of transitions from the initial state $ K _ {1} ( x _ {1} \dots x _ {n} ) $ to the next state $ K _ {2} ( x _ {1} \dots x _ {n} ) $, etc., in correspondence with the program of the machine $ M $, until a final configuration is reached. If the value of $ f( x _ {1} \dots x _ {n} ) $ is not defined, $ M $ will never reach a final configuration when starting from $ K _ {1} ( x _ {1} \dots x _ {n} ) $. In this case the computing process may continue indefinitely.
Often one uses an ordinary Turing machine in the definition of an abstract machine; the differences consist of adding new possibilities or restrictions on the Turing machine. The modification of a Turing machine is usually conducted along the following three lines.
The introduction of several tapes is most common (multi-tape Turing machines). One also studies machines with one-sided or multi-dimensional tapes, with an exterior memory in the form of an infinite graph (e.g. in the form of an infinite binary tree). Another approach consists in replacing tapes by registers or calculating frames, suitable for containing natural (integral) numbers or words of arbitrary length. Such are Shepherdson–Sturgis machines [6], the Minski machine [3] and a random-access machine.
One considers machines for which a part of the information in the external memory to which the control device has had no access for some time (depending on the input) becomes totally inaccessible to the reading device for the rest of the time. The definition of a Turing machine with tapes of storage type is based on a closely related idea (pushdown automata).
For machines taking and transforming information from the external memory symbol-by-symbol, a typical restriction is the prohibition to print some symbols next to certain others. Such are, for example, the non-erasing Turing machines and two-sided automata equivalent to finite automata. The important class of finite automata also belongs here. However, restrictions on the size of the memory, the operating time, etc., are more usual. For example, a linearly bounded automaton is a Turing machine for which the length of the tape used in the computations is bounded by a linear function in the length of the input.
Often, the information from the external memory is obtained by using several reading heads. A more usual approach is where general recursive predicates have been defined on the external memory. The set of these predicates determines the information arriving at the control device of the machine. This idea is used in Shepherdson–Sturgis machines, where the addition of information into the external memory is realized by arbitrary general recursive functions.
Here one distinguishes between deterministic, non-deterministic and probabilistic machines. For a deterministic machine every operating step is uniquely determined by the state of the control device and the information from the external memory obtained by the reading device. The program of a non-deterministic machine may contain different commands with the same left-hand side. Therefore, for a non-deterministic machine one considers not one but the set of all computations compatible with the program for a given input. A probabilistic machine is a machine that is provided with a random-number generator or one that has a program in which the transition from one command to another is realized with a given probability.
In the non-deterministic case one usually considers the computation of predicates. If a non-deterministic machine $ M $ computes a predicate $ p( x _ {1} \dots x _ {n} ) $, then $ p( x _ {1} \dots x _ {n} ) $ is true if and only if among all possible computations of $ M $ starting at the configuration $ K _ {1} ( x _ {1} \dots x _ {n} ) $ there is a computation containing a final configuration.
In general, the computing possibilities of deterministic, non-deterministic and probabilistic machines are the same. However, in separate narrow classes of machines, non-deterministic and probabilistic machines may prove to be more powerful than deterministic machines. For many types of abstract machines with restrictions either on the volume of the external memory or on the operating time, the problem of "determinization" of non-deterministic machines is an interesting open (1989) problem.
See also Computer, abstract.
[1] | C.E. Shannon (ed.) J. McCarthy (ed.) , Automata studies , Princeton Univ. Press (1956) |
[2] | A.I. Mal'tsev, "Algorithms and recursive functions" , Wolters-Noordhoff (1970) (Translated from Russian) |
[3] | M. Minsky, "Computation: finite and infinite machines" , Prentice-Hall (1967) |
[4] | , Problems of mathematical logic. Complexity of algorithms and of classes of computable functions , Moscow (1970) (In Russian; translated from English) |
[5] | , Complexity of computations and algorithms , Moscow (1974) (In Russian; translated from English) (Collection of translations) |
[6] | J.C. Shepherdson, H.E. Sturgis, "Computability and recursive functions" J. Assoc. Comp. Mach. , 10 : 2 (1963) pp. 217–255 |
The basic distinction which has to be made among contemporary machine models is the distinction between sequential and parallel machines.
In a sequential machine model there is a single control unit which operates on some potentially infinite storage structure. For the case of the Turing machine, the storage structure consists of linearly ordered tapes which are accessed by read-write heads. The various Turing machine models differ with respect to the number of tapes, the number of heads on a tape and the dimensionality of the worktapes. Some special tapes may have restrictions on their use, like being a read-only input tape, a write-only output tape, etc.
In the random access machine (RAM) the memory consists of an infinite collection of computer words, each with an unbounded word-length, thus capable of storing arbitrary integers. The memory is accessed by means of direct and indirect load and store instructions. Arguments fetched from memory into the central processor can be subjected to arithmetic instructions like additions, subtractions and comparisons. More powerful models also include multiplicative instructions and parallel bitwise operations based on the binary representation of the numerical values of the arguments.
A third class of sequential models operates on directed or undirected graphs by creating nodes and redirecting pointers between nodes. The oldest of these models is the Kolmogorov–Uspenskii machine (which operates on undirected graphs); machines operating on directed graphs are frequently called pointer machines or storage modification machines.
Parallel machine models have a potentially infinite number of processing units which interact either by means of communication or by a shared memory. The diversity of parallel models even exceeds that of the sequential models. The cellular array model consists of finite state processors connected in a network. Even though the components are finite states devices, the network obtains universal computational capabilities. Other models have for each processor a private storage structure; the processors communicate by exchanging messages. In a contrasting model a growing number of processors share a single (potentially infinite) storage structure; a crucial parameter in such a model is the formal treatment of write-conflicts (disallow, arbitrary choice of winner, priority, $ \dots $). Other models combine local and global storage.
During a computation only a finite number of processors become active. The different models have various mechanisms for controlling process activation. Some message passing models create processes dynamically in a graph-shaped interconnection pattern. Central memory based machines specify for each instruction the number of processors which are activated during this instruction. A third control mechanism is based on recursive subroutines, where several recursive branches are executed in parallel on processors created at calling time; these processors again are deleted after having returned their answer to their calling processor.
Machine models play a crucial role in defining the basic notions like running-time or space consumption in computational complexity, since these running times and space consumption ultimately have to be expressed in terms of computations on some idealized machine. It is therefore crucial to investigate the extent to which the complexity notions thus obtained become model dependent. This issue is closely related to the question of the efficiency of the mutual simulations between the different models — since these models all have universal computational power such mutual simulations exist.
It can be shown that the basic sequential models all simulate each other with polynomial overhead in time and constant factor overhead in space (provided space measures are defined correctly). As a consequence, the notions of logspace and/or polynomial time computability become mathematically well-defined machine-independent notions, whereas the notion of computability within quadratic time is inherently model dependent. This invariance property should be considered as a criterion for determining the standard class of sequential machine models.
For a large variety of parallel models it has been established that there exists an equivalence between polynomial time on the parallel machine and polynomial space in the sequential world. This equivalence does not hold for all parallel models proposed in the literature. Some degenerate parallel models have the power of computing arbitrarily complex functions in constant time. On the other hand, there exist some sequential models which turn out to be equivalent to the parallel models, due to the fact that they can operate on exponentially large objects in unit time. Again, the above equivalence, which is traditionally referred to under the name of the parallel computation thesis, should be considered to be the characterization of some basic mode of parallelism.
A basic open question about sequential models is the power of non-determinism. The notorious $ {\mathcal P} = {\mathcal N} {\mathcal P} $? problem (which is machine independent as long as one considers the basic sequential models only) can be understood as asking for the overhead required for making a non-deterministic sequential device deterministic. Similarly, the unsolved $ {\mathcal P} = {\mathcal P} {\mathcal S} {\mathcal P} {\mathcal A} {\mathcal C} {\mathcal E} $? problem is related to the relation between time and space as a computational resource, and, in the light of the parallel computation thesis, to the power of parallelism.
An alternative mode of using machine computations, called alternation, was discovered in 1976 by A.K. Chandra, D.C. Kozen and L.J. Stockmeyer. Sequential alternating machines turn out to be equivalent to the standard parallel machine models, but the alternating models have the additional feature that logarithmic space in the alternating model is equivalent to polynomial deterministic time in the sequential world; similarly, polynomial alternating space is equivalent to deterministic sequential exponential time.
[a1] | A.V. Aho, J.E. Hopcroft, J.D. Ullman, "The design and analysis of computer algorithms" , Addison-Wesley (1974) |
[a2] | A.K. Chandra, D.C. Kozen, L.J. Stockmeyer, "Alternation" J. Assoc. Comp. Mach. , 28 (1981) pp. 114–133 |
[a3] | J.E. Hopcroft, J.D. Ulman, "Introduction to automata theory, languages and computation" , Addison-Wesley (1979) |
[a4] | A.N. Kolmogorov, V.A. Uspenkii, "On the definition of an algorithm" Transl. Amer. Math. Soc. (2) , 29 (1963) pp. 217–245 Uspehi Mat. Nauk , 13 (1958) pp. 3–28 |
[a5] | K. Wagner, G. Wechsung, "Computational complexity" , Reidel (1986) |
[a6] | P. van Emde Boas, "Machine models and simulations" J. van Leeuwen (ed.) , Handbook of Theoretical Computer Science , North-Holland (1990) |
[a7] | M.A. Harrison, "Introduction to switching and automata theory" , McGraw-Hill (1965) |
[a8] | M.A. Arbib, "Theories of abstract automata" , Prentice-Hall (1969) |
[a9] | R.E. Kalman, P.L. Falb, M.A. Arbib, "Topics in mathematical system theory" , McGraw-Hill (1969) |
[a10] | J.E. Hopcroft, J.D. Ullman, "Formal languages and their relation to automata" , Addison-Wesley (1969) (Revised version: "Introduction to automata, languages and computation" Addison-Wesley, 1979) |