Contents |
The term computation often indicates some sort of information processing implemented by human-designed machines or by natural processes. In computer science, Turing machines have been used to provide a formal definition of computation (in discrete domains). A Turing machine receives an input that is then iteratively transformed until the machine stops and the output is obtained. Usually, the intermediate transformations of the input are not recorded but only the mapping input / output is considered of interest. Computing by observing, on the other hand, investigates exactly such intermediate transformations. Specifically, the paradigm is based on to the composition of two distinct machines, called observed system and observer. The observer associates to each configuration of the observed system a unique symbol (observation). Then, to each sequence of configurations (behaviour) of the observed system, one associates a sequence of symbols: Each symbol is the observation of exactly one configuration in the behaviour. Such sequence of symbols constitutes the output produced by the composed system.
The paradigm, in the most general case, is presented in Figure 1. The behaviour of the observed system is the sequence of configurations 'Config.1', 'Config.2',.. The observer associates the symbol 'a' to the first configuration, the symbol 'b' to the second configuration, etc..The concatenation of these symbols, 'abc..' (a string) is the output of the composed system.
The motivations for computing by observing come from experimental science: there, the usual procedure is to conduct an experiment, observe the entire behaviour of a system and then take the result of this observation as the final output. A momentary picture of the observed system is irrelevant, and rather the entire behaviour of the system is evaluated. Computing by observing introduces an explicit separation of the observer from the observed system, stressing in this way the role of the observer in computation. This allows to define the computation by an opportune computational balance between the observed system and the observer. Moreover, for systems where configurations are not finite or that are characterized by infinite behaviours, e.g., cellular automata, this approach seems the natural way to investigate them as computing devices. In several applications of the idea, it has been shown that the composition of observed system and observer can form computationally powerful devices (such as Turing machines), even when the observed system and the observer are computationally simple, e.g., equivalent to finite state automata (see, e.g., Alhazov and Cavaliere(2004) and Cavaliere and Leupold(2004)). Moreover, as shown by Cavaliere et al.(2006) if the observed system has sufficient computational power one can obtain every possible computational device without changing the observed system but only by defining the appropriate observer. These results seem to be relevant especially in the area of natural and molecular computing where one could take advantage of the fact that simple experiments observed in a clever manner by simple tools could compute the same functions that much more complex experiments can compute.
Computing by observing has been applied in several contexts, by considering different kinds of observers and of observed systems. The idea was originally introduced by Cavaliere and Leupold(2004b) where the observed system was a P system, an abstract model of cellular processes. Following the same idea, new bio-inspired models of computation have been obtained in Alhazov and Cavaliere(2004) (observing sticker systems) and in Cavaliere et al.(2009) (observing DNA splicing). The generalization to formal languages theory has been proposed in Cavaliere and Leupold(2004), to string-rewriting in Cavaliere and Leupold(2006). Most of these results are obtained by designing an opportune observed system and an opportune observer, such that their composition produces a specified computing device. However this contrasts with the fact that, often, natural systems cannot be designed or modified, but their behaviour can be monitored by using different observers. In this respect, as shown in Cavaliere et al.(2006), if the observed system has sufficient computational power, every computational device can be obtained by only designing an appropriate observer, without modifying the observed system (with both - observers and observed system of limited computational power). A recent application concerns the computational power of the observer in the de-quantisation of quantum algorithms Calude et al.(2010).
One can often abstract a discrete dynamical system by representing the configurations of the system as sequences of objects/symbols (strings) and the rules that determine the transitions between different configurations by means of opportune rewriting rules (e.g., productions of a formal grammar). For this reason, many results on computing by observing can be understood in the framework of formal languages theory Salomaa(1987). In "Grammar/Observer (G/O) systems", introduced in Cavaliere and Leupold(2004), a formal grammar plays the role of the observed system and a finite state automaton plays the role of the observer. The grammar's configurations are the sentential forms of its derivations. The behaviours of the grammar are its derivations. The observer is a device mapping the sentential forms into one single symbol. Precisely, the observer is an appropriate variant of finite state automata where the set of states is labeled with the symbols of a specified output alphabet: Any computation of the observer produces as output the label of the state it halts in (one can assume as observers only deterministic and complete automata). As we do in this article, an observer can always be described by opportune regular expressions (we assume the basic notions of formal grammars and regular expressions, Salomaa(1987)).
A Grammar/Observer (G/O) system is a pair \( \Omega = (G,A) \) constituted by a generative grammar \( G =(N, T, S, P) \) and an observer \( A \) with output alphabet \(\Sigma \cup \{\bot\}\) which is also the output alphabet of the entire system \(\Omega\ .\) The observer's input alphabet is the union of \(N\) and \(T\ .\) Several modes of generation can be defined by considering different restrictions on the observer. The most general observer can write an empty and non-empty symbol in an arbitrary manner (free G/O systems).
A free G/O system generates a language in the following manner\[L_f(\Omega) = \{A(w_0,w_1,\dots ,w_n) \mid S=w_0 \Rightarrow w_1 \Rightarrow \dots \Rightarrow w_n,\ w_n\in T^* \}.\]
Here \( A(w_0,w_1,\dots ,w_n) \) is used as a more compact way of writing the catenation \( A(w_0)A(w_1)\cdots A(w_n) \ .\) In other words, the language contains all those strings corresponding to all the possible terminating derivations of the observed grammar. Derivations which do not terminate do not produce a valid output. One can consider the variant where the language produced by \( \Omega \) is defined as \( L_{\bot,f}(\Omega) = {L}_f(\Omega) \cap \Sigma^* . \) In this way the strings in \( L_f(\Omega) \) containing \( \bot \) are filtered out and they are not present in \( L_{\bot,f}(\Omega) \ .\) Thus the observer has the ability to reject a computation, when configurations of a certain type appear.
The functioning of the free G/O system presented in Figure 2 is the following one.
To each sentential form produced by the grammar \( G \) the observer \( A \) associates a symbol that can be \( a,b,c,\bot \) or the empty symbol \( \lambda \ .\) Thus every computation of the observer produces a unique symbol and the chronological catenation of these symbols is the output string. In Figure 2 the output string is \(\lambda a \cdots a b\cdots b c \cdots c\ .\) The mapping defined by the observer is specified by the corresponding regular expressions. The language \(L_f(\Omega)\) is obtained by considering all possible halting derivations of \(G\) and collecting all the output strings. In this case \(L_{\bot,f} (\Omega)\) is \(\{a^n b^n c^n \mid n >0\}\ .\)
We assume that \( \mathcal REG \) and \( \mathcal CF\) denote the classes of regular and context-free grammars, respectively and that \(REG\ ,\) \(CF\) and \(RE\) denote the classes of languages generated by regular, context-free, and type-0 grammars, respectively. Commutative context-free grammar, as defined in Cavaliere and Leupold(2004) is a class of grammars weaker than \(\mathcal CF\) and are denoted by \(LCCF\ .\) In Cavaliere and Leupold(2004) it has been shown that a G/O system composed of computationally simple components, namely a locally commutative context-free grammar (\(LCCF\)) as observed system and a finite state automaton as observer, is computationally complete.
Theorem : For each \(L \in RE \) there exists a G/O system \(\Omega = (G,A)\ ,\) with \(G\) an \(LCCF\ ,\) such that \( L_{\bot,f}(\Omega)=L. \)
An interesting fact is that the observer's ability to produce \( \bot \ ,\) i.e., to eliminate certain observed derivations, seems intuitively to be a powerful and essential feature to get high computational power. However, one can obtain all recursively enumerable languages over \( \Sigma \) simply by intersection of a language over \(\Sigma \cup \{\bot\}\) with the regular language \(\Sigma^*\ .\) Notice that recursive languages are closed under intersection with regular sets ( Salomaa(1987)). Therefore, there must exist some G/O system \(\Omega\) generating a non-recursive \(L_{f}(\Omega)\) (i.e., without using the filtering with \(\bot\)).
An interesting restriction of the general idea is to allow only changes in the observer, keeping unchanged the observed system (computing by only observing Cavaliere et al.(2006)). The following simple example shows that the changes of the observer can be important. Let us consider the context-free grammar \( G = (\{S,A,B,C\},\{t,p\},S,\{S \rightarrow pS, S \rightarrow p, S\rightarrow A, A\rightarrow AB, A\rightarrow C, B\rightarrow C, C\rightarrow t\})\ .\)
If \(G\) is coupled with the observer \(A'\) such that \(A'(w) = a \) if \( w\in \{S,A,B,C,t,p\}^+ \ ,\) then \(\Omega=(G,A')\) defines the language \(L_{\bot,f}(\Omega)= \{a^i \mid i \geq 2\}\ ,\) a regular language. In fact, the derivation \(S\Rightarrow pS \stackrel{n-2}\Rightarrow p^{n-1}S \Rightarrow p^n\) produces (when observed) the string \(a^{n+1}\ .\) Keeping the same grammar \(G\) we change the observer into \(A''\) such that: \[A''(w) = \left\{ \begin{array}{ll} \lambda & \textrm{ if } w=S,\\ a & \textrm{ if } w\in AB^*,\\ b & \textrm{ if } w\in C^+B^*,\\ c & \textrm{ if } w\in t^+C^*,\\ \bot & \textrm{ else} \end{array}\right.\] One can verify that \(\Omega=(G,A'')\) generates the language \(L_{\bot,f}(\Omega)= \{a^nb^nc^n \mid n > 0 \}\ ,\) a context-sensitive language. This example suffices to underline that part of the computation can be done by only defining the observer, keeping unchanged the observed system. This part can be important: As shown by Cavaliere et al.(2006) one can construct a (fixed) universal context-free grammar that can generate all recursively enumerable languages, by designing, for every language, an appropriate observer.
Computing by observing has been applied in the area of molecular computing to define bio-inspired accepting machines. This has been generally achieved by considering a bio-inspired computational device, introducing an input to such computing device, and observing its behaviour. If the behaviour is of an "expected" type (for example, it follows a predetermined pattern), the input is accepted, otherwise it is rejected.
In this article, we present the application (Cavaliere et al.(2009)) of the idea to splicing systems, a bio-inspired model of the recombination of double stranded DNA sequences (for simplicity, one refers to them as DNA strands) under the action of a ligase and restriction enzymes (endonucleases). An abstract accepting device, “splicing recognizer” (in short, SR), constructed by joining a decider, an observer, and a splicing system is depicted in Figure 3.
The intuitive functioning of the SR is the following one. An input "marked" DNA strand (represented by a string \(w\)) is inserted in a test tube. Due to the presence of restriction enzymes, the input strand changes, as it starts to recombine with other DNA strands present in the test tube. A sequence of intermediate marked DNA strands is then produced: Each new marked strand is obtained by splicing the previous one with other strands present in the test tube. This constitutes the behaviour of the input marked DNA strand. Schematically this is presented with the sequence of \(w, w', w'', w'''\) in Figure 3. The observer associates to each intermediate marked strand a certain symbol taken from a finite set of possible symbols. It writes these symbols onto an output tape in their chronological order. In Figure 3 this corresponds to the string \(a_1a_2a_3a_4\ .\) When the marked strand becomes a defined target strand, \( w''' \ ,\) then the observation stops. At this point the decider checks if the entire observed behaviour of the input marked DNA strand described by the string \(a_1a_2a_3a_4\) has followed a certain expected pattern, i.e., if it is in a certain language. If this is true, the input string \(w\) is accepted by the SR; otherwise it is rejected. Using this idea it is possible to obtain powerful accepting systems even when computationally simple components are used. For instance, observing a finite splicing system (i.e., finite set of rules) with observer and decider as finite state automata is enough to simulate an accepting Turing machine. This is a remarkable jump in acceptance power since it is known (e.g., Paun et al.(1998)) that a finite splicing system by itself can generate only a subclass of the class of regular languages.
These ideas can be formalized in the following way (we assume the notation of splicing systems, more precisely their formal H scheme, as presented in Paun et al.(1998)).
Consider an alphabet \(V\) (splicing alphabet) and two symbols \(\# \) and \(\$\) not in \(V\ .\) For a splicing rule \(r=u_1 \# u_2 \$ u_3 \# u_4\) and strings \(x,y,z_1,z_2 \in V^*\) one writes \((x,y) \Longrightarrow_{r} (z_1,z_2)\) iff \(x=x_1u_1u_2x_2, y=y_1u_3u_4y_2,z_1=x_1u_1u_4y_2, z_2=y_1u_3u_2x_2\ .\) One refers to \(z_1\) (\(z_2\)) as the first (second) string obtained by applying the splicing rule \(r\ .\) An H scheme is a pair \(\nu=(V,R)\) where \(V\) is an alphabet, and \(R\subseteq V^* \#V^*\$V^*\#V^*\) is a set of splicing rules.
For a given H scheme \(\nu=(V,R)\) and a language \(L \subseteq V^*\) one defines \(\nu(L)=\{z_1,z_2 \in V^* \mid (x,y) \Longrightarrow_{r} (z_1,z_2), \) for some \( x, y \in L,r \in R \}\ .\) When restriction enzymes (and a ligase) are present in a test tube, they do not stop acting after one cut and paste operation, but they act iteratively. Then, given an “initial language” \(L\subseteq V^*\ ,\) and an H scheme \(\nu=(V,R)\) one defines the iterated splicing as\[\nu^0(L)=L, \nu^{i+1}(L)= \nu^{i}(L) \cup \nu(\nu^{i}(L)), i \geq 0.\]
We are interested in observing the behaviour of a specific marked string introduced, at the beginning, in the initial language \(L\) and called “input marked string”.
Given an initial language \(L\ ,\) an “input marked string” \(w \in L\ ,\) a “target marked language” \(L_t\) and an H scheme \(\nu\ ,\) the scheme defines a sequence of marked strings that represents the behaviour of the input marked string \(w\ ,\) according to the splicing rules defined in \(\nu\ .\)
The sequence of marked strings, \(\langle w_0=w,w_1, \cdots, w_k \rangle\ ,\) for \(k \geq 1\) and \(w_k \in L_t\ ,\) is constructed in the following iterative way (\(w_i\) is the marked string associated to the set \(\nu^{i}(L), 0 \leq i \leq k\)). Each new marked string is obtained by splicing the old marked string, until a marked string \(w_k\) from the target marked language \(L_t\) is reached or the marked string cannot be spliced. The first string of the sequence is the input marked string, \(w_0=w\ .\) If \(w_i \in L_t\ ,\) \(i \geq 1\ ,\) then the sequence stops (the marked string is among the ones of the target marked language). If there is no \(x \in \nu^{i}(L)\ ,\) \(i \geq 0\ ,\) such that \((w_i,x) \Longrightarrow_{r} (z_1,z_2)\) or \((x,w_i) \Longrightarrow_{r} (z_1,z_2)\) for some \(r\in R\ ,\) then the sequence stops (the marked string cannot be spliced). If \(x,y \in \nu^{i}(L)\ ,\) \(i\geq 0\ ,\) with \(w_i=x\) (or \(w_i=y\)) and there exists a rule \( r \in R\) such that \((x,y) \Longrightarrow_{r} (z_1,z_2)\ ,\) then \(w_{i+1}=z_1\ .\) In this case, if the marked string can be subject to more than one splicing rule, producing different strings, the choice of the next marked string is done in a non-deterministic way (notice that it is assumed that the first string produced as the new marked one). Because the update of a marked string is made in a non-deterministic way, given an input marked string \(w\ ,\) an initial language \(L\ ,\) a target marked language \(L_t\ ,\) and an H scheme \(\nu\ ,\) it is possible to get different sequences of intermediate marked strings. The collection of all these sequences is denoted by \(\nu(w,L,L_t)\ .\)
As observer we use a finite state automaton with output (as defined for G/O systems).
As deciders we require devices accepting a certain language over the output alphabet of the corresponding observer and for this we use conventional finite automata. The output of a decider \(D\ ,\) for a word \(w\) in input, is denoted by \(D(w)\ .\) It consists of a simple “yes” or “no”.
These components can be put together to constitute a “splicing recognizer” that formalizes the machine presented in Figure 3.
A splicing recognizer (SR) is a quintuple \(\Omega = (\nu, A, D, L, L_t)\ ;\) \(\nu =(V,R)\) is an H scheme, \(A\) is an observer with input alphabet \(V\) and output alphabet \(\Sigma\ ,\) \(D\) is a decider with input alphabet \(\Sigma\ ,\) \(L\) and \(L_t\) are finite languages, respectively, the initial and the target marked language for \(\nu\ .\)
The “language accepted” by the SR \(\Omega \) is the set of all words \(w\in V^*\) provided there exists a sequence of marked strings accepted by the decider. Formally, \[ L(\Omega) = \{w \in V^*\mid \] there is \( s\in \nu(w,L,L_t) \) such that \( D(A(s)) = \)” yes” }.
It is known (e.g.,Paun et al.(1998) ) that the family of languages generated by splicing systems using only a finite set of splicing rules and a finite initial language is strictly included in the family of regular languages. However we can show that an SR composed by an H scheme with a finite set of rules, finite initial language, finite target marked language and finite state automata as observer and decider, can recognize non-regular languages. This example is an hint towards the fact that the combination splicing system-observer-decider can be computationally powerful even when the single components are computationally simple. In particular, we show how to construct an SR accepting the non-regular language \( \{o_l a^nb^n o_r \mid n \geq 0\}\ .\)
The SR \(\Omega=(\nu, A, D, L, L_t)\) is defined as follows: the H scheme is \(\nu=(V,R)\ ,\) with \(V=\{o_l,o_r,a,b,a',b',X_1,Y_1,X_2,Y_2\}\) and \(R=\{r_1:\#bo_r \$ X_2\#b'o_r, r_2: o_la' \# Y_2\$ o_l a \#, r_3: \#b'o_r \$ X_1\#o_r, r_4:o_l \# Y_1 \$ o_la'\# \}\ .\)
The initial language is \(L=\{X_2b'o_r,o_la'Y_2,X_1o_r,Y_1o_l\}\ .\) The target marked language is \(L_t=\{o_lo_r\}\ .\)
The observer \(A \) has input alphabet \(V\) and output alphabet \(\Sigma= \{l_0,l_1,l_2,l_3,\bot\}\ .\)
The mapping it implements is\[A(w) = \left\{ \begin{array}{ll} l_0 & w\in o_l(a^*b^*)o_r,\\ l_1 & w\in o_l(a^*b^*b')o_r,\\ l_2 & w\in o_l(a'a^*b^*b')o_r,\\ l_3 & w\in o_l(a'a^*b^*)o_r,\\ \bot & else. \end{array}\right.\]
The decider \(D\) is a finite state automaton, with input alphabet \(\Sigma\ ,\) that gives a positive answer exactly if a string belongs to the regular language \(l_0(l_1l_2l_3l_0)^*\ .\)
The functioning of \(\Omega\) can be described as follows. The observer can "check" that the splicing rules are applied in the order \(r_1,r_2,r_3,r_4\ ,\) and this corresponds to remove, in an alternating way, a \(b\) from the right and an \(a\) from the left of the input marked string. In this way, at least one of the behaviours of the input marked string will be of the kind accepted by the decider if, and only if, the input marked string is in the language \(\{o_l a^nb^n o_r \mid n \geq 0\}\ .\) Notice that, at each step, the marked string present is spliced only with one of the strings present in the initial language. To clarify the working of the SR \(\Omega\) we present the acceptance of the input marked string \(w_0=o_l aabbo_r\ .\) For simplicity, we only show the marked string and the output of the observer, step by step.
Step \(0\ :\) input marked string \(w_0= o_l aa bb o_r\ ;\) \(A(w_0)= l_0\ ;\)
Step \(1\ :\) apply rule \(r_1\ ;\) new marked string \(w_1=o_l aabb'o_r\ ;\) \(A(w_1)=l_1\ ;\)
Step \(2\ :\) apply rule \(r_2\ ;\) new marked string \(w_2=o_la'abb'o_r\ ;\) \(A(w_2)=l_2\ ;\)
Step \(3\ :\) apply rule \(r_3\ ;\) new marked string \(w_3=o_la'abo_r\ ;\) \(A(w_3)=l_3\ ;\)
Step \(4\ :\) apply rule \(r_4\ ;\) new marked string \(w_4=o_labo_r\ ;\) \(A(w_4)=l_0\ ;\)
Step \(5\ :\) apply rule \(r_1\ ;\) new marked string \(w_5=o_l ab'o_r\);\(A(w_5)=l_1\ ;\)
Step \(6\ :\) apply rule \(r_2\ ;\) new marked string \(w_6=o_la'b'o_r\ ;\) \(A(w_6)=l_2\ ;\)
Step \(7\ :\) apply rule \(r_3\ ;\) new marked string \(w_7=o_la'o_r\ ;\) \(A(w_7)=l_3\ ;\)
Step \(8\ :\) apply rule \(r_4\ ;\) new marked string (in the target marked language) \(w_8=o_lo_r\ ;\) \(A(w_8) =l_0\ .\)
Obviously, the entire observed behaviour \(l_0l_1l_2l_3l_0l_1l_2l_3l_0\) is of the kind accepted by the decider \(D\ ,\) so the string \(w_0\) is accepted by the SR \(\Omega\ .\) Using a similar reasoning one can see tha the defined SR \(\Omega\) accepts exactly the language \( \{o_l a^nb^n o_r \mid n \geq 0\}\ .\)
Using similar approaches, it is possible to prove that SRs are universal. Informally, this means that it is possible to simulate an accepting Turing machine by observing, with a computationally simple observer, a computationally simple splicing system (this has been shown in Cavaliere et al.(2009)).
Formally: Theorem : For each \(RE\) language \(L\) over the alphabet \(T\) there exists an SR \(\Omega\) using a splicing scheme \(\nu \) with finite splicing rules and finite initial language, such that \(\Omega\) accepts the language \(\{o_l' w o'_r \mid w \in L\}\ ,\) with \(o'_l, o'_r \notin T\ .\)
See also the introductory paper by P. Leupold:
http://doc.gold.ac.uk/aisb50/AISB50-S03/AISB50-S3-Leupold-paper.pdf