a local graph-rewriting system (aka artificial chemistry) inspired by Kruszewski and Mikolov term-rewriting Combinatory Chemistry
a confluent graph-rewriting system where all rewrites are local, which moreover can be used to do SKI calculus reductions. This is true in the same sense that Lafont Interaction Combinators can be used to do lambda calculus reductions.
"Define a molecular computer as one molecule which transforms, by random chemical reactions mediated by a collection of enzymes, into a predictable other molecule, such that the output molecule can be conceived as the result of a computation encoded in the initial molecule. "
Graphs are called "molecules" and they are formed by nodes connected by edges. They follow the same conventions as chemlambda v2.
The edge names are arbitrary strings (with letters a-z, A-Z, 0-9, without space character) and different edges of the graph have different names. To each node port there is only one edge connected.
FRIN and FROUT are used to cap the free half-edges of the graph, if any.
The Arrow node was introduced to allow simultaneous application of rewrites, followed by the deletion of the introduced Arrow nodes by the rewrite COMB (which "combs" the graph). Together with FRIN and FROUT, they are used in all chemlambda rewrite systems.
It is possible to have edges connected to the same node, but at different ports. For example "S b b a" describes a node of type "S", with an edge named "b" connecting port 1 and port 2 of that node.
The rewrites concerning the node S are different, according to the existence of an edge between S node ports 1 and 2 (in this case S represents the combinator S) or not (in this case S is a fanout node).
Any molecule (ie chemSKI graph) is described by a mol file. As an example, the molecule which is the translation of the combinator ((S K) K) I is described by the mol file
FROUT 1
S 2 2 3
A 3 4 5
K 4
A 5 6 7
K 6
A 7 8 1
I 8
with space " " as field separator and new line as a line separator. In the simulation section Play with chemSKI we use "^" as the line separator, for example the same molecule appears as the string
Mind that the edge names are different, which is not important, because they are arbitrary.
The line separator is not important, because we can deduce the number of ports (ie the number of edge names on a node line) from the node type, therefore we shall also use a notation like
FROUT 6 S 10 10 1 A 1 3 2 K 3 A 2 5 4 K 5 A 4 7 6 I 7
for the same molecule, where the line and field separator are the same. Thus a mol file appears as a string of node types and edge names. This convention is useful in the section about rewrites.
Each chemSKI rewrite is described by a left hand side (LHS) and a right hand side (RHS) pattern, which are mol files. We see each rewrite as a chemical reaction, which transforms the LHS into RHS. In order to be conservative in nodes and edges, we need to add tokens, ie a finite number (up to isomorphism) of small graphs.
Borrowing from the Project Hapax, we see that the addition of tokens solves the problem of unique name edges, even in a decentralized graphical reduction. It is enough to have "minted" tokens (ie tokens with unique edge names) and then to use them in the rewrites. The minting is of course an independent part from the reduction rewrites.
The tokens have 1 (for Arrow token), 2 (for S-K and I-A tokens) or 3 (for A-A, S-A, S-S tokens) edge names. Minting such a token is a process where the new, unique name edges of the token are produced. In a sense, each such token carries 1, 2 or 3 unique names.
Suppose that each token has a scalar cost, per token type, in our case there are numbers cost[A-A], cost[I-A], cost[S-A], cost [S-K], cost[S-S], cost[Arrow]. In the programs the cost per token is in the TokensCost vector [nodes.js line 23]. Then the cost of a collection of tokens is the sum of costs of tokens.
A reasonable choice is to take the cost of the token as the number of unique names it carries. In an economy application the cost per token would be decided by trading. In a world building application the cost per token would be tuned to reflect its scarcity.
As the Arrow token enters in many "COMB" rewrites, which just eliminate the Arrow nodes, it is also reasonable to assign a cost equal to 0 for this token, because Arrow nodes in graphs have a fleeting existence.
The cost of a rewrite is obtained from the InCost and OutCost. For economy applications we may use InCost[rewrite] to bill a rewrite. We would use perhaps only "In Tokens" from the customer.
InCost[rewrite] = cost[In Tokens]
The OutCost[rewrite] would measure the value of the tokens we obtain because we performed the rewrite.
OutCost[rewrite] = cost[Out Tokens]
From here, we may have the cost[rewrite] as the profit.
The cost of a graph reduction would be then the sum of the costs of rewrites performed.
For example, in the case when the cost measures the number of new names, this is reasonable. Also, this cost[rewrite] would be interesting in relation to world building applications, where we see graphs as artificial life, arXiv:2005.06060 section 3, or the experiment page How to test a quine. For such an application cost[rewrite] is related to the metabolism.
For economy or general computation applications, such cost would supplement other costs considered, like the number of computational steps performed by a virtual machine to do the rewrite. Indeed, for some quine graphs, like in Play with chemSKI menu the graph associated to (S I I) (S I I), the cost of the graph reduction oscilates in time around 0. The computational cost of running the reductions for such a quine should be something which increases approximately linear in time. For other graphs, like in Play with chemSKI menu the graph associated to (S I I) (S (K (S I I)) (S (K (S I)) (S (K K) I))), they may evolve into a "dirty" quine graph, which is an ever increasing graph which has a component which is a quine, but also more and more "waste" composed of small graphs which are inert. In such a case the cost proposed here is linear in time.
The existence of "waste" implies that, whatever cost we consider, it should also extend to the cost[Initial graph] and cost[Final Graph], such that it satisfies:
is the graphical rewrite version of the SKI combinators rewrite
S a b c -> (a c) (b c)
In chemSKI it is the chemical reaction
with no tokens needed.
The difference between the chemSKI rewrite and the SKI rewrite is in the treatment of duplication. Indeed, in Combinatory Chemistry, the chemical reaction proposed is the following:
S a b c + c -> (a c) (b c) + S
In order for the rewrite to happen, an already existing term "c" is needed. Therefore, in Combinatory Chemistry, this reaction is based on delegated duplication: somehow, a copy of "c" was already made and it is available for this reaction.
In chemSKI, the chemical reaction is conservative and also neutral, there are no in or out tokens. Instead of duplicating "c", the reaction just rewires the LHS where the combinator S appears as an S node with ports 1 and 2 connected, into the RHS where the node S has not the ports 1 and 2 connected, thus it plays now the role of a fanout. The duplication of the graph conected at port "c" will happen later, and progressively, due to the duplication rewrites. The chemSKI reaction does not need the previous existence of a copy of a term (or graph) "c".
The kind of this rewrite is "DIST", because it does not decrease the number of nodes, therefore it is favored by the rewrite strategy "GROW". It is not a DIST0 to DIST7 rewrite, because the action is "SA".In mol notation, the rewrite is described as:
LHS = S 1 1 2, A 2 a 3, A 3 b 4, A 4 c d
RHS = S c 1 2, A a 1 3, A b 2 4, A 3 4 d
The LHS and RHS patterns with the tokens, are the same:
LHS with tokens = S 1 1 2, A 2 a 3, A 3 b 4, A 4 c d
RHS with tokens = S c 1 2, A a 1 3, A b 2 4, A 3 4 d
We can encode the passage from (LHS with tokens) to (RHS with tokens) as the permutation
(0 1 2 3 4 5 6 7 8 9 a b c d e f)
(0 e 1 3 4 6 2 7 8 a 5 b c 9 d f)
used to duplicate a I combinator node by the S node which plays the role of a fanout in this rewrite. It can also be seen as a rewrite which eliminates a S node.
The kind "DIST" takes the name from "distributivity", but I used the name as a generic one for duplication rewrites or even for rewrites which do not increase the number of nodes. The rewrite. However the action field is "DIST1", which is a precise description because there are 8 types of truly inspired from "distributivity" rewrites (DIST0 to DIST7), as explained in [chemistry.js from line 820]. In the case of DIST0 to DIST7 rewrites there are new fields "t1", ..., "t4" which are used for the types of nodes in the RHS of the rewrite.
is a neutral rewrite (ie no tokens needed). It uses a S node in the role of an S combinator, connected to another S node in the role of fanout. It duplicates the S combinator, but this is done by rewiring of the two S nodes available.
Mind that even if the kind is "DIST", the action is not one of DIST0 to DIST7, therefore here is a case where I used the kind "DIST" as a placeholder for duplication. This is also justified by the fact that the rewrite does not decrease the number of nodes, it is therefore put in the category of rewrites which is favored by the "GROW" strategy.
has two LHS patterns, in both cases the K node, connected at port 2 or port 3 of a S node (when it has the role of a fanout), prunes the respective port.
Tokens rewrites are rewrites involving only tokens. They are needed in case we have tokens which do not appear as in tokens.
Synthesis rewrites are needed to produce new graphs from old ones, outside of the chemSKI reduction. For example we may want to copy an existing graph, to erase one, or to produce a new graph from existing ones.
Waste rewrites convert waste to tokens or useful graphs.
In this version of the programs the synthesis rewrites are not yet implemented.
The justification is that the tokens A-A and S-S never appear as In Tokens in the chemSKI rewrites, so there should be a way to convert these tokens into ones which can be used further.
In the programs, those rewrites are used in the part of the programs which updates the balance of tokens. This is because the tokens are not represented graphically. This choice was made in order to moderater the graphical complexity of the simulations.
In mol notation, the rewrite is described as:
LHS = S a b a, S c b c, A d e d, A f e f
RHS = S a b a, S c e c, A d b d, A f e f
In chemSKI it is the chemical reaction
We can encode the passage from LHS to RHS as the permutation
(0 1 2 3 4 5 6 7 8 9 a b c d e f)
(0 1 2 3 4 5 a 7 8 9 6 b c d e f)
is a rewrite which rewires two FROUT nodes, with the help of a S-K token.
In mol notation, the rewrite is described as:
LHS with tokens = FROUT a, FROUT b, S c e c, K e
RHS with tokens = FROUT e, FROUT c, S a e c, K b
In chemSKI it is the chemical reaction
We can encode the passage from LHS with tokens to RHS with tokens as the permutation
(0 1 2 3 4 5 6 7 8 9)
(0 6 2 5 4 1 9 7 8 3)
This rewrite is used to duplicate a graph and erase another. Indeed, suppose FROUT a and FROUT b are the free out nodes corresponding to the roots of two combinators, denoted Ta and Tb (converted to chemSKI). After the rewrite is done, the combinator Ta will be duplicated and the combinator Tb will be erased, by using only the chemSKI rewrites.
These duplication and erasure rewrites will have costs which, in particular, can be used to define the cost of a graph (here cost[Ta]) or to define the cost of erasure of a graph (here the cost of reductions of the Tb graph connected to K). This can be used in section Cost of rewrites.
In artificial life applications, such a rewrite, applied randomly, will trigger a duplication of a molecule and the erasure of another.
is a rewrite which rewires two FROUT nodes, with the help of a S-K token. In mol notation, the rewrite is described as:
LHS with tokens = FROUT a, FROUT b, S c e c, A d e d
RHS with tokens = FROUT e, FROUT c, S a e d, A d b c
In chemSKI it is the chemical reaction
We can encode the passage from LHS to RHS as the permutation
(0 1 2 3 4 5 6 7 8 9 a b)
(0 6 2 5 4 1 a 9 8 b 3 7)
This rewrite is used to duplicate a graph and connect one of the copies to another, via an A node. Indeed, as before suppose FROUT a and FROUT b are the free out nodes corresponding to the roots of two combinators, denoted Ta and Tb (converted to chemSKI). After the rewrite is done, the combinator Ta will be duplicated (by the node S) into Ta1 and Ta2 and Ta2 will be applied to the combinator Tb, by using only the chemSKI rewrites.
This rewrite may play the role of a rewrite from Combinatory Chemistry: Towards a Simple Model of Emergent Evolution arXiv:2003.07916:
Ta + Tb = Ta Tb
which does not have a correspondent in chemSKI rewrites.
There is a variant of chemSKI which is fully compatible with chemlambda v2 rewrites, called "chemSKI+λ" where the occurences of the node S when there is no edge between ports 1 and 2 are replaced with a chemlambda node FOE. You can toggle between chemSKI and chemSKI+λ by using the "change" button.
You can explore the differences between chemSKI and chemSKI+λ by using the parser window and button "λSKI -> mol".
The parser transforms any mix of SKI with lambda calculus (where the letters "S", "K", and "I" are always interpreted as combinators, not lambda calculus variables) into a mol file (i.e. into a graph). This graph can be reduced with chemSKI or chemSKI+λ (ie chemSKI plus chemlambda).
In the case of chemSKI the chemlambda nodes do not interpret the node S as a fanout, nor the pure chemSKI nodes (I,S,K) do not see FO and FOE as fanouts. Differently, if you change to the chemistry chemSKI+λ then the reductions work better, with the price of mixing the FO and FOE nodes into the pure chemSKI formalism.
Not implemented, but chemSKI is compatible with dirIC (directed Interaction Combinators). Indeed, chemlambda and dirIC chemistries have a common part, called in the programs "CHEMLAMBDABARE", and different parts, "CHEMLAMBDAEND" and "DICMOD" respectively. In order to obtain chemSKI + dirIC, we can just replace "CHEMLAMBDAEND" with "DICMOD" at lines 21, 23 in the file [pagelook-skilambda.js]. Some experiments were done. not described here. It is an interesting problem to consider.
All the programs are abundantly commented, for the convenience of the reader. The way rewrites are done by the programs is described shortly here.
The recognition of the LHS pattern is defined in [chemistry.js from line 293] in the function
findTransform(n1), where "n1" is the node where the recognition of the LHS pattern algorithm starts (it should have the type ot the "right" field of the transform). The rewrite is identified by the "action" field, for our example, the rewrite I-S, we have action:"terminIS", and the recognition of the LHS pattern for that transform is done at
[chemistry.js lines 448-456], which starts with:
"case "terminIS": case "terminIFOE": "
(Thus we may have the same algorithm of recognizing LHS patterns for different action fields, which is not surprising, because the algorithm describes an abstract way to visit a pattern and to perform checks.)
If the LHS pattern of the transform trans (defined by the transform "action" field and the node n1 seen as a node of the type defined by the "right" field) is found then the function returns the transform trans. Then the graph rewrite is done by the function doTransform(n1, trans), defined in [chemistry.js from line 602]. At the level of mol files, doing the rewrite means: eliminate the lines in the mol file which correspond to the nodes from the identified LHS pattern, add to the mol file the new lines corresponding to the RHS pattern.
There is no canonical way, no particular order of the lines in the mol file associated to a graph. Any permutation of the lines in the mol file describes the same graph. Any renaming of the edges describes the same graph. That is why we randomly shuffle the mol file before looking for rewrites and we also randomize the choice of rewrites which will be applied. The probability of application of rewrites depends on their "kind", but roughly there is a parameter which is controlled by the "rewrites weight slider" which favors either the rewrites of the kind which increases (or does not decrease)the number of nodes (ie "GROW"), typically of kind "DIST", or the kind which decreases the number of nodes (ie "SLIM"), such as kinds "TERMINATION" or "BETA". Another control we have is to associate to the edges of the graph an age (how many time steps since that edge appeared) and then to favor the rewrites performed on older edges first. As the programs don't do rewrites in parallel (although the first version, in awk, chemlambda-gui, does this for chemlambda v2), this age-based choice of rewrites is a good approximation of the choice to do as much as possible independent rewrites, as soon as possible. This age-based strategy is not used in this version of chemSKI. See Alife properties of directed interaction combinators vs. chemlambda. Marius Buliga (2020), arXiv:2005.06060.
The chemSKI molecule associated to a SKI combinator term is simply the syntax tree of the term, which has as leaves the S,K,I combinator nodes. Go to Play with chemSKI to use the parser. Use the window "λSKI>mol" to parse a SKI combinator to chemSKI. Actually this is a parser from lambda calculus AND SKI combinators to chemlambda AND chemSKI, where the letters "S", "K" and "I" are treated as in SKI. The source is at [0parser.js].
Examples:
SII will not work, use a space for application: S I I will be understood as the term (S I) I.
(S I I) (S I I) will work
Also, the parser works for lambda terms, like (\x.x x) (\x.x x) or (\x.\y.x) z.