Advertisement

Self-replicating, self-modifiying scripting language

Started by March 31, 2009 03:51 AM
10 comments, last by polyfrag 15 years, 10 months ago
Hello, a long time ago I posted some thoughts I had about making an evolving A-life simulation, although I wasn't very clear in explaining myself, but now I've gotten more interested in the idea again and am wondering what the more knowledge people here have to say. The idea I had was to create a self-replicating, self-modifying A-life. At first I got the idea from "Core Wars" and at the time I was trying to write an assembly-language-like scripting language simulation where the scripts would (hopefully) evolve and (hopefully) create interesting results I could look at. But my first attempts weren't very successful. But anyways, the intriguing idea I had for my A-life simulation was to use a scripting language that could self-modify, and where individual scripts thus had the ability to self-replicate and interact with other scripts. Basically, they would be able to change their own instructions, data, etc., change their own internal workings on-the-fly, replicate, and interact with other scripts..... this was very interesting, and I was trying to figure out how this scripting language would be like, but I had many headaches in dealing with the problem of self-reference and infinity (specifically, how a script would be able to traverse its own structure, replicate and assemble a copy, while not 'going over' anything twice, or some sort of problem occuring as a result of its workings/inter-connectivity changing during self-reading and replication.) The 'scripting language' that I was designing was not even a scripting language but a 3-dimensional mesh of inter-connected instructions/variables.... this was what I had arrived at in my attempts to try to come up with this self-replicating, self-modifying scripting language. Is there anything like this that's already been made by someone? What are your thoughts on this? Is it impossible?
LISP. But the semantics are easily broken by random mutations.

I've personally worked on a mutation-resistant language based on forth to perform simulated annealing, I guess you could adapt code-manipulation primitives to the stack-based skeleton (perhaps using ideas from here).
Advertisement
Check Tierra. http://life.ou.edu/tierra/
Yes, I got started on this idea of the "3D" scripting language from some flowchart I saw LISP used with genetic algorithms, but LISP isn't self-modifiable or self-replicating.... that was the whole point of what I wanted to create.


And yes, I've seen the classic example of Tierra. This is one of the things that got me interested.

And I am not sure about the "mutation-resistant language". What makes it different from LISP?

And code-manipulation primitives are exactly what I'm interested in. But what would these primitives be like? How would they operate on code?
I'm working on a language that's kind of lispy, in that the code can be treated like data and easily modified at runtime. It is called Circa. I'm also planning on exploring different ways that code can be automatically improved at runtime.

One method that I've had some success with, is backpropogation-style training across a dataflow diagram. The short version is this:

1. Construct a dataflow diagram for the code in question. (for me this is easy, that's the base representation I use)
2. Pick some values at the top that you want to get modified. These can be called the "inputs".
3. Pick some results at the bottom that you want to change. For each result, you specify a "desired" value.
4. Since you know exactly how every result was computed, you can then go backwards up the dataflow diagram, and figure out what you need to change to make the whole code spit out your "desired" values.
5. Eventually you will reach the "inputs" at the top, and you can change them to be what you need.

So this method can automatically adjust values, but it doesn't modify structure in any way.

I haven't started experimenting with structure-changing methods, but I don't think they will be too hard. The key is having a code representation which is easy to modify. An assembly-language representation is pretty bad because it's too easy for a small modification to completely change the effect of the code. A lispy representation is better. A dataflow representation is pretty good, I think.

Dataflow representation isn't Turing-complete, but I think for this domain, you are much better off having a restrictive representation which is easier to reason about. Turing-complete representations are pretty hard to reason about (the obvious example being, you can't always know if a piece of code will terminate).

Anyway the Circa project is here if you're interested: http://github.com/andyfischer/circa. It's totally a work in progress, don't expect it to do anything.
Quote:
Original post by polyfrag
Yes, I got started on this idea of the "3D" scripting language from some flowchart I saw LISP used with genetic algorithms, but LISP isn't self-modifiable or self-replicating....
LISP is modifiable. Any LISP program is by definition also a LISP value.

Quote:
And I am not sure about the "mutation-resistant language". What makes it different from LISP?
LISP represents source as a tree (of S-expressions). Mutating a tree is hard because crossing-overs can easily change the number of arguments to a function and therefore cause runtime errors. The point of using FORTH-like syntax is that adding, removing or swapping program parts never causes actual semantic issues: every instruction is still able to get the correct elements from the stack.

Quote:
And code-manipulation primitives are exactly what I'm interested in. But what would these primitives be like? How would they operate on code?


LISP code is a tree (lists of lists of lists of lists of etc) and so list-manipulation functions allow you to manipulate LISP programs. Forth code is a sequence of atomic expressions, so sequence manipulation operators (swap, splice, insert, remove) allow you to manipulate Forth code.
Advertisement
But you couldn't use LISP to make a self-replicating script, could you?

[edit] In you LISP you have an instruction that basically copies a whole tree of instructions, right? This is not what I'm interested in. I'm interested in a script that can traverse its entire structure and assemble a replicate of itself 'piece by piece'.
Quote:
Original post by polyfrag
But you couldn't use LISP to make a self-replicating script, could you?
Simply write a piece of LISP code that reads itself as input, and generates a new program as output... this is as easy as reading a tree and generating a tree.

But a tree is not able to repeat itself, is it? Unless you have a for/while loop instruction at the very top of the tree.

And how exactly are variables handled in LISP? How could they fit into a tree-hierarchy?


For the scripting language, I was thinking that the tree could 'loop back' at any point in the script, back to any point, creating loops, conditional pathways, etc. And that this looping back could also be self-modified. I.e., the connections could be moved to any nearby connected instruction and connections could also be formed on-the-fly. This would allow the scripts to traverse themselves to replicate, to modify themselves, and to interact with other script/A-life-organisms.

This means that the scripting language would be a giant interconnected mesh without a top-down tree hierarchy. (This is why I wanted to use 3D to represent it, since only that would allow the entire structure and inter-connectivity to be seen.)

[edit]

Quote:
Simply write a piece of LISP code that reads itself as input, and generates a new program as output... this is as easy as reading a tree and generating a tree.


So would it actually read AND parse its own code if it used itself as input? Or does input simply involve the representation of the instructions?

[Edited by - polyfrag on April 1, 2009 2:15:49 AM]
Quote:
Original post by polyfrag
But a tree is not able to repeat itself, is it? Unless you have a for/while loop instruction at the very top of the tree.
Any language can be used to construct a program that outputs its own code—it's called a Quine and it's a natural consequence of being Turing-complete.

Now, any dynamic language is able to interpret source code on the fly at runtime, meaning that you can modify your quine to execute the code it outputs (leading to a self-replicating program). You can do this in many languages, such as &#106avascript or PHP.

The problem is that you cannot easily change the output code in these languages: there's a strict syntax you need to respect or you'll end up with syntax errors. The difference with LISP is that LISP is able to manipulate a LISP program as a tree, whereas languages like &#106avascript or PHP can &#111;nly manipulate their own code as a string (or in some rare cases through clunky reflection mechanisms). <br><br><!--QUOTE--><blockquote><span class=smallfont>Quote:</span><table border=0 cellpadding=4 cellspacing=0 width=95%><tr><td class=quote><!--/QUOTE--><!--STARTQUOTE-->And how exactly are variables handled in LISP? How could they fit into a tree-hierarchy?<!--QUOTE--></td></tr></table></blockquote><!--/QUOTE--><!--ENDQUOTE-->With the exception of Forth (and possibly Prolog, depending &#111;n your religious philosophy), every programming language in the world includes variables in a tree: expressions are trees which have variables and literals as their leaves, and statement blocks are trees which have expressions and declarations as their leaves. <br><br>However, I strongly advise you to stay away from variables. The existence of variables is yet another constraint your modification algorithms will have to take into account. If anything, use De Bruijn indices, they have a natural self-verification structure. <br><br><!--QUOTE--><blockquote><span class=smallfont>Quote:</span><table border=0 cellpadding=4 cellspacing=0 width=95%><tr><td class=quote><!--/QUOTE--><!--STARTQUOTE-->For the scripting language, I was thinking that the tree could 'loop back' at any point in the script, back to any point, creating loops, conditional pathways, etc. And that this looping back could also be self-modified. I.e., the connections could be moved to any nearby connected instruction and connections could also be formed &#111;n-the-fly. This would allow the scripts to traverse themselves to replicate, to modify themselves, and to interact with other script/A-life-organisms.<!--QUOTE--></td></tr></table></blockquote><!--/QUOTE--><!--ENDQUOTE-->Sure. But would that be <i>easy</i>? If self-replicating behavior is part of the organisms, you probably don't want to have a thousand-page program that does nothing but replicate. <br><br><!--QUOTE--><blockquote><span class=smallfont>Quote:</span><table border=0 cellpadding=4 cellspacing=0 width=95%><tr><td class=quote><!--/QUOTE--><!--STARTQUOTE-->So would it actually read AND parse its own code if it used itself as input? Or does input simply involve the representation of the instructions?<!--QUOTE--></td></tr></table></blockquote><!--/QUOTE--><!--ENDQUOTE-->In LISP, both of these mean the same thing (but to be a little bit more helpful, you would probably write a bootstrap program that constructs the initial program from a clean tree representation, because you probably don't want to write quines by hand)

This topic is closed to new replies.

Advertisement