Scripting Language Genesis

umbrae · 2004-12-11T08:40:06

For a language that is (also) called voodoo. Voodoo is a very flexable concept at the moment. Even though I have done quite a bit of work on it, I am willing to drop the whole thing if someone has a better idea. In fact I am probably too willing, I have rewritten different versions of voodoo about 4 times in total - I always think of a better way to do it. So although I say "Is going to be" I really mean "the current idea is". Voodoo is going to be object oriented with a very flexible source file syntax. I basically have had a look at quite a few languages, and I want to take the best out of them, but still keep it simple. I like object oriented because it is possible to use design patterns, and they are cool. I also have had a look at groovy and if it didn't just run on the java virtual machine I would probably use it. The purpose of this language is to embed it into a game engine. It would have all the meta level programming, and all the actual algorithms would be implemented in C++. Voodoo would be good for abstract structures and objects, C++ is good for optimised code. This is a list of cool things I would like to use in Voodoo. Tabbed based structure (instead of { }, like (I think) python?) i.e. no semi-colons Closures Pure Object Oriented (no natives) Operator Overloading (just for data objects - vectors, other types of numbers) Native syntax for lists (maps and ranges also possible) I have written a Lexer and a Parser for a version of voodoo that had C like code syntax, and I have written tree manipulation code (using the visitor design pattern). The aim of the language is to have a strongly typed (think - compile time errors instead of runtime errors), object oriented, easy to write, easily embedded language. Any really cool things you would like to see in a language? Any Bad Programmer no Twinkie things I should look out for? The current ideas / examples. [Edited by - umbrae on November 1, 2004 9:16:50 AM]

Engines and Middleware Programming Unity

Started by umbrae November 01, 2004 07:28 AM

116 comments, last by cmp 20 years, 2 months ago

umbrae

Author

308

November 21, 2004 06:33 PM

Quote:
Original post by cmp
no, you would simply say that a divide by zero throws a System.DivideByZero exception, wich then would have a throw or rethrow method.

So divide by zeros are rather unrecoverable then...

Quote:
btw.: it would be at least methodicaly interesinting, if every mehtod, since it is an object, would habe method named install_handler or catch, wich would accecpt a closures. this way you would save another keyword ;)
but it would be a bit hard to use, since you would have to define ever handler in front of the code.
*** Source Snippet Removed ***

I was thinking about the reasons to have try blocks, and they seem mostly to be able to have a safe (exceptions caught) method call. So yes it would be cool to have a method call also be able to accept a closure. The closure could be called to clean up after some file opening / reading, or the closure could be exception handling for that method only. This way there are no try blocks, just method wide exceptions, and method call exceptions.

object.nastymethodthatcausesexceptions(boo).catch_exceptions    catch (exception.someexception)        system.println("Object caused an exception again...")        e.cancel_method()system.println("Called the nasty method")

Even if the method caused an exception, code flow will just continue from where it was called.

Quote:
Original post by Joakim_ar
Here's my language ideas, close to the ones of GameMonkey, but tweaked to something I like better.

I like the syntax for member variables, no need for the "self" or "this" keyword.

Multiple variables able to be returned from a function, interesting. I don't think the language I am building needs that, but it is cool none the less.

Quote:
Assignments (= += -= *= /= ^= &= |=) to variables can not be made inside expressions

I agree. I think that only optimised and fast languages really need this - and even then it is arguable. Why should testing for someting, an expression have a side effect?

What do you think about expression computation? Should the expression "boolean_method() && another_method()" return false if boolean_method() returns false, or should the whole expression be calculated (similar with or)?

umbrae

Author

308

November 21, 2004 08:58 PM

So I was thinking about how the bytecode will work and I'm having a few thoughts. Since I only have one return type, would it be bad to have one register type thing, that always contain the return value?

push 3push 9call <add method>    // the parameters used get popped off the stack// $r now contains the return value

Also I'm not sure how to have the object on the stack that the method gets called on. Either I have it at the start of the parameters, the end, or have the object passed with the call.

For the codea.method(2, 4, 5)// object at the startpush apush 2push 4push 5call method// object at the endpush 2push 4push 5push acall method// object with method callpush 2push 4push 5call a method// object as addition to callpush 2push 4push 5obj acall method

If the object is at the start, the vm needs to know the number of parameters the method takes - so I think that's out of there.

If the object is at the end something like

a.method(4).method(2)// looks like thispush 4push acall method// if using a stack based return need a temp registerpop $1push 2push $1call method

Does anyone know how C++ does it? I had a bit of a look, but it was very nasty code (powerpc code is very nasty, possibly more nasty than x86 but not sure)

I'm thinking about having two stacks, one for variables and one for parameters, what do people think? The variable stack is expanded as it is needed.

I have uploaded a few files with fairly random code in them. Just playing with ideas. I think I will have two files, one that is completely byte code, and another xml file that says which bytes go with which method, and contains the object tree and a few other things.

cmp

138

November 22, 2004 07:23 AM

have lool at Lightweight C++ (the page seems to be down currently), it trantlates most c++ source into c, wich it basicly like assembler.

for your vm, i would simply use a return and a 'real' stack, if you call a method the return address gets on the return stack and everything else parameters and return values gets layed on the 'real' stack. additionally you would need sth. like a heap, where you could allocate memory, whoose lifetime is not limited to a method.
to call a method of an object you would either call the method, with the object as the 1st parameter, if the method is not virtual. if it is virtual, you would have to do a lookup in the virtual function table and call the method specified in it, again with the object as the 1st parameter.
if you don't want the vm to be able to jump to any address in memory, you would have to supply an opcode, wich does the whole lookup and jump process, you could for example give every method a unique name (a simple 32bit number) and use it as a parameter of the opcode:
vcall objects_address method_name

if you would then do for example a simple print operation with the object io, having a method named print, wich accepts a string as the parameter, like this:
push io
push hello
vcall io print

edit: i had a quick look at your file and i wouldn't use inc to increment the stack pointer, since you would usually expect that inc would actually increment a number.
edit2: it is also quite important that my whole vm is stackbased, instad of add a b you would write:
push a
push b
add
so you would actually have to write:
push io
push hello
push io
push print
vcall

Mayrel

348

November 22, 2004 08:01 PM

Quote:

Quote:
Original post by Joakim_ar
Here's my language ideas, close to the ones of GameMonkey, but tweaked to something I like better.

I like the syntax for member variables, no need for the "self" or "this" keyword.

Well, C++ manages to use member variables without, for the most part, needing to use "this". It uses "this" when you have to refer to the object itself, whilst ".x" doesn't appear to suggest an intuitive way to refer to that.

If you were refering to Python, note that "self" isn't a keyword, it's just a parameter.

Quote:

Quote:
Assignments (= += -= *= /= ^= &= |=) to variables can not be made inside expressions

I agree. I think that only optimised and fast languages really need this - and even then it is arguable. Why should testing for someting, an expression have a side effect?

Why shouldn't it?

Quote:

What do you think about expression computation? Should the expression "boolean_method() && another_method()" return false if boolean_method() returns false, or should the whole expression be calculated (similar with or)?

Apply the principle of least astonishment. It's likely that most of your users will be familiar with short-circuiting behaviour for "and" and "or". Unless you have a good reason for the language to behave contrary to how people would expect, don't.

Quote:
Original post by umbrae
So I was thinking about how the bytecode will work and I'm having a few thoughts. Since I only have one return type, would it be bad to have one register type thing, that always contain the return value?
push 3push 9call <add method>    // the parameters used get popped off the stack// $r now contains the return value

Why bother? Unless there's a good reason not to store the return value on the stack, why make extra work for yourself?

Quote:
Also I'm not sure how to have the object on the stack that the method gets called on. Either I have it at the start of the parameters, the end, or have the object passed with the call.
For the codea.method(2, 4, 5)// object at the startpush apush 2push 4push 5call method
If the object is at the start, the vm needs to know the number of parameters the method takes - so I think that's out of there.

But don't you think it should know the number of parameters the method takes? In fact, don't think you think it will? Unless you're planning to have variable length argument lists.

Quote:
If the object is at the end something like
a.method(4).method(2)// looks like thispush 4push acall method// if using a stack based return need a temp registerpop $1push 2push $1call method
Does anyone know how C++ does it? I had a bit of a look, but it was very nasty code (powerpc code is very nasty, possibly more nasty than x86 but not sure)

No temporary register needed.

push 2push 4push acall methodcall method

When possible, C++ will do what I did there. But note that asking 'what C++ does' is meaningless, because C++ compilers are free to do anything.

Quote:

I'm thinking about having two stacks, one for variables and one for parameters, what do people think? The variable stack is expanded as it is needed.

That seems unnecessary. Once called, a function never recieves any more parameters, so you could just put the parameters at the bottom of the variable stack.

CoV

umbrae

Author

308

November 23, 2004 03:58 AM

Quote:
Quote:
Quote:
Original post by Joakim_ar
Here's my language ideas, close to the ones of GameMonkey, but tweaked to something I like better.

I like the syntax for member variables, no need for the "self" or "this" keyword.

Well, C++ manages to use member variables without, for the most part, needing to use "this". It uses "this" when you have to refer to the object itself, whilst ".x" doesn't appear to suggest an intuitive way to refer to that.

If you were refering to Python, note that "self" isn't a keyword, it's just a parameter.

I was refering to when you have an ambiguity, a local variable with the same name as a member variable. In C++ you can use the 'this' pointer to bypass the default 'local variables are first' rule. As I remember other languages use the 'self' keyword (I've used it in REALBasic for one). As you probably know, I dislike keywords and having just a dot in front of a variable name seemed like a good idea. I think I was taking Joakim_ar a little differently, perhaps he accesses all member variables this way.

Quote:
Quote:
Quote:
Assignments (= += -= *= /= ^= &= |=) to variables can not be made inside expressions

I agree. I think that only optimised and fast languages really need this - and even then it is arguable. Why should testing for someting, an expression have a side effect?

Why shouldn't it?

Why should it? It's a condition not an execution, it goes against logic itself. Even if you have function calls in a conditional, they themselves shouldn't have any side effects. Code such as this:

int errorValue;if (errorValue = doSomethingThatCouldFail()){    cout << "Something failed, Error value:" << errorValue << endl;}

Is horrid and outdated. It may be the nicest and fastest way to do things in C / C++, but shouldn't be used in other languages that have better constructs. It is these things that make learning and using C / C++ such a nightmare, you are expected to know that an integer can count as a boolean, and that assignments can happen in conditionals.

If you read the conditional in english it says "if errorValue equals the result of doSomethingThatCouldFail() then execute the next line" which goes against what it actually does.

Quote:
Quote:
What do you think about expression computation? Should the expression "boolean_method() && another_method()" return false if boolean_method() returns false, or should the whole expression be calculated (similar with or)?

Apply the principle of least astonishment. It's likely that most of your users will be familiar with short-circuiting behaviour for "and" and "or". Unless you have a good reason for the language to behave contrary to how people would expect, don't.

When you say that most of my users will be familiar with the short-circuiting behaviour, are you sure? This is a scripting language after all, not a tool for C programmers. In fact I hope most users of this language won't need to know how to program in C, to be able to use it. I even hope that people who use my language don't need to know how to program to find it useful.

So when I apply the principle of least astonishment, I think where the users of this language are coming from. Someone at least experienced in logic will understand the concept of a conditional and probably find the short-circuiting behaviour unusual, but this would only be if they used the concept of methods returning booleans. I personally even found this concept unusual. If I was to only allow const methods in an expression the short-circuiting argument would be moot. The more important idea here is whether to allow conditionals to have side effects.

I think conditionals shouldn't have side effects, and I know that I don't use them when I want to code well. I think they are a cause of bugs, but more importantly the code is harder to read.

Perhaps one of the overall themes here is whether the language should force good habits by limiting the things that can be done, or should the good habits be left to the user. I would say that if a feature could lead to bad habits, and can safely be removed then it should.

Quote:
Quote:
Original post by umbrae
So I was thinking about how the bytecode will work and I'm having a few thoughts. Since I only have one return type, would it be bad to have one register type thing, that always contain the return value?
push 3
push 9
call <add method> // the parameters used get popped off the stack
// $r now contains the return value

Why bother? Unless there's a good reason not to store the return value on the stack, why make extra work for yourself?

Simplicity and beauty. I crave perfection and hate code that just works. I think I have a bit of a problem, everything that I want to code myself has to be perfect (not including assignments, (deadly) group work and plain work), otherwise I rewrite it. The problem with this obsession is that a lot of things don't get finished (or sometimes even started). The other thing is that once I have solved a problem, there is no point programming it - it is solved, and so some projects are left.

I estimate that I have written about 25,000 lines of code on this language project alone - and at the moment I am going to start again (new syntax etc). That's not saying the code that has been written is useless, often I will sit there with the old code on one side of the screen, and the new code on the other and just copy large sections of it.

I did the personality test and found I was IN TP which sounds like it suits me well.

Quote:
Quote:
If the object is at the start, the vm needs to know the number of parameters the method takes - so I think that's out of there.

But don't you think it should know the number of parameters the method takes? In fact, don't think you think it will? Unless you're planning to have variable length argument lists.

The compiler will, for sure. But the virtual machine just runs the code that it sees. The methods themselves pop off the parameters, and pop on the return value, the vm itself just executes. Some part of it will know how many parameters, but that part shouldn't have to deal with the grindstone, the executing core.

That is, unless the way parameters are passed around is changed. If the virtual machine automatically, upon a method call, popped the parameters off and set them up as local variables - which would actually be quite cool. Is this a good idea?

Quote:
Quote:
Does anyone know how C++ does it? I had a bit of a look, but it was very nasty code (powerpc code is very nasty, possibly more nasty than x86 but not sure)

No temporary register needed.

push 2
push 4
push a
call method
call method

When possible, C++ will do what I did there. But note that asking 'what C++ does' is meaningless, because C++ compilers are free to do anything.

Thanks for the code, I hadn't though of doing it that way. But wouldn't coding it backwards like that use a lot more stack than normal? And also that would mean that code like this

a.method1(method2()).method3(method4())

would be executed in this order:

method4()method2()method1()method3()

When you would think it would execute in this order:

method2()method1()method4()method3()

To me that would cause a bit of confusion when trying to debug things.

Quote:
Quote:
I'm thinking about having two stacks, one for variables and one for parameters, what do people think? The variable stack is expanded as it is needed.

That seems unnecessary. Once called, a function never recieves any more parameters, so you could just put the parameters at the bottom of the variable stack.

Again, for simplicity. But you are right, just I was going to reference variables by their offset along the stack, and creating another variable would mean that all the parameter offsets would shift down one. This would make compiling one bit more tricky, and if anyone wanted to read the code - it would be harder to understand.

Unless you referenced variables from an offset that counts up towards the top of the stack, from the first parameter. but then there needs to be a way to change this stack pointer from method to method.

umbrae

Author

308

November 23, 2004 04:18 AM

Quote:
Original post by cmp
have lool at Lightweight C++ (the page seems to be down currently), it trantlates most c++ source into c, wich it basicly like assembler.

I checked it again, and it was up. Looks promising, I will have a look how they translate C++ later (stupid 56k and phone). I see a few void *'s in there :P.

Quote:
for your vm, i would simply use a return and a 'real' stack, if you call a method the return address gets on the return stack and everything else parameters and return values gets layed on the 'real' stack. additionally you would need sth. like a heap, where you could allocate memory, whoose lifetime is not limited to a method.

I was thinking about having a different method for creating objects, since they are the only things going to be created, and each one has a certain structure (n number of properies etc.) there doesn't need to be a heap. I was just going to use structs, and pointers (dynamic).

Quote:
to call a method of an object you would either call the method, with the object as the 1st parameter, if the method is not virtual. if it is virtual, you would have to do a lookup in the virtual function table and call the method specified in it, again with the object as the 1st parameter.

In voodoo there are no non-virtual methods. Just like java I believe, every method can be overridden (unless it is private).
I was planning on having two opcode method calls. One for normal methods (just offsets from the 'object' structure) and another for interface methods. When the vm encounters an interface opcode it grabs the objects class number (every object has it) and uses that number to jump to the right location in the interface. Then it jumps to the offset of the method, then jumps to the method.

Just on an aside, what would be wrong with using pointers on the code and link everything together? A method call byte code would be translated by the virtual machine linker and an object would be created with a pointer to the actual method object. All the vm would have to do would be to jump around on these pointers. Some of them would also point to the classes, chunks of code etc. It would eliminate the need for ID values, and possibly make things faster (no addition, just pointer dereferencing). Would it work? What is wrong with this idea?

cmp

138

November 23, 2004 10:44 AM

Quote:
Quote:
for your vm, i would simply use a return and a 'real' stack, if you call a method the return address gets on the return stack and everything else parameters and return values gets layed on the 'real' stack. additionally you would need sth. like a heap, where you could allocate memory, whoose lifetime is not limited to a method.

I was thinking about having a different method for creating objects, since they are the only things going to be created, and each one has a certain structure (n number of properies etc.) there doesn't need to be a heap. I was just going to use structs, and pointers (dynamic).

with heap i mean dynamic memory, and there aren't only 'objects' in the sense of data allocated on the heap, i really hope that at least pointers to these objects and maybe also primitve types are just plain data, wich lifetime is limited to the method's lifetime.
i think with structs and pointers you mean the real implementation, but when i was talking about a 'heap', i didn't said anything about the implementation, with heap i just meant anything wich could dynamicly allocate memory and deallocate it.
the way, you implement it, is totally up to you, you could use one big array or c's malloc (i think i would go this way).

Quote:
In voodoo there are no non-virtual methods.Just like java I believe, every method can be overridden (unless it is private).

no, in your language every method is 'virtual', if you think of virtual of just meaning, that the address of a method is unkown at compile time, so it has to be determined at run time.

Quote:

I was planning on having two opcode method calls. One for normal methods (just offsets from the 'object' structure) and another for interface methods. When the vm encounters an interface opcode it grabs the objects class number (every object has it) and uses that number to jump to the right location in the interface. Then it jumps to the offset of the method, then jumps to the method.

what do you mean with an offset to the object's structure?, if the offset and the address of the object's structure is known at compile time, you can simply use the real address of the method.

Quote:
Just on an aside, what would be wrong with using pointers on the code and link everything together?

what do you mean, i don't understand your explanation.

Mayrel

348

November 23, 2004 02:45 PM

Quote:
Original post by umbrae
I was refering to when you have an ambiguity, a local variable with the same name as a member variable. In C++ you can use the 'this' pointer to bypass the default 'local variables are first' rule. As I remember other languages use the 'self' keyword (I've used it in REALBasic for one). As you probably know, I dislike keywords and having just a dot in front of a variable name seemed like a good idea. I think I was taking Joakim_ar a little differently, perhaps he accesses all member variables this way.

Yes, I see. I prefer the Python approach. I think that object.field is easier to parse (for a human) than just .field. And if the object is just an argument, then it isn't a keyword.

Quote:

Quote:
Quote:
Quote:
Assignments (= += -= *= /= ^= &= |=) to variables can not be made inside expressions

I agree. I think that only optimised and fast languages really need this - and even then it is arguable. Why should testing for someting, an expression have a side effect?

Why shouldn't it?

Why should it? It's a condition not an execution, it goes against logic itself. Even if you have function calls in a conditional, they themselves shouldn't have any side effects.

Why not?

Which do you think is better,

stream.prepare_to_test_for_waiting_data();if (stream.has_waiting_data())  ...

if (stream.has_waiting_data())  ...

Note that to check the stream it is not sufficient to merely check a flag that says extra data is available. Data may be available in a system buffer external to the program. In my opinion, I think this shows that the general rule "conditionals must not have any side effects" is not applicable.

Quote:
Code such as this:
int errorValue;if (errorValue = doSomethingThatCouldFail()){    cout << "Something failed, Error value:" << errorValue << endl;}
Is horrid and outdated. It may be the nicest and fastest way to do things in C / C++, but shouldn't be used in other languages that have better constructs.

Quote:

It is these things that make learning and using C / C++ such a nightmare, you are expected to know that an integer can count as a boolean, and that assignments can happen in conditionals.

If you read the conditional in english it says "if errorValue equals the result of doSomethingThatCouldFail() then execute the next line" which goes against what it actually does.

That's not because (1) integers can be converted to Booleans or (2) assignments can happen in conditionals. It's because of (a) C++'s notation and (b) the stupidity of functions that return error status codes.

In addition, picking an example which no programmer familiar with modern C++ would choose to use -- he'd almost certainly use exceptions in this case -- is bad practice. It's a straw man argument.

Quote:

Quote:
Quote:
What do you think about expression computation? Should the expression "boolean_method() && another_method()" return false if boolean_method() returns false, or should the whole expression be calculated (similar with or)?

Apply the principle of least astonishment. It's likely that most of your users will be familiar with short-circuiting behaviour for "and" and "or". Unless you have a good reason for the language to behave contrary to how people would expect, don't.

When you say that most of my users will be familiar with the short-circuiting behaviour, are you sure? This is a scripting language after all, not a tool for C programmers.

If you expect them to be familiar with scripting languages, bear in mind that almost all widely scripting languages -- inc. Python, Ruby, perl and bash -- have short-circuiting behaviour.

Quote:

In fact I hope most users of this language won't need to know how to program in C, to be able to use it. I even hope that people who use my language don't need to know how to program to find it useful.

Quote:

So when I apply the principle of least astonishment, I think where the users of this language are coming from. Someone at least experienced in logic will understand the concept of a conditional and probably find the short-circuiting behaviour unusual, but this would only be if they used the concept of methods returning booleans. I personally even found this concept unusual. If I was to only allow const methods in an expression the short-circuiting argument would be moot. The more important idea here is whether to allow conditionals to have side effects.

Well, no it isn't. Conditionals must be allowed to have side effects. Otherwise there are a lot of very useful things you cannot do inside a conditional. People will always be astonished if "bool b = foo(); if (b)" is allowed, but "if (foo())" is not.

Quote:

Perhaps one of the overall themes here is whether the language should force good habits by limiting the things that can be done, or should the good habits be left to the user. I would say that if a feature could lead to bad habits, and can safely be removed then it should.

Ah, so you want to design a language that nobody will want to use. It's a common enough design goal, but I feel that Intercal and BrainF**K rather have the market cornered there.

Quote:

Quote:

Why bother? Unless there's a good reason not to store the return value on the stack, why make extra work for yourself?

Simplicity and beauty.

How is having two incompatible places to store data 'simple' or 'beautiful'?

Quote:

I crave perfection and hate code that just works.

You want code that works first, and perfection later.

Quote:

The compiler will, for sure. But the virtual machine just runs the code that it sees. The methods themselves pop off the parameters, and pop on the return value, the vm itself just executes. Some part of it will know how many parameters, but that part shouldn't have to deal with the grindstone, the executing core.

Why shouldn't the virtual machine be allowed to know how many arguments a function takes?

Quote:

That is, unless the way parameters are passed around is changed. If the virtual machine automatically, upon a method call, popped the parameters off and set them up as local variables - which would actually be quite cool. Is this a good idea?

No. It's neither simple nor beautiful.

Quote:

Quote:
Quote:
Does anyone know how C++ does it? I had a bit of a look, but it was very nasty code (powerpc code is very nasty, possibly more nasty than x86 but not sure)

No temporary register needed.

push 2
push 4
push a
call method
call method

When possible, C++ will do what I did there. But note that asking 'what C++ does' is meaningless, because C++ compilers are free to do anything.

Thanks for the code, I hadn't though of doing it that way. But wouldn't coding it backwards like that use a lot more stack than normal?

Not a "lot", no. And if variables are stored on the stack, then it wouldn't use up any more stack.

Quote:
And also that would mean that code like this
a.method1(method2()).method3(method4())
would be executed in this order:
method4()method2()method1()method3()
When you would think it would execute in this order:
method2()method1()method4()method3()
To me that would cause a bit of confusion when trying to debug things.

Three solutions. Firstly, be slightly confused when debugging. Secondly, when debugging, use temporary variables to store intermediate values. Thirdly, do this:

call method2push acall method1call method4swapcall method1

Quote:

Quote:
Quote:
I'm thinking about having two stacks, one for variables and one for parameters, what do people think? The variable stack is expanded as it is needed.

That seems unnecessary. Once called, a function never recieves any more parameters, so you could just put the parameters at the bottom of the variable stack.

Again, for simplicity.

I think a system with one stack is more simple than a system with two.

Quote:

But you are right, just I was going to reference variables by their offset along the stack, and creating another variable would mean that all the parameter offsets would shift down one. This would make compiling one bit more tricky, and if anyone wanted to read the code - it would be harder to understand.

What code? The compiled code? Isn't that going to be hard to understand anyway?

Quote:

Unless you referenced variables from an offset that counts up towards the top of the stack, from the first parameter. but then there needs to be a way to change this stack pointer from method to method.

On x86 architectures, C/C++ compilers copy the stack pointer into a temporary register upon entering a function. Then they use that copy to reference parameters and variables, and the offsets don't change when data is pushed onto the stack for a function call.

CoV

umbrae

Author

308

November 23, 2004 06:04 PM

Quote:
Yes, I see. I prefer the Python approach. I think that object.field is easier to parse (for a human) than just .field. And if the object is just an argument, then it isn't a keyword.

It is still another 'global word', but I understand your preference.

Quote:
Why not?

Which do you think is better,
stream.prepare_to_test_for_waiting_data();if (stream.has_waiting_data())  ...orif (stream.has_waiting_data())  ...
Note that to check the stream it is not sufficient to merely check a flag that says extra data is available. Data may be available in a system buffer external to the program. In my opinion, I think this shows that the general rule "conditionals must not have any side effects" is not applicable.

How about one of those closure type things:

stream.data_waiting()    ...

Makes more sense.

Also your example doesn't "show that the general rule 'conditionals must not have any side effects' is not applicable", all it does is show an example where there is less code if conditionals can have side effects. I would code it like this:

stream.query_status();if (stream.has_waiting_data())    ...

And I would be quite happy with that. The method name "has_waiting_data()" does not hint that it has a side effect, and in your second example it does.

Quote:
Quote:
It is these things that make learning and using C / C++ such a nightmare, you are expected to know that an integer can count as a boolean, and that assignments can happen in conditionals.

If you read the conditional in english it says "if errorValue equals the result of doSomethingThatCouldFail() then execute the next line" which goes against what it actually does.

That's not because (1) integers can be converted to Booleans or (2) assignments can happen in conditionals. It's because of (a) C++'s notation and (b) the stupidity of functions that return error status codes.

In addition, picking an example which no programmer familiar with modern C++ would choose to use -- he'd almost certainly use exceptions in this case -- is bad practice. It's a straw man argument.

I see you know your fallacies :).

That was one example, the only good reason I could think of to have assignments in conditionals. Do you have a better example for having assignments in conditionals?

Quote:
Quote:
When you say that most of my users will be familiar with the short-circuiting behaviour, are you sure? This is a scripting language after all, not a tool for C programmers.

If you expect them to be familiar with scripting languages, bear in mind that almost all widely scripting languages -- inc. Python, Ruby, perl and bash -- have short-circuiting behaviour.

Short-circuting behaviour is only noticable if the conditional has side effects.

Quote:
Quote:
So when I apply the principle of least astonishment, I think where the users of this language are coming from. Someone at least experienced in logic will understand the concept of a conditional and probably find the short-circuiting behaviour unusual, but this would only be if they used the concept of methods returning booleans. I personally even found this concept unusual. If I was to only allow const methods in an expression the short-circuiting argument would be moot. The more important idea here is whether to allow conditionals to have side effects.

Well, no it isn't. Conditionals must be allowed to have side effects. Otherwise there are a lot of very useful things you cannot do inside a conditional. People will always be astonished if "bool b = foo(); if (b)" is allowed, but "if (foo())" is not

"People will always be astonished" is a fallacy.

Quote:
Quote:
Perhaps one of the overall themes here is whether the language should force good habits by limiting the things that can be done, or should the good habits be left to the user. I would say that if a feature could lead to bad habits, and can safely be removed then it should.

Ah, so you want to design a language that nobody will want to use. It's a common enough design goal, but I feel that Intercal and BrainF**K rather have the market cornered there.

While most of your criticisms are good, and you may have the best intentions, a lot of the time your posts are rather negative, and you sound like you are flaming. While forums can be a great place to rip into people (I can hardly stop myself sometimes) please don't do it to me. I'm here trying to do something constructive and you aren't helping. Some positive feedback maybe?

As for that scathing comment that you just made. I was under the impression that Intercal and BrainF**K were designed to be as hard to use as possible, while still being turing complete. I think there is something called a turing tarpit, and that is where they are from. Not at all what I want to do. I want a language that is easy to use and easy to read. If you notice I even say "if" before my statements on removing features that could lead to bad habits. I am asking for opinions if this is a good idea.

Let me ask you this: do you think compiler type checking is bad? This is restricting behaviour, shouldn't this be removed?

Your comment is completely false, and I take offence. If you want to argue constructively do so, don't say I am committing fallacies and then you yourself say something so overboard I can hardly even call it a fallacy.

Quote:
Quote:
Quote:
Why bother? Unless there's a good reason not to store the return value on the stack, why make extra work for yourself?

Simplicity and beauty.

How is having two incompatible places to store data 'simple' or 'beautiful'?

Everything in it's place, and a place for everything.

If one stack only contains parameters, and the other stack only contains variables there is a clear order. Compare this to having paramters, return values, objects, variables all on the same stack, and you might have some idea about what I like.

As for them being incompatable, that is the idea. The user never gets to choose anyway, the compiler does it for them. I'm talking about the implementation of the virtual machine, and the structure of the bytecode, not the language.

Quote:
Quote:
I crave perfection and hate code that just works.

You want code that works first, and perfection later.

I think you will find that "I" like perfection. I'm not sure what you are trying to prove, but telling me what I want is a bad way to structure an argument.

Quote:
Quote:
The compiler will, for sure. But the virtual machine just runs the code that it sees. The methods themselves pop off the parameters, and pop on the return value, the vm itself just executes. Some part of it will know how many parameters, but that part shouldn't have to deal with the grindstone, the executing core.

Why shouldn't the virtual machine be allowed to know how many arguments a function takes?

If it doesn't need to know, and can function with out knowing, then why should it know? If the methods take care of themselves, why does the virtual machine need to know the higher structure of what is going on?

I say again, some part of it knows, the part that the code can ask a method for that type of imformation (introspection), but the execution part doesn't need to know. That's like saying that a cpu needs to know the number of parameters the method that it is executing takes, when it can function, and does function quite happily without that information.

Quote:
Quote:
That is, unless the way parameters are passed around is changed. If the virtual machine automatically, upon a method call, popped the parameters off and set them up as local variables - which would actually be quite cool. Is this a good idea?

No. It's neither simple nor beautiful.

Why?

Quote:
I think a system with one stack is more simple than a system with two.

I think a system with a stack for each type of thing is more simple that one stack with everything.

Apples and oranges. If you have a bag with both apples and oranges it is more compilcated than two bags, one with just apples, and the other with just oranges.

Quote:
Quote:
But you are right, just I was going to reference variables by their offset along the stack, and creating another variable would mean that all the parameter offsets would shift down one. This would make compiling one bit more tricky, and if anyone wanted to read the code - it would be harder to understand.

What code? The compiled code? Isn't that going to be hard to understand anyway?

I would dislike the knowedge that something that I wrote is messy inside.

Quote:
Quote:
Unless you referenced variables from an offset that counts up towards the top of the stack, from the first parameter. but then there needs to be a way to change this stack pointer from method to method.

On x86 architectures, C/C++ compilers copy the stack pointer into a temporary register upon entering a function. Then they use that copy to reference parameters and variables, and the offsets don't change when data is pushed onto the stack for a function call.

Cool, sounds like a good way. Then, if I am right, the parameters are 'down' and the variables are 'up'?

umbrae

Author

308

November 23, 2004 06:27 PM

Quote:
Original post by cmp
Quote:
Quote:
for your vm, i would simply use a return and a 'real' stack, if you call a method the return address gets on the return stack and everything else parameters and return values gets layed on the 'real' stack. additionally you would need sth. like a heap, where you could allocate memory, whoose lifetime is not limited to a method.

I was thinking about having a different method for creating objects, since they are the only things going to be created, and each one has a certain structure (n number of properies etc.) there doesn't need to be a heap. I was just going to use structs, and pointers (dynamic).

with heap i mean dynamic memory, and there aren't only 'objects' in the sense of data allocated on the heap, i really hope that at least pointers to these objects and maybe also primitve types are just plain data, wich lifetime is limited to the method's lifetime.
i think with structs and pointers you mean the real implementation, but when i was talking about a 'heap', i didn't said anything about the implementation, with heap i just meant anything wich could dynamicly allocate memory and deallocate it.
the way, you implement it, is totally up to you, you could use one big array or c's malloc (i think i would go this way).

I was thinking about having one big chunk of memory, but then that would lead to fragmentation problems, so I thought why not just leave it to the OS?

Quote:
Quote:
In voodoo there are no non-virtual methods.Just like java I believe, every method can be overridden (unless it is private).

no, in your language every method is 'virtual', if you think of virtual of just meaning, that the address of a method is unkown at compile time, so it has to be determined at run time.

I was refering to what C++ calls 'virtual', that the method can be overridden. Using your definition, then only interface methods are virtual, the others are known at compile time.

Quote:
Quote:
I was planning on having two opcode method calls. One for normal methods (just offsets from the 'object' structure) and another for interface methods. When the vm encounters an interface opcode it grabs the objects class number (every object has it) and uses that number to jump to the right location in the interface. Then it jumps to the offset of the method, then jumps to the method.

what do you mean with an offset to the object's structure?, if the offset and the address of the object's structure is known at compile time, you can simply use the real address of the method.

Because of overriding. This code

class base    void print()        print("One")class step1 : base    void print()        print("Two")class step2 : base    void print()        print("Three")class step3 : step2    void print()        print("Four")base a = base.alloc().init()base b = step1.alloc().init()base c = step2.alloc().init()base d = step3.alloc().init()a.print()b.print()c.print()d.print()// should outputOneTwoThreeFour

Yes you know the offset for the method, but you do not know which one to use. So you use the object to get it's class, then you use the offset of the method to jump to the right place in the class, then you jump to the method. Every class that inherits from base will have that method location pointing to a method. Even if they don't implement that method, it will just point to the last implementation of it.

This works well for normal inheritance, but doesn't work with multiple inheritance. I am planning on not having 'true' multiple inheritance, but just having interfaces. For each class that implements an interface, there is a section in the interface structure that maps the interface's method offsets onto the class's methods. Jumping to that section is a bit harder though. I think I was just going to do a bit of a walk through the sections.

Quote:
Quote:
Just on an aside, what would be wrong with using pointers on the code and link everything together?

what do you mean, i don't understand your explanation.

Instead of using ID values for a method, just use pointers. If you want to find an object's method, just follow the pointer to it.

// virtual machine implementationclass voodoo_class{    char* name;    voodoo_method* methods[];};

Instead of plain data. The byte code could even be linked this way, a reference to a constant could be replaced with a pointer to the constant.

Scripting Language Genesis

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Scripting Language Genesis

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines