Scripting Language Genesis

umbrae · 2004-12-11T08:40:06

For a language that is (also) called voodoo. Voodoo is a very flexable concept at the moment. Even though I have done quite a bit of work on it, I am willing to drop the whole thing if someone has a better idea. In fact I am probably too willing, I have rewritten different versions of voodoo about 4 times in total - I always think of a better way to do it. So although I say "Is going to be" I really mean "the current idea is". Voodoo is going to be object oriented with a very flexible source file syntax. I basically have had a look at quite a few languages, and I want to take the best out of them, but still keep it simple. I like object oriented because it is possible to use design patterns, and they are cool. I also have had a look at groovy and if it didn't just run on the java virtual machine I would probably use it. The purpose of this language is to embed it into a game engine. It would have all the meta level programming, and all the actual algorithms would be implemented in C++. Voodoo would be good for abstract structures and objects, C++ is good for optimised code. This is a list of cool things I would like to use in Voodoo. Tabbed based structure (instead of { }, like (I think) python?) i.e. no semi-colons Closures Pure Object Oriented (no natives) Operator Overloading (just for data objects - vectors, other types of numbers) Native syntax for lists (maps and ranges also possible) I have written a Lexer and a Parser for a version of voodoo that had C like code syntax, and I have written tree manipulation code (using the visitor design pattern). The aim of the language is to have a strongly typed (think - compile time errors instead of runtime errors), object oriented, easy to write, easily embedded language. Any really cool things you would like to see in a language? Any Bad Programmer no Twinkie things I should look out for? The current ideas / examples. [Edited by - umbrae on November 1, 2004 9:16:50 AM]

Engines and Middleware Programming Unity

Started by umbrae November 01, 2004 07:28 AM

116 comments, last by cmp 20 years, 2 months ago

umbrae

Author

308

November 03, 2004 06:22 AM

Quote:
1. macros have many, many uses.

Yeah they do. But I haven't really had many situations that I felt needed macros, I believe that they subtract from readability - it is harder to understand the code.

From the examples that you have given, I think each one would be better implemented in the language.

Quote:
Macro EQ ==
Macro NEQ !=

There is no good reason to ever do an assignment in a conditional expression within a scripting language. C++ maybe for optimisation purposes, but scripting language - no. So why don't we just use "="? It makes more sense than two capital letters.

Quote:
Macro Set =

Why? There is already a perfectly good operator that means assignment. I would even choose Pascal's ":=" over a set keyword.

Quote:
Macro isint(X) (cint(x) eq x)
Macro issquare(x) isint((z set sqr(y)))

integer r = 5;if (r.isint()) ...if (r.issquare()) ...

Quote:
Macro msgbox(x) eventsend:Msgbox(cstring(x))

gui.msgbox()

(takes strings, overloaded to take integers, and other datatypes also user overrideable)

Macros Compared to Functions
Generic Programming Using Cee Macros

If you have more arguments for the inclusion of macros I would like to hear them.

Quote:
Also try to make a simple vm first, its less then an hours work to make a real good one, and it'll help you later. (when your trying to code for something which doesn't exist, it is Not Nice.)

Good idea, I have been playing around with parsing for quite some time now. Would be good to have something that works, even if I have to hand code the bytecode.

Quote:
Most of it would happen at compile time.
Sometimes it would happen at run-time (but rarely would you need to). Doing so is quite easy,
Get code for function,
Allocate new memory (for each var in function)
Copy function to new spot on the object array
Change vars in function.

All in all quite a fast job.

I've always been wary when thinking about reallocating whole arrays. It doesn't sound good in my book.

I like static. I like the idea that once some classes have been definied that they aren't going to change.

Quote:
Also, What i try and do, is keep a smallish array (or vector, in c++), which contains the jump points. These are points where your code should go to.

Something like
jmp GUID_OF_FUNCTION, GUID_OF_OBJECT, Line number

-1 (or NULL) should be used to signify My. (current object).

If your not using guids, then shame on you!

Shame on me.

I have some idea of what a guid is, and I think I am going to use something similar but I would like some explanation if possible. Globally Unique Identifier?

Each of my methods has a unique offset based on the distance it is from the first method in object. Each sub-class of object increases this offset with the methods it adds.

Object   0 init()   1 istype(class type)Integer : Object   0 init() (inherited from object - can be overridden)   1 istype() (inherited from object - can be overridden, not a goot idea to though)       - new methods start here -   2 issquare()   3 multiply(integer i)Decimal : Object   0 init()   1 istype()       - new methods start here -   2 convert_to_integer()   3 multiply(decimal d)NewInteger : Integer : Object   0 init() (inherited from object - can be overridden)   1 istype() (inherited from object - can be overridden, not a goot idea to though)       - integer methods start here -   2 issquare()   3 multiply(integer i)       - newinteger methods start here -   4 do_wacky(string s)

Also how do you keep these guid's over multiple compile sessions? What I want to do is have the ability to have multiple bytecode files and include them all in the virtual machine.

Quote:
With lists,
Make it a data type, and make the list, a list of type datatype.
That would allow you to have lists of lists of lists of lists, ect.

maybe have a few methods in list, like concatinate, sort, split-at, ect. to make things easier.

I am thinking about using the same syntax that Groovy uses.

integer i[] = [1, 2, 3]

integer[] i = [1, 2, 3]

Quote:
Have namespaces, and allow them to be changed at runtime (which would make things a lot easier).

The package system is a sort of namespace convention. I personally don't think namespace changing is good - it means that what you are doing probably isn't object orientated.

Quote:
Function pointers! (you do not know how good these are, until your using a language which doesn't support them!)

Methods are just another object, like classes and objects. They all can be stored in a "type" container. There will be a way to do something like:

integer i = 5method(multiply(integer)) m = method(i.multiply())i = i.call(m(i))

Or something similar. Need to work out calling conventions (ie what it returns, the arguments etc).

Quote:
Also, it would be nice to have a flexable grammer and/or reprogrammable grammer. (currently my parser, (with a bit of extra taggery) will accept C style for loops, basic style for loops, and with a bit of persuading, probably a few more types too!)

Maybe something like a grammer define tag?

GRAMMER ; NEWLINE

I like the simplistic approach. Cluttering up a language with extra syntax seems too much. If there are multiple ways to do something then people will use these different ways. I like to be able to look at some code and be able to work out what it does easily. This is a similar argument to the one against macros.

I'm a bit of an advocate for Self Documenting Code.

cmp

138

November 03, 2004 08:43 AM

Quote:

We would have to sort some things out, but it sounds like a good idea. As I think I have said, I've done a bit of work on the parsing side of things, and it sounds like you have done a bit of work on the other side of things (xslt is greek to me) so we could work together.

that's perfekt, because i'am not interested in making a parser ;) currently i need someone who translates sourcecode to the xml syntax, wich itselft isn't that hard.
the xml syntax itself is quite finished, but i have to figure some things out and things may change (for example: i'm still thinking about if i should use "name" or "select" for an attribute name).

Quote:

I also have to implement a few things, but they are mostly tied to the virtual machine (lists / maps / ranges).

i would choose a different approach, in my understanding of a 'good' language a language itself should be as simple and clean as possible, so array, lists, maps, etc are all implemented in the language not in the runtime system.
as for your integer array exmaple in my planned language you would write:
i : Kernel.Array<Integer> = [[1, 2, 3]]
where [[ e1, e2, ..., en ]] defines an array with the elements e1 to en.. similar to [[ ]] is << >> wich defines a tuple: <<e1, e2 .. en>>.

function pointers would be also classes, with a call method. for example:
func : Kernel.Function<Boolean, Integer, Integer> = lamda (? == ?)
and then you could call it via: func.call(2, 3). it is also interesting to note, that in my current approach there are no 'real' function pointers. to define a function pointer to a function taking an int and returning a string, you had to say:
array_object : Kernel.Array<String> = [["hello world", "hallo welt"]]
func : Kernel.Function<String, Integer> = lamda (array_object @ ?)
func.call(2)
(btw.: the @ operator on an array returns an object at an position similar to the c++ [] operator, but if you use @ you have only infix and postfix operators leading to a clearer language.)

i choose to use name : type as the definition of a variable becuase, this way the modifiers of of a method's return-type cannot conflict with the modifier of the method itself.
for example:
// a const method does not change it's class' contents
const get(index : Integer) : const String {
return array_object @ index
}

you may argue why not use the c++ syntax const String get(Integer index) const? but in c++ the last const is an method modifier and normally method mods like virtual, extern, etc are written in front of the method. and in my language there are much more modifiers planned than in c++, so this keeps the language logical.

another interesting aspect of my langauge is, that every class inherits Kernel.Any so any class has method likle same_type(), equal, depp_equal, copy, deep_equal ...
as you may have noticed many features of 'my' language are borrowed from eiffel, wich itself is a very clean language.

from what i have got our projects are very similar, but you target only scripting, while i try to make a scripting and native language. as i said i'am currently writing an xml to c++ 'compiler' wich itself should be written in my little xml language, so it's portable (in theory you could even use your interpreter to run the compiler, if you would support loading xml source code).

btw.: if you need vm code, i've written a very simple vm in c and rewrote it to c++, but it is not oop, just pure assembly level (i tried to extend it, but i lost interest).

[Edited by - cmp on November 3, 2004 10:43:29 AM]

umbrae

Author

308

November 03, 2004 12:51 PM

Quote:
Original post by cmp
that's perfekt, because i'am not interested in making a parser ;) currently i need someone who translates sourcecode to the xml syntax, wich itselft isn't that hard.
the xml syntax itself is quite finished, but i have to figure some things out and things may change (for example: i'm still thinking about if i should use "name" or "select" for an attribute name).

If I am getting this correctly - all you want is your language syntax converted into an xml form? If so then that would be fairly simple (when I say fairly simple - I mean easier than parsing and compiling to bytecode). All I really need is a BNF of your language.

What xml library do you use? (if you use one) does it work well?

Quote:
Quote:

I also have to implement a few things, but they are mostly tied to the virtual machine (lists / maps / ranges).

i would choose a different approach, in my understanding of a 'good' language a language itself should be as simple and clean as possible, so array, lists, maps, etc are all implemented in the language not in the runtime system.

I just realised that I said the completely wrong thing. What I meant to say was that I need to implement the C++ code that runs these objects. They won't be tied to the virtual machine, sorry about the error.

Quote:
as for your integer array exmaple in my planned language you would write:
i : Kernel.Array<Integer> = [[1, 2, 3]]
where [[ e1, e2, ..., en ]] defines an array with the elements e1 to en.. similar to [[ ]] is << >> wich defines a tuple: <<e1, e2 .. en>>.
...

I think I like these ideas. I'm not terribly keen on the syntax, but that is not important.

Quote:
i choose to use name : type as the definition of a variable becuase, this way the modifiers of of a method's return-type cannot conflict with the modifier of the method itself.
for example:
// a const method does not change it's class' contents
const get(index : Integer) : const String {
return array_object @ index
}

I like the idea of using identifier:type syntax.

Quote:
you may argue why not use the c++ syntax const String get(Integer index) const? but in c++ the last const is an method modifier and normally method mods like virtual, extern, etc are written in front of the method. and in my language there are much more modifiers planned than in c++, so this keeps the language logical.

I don't really like modifiers. Mostly they are for the compiler - not the code or the functionality. I'm almost against using private, public and protected (package?).

My language is not geared towards mathematical correctness, it is also not designed for efficient algorithms. It can act as the "glue" between C++ code allowing the dynamic inclusion of behaviour and functionality. I'm not sure how this would fit into your language.

Quote:
another interesting aspect of my langauge is, that every class inherits Kernel.Any so any class has method likle same_type(), equal, depp_equal, copy, deep_equal ...
as you may have noticed many features of 'my' language are borrowed from eiffel, wich itself is a very clean language.

Similar to the built-in classes in my language. Type is all objects, Object is all instance-able objects etc.

Quote:
from what i have got our projects are very similar, but you target only scripting, while i try to make a scripting and native language. as i said i'am currently writing an xml to c++ 'compiler' wich itself should be written in my little xml language, so it's portable (in theory you could even use your interpreter to run the compiler, if you would support loading xml source code).

These are the things I want in my language:

Ease of programming
Ease of binding (method is an object)
Pure object orientated
Open package structure
Multiple source files
Ability to have class code spread over multiple files
Lazy class definitions

What I don't like

Cluttered language
Multiple syntaxes to do the same thing
Excessive use of random symbols

Edit: UnrealScript is a cool game scripting language.

[Edited by - umbrae on November 4, 2004 12:51:54 AM]

Nice Coder

366

November 04, 2004 01:26 AM

Quote:
Original post by umbrae
Quote:
1. macros have many, many uses.

Yeah they do. But I haven't really had many situations that I felt needed macros, I believe that they subtract from readability - it is harder to understand the code.

From the examples that you have given, I think each one would be better implemented in the language.

Quote:
Macro EQ ==
Macro NEQ !=

There is no good reason to ever do an assignment in a conditional expression within a scripting language. C++ maybe for optimisation purposes, but scripting language - no. So why don't we just use "="? It makes more sense than two capital letters.
Quote:
Macro Set =

Why? There is already a perfectly good operator that means assignment. I would even choose Pascal's ":=" over a set keyword.
Quote:
Macro isint(X) (cint(x) eq x)
Macro issquare(x) isint((z set sqr(y)))
integer r = 5;if (r.isint()) ...if (r.issquare()) ...
Quote:
Macro msgbox(x) eventsend:Msgbox(cstring(x))
gui.msgbox()
(takes strings, overloaded to take integers, and other datatypes also user overrideable)

Macros Compared to Functions
Generic Programming Using Cee Macros

If you have more arguments for the inclusion of macros I would like to hear them.

Quote:
Also try to make a simple vm first, its less then an hours work to make a real good one, and it'll help you later. (when your trying to code for something which doesn't exist, it is Not Nice.)

Good idea, I have been playing around with parsing for quite some time now. Would be good to have something that works, even if I have to hand code the bytecode.

Quote:
Most of it would happen at compile time.
Sometimes it would happen at run-time (but rarely would you need to). Doing so is quite easy,
Get code for function,
Allocate new memory (for each var in function)
Copy function to new spot on the object array
Change vars in function.

All in all quite a fast job.

I've always been wary when thinking about reallocating whole arrays. It doesn't sound good in my book.

I like static. I like the idea that once some classes have been definied that they aren't going to change.

Quote:
Also, What i try and do, is keep a smallish array (or vector, in c++), which contains the jump points. These are points where your code should go to.

Something like
jmp GUID_OF_FUNCTION, GUID_OF_OBJECT, Line number

-1 (or NULL) should be used to signify My. (current object).

If your not using guids, then shame on you!

Shame on me.

I have some idea of what a guid is, and I think I am going to use something similar but I would like some explanation if possible. Globally Unique Identifier?

Each of my methods has a unique offset based on the distance it is from the first method in object. Each sub-class of object increases this offset with the methods it adds.
Object   0 init()   1 istype(class type)Integer : Object   0 init() (inherited from object - can be overridden)   1 istype() (inherited from object - can be overridden, not a goot idea to though)       - new methods start here -   2 issquare()   3 multiply(integer i)Decimal : Object   0 init()   1 istype()       - new methods start here -   2 convert_to_integer()   3 multiply(decimal d)NewInteger : Integer : Object   0 init() (inherited from object - can be overridden)   1 istype() (inherited from object - can be overridden, not a goot idea to though)       - integer methods start here -   2 issquare()   3 multiply(integer i)       - newinteger methods start here -   4 do_wacky(string s)   
Also how do you keep these guid's over multiple compile sessions? What I want to do is have the ability to have multiple bytecode files and include them all in the virtual machine.

Quote:
With lists,
Make it a data type, and make the list, a list of type datatype.
That would allow you to have lists of lists of lists of lists, ect.

maybe have a few methods in list, like concatinate, sort, split-at, ect. to make things easier.

I am thinking about using the same syntax that Groovy uses.
integer i[] = [1, 2, 3]
or
integer[] i = [1, 2, 3]
Quote:
Have namespaces, and allow them to be changed at runtime (which would make things a lot easier).

The package system is a sort of namespace convention. I personally don't think namespace changing is good - it means that what you are doing probably isn't object orientated.

Quote:
Function pointers! (you do not know how good these are, until your using a language which doesn't support them!)

Methods are just another object, like classes and objects. They all can be stored in a "type" container. There will be a way to do something like:
integer i = 5method(multiply(integer)) m = method(i.multiply())i = i.call(m(i))
Or something similar. Need to work out calling conventions (ie what it returns, the arguments etc).

Quote:
Also, it would be nice to have a flexable grammer and/or reprogrammable grammer. (currently my parser, (with a bit of extra taggery) will accept C style for loops, basic style for loops, and with a bit of persuading, probably a few more types too!)

Maybe something like a grammer define tag?

GRAMMER ; NEWLINE

I like the simplistic approach. Cluttering up a language with extra syntax seems too much. If there are multiple ways to do something then people will use these different ways. I like to be able to look at some code and be able to work out what it does easily. This is a similar argument to the one against macros.

I'm a bit of an advocate for Self Documenting Code.

1. i see you do not lilke macros...
I like them, i use them, when named after what they do Its quite readable (although if not, i start agreeing with you on this point). Please note that the egsamples are actually quite comical, and were not good egsamples on how macros should be used...

2. Guids, guids guids.
Globally Unique Identifiers.

You have your objects, variables, ect. with a guid. That makes each variable globally unique. This speeds up compile times by quite a bit.

What you would have, is in your object code:

Object Obj1
functions: Fucntion1, Function2, Function3, Obj2:Functions
Vars:Obj2:Vars, Var1
Guid:6421000
Object Obj2
fucntion: Function7, FUnction8, Foo, bar, obj1:Functions
vars:Obj1:Vars, Var2
Fuid:6422001

Simple guids

Type

x-----xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Class ID
Where x is one byte.

You make obj0, its a class 0 guid.
Therefore the class byte would be zero.
The first x in the id, would be random. (its id, there are other ways of making it.)

When an object becomes a subobject of another object (lets call it o),

Its class is = the previous class + 1
if its class is > 255 then an error is thrown.
The class -1 bytes of the id of its parent are copied into its id.
The class byte of its id are made (randomness, or a counter indicating how many subobjects it has).
The objects class is parents.class + 1

Thee guid can be hashed (quite quicky i might add) into an array index. On that array you can store all the information about the object in a static place, and also, you can pass the hashed guid from fucntion to function, and its simple to look it up again. This stopps you from having to push pointers, or copying memory.

Usefull eh?

(the offset idea is good, but is still lacking, for egsample: what would be the offsett for:
Where o0-9 are objects
o8.o7.o6.o5.o4.o3.o2.o1.add o9
??

Nice integer syntax, and a nice idea also. I would prefer
integer i = [1,2,3,4,5]
tho.
Also i = [x,y,z,2,3,4] should also be acceptable (and handy).

With the grammer tags, i can see your point. Its just that it would allow for more expansion of the language (see lisp, and how easily it allow you to do things, simply be changine the grammer).

HTH,
From,
Nice coder

Click here to patch the mozilla IDN exploit, or click Here then type in Network.enableidn and set its value to false. Restart the browser for the patches to work.

umbrae

Author

308

November 04, 2004 03:27 AM

Quote:
1. i see you do not lilke macros...
I like them, i use them, when named after what they do Its quite readable (although if not, i start agreeing with you on this point). Please note that the egsamples are actually quite comical, and were not good egsamples on how macros should be used...

I dislike global methods - and that is what macros seem to me. They are not associated with an object. Maybe if they were associated to an object I would like them more. But then there would not be much difference between them and a method, the difference is that the method is 'inlined' to wherever it is called from - a compiler optimisation hint. The other use is when you want to use different syntax for calling or using part of the language - because the language is not elegant or doesn't look right in the first place. I personally have always used this macro:

#define null NULL

because I don't like excessive capital letters. In my mind macros are commonly used where the compiler lacks - for assertions that can be removed for a particular compile as an example.

Each to their own I guess.

A quote from c2.com "they can be used in contexts where functions cannot; and can accept arguments that functions cannot"
And I think I read somewhere that although some languages have macros a lot of languages don't and they aren't any less used or less powerful.

Quote:
Simple guids
...

Although that would work, and it sounds good - you still can't be certain that it won't clash with another class. That would be the most annoying bug, and having a non-deterministic algorithm is bad in my book.

Quote:
(the offset idea is good, but is still lacking, for egsample: what would be the offsett for:
Where o0-9 are objects
o8.o7.o6.o5.o4.o3.o2.o1.add o9
??

You need to know what object o1 is so the add method can access it's member properties. This means that at some point you need to follow the object trail starting from o8.o7 and ending at o2.o1

This is some example bytecode that does this, for this example each property in each object has the same offset as it's name. eg the property o7 is 7 offset in o8. The add method is for example is at the offset 12 in the object o1. These objects for this example are the same class, but they can just as easily be a different classes.

load $1, o8     // not the 'proper' way to load a variable - just to show that $1 is o8obj $1, $1, 7   // offset by 7 to get the 'pointer' to the next object then follow that pointerobj $1, $1, 6obj $1, $1, 5obj $1, $1, 4obj $1, $1, 3obj $1, $1, 2obj $1, $1, 1   // now $1 = o1load $2, o9     // again this is actually differentpush $2         // o9 is now on the stackcall $1, 12     // calls the add method which is offset 12 from the object o1pop $2          // 2 now contains the answer

Quote:
Nice integer syntax, and a nice idea also. I would prefer
integer i = [1,2,3,4,5]
tho.
Also i = [x,y,z,2,3,4] should also be acceptable (and handy)

How do you know that you are storing an array in integer i ? Does the compiler do this by inspection (looks at the result of the expression and sees if it is an array)?
Also how do we say that a method takes an array of integers?

Do you like dynamic typing? In what I am envisaging in this language there are only static types. For the last example it would be object[] i = [x,y,z,2,3,4]. The object[] is the type of the variable, nothing else can go in it. If you ment that x, y and z were integer variables then yes, that would be possible. It should also be possible to do this.

integer[] i = [1..5, 8, 9]integer[] j = [11, 12, 16..20, i]

Both including ranges in a list, and other lists.

Do you think dynamic typing is good?

Quote:
With the grammer tags, i can see your point. Its just that it would allow for more expansion of the language (see lisp, and how easily it allow you to do things, simply be changine the grammer).

True. I suppose I want this language to be more "conventional" and basic. I think it would be cool to make a language with lambda's, macros and a flexable grammar, but I think what this language is for doesn't really need them.

Maybe that could be my next project, a Haskell / Lisp / Prolog type language.

[Edited by - umbrae on November 4, 2004 4:27:30 PM]

cmp

138

November 04, 2004 01:56 PM

concerning your dislike of keywords: to me you're right to some extend, there should be as little keywords as possible, but for some very high level constructs you need keywords.
another way of achiving so high abstraction would be to use external files describing these constructs, but this dosen't seem to be the right way in my eyes.
as an example: remote procedure call, i plan to use the keyword "remote". then you could just write:

remote class HelloWorld {  void put(){     std_io.put("hello world!")  }; }class Client{  void run(){     server : remote HelloWorld = get_root_by_ip("127.0.0.1")     server.put()  }}

all togehter i will try to support the following keywords:

sperate - for threads, every call on an seperate object will execute in another thread and the calling thread won't wait, until the result is used

remote - for remote objects distributed in a network

extern( language/plugin ) - automatic wrapper generation, but the source has to be written in my langauge (so you could compile the game object to lua or your bytecode and use it directly in your native application without the need to write any wrapper object by hand)

const - for the const'ness of objects. it's a very good feature of c++ in my eyes.

reference - for defining pointers to pointers (else you would have to write wrapper objects, wich sucks)

very other keyword as inline or virtual should be guessable by the compiler. but in my current stage i still need the virtual keyword, as the xslt language isn't powerfull enough to determine these things.
another thing, where you would need virtual, is the compilation of code into a library, because sometimes you have to guarantee that a method will be virtual.

concerning your array-syntax: i think it's very wired that keywords bloat a language from your point of view, but an extra syntax for defining arrays does it not.

btw: i try to upload some code in the weekend, but by now i'm very consumed with real life (school, etc)

umbrae

Author

308

November 04, 2004 04:03 PM

Quote:
Original post by cmp
concerning your dislike of keywords: to me you're right to some extend, there should be as little keywords as possible, but for some very high level constructs you need keywords.
another way of achiving so high abstraction would be to use external files describing these constructs, but this dosen't seem to be the right way in my eyes.

I also thought about the external file idea. Maybe instead of keywords, there is a different syntax to give information about the method / class.

class a : b : c : d  // name is a, super is b, implemented interface is c, viewable package is d    - void method()        - const        - remote                integer i = 5;        if ( ...

I think I would like that better, or something similar.

Quote:
as an example: remote procedure call, i plan to use the keyword "remote". then you could just write:
*** Source Snippet Removed ***

I've had a look at UnrealScript, and it is completely full of keywords to give information about the objects. It has similar network savvy keywords and I think that is quite cool.

With your example code - does the keyword mean that the class has to be over the network, or does it mean that it can be? And is it important to use the keyword when you declare a variable (server : remote HelloWorld)? Can't one of these keywords at least be taken out? If you take out the one next to the class it means that it will be possible to remote any class - quite a cool feature. If you take out the one in the declaration it means that you don't have to know that you are talking to a remote object - it is handled transparently.

I was considering having proxy objects with respect to networking. One of the computers "owns" the object - it has the real thing, it is the server. The client part of the code (even the client code that is running on the same computer) deals with a proxy object - not the real thing. This object can get it's data from anywhere - I was thinking another "connection" type object that represented a connection to another computer. Calls to the proxy object interact with this connection object which talks to another connection object on the server which calls and changes data with the real object. Basically the proxy object gives the connection object a pointer and data that it wants to call the real object with (and a method pointer), the connection then sends it to the server. This way is would be possible to have different types of connections also one connection can have multiple objects communicating through it - and this can be defined in code, and changed. In this situation networking is done through objects not language.

// a normal integer createdinteger i = integer.alloc().init(5)// orinteger i = 5// a networked integer 'subscribed'integer i = integer.subscribe(connection, "new integer")// a network integer 'published'integer i = 5;connection.publish(i, "new integer")

Of course the connection has to have been setup beforehand. Not sure if I should use a string to identify the published object or something else (or not identify it?).

Quote:
all togehter i will try to support the following keywords:

sperate - for threads, every call on an seperate object will execute in another thread and the calling thread won't wait, until the result is used

remote - for remote objects distributed in a network

extern( language/plugin ) - automatic wrapper generation, but the source has to be written in my langauge (so you could compile the game object to lua or your bytecode and use it directly in your native application without the need to write any wrapper object by hand)

const - for the const'ness of objects. it's a very good feature of c++ in my eyes.

reference - for defining pointers to pointers (else you would have to write wrapper objects, wich sucks)

Again the runtime keywords could be implemented using a similar creation method.

integer i = integer.async().init() // separate threadinteger i = integer.subscribe(connection, "new integer") // remote connection

I don't think the type information needs to know that a certain object is async or remote, that is only important at runtime. You should still be able to pass the object around like a normal object (eg. integer).

I think I will do binding a little differently, basically an object "signs" itself up to implement a method - using simple macros. ( I know, I hate macros, but their strong point is binding / interpreters and I am writing this in C++).

As for const - I haven't used it much, and when I did I had no errors due to changing something that was const. My idea is that they are for compile time only, and I have never needed them - I have never been saved from an error because of them.

I personally don't use pointers to pointers much at all. I have programmed in Java for a while, I don't think you can have them in Java and I got along fine. They are only really useful when you are doing something nasty in C or C++, and even then there are Design Patterns that help you out of those holes.

Quote:
every other keyword as inline or virtual should be guessable by the compiler. but in my current stage i still need the virtual keyword, as the xslt language isn't powerfull enough to determine these things.
another thing, where you would need virtual, is the compilation of code into a library, because sometimes you have to guarantee that a method will be virtual.

It sounds like your language is restricted because of the language you are turning it into. In languages like java and my language all methods are 'virtual' to a point - compared to C++. But yet each class doesn't need to implement the method (unlike C++). I suppose that is the annoying thing about compiling to a language that already exists, and has it's quirks. In these forums I was having a bit of a discussion about the Java Virtual Machine - I was saying that not all languages would run well on it, the language has to be similar to java in the first place. It turns out that it is hard to do multiple inheritance - because the vm only really supports each class having one parent.

If you made every method virtual, and always defined a method - even if it just calls it's parent's method (can you do this in C++?) I think it might work.

Quote:
concerning your array-syntax: i think it's very wired that keywords bloat a language from your point of view, but an extra syntax for defining arrays does it not.

It does sound that I'm hypocritical doesn't it :P.

I would consider that this:

integer[] i = [1, 2, 3]

is less bloated than:

integer[] i = array.alloc(integer, 1, 2, 3)

Also I consider making a new grammar for one of the most used structures (at least for me) to be more important than making a new grammar for infrequently used keyword system (again - me personally).

Quote:
btw: i try to upload some code in the weekend, but by now i'm very consumed with real life (school, etc)

Yeah, I have an exam tomorrow, I've had 3 in the last week - real life is annoying sometimes.

umbrae

Author

308

November 05, 2004 04:23 AM

cmp, do you have a bnf of your language? If you have, or can create one then I will be able to write a parser for it. Is flex and bison a good option? - it's what I have used.

cmp

138

November 05, 2004 08:54 AM

Quote:
Original post by umbrae
With your example code - does the keyword mean that the class has to be over the network, or does it mean that it can be?

it can be, but it does not have to.

Quote:
And is it important to use the keyword when you declare a variable (server : remote HelloWorld)? Can't one of these keywords at least be taken out?

this was actually a typo, you just need one remote, if you specify a class as remote every instance of it may be located somewhere in the network. if it's specified only for a variable, than it means that only this variable may point to a remote instance of this class.

Quote:
If you take out the one in the declaration it means that you don't have to know that you are talking to a remote object - it is handled transparently.

this is one of the ideas, that the compiler knows the object is not located on this machine, but the programmer does not have to worry about it.
@your proxy object idea: i think this is the only way to deal with this problem, i would just subclass the class wich should be located in the network and now simply through polymorphism the user of an object can't distinguish between a remote and a local object.

Quote:
This way is would be possible to have different types of connections also one connection can have multiple objects communicating through it - and this can be defined in code, and changed. In this situation networking is done through objects not language.

yeah, but i think this way you have to write the proxyobjects everytime for every object. my idea was to let the compiler do this unneccasrry rask and this way the problem is not a runtime, but compile time problem. another aspect, wich is imporant, if the compiler knows that a problem may be remote, it won't let the main thread of execution wait for the method called or use similar techniques.

Quote:

Of course the connection has to have been setup beforehand. Not sure if I should use a string to identify the published object or something else (or not identify it?).

i had the idea to just make the root object of an application visible via the network, but as each object may have subobjects you could reach every object through one, wich itself could be specified by the ip address or something else (of course the compiler would have to have at least the source code for the server's superclass when compiling the client.).

Quote:

integer i = integer.async().init() // separate threadinteger i = integer.subscribe(connection, "new integer") // remote connection

i really don't understand, if asyinc is now a method or not, and if so init should be also one, but how do you actually allocate an object (or are async and init compile time generated) and if async is a method what is it returning wich itself could generate an object's instance?
from what i get it seems that you are not removing keywords from the language, but just bluring the difference between a keyword and a method.

Quote:
As for const - I haven't used it much, and when I did I had no errors due to changing something that was const. My idea is that they are for compile time only, and I have never needed them - I have never been saved from an error because of them.

you always have to remember that your source code may be reused by some else, and now it does matter, wheter an object/mehtod is const/private/public.
as you said you won't do any mistake, but someone else could break your whole application when writing a plugin, just because he has changed something, wich shouldn't have been changed.

Quote:

I personally don't use pointers to pointers much at all. I have programmed in Java for a while, I don't think you can have them in Java and I got along fine. They are only really useful when you are doing something nasty in C or C++, and even then there are Design Patterns that help you out of those holes.

yes, i have to rethink it, because if you may have a pointer to pointer, why not a pointer to pointer, wich points to a pointer (wow so many points).

Quote:
It sounds like your language is restricted because of the language you are turning it into. [...]
If you made every method virtual, and always defined a method - even if it just calls it's parent's method (can you do this in C++?) I think it might work.

this is not the problem, but i want a language wich is fast and making a function virtual alway implies some runtime overhead (memory and speed), wich isn't the way i want to restrict the user's of my langauge.

Quote:
I would consider that this:
integer[] i = [1, 2, 3]
is less bloated than:
integer[] i = array.alloc(integer, 1, 2, 3)

no the [1,2,3] is not my problem, this is something i have planned myself. the problem is the integer[] wich i would have been rewritten as Array<Integer>, where Array is an object implemented in my language itslef (atcually i'm currently writing this object, but my stylesheet lack template support at this time, so everything brakes.)

Quote:
Yeah, I have an exam tomorrow, I've had 3 in the last week - real life is annoying sometimes.

normally exams (btw i was really searching for this word when writing my last reply :D) aren't that hard, but maths and physics are special (linear algebra and einstein's theory of relativity).

Quote:
cmp, do you have a bnf of your language? If you have, or can create one then I will be able to write a parser for it. Is flex and bison a good option? - it's what I have used.

actually i had one parser written in c++ (with flex and bison), but then i realised that i would be really cool to write the parser in my language, so currently i will write everthing as xml, so i can recompile it later to c++ and my langauge.

currently this my bnf (i don't know if this is valid bnf, somthing enclosed by [] means it is optional something with a following * means it can be placed as many times as you want)

file:		definition*definition	: [mod_list] 'class' [ '<' ident_list '>'] [ ':' object_list ] '{' [member*] '}'		| 'namespace' identifier '{' [definition*] '}'object_list : object_list, object            | objectident_list : ident_list , identifier            | identifiermember		: attribute | methodattribute	: var_def [assignment]method		: [mod*] identifier ':' [type] ['(' parameter_list ')'] method_bodyvar_def		: identifier ':' typemethod_body	: '{' [statements] '}' 		| 'deferred'			type		: [mod*] ex_typeex_type		: ex_type '.' identifier | identifierstatement	: identifier assignment		| var_def [assignment]		| epxression		| 'if' '(' expression ')' '{' [statement*] '}' [elseif*] [else] 		| 'while' '(' expression ')' '{' [statement*] '}'		| 'throw' expression		| 'try' '{' [statement*] '}' [catch*]elseif		: 'elseif' '(' expression ')' '{' [statement*] '}'else		: 'else' '{' [statement*] '}'catch		: 'catch' '(' var_def ')' '{' [statement*] '}'expression	: expression infix expression		| prefix expression		| '(' expression ')'		| literal		| access [ '(' [argument_list] ')']		| 'new' type '(' [argument_list] ')'		| 'lamda' expression		| objectargument_list: argument_list ',' expression             | expressionliteral		: string | integer | real | boolean | character | array_literal | tuple_literalarray_literal	: '[[' exp_list ']]'tuple_literal	: '<<' exp_list '>>'infix		: '+' | '-' | '*' | '/' | '%' | '|' | '&' | '||' | '&&' | '@' | '==' | '!=' | '<=' | '>='prefix		: '^' | '!'mod		: 'extern' '(' string ')' 			// example: extern("lua") automaticly creates wrapper code		| 'extern'		| 'public' | 'protected' | 'private' 		| 'virtual' 		| 'const' 		| 'remote'		| 'seperate'object		: identifier | "( '?' | '?'( [1-9] | [1-9][0-9]* ))"access		: access '.' object | objectexp_list	: exp_list ',' expression | expressionparameter_list	: parameter_list ',' var_def | var_defassignment	: '=' expression

umbrae

Author

308

November 05, 2004 10:13 PM

Quote:
Original post by cmp
Quote:
If you take out the one in the declaration it means that you don't have to know that you are talking to a remote object - it is handled transparently.

this is one of the ideas, that the compiler knows the object is not located on this machine, but the programmer does not have to worry about it.

Wouldn't it be cool if even the compiler doesn't know - just the runtime environment.

Quote:
@your proxy object idea: i think this is the only way to deal with this problem, i would just subclass the class wich should be located in the network and now simply through polymorphism the user of an object can't distinguish between a remote and a local object.

The class would still be flagged as remote though? The compiler would have to know.

Quote:
Quote:
This way is would be possible to have different types of connections also one connection can have multiple objects communicating through it - and this can be defined in code, and changed. In this situation networking is done through objects not language.

yeah, but i think this way you have to write the proxyobjects everytime for every object. my idea was to let the compiler do this unneccasrry rask and this way the problem is not a runtime, but compile time problem. another aspect, wich is imporant, if the compiler knows that a problem may be remote, it won't let the main thread of execution wait for the method called or use similar techniques.

The proxy objects are automatically generated by the compiler, they are subclass type things that every method jumps to the same code that asks the connection to send itself.

Quote:
Quote:
Of course the connection has to have been setup beforehand. Not sure if I should use a string to identify the published object or something else (or not identify it?).

i had the idea to just make the root object of an application visible via the network, but as each object may have subobjects you could reach every object through one, wich itself could be specified by the ip address or something else (of course the compiler would have to have at least the source code for the server's superclass when compiling the client.).
Quote:
integer i = integer.async().init() // separate threadinteger i = integer.subscribe(connection, "new integer") // remote connection
i really don't understand, if asyinc is now a method or not, and if so init should be also one, but how do you actually allocate an object (or are async and init compile time generated) and if async is a method what is it returning wich itself could generate an object's instance?
from what i get it seems that you are not removing keywords from the language, but just bluring the difference between a keyword and a method.

The cool thing is that .alloc() and .async() are actually both methods. They are both implemented in C++ although they are not 'normal' methods. There are a few methods in my language that are 'special', mostly the base ones in the object object, the type object, the method object and the class object.
The method object can have methods! but the compiler will only allow 1 level deep. Say you want a particular method async, just go like this:

unit_group g = goblins.spawn(world).init()// normal callg.moveto(waypoint)// async callg.moveto.async(waypoint)

The .async() method on the method takes the same parameters as the method - but it will execute asynchronously. I think this is better than a keyword - the functionality is linked to the method, not a global keyword. And it can be called either sync or async.

Quote:
Quote:
As for const - I haven't used it much, and when I did I had no errors due to changing something that was const. My idea is that they are for compile time only, and I have never needed them - I have never been saved from an error because of them.

you always have to remember that your source code may be reused by some else, and now it does matter, wheter an object/mehtod is const/private/public.
as you said you won't do any mistake, but someone else could break your whole application when writing a plugin, just because he has changed something, wich shouldn't have been changed.

I think your example works well for access modifiers but not as well for const. The only problem I see is when someone subclasses your class and overrides a method that isn't supposed to modify it's parameters, and they do. This is bad.

Okay, the thing I don't like about keywords is that they aren't associated with an object. I would be more happy if the keyword was something like object.const because then you can see it's meaning (a const object). If the method was declared something like:

- void do_stuff_const (integer.const amount, string.const bob)

I would like it better.

A little bit more extrapolation on the dynamics of my runtime. Each class has:

a super object (has the name of the class - one instance ever)

the object (a normal instance)

an hooked object (a special instance)

In the hooked object each method points to a method in the super that looks in a special variable in the object. This variable is a enumeration that determines what the next method it should jump to. Such options are: async, remote, async+remote etc. Then it jumps to the method, eg the method that actually does the thread split. I was thinking about every object method being hooked, but I think that is bad (and slow). I don't think I will let users write their own hooks. The methods are implemented in the class object which is written in C++. The method is intercepted by the super class, which does something special before calling the real method. An example would also be a const object - all calls that aren't const (if I have const in my language) get thrown away, if it returned an object it would be null.

I think I can explain it a bit better if given another go, if you are hazy or don't quite understand I will try again :).

Quote:
yes, i have to rethink it, because if you may have a pointer to pointer, why not a pointer to pointer, wich points to a pointer (wow so many points).

Yeah. Pointers are cool in C++. I actually realised the other day that I only ever use pointers when dealing with objects. It is always xml_parser* parser = ... other than xml_parser parser = ... And I always use new and delete. I just feel safer. In fact I only just stopped malloc and freeing my char*'s, I use new and delete now.

Quote:
Quote:
It sounds like your language is restricted because of the language you are turning it into. [...]
If you made every method virtual, and always defined a method - even if it just calls it's parent's method (can you do this in C++?) I think it might work.

this is not the problem, but i want a language wich is fast and making a function virtual alway implies some runtime overhead (memory and speed), wich isn't the way i want to restrict the user's of my langauge.

True, I see your point. Hard to fit a dynamic language into a static-ish one.

Quote:
Quote:
I would consider that this:
integer[] i = [1, 2, 3]
is less bloated than:
integer[] i = array.alloc(integer, 1, 2, 3)
no the [1,2,3] is not my problem, this is something i have planned myself. the problem is the integer[] wich i would have been rewritten as Array<Integer>, where Array is an object implemented in my language itslef (atcually i'm currently writing this object, but my stylesheet lack template support at this time, so everything brakes.)

Yeah, I want a type of generic (template) support, but I'm not sure if it should be in the language or not. The only thing I think it is really useful is data structures, and if I can supply the structures (lists etc) then there won't be a need for templates. There will be support for them in the vm so I can always add support later.

Quote:
Quote:
Yeah, I have an exam tomorrow, I've had 3 in the last week - real life is annoying sometimes.
normally exams (btw i was really searching for this word when writing my last reply :D) aren't that hard, but maths and physics are special (linear algebra and einstein's theory of relativity).

Bah! Try this one - Computing Systems, Foundations of Computer Science, Software Engineering Process and Minds and Machines. Foundations of CS is not as fun as it sounds (and if you have done the one before - no, it can get worse).
Computing Systems = OS Architecture, Assembly, Comm
Foundations of CS = Formal methods, Turing Machines, NFSA's
Software Engineering Process = Writing poker in java with software practices shoved down your throat
Minds and Machine = philosophy, what is the mind mostly, quite fun.

Quote:
Quote:
cmp, do you have a bnf of your language? If you have, or can create one then I will be able to write a parser for it. Is flex and bison a good option? - it's what I have used.

actually i had one parser written in c++ (with flex and bison), but then i realised that i would be really cool to write the parser in my language, so currently i will write everthing as xml, so i can recompile it later to c++ and my langauge.

So let me get this straight... you are writing your parser in the same language that you are parsing for. Now I want to do that (seriously).

I think I will go away and write a vm, then write the parser in my own language.

Quote:
currently this my bnf (i don't know if this is valid bnf, somthing enclosed by [] means it is optional something with a following * means it can be placed as many times as you want)
*** Source Snippet Removed ***

Not quite valid but good enough. Normally the way to do tag: other_tag* is
For 1 or more

tag:    other_tag    | tag other_tag

For 0 or more

tag:    /* nothing */    | tag other_tag

This is also good with parsers as the tag is not needed on the stack. If it was done right (as apposed to left) it has to be on the stack.

Also it is a good idea to implement what order your operators have - I copied the structure from a C parser I found.

Edit:

So are you going to write your parser in your own language?
Also if you are interested, this is the bnf of my original language.

[Edited by - umbrae on November 5, 2004 10:13:19 PM]

Scripting Language Genesis

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Scripting Language Genesis

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines