Advertisement

AngelScript - changes to how string literals are handled

Started by November 15, 2017 03:53 PM
9 comments, last by WitchLord 6 years, 11 months ago

I've changed how string literals are handled in the script engine. Now the compiler evaluates them at compile time and stores a pointer to the application native string type in the byte code. This avoids the need to create new instances of strings every time a script uses a string literal, which in turn translates to a performance improvement.

A new asIStringFactory interface is used to perform the compile time evaluation. This also allows the application to keep track of the used string literals, and store them in a common pool to reduce memory consumption for duplicate string literals. With the factory interface there is also no need for the script engine to keep an internal copy of the string literals thus reducing memory consumption even further.

Since the byte code now stores the pointer to the actual string object, the byte code instruction asBC_STR has no use anymore, and with that the restriction of a 2^16 maximum amount of string literals is eliminated.

Besides the above benefits I should also be able to take this a step further as the compiler now knows that the life time of a string literal is guaranteed, so it should take advantage of that and avoid making unnecessary copies when passing string literals to functions, etc. Though this improvement has not yet been implemented.

For existing scripts there shouldn't be any changes, except if the application currently registers the string factory to return a non-const string object instance. As the string literals are now evaluated at compile time it is no longer possible to have them treated as non-const, still I've made it so that the compiler can when needed implicitly convert to a non-const instance by making a local copy of the string literal at run-time.

As the changes has been quite extensive, I decided to make it possible to temporarily turn off these changes until all bugs have been rooted out. To turn off the changes look for the #define AS_NEWSTRING in angelscript.h and comment out that line. Before I make the official release (hopefully before the end of the year) I'll remove this option completely.

Let me know if you encounter any problems with this, so I can have it corrected before the official release.

Regards,
Andreas

AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

Looks good, thanks! Is there any impact for the JIT?

Advertisement

The only impact is that the asBC_STR instruction is no longer used. So far it is still defined so any JIT compiler that does something with that will not fail to compile. But I will eventually remove the definition all together, and then the JIT compilers will have to be updated to remove it as well.

AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

Sounds great, thanks!

This is great news! Going to experiment with this a bit this weekend, will report back if there's any problems.

So I'm a bit confused at how to use this. I've looked at the existing ScriptStdString to find how to properly use this for my own string class, but I'm not certain I'm doing it right. I assume I have to allocate a new string and cache it in GetStringConstant, and then free the memory in ReleaseStringConstant, but only if it's cached? (And then uncache it)

However, with the way I currently have this implemented, the script engine ends up trying to construct a string with a freed pointer (which according to my logfile was released in ReleaseStringConstant already). So I shouldn't free the memory in ReleaseStringConstant? If not, then where is the appropriate place to free the memory? I can't really figure this part out by lookng at ScriptStdString.

Also, I'm still unsure of the purpose of GetRawStringData.

Here is my string factory class: https://pastebin.com/YpyQP3P8

 

Edit: I'm an idiot and I only just realized this is a refcount, not an immediate allocate/free thing. I'm still having some troubles though, and I still am not sure what GetRawStringData is used for.

Advertisement

So, a couple things I'm having issues with: https://pastebin.com/PTbBrZqg

It is reporting some weird new unexpected constant strings:


[    ScriptEngine] New string constant: "€¼¦" @ 1EA6BE80
...
[    ScriptEngine] New string constant: "Tooltip: "" @ 1EA6C120
[    ScriptEngine] New string constant: ": "" @ 1EA6C0A0
[    ScriptEngine] New string constant: " (" @ 1EA6C040
[    ScriptEngine] New string constant: " ¾¦" @ 1EA6C100
[    ScriptEngine] New string constant: " ¿" @ 1EA6BF30
[    ScriptEngine] New string constant: " ¿¦" @ 1EA6C2D0
[    ScriptEngine] New string constant: "P¿¦" @ 1EA6C310
[    ScriptEngine] New string constant: "ÀÀ¦" @ 1EA6C230
...
[    ScriptEngine] New string constant: "Game Scene" @ 1EA6C280
[    ScriptEngine] New string constant: "˜ï" @ 1EA6C130
[    ScriptEngine] New string constant: "п" @ 1EA6C290

No string is now being destroyed, but when my scripts start executing code related to strings, it does a string copy constructor and passes an invalid reference to copy from. The (invalid) pointer is not a constant string located in my log, so I'm not sure where it's coming from or why it's invalid. I'm going to try to track this down where in the scripts it's coming from.

Edit: The weird issues were caused by me being tired. Of course. I was returning a pointer to my string cache pair instead of the actual string object itself. Fixing this also made those weird string constants disappear.

I'm sorry, I missed your posts somehow. 

Can I assume you were able to resolve all your problems with the new string factory? 

 

The GetRawStringData is used when the script engine needs to know the original string content, for example when saving the bytecode. Currently it is also used during the compilation to make a copy of the string literals when finalizing the bytecode for a recently compiled script function. This is slightly unefficient and I'm looking into removing this, if I can find a better way. If I'm successful it will only be used when saving the bytecode. 

 

AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

Yes! I managed to resolve my problems. It was mostly just due to a lack of documentation on asIStringFactory, I suppose.. :)

Thanks for the explanation of GetRawStringData, too!

I'll make sure to include a clarification for these things in the documentation for the release. :)

AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

This topic is closed to new replies.

Advertisement