Advertisement

Question about string addon

Started by June 01, 2015 02:37 AM
1 comment, last by WitchLord 9 years, 5 months ago

I've recently implemented my own string library using Glibmm's ustring which is UTF-8, everything is working great. My question was why did you decide to remove the reference counted version of the string add-on (I do know that it is in the test_feature) and only support the value version?

I decided to go with reference counted as my program will do a lot of string operations. I haven't attempted any optimizations and haven't yet looked into the string pooling you do with the official add-on. I compared my implementation with the default add-on using my version of the Hashids library. For 10,000 encodings it takes on average 1.97 seconds with the default string add-on. With my thin ref-counted ustring wrapper it takes on average 2.54 seconds. The ustring is going to be slower than std::string because of UTF-8 character validation, so that may account for some of the speed difference, or would I be better off attempting to register it as a value type?

The above was tested on Linux x64 with O3 level optimization, each test was ran 4 times and averaged out. JIT would of course speed things up, I know it was below a second in past testing, however it's no longer an option as I rely on some changes in 2.30.x which the library is not compatible with (and I've failed to get it working correctly). For comparison the PHP implementation takes on average 0.388 seconds to do the same.

I put the hashid implementation on pastebin if you want to take a look (it's mostly a direct port of the Java version), maybe you have some ideas on how to speed it up.

Thanks,

dkrusu

Right now with the speed difference I'm thinking of offering two string types. The add-on that comes with AngelScript as the default and then a unicode type called ustring for when UTF-8 support is needed.

Advertisement

I removed the reference counted CScriptString from the standard add-ons because I wanted to provide a simple solution. Keeping both CScriptString and std::string made it much harder to make sure the other add-ons that used strings would work easily with either of the two.

Besides, use of the std::string in C++ is much more common. And whenever an application developer feel they need a different string type, they usually write their own class anyway, so the CScriptString wouldn't be of much use anyway.

I won't try to convince you which is better, strings as value type or reference type. There are plenty of material that can be found on the internet that discusses the benefits versus disadvantages of ref-counted strings.

AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

This topic is closed to new replies.

Advertisement