Advertisement

std::map registration

Started by July 02, 2012 03:47 PM
6 comments, last by TheAtom 12 years, 5 months ago
Hi,

i wrote a std::map, std::unordered_map, or any associative container register class.

here is the syntax

hashmap<int>::RegisterHashMap("hashmap_int", "int", engine);
hashmap<string>::RegisterHashMap("hashmap_string", "string", engine); // a better dictionary addon
hashmap<float>::RegisterHashMap("hashmap_float", "float", engine);
RegisterIntpair();
hashmap<intpair>::RegisterHashMap("hashmap_intpair", "intpair", engine); // std::pair


script

hashmap_int<float> map;
int size = 100;
for(uint i = 0; i < size; ++i)
{
map = 42.0f;
}
hashmap_int_iterator end = map.End();
for(hashmap_int_iterator it = map.Begin(); it != end; ++it)
{
float x = map.Get(it);
}


intpair pair(10, -10);
hashmap_intpair<HashMapTest@> map;
@map[pair] = HashMapTest(11.0f);

hashmap_intpair_iterator it = map.Begin();
while(it != map.End())
{
float x = map[it].x;
dbg("x: " + x);
it = it.Next();
}


code is only 1 day old. only tested barely.
it is not battle tested, there may be some leaks.

i hope it will be useful to someone.
Looks interesting.

I'll review the code when I get the time. Perhaps you've implemented some things I can incorporate into the dictionary add-on too. :)

AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

Advertisement
In script environment, one would like to be safe from the possibility of dereferencing bad iterators. Any thoughts on that?

In the code I am doing now, every container keeps track of iterators that point to it to invalidate them whenever the container's contents (anything that affects iterators) are changed. At that point it also updates internally kept end and begin iterators which are to be returned by reference, to save from excessive copying. GC references are kept back and forth. Dereferencing an iterator marked as invalid raises a script exception.

An alternative I considered was to keep the container's "version number" and increment it every time the contents change. The iterator is also keeping the container version number it was created for, and this value is passed over when copying the iterator. Dereferencing, incrementing etc. of an iterator that holds an outdated value would raise an exception. That relieves both the container and GC from extra iterators "herding" but is potentially unsafe in case of some heavy duty containers(then again it's not likely for the version number to go over 64 bit integer limit, is it).

Apart from the iterators I exposed all the basic stl templates. I still haven't decided on the manner I'm going to do the iterators, I would welcome any suggestions. Perhaps I am missing some obvious solution.
why do you even need stored iterators. just store keys
they are needed most in linked lists where access is N. in every other case you have o(1) access.
linked list do not even invalidate iterators, so it is unnecessary to implement some sort of reference counting for iterators.
i use iterators only when i need to delete more than one element in one iteration.

that how i use them anyway. might be some other cases i dont know
It's easy to produce, accidentally or on purpose, a code that leads to an undefined behaviour.


container.push_back(50);
iterator it = container.begin();
container.clear();
Log(it->value);


My goal was to eliminate the possibility. That's again the issue of scripts supposed to be a sandbox environment. Much like disabling/enabling unsafe references, it's the matter of having a possible crash making code or not having it.
how about you keep one copy of the container and use it exclusively for iterators.
this maybe the only way to make sure iterators do not crash your code.
java does it this way.
Advertisement
These issues are the main reasons why I haven't implemented iterators or for-each loops in the script yet.

It's definitely not possible to keep the iterators as lightweight as in C++ while still keeping the safety.

Your idea of the container keeping track of all live iterators is a good one. Assuming you don't have a lot of iterators to the same container at the same time this shouldn't add too much of an overhead, and is probably quicker than doing a fresh lookup with every access to the value. How would you determine if the iterator has to be invalidated or not? Perhaps an iterator is only invalidated if the value it is pointing to is deleted?

The version number suggestion is also not bad, even though there is minimal risk of the version number overflowing and restarting from 0. Any updates to the container would invalidate all iterators all times, so this can possibly add more overhead than the first one, but would be easier to implement.

AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game


how about you keep one copy of the container and use it exclusively for iterators.
this maybe the only way to make sure iterators do not crash your code.
java does it this way.


How does it handle erasing by iterators or iterators range? For instance, container.erase(container.begin(), container.begin()+5);


How would you determine if the iterator has to be invalidated or not? Perhaps an iterator is only invalidated if the value it is pointing to is deleted?


This is hard to do in general and it will usually lead to iteration over all iterators to see what must be done about each one. There are some guarantees on the validity of iterators after calling some methods that modify the container. A rather messy document available at http://www.open-std..../2010/n3225.pdf tries to sum them up in the chapter 23. For example, a swap() never invalidates the iterators (though of course after the call they point to different containers, and all GC references have to be updated anyway). An erase() on an associative container invalidates only the erased iterator - however there might still exist other copies of it, so again we have to iterate and check each one. A push_back() or pop_back() on a deque does not invalidate anything. A range erase() on a vector invalidates only the iterators past the new end. In this last case it is possible to verify an iterator at any point - it is sufficient that it lies between [begin(), end()) range since vector never shrinks the underlying array, excepting clear() call, which invalidates everything (in any other container too).

The need for iteration on almost every modification is one of the arguments for the "versioned containers" solution. But perhaps some hybrid approach wouldn't be a bad idea - in some cases we know that we have to invalidate everything (any clear()) and we do that by increasing the version counter, sometimes we handypick the iterators to be invalidated (erase(iterator) in map or set), and in case of a vector, maybe we don't care for invalidation and check the validity runtime (it entails but a pair of address comparisons, so the performance hit should not be severe).

I will take some time to think more about this. I am gone for good until 18.07 so I just wanted to pour all my thoughts here first.

This topic is closed to new replies.

Advertisement