Series introduction
This article is part of a larger series describing implementation details of various systems in Banshee Engine. If you are not familiar with Banshee make sure to check out the previous article:
Banshee Engine Architecture - Introduction
Introduction
Run time type information system with serialization support was one of the first systems to be developed for Banshee. It is something that has wide ranging consequences on many high level modules and classes and is hard to do properly as an afterthought. The system had to be general enough to work for most objects in the engine, from resources and configuration files to entire hierarchies of scene objects and components for saving entire levels.
For those not familiar, run time type information allows you to query information about types during program execution. For example, you may find out exactly which fields (i.e. variables) a class contains, get class name, determine actual type of a polymorphic object or even determine class base and subtypes. This has many uses but a really useful one is serialization. With enough information you can serialize entire objects or hierarchies of objects into a stream of bytes without needing to write special serialization code for specific object types. As long as the object has RTTI it can be serialized using the same code while in the past you might be forced to write
save/serialize/write (and their read equivalents) methods for each type.
C++ comes with RTTI support built into the compiler, however it is a very limited form of support and used primarily for finding out of an object is of a certain polymorphic type and if can be safely cast to another type. This form of RTTI also applies to every single type whether you need it or not (unless compiler is smart enough to optimize certain cases out, but exact control is lacking regardless). Higher level languages like C# come with a more advanced RTTI support allowing you to query pretty much every single little detail about classes, fields, methods and most other language constructs.
Banshee needed a lot more control than C++ RTTI system offers, but without implementing a full reflection-type system like the one present in C#. When creating it I had this set of requirements in mind:
- Ability to determine out exact polymorphic type from an object pointer
- Ability to create a new empty instance of an object with just a class name or RTTI identifier
- Serialization that handles object references so complex structures may be saved (e.g. entire game levels)
- Serialization that supports versioning so that adding new fields or deleting old ones doesn't break previously serialized data (e.g. player has a save file and after you patch the game that save file must still work even if serialized class structure is no longer the same)
- Serialization that works automatically with inheritance hierarchies
- Support for data transformations during serialization/deserialization. Sometimes what you keep in memory is not the same as the thing you want to serialize (e.g. you might want to compress image data before saving it)
- Serialization that natively supports arrays of data
- Serialization needs to play well with external references (e.g. references to resources, which are saved separately)
- Serialization needs to play well with the scripting engine, as the scripts have a more relaxed serialization scheme which is still built on top of the same system
- Ability to write serialized data in various formats (e.g. binary, text, xml, etc.)
- Binary serialization had to produce compact results
To give you a taste here are some examples of what the finalized RTTI and serialization system will allow you to do.
Serialize an in-memory object to a file and then restore it later, while keeping all its references intact. This will even save complex hierarchies like game levels and external references to resources.
MyObjectPtr myObject = ...;
FileSerializer fs;
fs.encode(myObject, "C:\myObject.obj");
IReflectable myRestoredObject = fs.decode("C:\myObject.obj");
MyObject class can even change in-between encode and decode calls without breaking the serialization. If a field was removed the deserialization will throw away the saved data, or if a new field was added it will be initialized to the default value. This means you can safely store your levels/resources/save games and not worry about versioning or converting them if your types get changed (during development process or due to a patch, for example).
You may check if an object is of proper type using the
rtti_is_of_type helper method:
IReflectable* myObject = ...;
if(rtti_is_of_type(myObject))
...
Check if a class derives from some other using the
rtti_is_subclass helper method:
if(rtti_is_subclass(myObject))
...
Create a new instance of an object just from the type name or ID using the
rtti_create helper method:
myObject = rtti_create(TID_Texture);
Helper methods for converting many standard types to/from bytes are also provided. Aside from being useful when writing to disk, you may use this for encoding data to send over the network.
Map> myMap;
... fill myMap with data ...
UINT32 size = rttiGetElemSize(myMap);
UINT8* buffer = bs_alloc(size);
rttiWriteElem(myMap, buffer);
...
It is also immensely useful to be able to iterate over all fields of a class. For example, imagine you wanted to iterate over your entire level hierarchy and create a list of all resources used by the level (or just by a specific game object). A very simplified example where we collect resources only on a single component
myComponent would look something like this:
RTTIType* type = myComponent->getRTTI();
UINT32 numFields = type->getNumFields();
for(UINT32 i = 0; i < numFields; i++)
{
RTTIField* field = type->getField(i);
if(field->isReflectableType()) // Resource handles are IReflectable value types
{
RTTIReflectableFieldBase* reflField = static_cast(field);
if(reflField->getType() == ResourceHandle::getRTTIStatic())
{
HResource handle = reflField.getValue(myComponent);
... // Save the handle to some list of dependencies
}
}
}
Now that you have an idea what it can do, let's see how it does it.
RTTI implementation
Banshee uses a manual approach for defining RTTI data. That is, you must manually specify fields and classes you wish to have RTTI information in a separate C++ file. Other engines often use a similar approach, although most seem to prefer using macros in the source class' header itself. This has the advantage of programmers less likely to forget to add a field to RTTI when a class changes, but also pollutes the header with RTTI information that I would rather keep external. Aside from that, macros are harder to read and often confuse IDEs, especially when attempting automatic refactoring. Banshee also requires you to specify additional information along with each field that helps it handle all the complex cases mentioned above, which would further pollute the header. On top of that, data transformations (e.g. compressing a texture before saving) cannot be handled easily with macros which would require a special case to deal with, as well as introducing even more RTTI data in the source class' files. In the end this is a personal preference more than anything else.
The ideal approach to handling RTTI data is programatically using code generation. You would have an external tool that preprocesses your files before compilation, parsing C++ code and some optional parameters you specify and generating RTTI C++ files from that information. This isn't the approach I have seen in any current C++-based engines, but I do believe this was due to lack of a simple way to parse C++ files and generate the necessary information.
Recently with the appearance of libraries like
libclang creating a fairly extensive C++ RTTI system should be fairly doable. However this was out of the scope of what was needed for Banshee. RTTI definitions in C++ are less of a problem with Banshee than with other fully C++ based solutions as it is expected you will write most of the high level code using C# scripting which provides a fully automatic serialization (based on top of the same system described here). Additionally even with an automatic system you would still have to manually specify serialization information in case of data transformations and other special cases, therefore it was not something I felt was worth the effort, but I thought it was still worth mentioning.
In Banshee, to create a class that supports RTTI you first must ensure it implements
IReflectable interface. It is a minimal interface that requires the class to implement a couple of methods that retrieve a
RTTIType object. Returned
RTTIType object contains all the needed RTTI data.
For a very simple
Texture class (with no actual data) the implementation would look like this:
class Texture : public IReflectable
{
int width, height;
static RTTITypeBase* getRTTIStatic()
{ return TextureRTTI::instance(); }
virtual RTTITypeBase* getRTTI() const
{ return Texture::getRTTIStatic(); }
};
getRTTI method allows you to retrieve RTTI data from an object instance and
getRTTIStatic is a static method for when you don't have an object instance.
TextureRTTI is a
RTTIType implementation specific for the
Texture class. It holds all the RTTI information, therefore keeping the source class clean from RTTI and serialization-specific code aside from the simple interface implementation. We will cover how to create your own
RTTIType in the next section.
RTTIType
RTTIType allows you to provide various information about the source type, including its name, place in inheritance hierarchy and field definitions, along with optional logic to trigger during serialization/deserialization.
TextureRTTI class mentioned above might look something like the following.
class TextureRTTI : public RTTIType
{
int& getWidth(Texture* obj) { return obj->width; }
void setWidth(Texture* obj, int& value) { obj->width = value; }
int& getHeight(Texture* obj) { return obj->height; }
void setHeight(Texture* obj, int& value) { obj->height = value; }
TextureRTTI ()
{
addPlainField("width", 0, &TextureRTTI::getWidth, &TextureRTTI::setWidth);
addPlainField("height", 1, &TextureRTTI::getHeight, &TextureRTTI::setHeight);
}
const String& getRTTIName()
{
static String name = "Texture";
return name;
}
UINT32 getRTTIId()
{
return TID_Texture;
}
std::shared_ptr newRTTIObject()
{
return bs_shared_ptr();
}
};
RTTIType is a template and its template parameters allow us to know the source class the RTTI information is provided for, along with its base class. This information is contained in its template parameters
RTTIType(Texture, IReflectable, TextureRTTI) where the first parameter is the source type, the second parameter is the base class of the source type (usually
IReflectable) and finally the RTTI type itself. The type of the class itself is not needed strictly for RTTI purposes but is there instead to allow generation of some repetitive code you would otherwise need to write yourself.
All
RTTIType implementations will get registered with the runtime automatically when the application is started or when the dynamic library containing the type is loaded. This means all you need to do is implement the interface and it will be usable on next compile.
In the class itself you will find field definitions for source class members you wish to include in RTTI. In this case we have
width and
height members with their getter/setter methods. Those methods need to follow a certain format and once declared getter/setter methods need to be actually passed to one of the
add*Field methods to register them with the type. We will talk more about field definitions in more detail later.
Following the field definitions are methods for retrieving source class unique name and ID. The class name can usually be the same as the C++ class name and the ID must be an unique integer, in this case provided in the form of an enum. It is usually good to ensure that all IDs in a library start far away for other library IDs in order to avoid conflicts, although the system will warn you if you accidentally use the same ID for multiple types.
Finally
newRTTIObject() method is used for creating a new empty object of the source type. This means that you will normally want to have a parameterless constructor for classes that support RTTI. You may make the constructor private as making the RTTI type class a
friend of the source type class is usually a good idea so you may more easily access its members. If the source class is an abstract class that should never be constructed this method may return null.
Returned value is always wrapped in a shared pointer to ensure proper cleanup. With complex hierarchies the serializers will often need to construct dozens or hundreds of objects and leaving the deallocation as a worry to the user is not practical. Those that find the use of shader pointers too heavy weight (which should only be extreme cases) can use the simpler serialization technique described later.
This concludes the example. The RTTI type shown above is fully functional - more complex types will require additional fields and different field types, along with possibly some data transformation and additional initialization logic, but the basic concept remains as shown here.
Following chapter will cover creation of field information which we just skipped over in this section.
RTTI fields
In the previous example you saw the use of
addPlainField method for registering a field with the RTTI system. This is just one of three available field types:
- Plain fields - used when you just want to (or need to) use a memory copy for serializing. This is used for all primitive types like int, float, bool and similar, but may also be used for complex types in case you don't want any advanced serialization. We'll touch on how to define serialization for such types later.
- Reflectable fields - used when referencing another object that implements IReflectable. Object will be serialized inline by value (if multiple objects reference it, each will have its own copy).
- Reflectable pointer fields - used when referencing another object that implements IReflectable via a pointer. If multiple objects share pointers to the same IReflectable that connection will be preserved when serializing and restored when deserializing. This allows you to serialize complex hierachies.
Each method for registering fields follows the same basic format, so lets use
addPlainField as an example and we will visit all the specific methods later.
void addPlainField(String name, UINT32 uniqueId, CallbackGet getter, CallbackSet setter, UINT64 flags);
First parameter is the field name, normally corresponding to name of the member the field describes. It doesn't have to be unique.
Second parameter is an unique ID for this field. Each field must have its own unique ID and you will be warned at runtime in case that is not true. Field IDs allow the versioning system to work - this allows you to serialize a certain set of data, modify it, and still be able to deserialize and read the original fields. Each newly added field should have an unique ID, and removed field IDs should never be reused. Additionally, if you change data type of a certain field you should update the ID (think of it as adding a brand new field and removing the old one).
Both the name and ID are used with all field types (
addReflectableField,
addReflectablePtrField, etc.) so I will not be mentioning them again when we cover those fields.
Next two parameters are function callbacks that actually assign and retrieve data from the member variable the field represents. Those callbacks need to follow a certain signature which is different based on field type they're used with so I will describe them in their own separate sections.
And the final parameter is an optional set of flags that you can use for custom data. It is not used by the RTTI system directly, but rather may provide additional information to the serializer or other systems using RTTI.
All field types except for managed data fields also come in array variants. Array variants are similar to normal fields but also allow you to specify an index in getter/setter plus provide a getter/setter for array size.
void addPlainArrayField(String name, UINT32 uniqueId, CallbackGet getter, CallbackGetSize getSize, CallbackSet setter, CallbackSetSize setSize, UINT64 flags);
All array field types accept common
setSize and
getSize callbacks which follow this format:
UINT32 getSize(SourceType* obj);
void setSize(SourceType* obj, UINT32 size);
Where
SourceType is the class of object we are creating RTTI for (e.g.
Texture), and
size is the size of the array.
Plain fields
Plain fields contain data types that don't implement
IReflectable. This includes primitives like
int,
bool or
float and complex types that you either cannot modify like
Vector or
String, or are sure their structure will not change during development.
Data in plain fields gets serialized using
memcpy, which means if you are using plain fields for complex types like classes you lose the versioning feature of the RTTI system. That is, if you modify that class later your saved data will most likely be broken. Therefore you should only use it for types that you are sure will be static, like the ones mentioned above, or when the data doesn't have to persist for a long period of time (e.g. network transfer).
Plain field getter and setter callbacks must follow this format:
DataType& getter(SourceType* obj);
void setter(SourceType* obj, DataType& value);
And for arrays:
DataType& getter(SourceType* obj, UINT32 idx);
void setter(SourceType* obj, UINT32 idx, DataType& value);
While the rest of the parameters follow the same format as described in previous section.
DataType above must specialize
RTTIPlainType that tells the RTTI system how internal data is to be serialized. The default version uses a
memcpy for the entire type, but you might need something more complex (e.g. when serializing
std::string). I will talk more about
RTTIPlainType in a bit.
Reflectable fields
These fields contain types that implement
IReflectable interface. The objects are stored by value which means each time the field is serialized and deserialized a brand new copy is made. This is in contrast with
IReflectable pointer fields mentioned in next section.
You do not need to do anything special with reflectable fields as long as the type properly implements
IReflectable interface as described earlier.
You may use
addReflectableField and
addReflectableArrayField methods in
RTTIType to register a new reflectable field. Getter and setter callbacks for those methods follow this format:
DataType& getter(SourceType* obj);
void setter(SourceType* obj, DataType& value);
And for arrays:
DataType& getter(SourceType* obj, UINT32 idx);
void setter(SourceType* obj, UINT32 idx, DataType& value);
Essentially these are the same parameters as with plain fields except that
DataType must implement
IReflectable otherwise you will get a compiler error.
Reflectable pointer fields
These fields also contain types that implement
IReflectable interface. However unlike the previous field type the reference held is not by value and instead by pointer. When such an object is serialized and deserialized it will remain a single object and the pointer references will be properly saved and restored in all objects that reference it. This allows you to serialize complex hierarchies or webs of objects while ensuring all the connections remain intact.
You may use
addReflectablePtrField and
addReflectablePtrArrayField methods in
RTTIType to register a new reflectable field. Getter and setter callbacks for those methods follow this format:
std::shared_ptr getter(SourceType* obj);
void setter(SourceType* obj, std::shared_ptr value);
And for arrays:
SPtr getter(SourceType* obj, UINT32 idx);
void setter(SourceType* obj, UINT32 idx, SPtr value);
Where
DataType must implement
IReflectable interface and
SPtr is just a Banshee shorthand for shared pointer.
This is the final field type but before continuing I want to focus on plain field type specialization I mentioned earlier.
Plain type specializations
Plain type specializations allow you to control how a plain data type is serialized. This is used for types that cannot implement
IReflectable interface (e.g. primitive types or types from standard library) or for types you know will not change.
Plain type serialization can also offer you more complete control over the serialization of your object and is more lightweight than using an
IReflectable interface. The downside is that all the advanced features like field versioning or pointer saving/restoring offered by
IReflectable will not be available.
Plain types are implemented by specializing the
RTTIPlainType template. This template allows you perform serialization in a more traditional way by using memory copies.
Banshee comes with
RTTIPlainType specializations for all basic types and many standard library containers, but you may also create your own specializations. See below for a very basic specialization of
RTTIPlainType for
String type.
template<> struct RTTIPlainType
{
enum { id = 20 }; enum { hasDynamicSize = 1 };
static void toMemory(const String& data, char* memory)
{
UINT32 size = getDynamicSize(data);
memcpy(memory, &size, sizeof(UINT32));
memory += sizeof(UINT32);
size -= sizeof(UINT32);
memcpy(memory, data.data(), size);
}
static UINT32 fromMemory(String& data, char* memory)
{
UINT32 size;
memcpy(&size, memory, sizeof(UINT32));
memory += sizeof(UINT32);
UINT32 stringSize = size - sizeof(UINT32);
char* buffer = (char*)bs_alloc(stringSize + 1);
memcpy(buffer, memory, stringSize);
buffer[stringSize] = '\0';
data = String(buffer);
bs_free(buffer);
return size;
}
static UINT32 getDynamicSize(const String& data)
{
UINT64 dataSize = data.size() * sizeof(String::value_type) + sizeof(UINT32);
return (UINT32)dataSize;
}
};
Each specialization must provide an unique type ID, similar to
IReflectable. Optionally you can control whether the type's size can be calculated via
sizeof or is dynamic per-instance via
hasDynamicSize property. Types without dynamic size take up less space in serialized form as their size doesn't need to be written in a separate block, while types with dynamic size need to implement
getDynamicSize that calculates number of bytes taken up by that specific type instance.
For example,
String type has dynamic size because each instance of the type can have a different size, while
float has static size.
Provided methods should be self-explanatory,
toMemory writes the object instance into a stream of bytes,
fromMemory restores the object from a stream of bytes and
getDynamicSize returns the amount of bytes the object will require in the memory buffer when serialized.
Once an object has
RTTIPlainType specialization implemented you can use it in plain fields as described earlier. In case your object is really simple and you want it to be entirely serialized by a single
memcpy (e.g.
Vector3,
Matrix4) you can use the shorthand macro:
BS_ALLOW_MEMCPY_SERIALIZATION(DataType);
Which will generate the default
RTTIPlainType specialization.
Aside from allowing you to use data types for plain fields in
RTTIType this type of specialization also allows you to use a set of helper methods:
char* rttiWriteElem(const DataType& object, char* buffer);
char* rttiReadElem(DataType& object, char* buffer);
UINT32 rttiGetElemSize(const DataType& object);
These methods can be used by other more complex
RTTIPlainType specializations. For example, if you were to specialize
Vector data type in its
toMemory method it could use
rttiWriteElem to easily write any child objects in the container to the memory buffer.
This type of serialization is very fast and as lightweight as you want it to be (depends on your
RTTIPlainType specialization) which makes it perfect for performance-intensive scenarios. It is also very useful for serializing data types for network transfers as these transfers generally do not require any advanced features provided by
IReflectable.
As Banshee already comes with many of these specializations built in, you can serialize fairly complex containers with no problem:
Map> myMap;
... fill myMap with data ...
UINT32 size = rttiGetElemSize(myMap);
UINT8* buffer = bs_alloc(size);
rttiWriteElem(myMap, buffer);
...
Binary serializer implementation
Now that RTTI type is defined we will focus on a serializer class that uses that RTTI information for saving and loading objects. Banshee currently only has a binary serializer but different types of serializer (text, xml, json) can be relatively easily implemented following the same example as the binary one.
When using binary serialization you have an option to output data to memory using
MemorySerializer or to a file using
FileSerializer. Both of those are just very simple wrappers around the
BinarySerializer class.
As an example, if you wanted to save/load a class that implements the necessary RTTI interfaces to/from a file you would do:
FileSerializer fs;
fs.encode(myObject, "C:\myObject.obj");
IReflectable myRestoredObject = fs.decode("C:\myObject.obj");
This will save a previously loaded object into a binary format, and then immediately load that same object. This can be used for saving entire levels, user save games, cloning objects, undo/redo functionality and other uses, and all you need to write is those few lines.
Encoding
Encoding involves parsing an existing object instance and encoding its data to a stream of bytes which can later be output to a memory buffer or a file. It is performed by calling
encode method on
BinarySerializer. Its signature looks like this:
void BinarySerializer::encode(IReflectable* object, UINT8* buffer, UINT32 bufferLength, int* bytesWritten, std::function flushBufferCallback);
FileSerializer and
MemorySerializer wrap the complexities of
BinarySerializer so you usually don't need to worry about most of these parameters but in this section we will explain them.
First off you have an
object you wish to serialize, followed by a pointer to a block of memory where the encoded object data will be written to and a size of that block (
buffer and
bufferLength).
bytesWritten is an output parameter that will hold the number of bytes used to encode the entire object once the process is complete.
Finally you have
flushBufferCallback that will be called whenever buffer gets full. In that callback you usually want to save the contents of the buffer and return a new pointer to a free block of memory, or terminate the encoding. Normally when this is called you would write the buffer data to a file or a larger block of memory and then return the start of the buffer to be re-used for further encoding.
The process of encoding involves these steps:
- The top level object starts getting encoded
- All RTTI types of an object are retrieved and iterated over (there will be only one if object is not polymorphic)
- For each RTTI type we retrieve all fields
- Plain and IReflectable fields are serialized directly into the output buffer
- Plain fields are encoded by directly accessing their RTTIPlainType implementation
- IReflectable fields are encoded by accessing their RTTI types and fields by calling encode recursively
- IReflectable pointer fields are marked for later and given an unique ID. The ID is encoded in the output buffer.
- After we loop through the top level object and all of its IReflectable children we start serializing IReflectable objects that were referenced by pointers. Their serialization proceeds by recursively calling encode essentially repeating the whole procedure.
Whenever a serialization of a particular sub-type starts and ends we call
RTTIType::onSerializationStarted and
RTTIType::onSerializationEnded. You may override those in your
RTTIType implementations in case you need to prepare some data on serialization start, or perform cleanup once it ends. Objects that implement
IReflectable by default also have a
mRTTIData field which has
Any type. As the name implies you can use that field to store any kind of data. Usually it is used for temporary data created in
RTTIType::onSerializationStarted and freed in
RTTIType::onSerializationEnded.
Most RTTI types will not require these methods, but they can prove useful with complex types that require special handling.
Decoding
Decoding process involves reading the binary data and creating object instances from that data. It is performed by calling the
decode method on the
BinarySerializer:
std::shared_ptr BinarySerializer::decode(UINT8* data, UINT32 dataLength)
Decoding algorithm will iterate through the provided buffer, detect objects, their fields and their data. Each time a new field or a new object is reached its type is looked up in the RTTI system. If the field or type cannot be found (we could have removed it since) it is skipped. Otherwise a new instance of that type is created, either by using the
IReflectable interface or
RTTIPlainType template.
After all objects are decoded the pointer references will be restored as a final step. The method returns an object instance to the top level object that was decoded, or null if no object was decoded.
Similar to encoding, whenever deserialization of a particular sub-type starts and ends we call
RTTIType::onDeserializationStarted and
RTTIType::onDeserializationEnded. You can use those methods together with
IReflectable::mRTTIData for pre- and post-processing operations.
You will get a better insight in how decoding works by taking a look at the binary structure of an encoded object.
Binary data structure
Binary structure determines how is encoded data laid out in memory. It tries to be relatively compact while robust enough to handle all the features provided by the RTTI system like versioning and reference saving.
On the highest level it is laid out as such:
- Top level object
- List of one or more types (More than one in case object derived from another RTTI type)
- Type meta-data
- Type fields
- Field meta-data
- Field data
- Zero or more objects referenced by IReflectable pointers
- Same structure as top level object
In that list there are three basic components to be aware of: type meta-data, field meta-data and field data, so let's continue by describing each.
Type meta-data
This is an 8-byte structure that describes the type that is to follow. It is also the first block you will encounter for each serialized object. This means it is the first block in the file for the top level object, and there will be at least one for every referenced
IReflectable object.
It corresonds to a single
RTTIType implementation. A single object can have multiple types if it derives from a type that also has RTTI of its own, but in a lot of cases it will be just one. Types are laid out starting with the most specialized followed by more general ones.
In its 8 bytes it encodes RTTI type ID, unique instance ID for the object and a couple of flags. Exact encoding looks like this (each character is a bit):
SSSS SSSS SSSS SSSS xxxx xxxx xxxx xxBO
IIII IIII IIII IIII IIII IIII IIII IIII
S - Unique instance identifier for an object instance. This is used when resolving references to objects (
IReflectable pointers).
I - Unique RTTI type identifier that was specified when implementing a
RTTIType
B - 0 if the current RTTI is actual polymorphic type of the object we're encoding and 1 if it is a base class. Used internally to signify if we have reached a brand new object or are just parsing a base class of the current object.
O - Type of meta data. 0 for field meta-data and 1 for type-meta data. Lets us know if we have reached a new field or a new object when parsing.
x - Unused
Each type-meta data is followed by a list of fields, if it has any. Fields consist of field meta-data and field data, described below.
Field meta-data
A 4-byte structure that describes a field that is to follow. It contains unique field ID we provided to one of the
add*Field methods when implementing
RTTIType, field type (plain,
IReflectable,
IReflectable pointer), size of the data to follow and a few other flags.
Exact encoding looks like this:
IIII IIII IIII IIII SSSS SSSS xxYP xCAO
I - Unique field identifier we specified when registering the field with
RTTIType. This is used for versioning - If fields are added or removed after an object was encoded we can detect it using unique field IDs. Then we can skip those that do not exist, and avoid touching those that do not exist in encoded data.
S - Size of the field data. This is used for plain types smaller than 255 bytes. If you expect the size to be larger than 255 then you must set the dynamic size flag in your
RTTIPlainType implementation. This is used as an optimization to avoid 4-byte size overhead for really small types.
Y - Specified for plain fields that do not fit in 255 bytes. Signifies that additional 4 bytes will be allocated at the start of field data.
A - Signifies that a field contains an array of values. In that case this meta-data structure will immediately be followed by a 4-byte integer containing the number of array elements. Each array element is encoded as separate field data entry.
C - Field contains an
IReflectable value type.
P - Field contains an
IReflectable pointer type.
O - Type of meta data. 0 for field meta-data and 1 for type-meta data.
x - Unused
Field meta-data is followed either by a single field data block (in case of non-array fields) or by a 4-byte array size, which is then followed by a list of field data blocks (for array fields).
Field data
This block contains actual field data and its layout differs depending on the parent field type:
- For plain types this is just raw data. If field has dynamic size first 4 bytes specify the size.
- For IReflectable value types it is the entire IReflectable object encoded recursively.
- For IReflectable pointer types it will be just an identifier signifying which object we are pointing to. The actual pointed to object will be encoded after the top level object as mentioned earlier.
This is the final data block included in the binary structure - the entire structure is formed using the three type of blocks described above.
Source code
Those that are interested in actual source code of the described systems, check out Banshee from
https://github.com/BearishSun/BansheeEngine. All the code is contained in the following files:
RTTI
- BsIReflectable - Base interface all types with RTTI support need to implement.
- BsRTTIType - Base class that contains RTTI data for a specific type
- BsRTTIField - Base class used for representing a field in a type
- BsRTTIPlainField - Type field implementation for plain types
- BsRTTIReflectableField - Type field implementation for complex value types
- BsRTTIReflectablePtrField - Type field implementation for pointers to complex types
Serialization
- BinarySerializer - Converts an IReflectable into a stream of bytes, and other way around
- FileSerializer - Uses BinarySerializer to serialize/deserialize directly to/from a file
- MemorySerializer - Uses BinarySerializer to serialize/deserialize to/from a memory buffer
Conclusion
This concludes the article. It has shown you how to create custom RTTI types, use the RTTI data for various needs and how to easily serialize and deserialize objects that supply RTTI data.
One part I haven't touched on is how Banshee handles serialization of script objects, as the scripting system is still in development. However it suffices to say that internally this same system is used, but scripting classes and fields do not require manual definitions of RTTI data. RTTI data is provided automatically by the scripting runtime, while the users may use various attributes to control RTTI information if needed. Compact representation is less of a requirement with the scripting system and focus is instead placed on simplicity. I will touch more upon this in a separate article.
Hopefully you found the article informative, and join me next time when I'll talk about the design of multi-threading in Banshee, primarily focusing on multi-threaded rendering.
Interesting, though, what puzzles me most - you've created this huge system that still requires a lot of user input and yet have named essentially only one use for it - iterating over all fields of a class. Would it not be more productive to simply create template functions for serialization / dumping / iteration and let the compiler take care of the rest?