Advertisement

Problem serializing small ammount of data

Started by March 31, 2018 07:54 PM
15 comments, last by FFA702 6 years, 10 months ago

I have a save/load function in my 3D engine which was working quite well in the past. I've just added terrains (which are very small, no more than 100 triangles big) but I can't get them trough my save/load pipeline. They're quite a bit larger than any single model I've had to deal with in the past. I have no problem loading the model in my engine, it takes less than a second, but I have a problem serializing it for my game save/load function which uses a different format than my model format. The whole terrain model file looks like this:

 


-25,4.48,-25:251,252,250
-24,4.58,-25:250,253,250
-25,4.62,-24:250,252,251
-24,4.58,-25:250,253,250
-24,4.62,-24:250,253,251
-25,4.62,-24:250,252,251
-25,4.62,-24:250,252,251
-24,4.62,-24:250,253,251
...

 

but this is nearly irrelevant as my save function only serializes the string using :


data += Serializebytes(Encoding.ASCII.GetBytes(entity.model.StringContent));

where Serializebytes() is :


 static string Serializebytes(byte[] bytes)
        {
            char byteSep = ' ';
            string byteString = "";
            foreach (byte byten in bytes)
            {
                byteString += byten.ToString();
                byteString += byteSep;
            }
            return (byteString);
        }

 

The problem really is with my serialize bytes method, the program hangs there as it takes forever to process. This worked fine for the very small model that composed my previous scenes but does not for the terrain.

The function is simply supposed to take the bytes given by Encoding.ASCII.GetByte() and put a space between them. I figure this is something that should be relatively fast, even with very large amount of data. The perfect solution would be a simple function that would replace Serializebytes(). Thanks in advance guys.

Any particular reason you are serializing to a string, rather than using a binary format of some kind? Given that you already seem to have a byte[] array ready to go, you could just spit that out directly, rather than passing it through an intermediate string representation. Writing a byte array to a Stream or file is trivial, and reading it back is just as easy. It will also be many times faster than converting to and from a string.

Advertisement

Alternatively, write each stringified value directly to the output stream, rather than building a large string and then write that string.

Now that's interesting. The only reason I do it that way is because like many, I'm a self thought programmer and I've always done it this way in the past.

The reason I have this byte[] system is because I use String.Split() to unwrap the content of the file and fill my game state, and the file itself contains script that might contain the characters I use as delimiter.  So if I wanted to minimally alter my code could I write the byte[] directly to the file but still put my delimiter characters between each items I pass to my function ?

The save file string itself is just a bunch of encoded characters with several delimiters throughout. This is what it looks like :

https://pastebin.com/ENUgGHrG

(it's also attached in the post if the link stops working)

What I like about this single "game state dump" solution is that my game is essentially a single file. I can open this file in my editor and modify everything from there and even handle casual end user save/load by dumping the whole game state into a file and essentially create a new game. It's really elegant and since I don't expect my game to be more than a few mb (i have my own ultra low poly model format, and use most of what is provided by the c# form classes for the interface) , I can't really see why it wouldn't work. I'd really like to keep it that way.

TestCopy2.txt

 

EDIT: I've tried your suggestion (I think) using BinaryWriter but it's pretty useless in my use case because using ".Write(Encoding.ASCII.GetBytes(string i want to encode and write))" it just writes the string verbose which defeat the whole purpose of encoding it first. 

 

EDIT2: I've got it working by using mellinoe method. Now the loading system has the same problems. This is trickier because it's not reading the file that pause problem, it's the actual operation to reverse the serialization. Here is the code. I'm having some serious issues with this.


static string Unserializebyte(string bytes)
        {
            string UnserializedBytes = "";
            string[] EncodedBytes = bytes.Split(' ');
            foreach (string charx in EncodedBytes.Take(EncodedBytes.Length-1))
            {

                try
                {

                int chary = Convert.ToInt32(charx);

                UnserializedBytes += (char)chary;

                }
                catch (Exception)
                {

                    //throw;
                }
            }
            return (UnserializedBytes);

 

 

The problem, is that constructing a large string is quite deadly for performance, so that's what you should try to avoid. I don't know C#, so I used Java, which is close enough I hope to get the idea across:


            byte[] bytes = {1, 2, 35, 8, 127};

            // Open a text file for writing
            BufferedWriter handle = new BufferedWriter(new FileWriter("output.txt"));

            // Convert each byte to a string, and write the string
            for (byte b : bytes) {
                String s = Byte.toString(b);
                handle.write(s);
                handle.write(' '); // Add a space after each number.
            }
            handle.write('\n'); // Add a newline at the end.
            handle.close(); // Close the file.

The code constructs a string for each value, and writes that string to a text file. This code makes many small strings, which is not a major problem.

 

And the file looks like (as you'd expect)


1 2 35 8 127

 

I hope you can convert this to something C#-ish if you like the idea.

Like @Alberth said, strings are going to be deadly for performance, especially if you are using them this way (to serialize individual bytes of a massive byte array). Your goal should be to have zero strings involved in your entire serialization pipeline (unless you are actually serializing a string...). If you are serializing mesh data (or terrain data, etc.), then there's no reason to have a string at any point. You're dealing with geometric data (presumably), not text.

Question: What does the actual deserialized data look like? Your engine doesn't deal with a string or with a byte array when it's rendering this terrain data, right? That is just another intermediate representation. Your engine must be dealing with something that contains an array of vertices or some similar data structure in order to draw the terrain/mesh. Instead of thinking about how you can read and write your intermediate string, you should be thinking about how you can most efficiently read and write the actual data you're interested in.

Advertisement
8 hours ago, FFA702 said:

 



static string Unserializebyte(string bytes)
        {
            string UnserializedBytes = "";
            string[] EncodedBytes = bytes.Split(' ');
            foreach (string charx in EncodedBytes.Take(EncodedBytes.Length-1))
            {

                try
                {

                int chary = Convert.ToInt32(charx);

                UnserializedBytes += (char)chary;

                }
                catch (Exception)
                {

                    //throw;
                }
            }
            return (UnserializedBytes);

 

 

I think I speak for everyone here when I say:  "What the F*** are you even doing?"

  • Never use += with strings like that.  Just don't.  You're performing N^2 work for no reason and that's why it's horrifically slow. Use a StringBuilder.
  • I really doubt casting an int to a char is doing what you think it's doing.
  • Do you know how to use a debugger?  It should be obvious that something's wrong while you step through that.
  • Seriously, why are you using strings?  Do you LIKE your code being slow and broken?

Also, why are you saving out terrain information, unless it's being changed by the player, but then just save the delta's not the whole terrain.  Also, unless you're writing out a character name, this should all be done in binary as everyone else has said/suggested.  Also as mellinoe asked, all your model data is held in memory not as strings but as bytes, correct?  Then just write them out as bytes, don't do this bytes->(save)string->(load)string->bytes conversion that is horrifically slow.

"Those who would give up essential liberty to purchase a little temporary safety deserve neither liberty nor safety." --Benjamin Franklin

9 hours ago, Nypyren said:

I think I speak for everyone here when I say:  "What the F*** are you even doing?"

(emphasis added)

You certainly do not speak for me. Especially given that this is the Beginners forum, a bit more respect and leniency would go a long way.

If we take away the hostility/rudeness from your list of points, then I would agree with them a lot more -- even more so if some explanations were added in order to help people understand and learn.

Hello to all my stalkers.

I don't agree that using binary data file formats is better. Yes, the input/output code is simpler to write, you don't have the data -> string -> textfile conversion (and vice versa on loading), and binary data is harder to inspect and modify for the user.

On the other hand, debugging from a binary data file is more complicated, since you cannot simply open the file and read what it says. You can also not easily edit it, or author new files (eg new levels for your game). I think that's a lot to give up for gaining simplicity of writing input/output, especially in non-commercial games.

This topic is closed to new replies.

Advertisement