Advertisement

Improving persistent data consistency

Started by June 08, 2017 04:29 PM
8 comments, last by samoth 7 years, 5 months ago

Hello people. I'll start of by explaining my current persistent data model and then ask for some advice regarding it.

Currently I'm using a "single file per entity" kind of model to store persistent data. A single file for each account which includes account information, player information, item information belonging to each player, etc. Similar goes for guilds and other separate entities. I do load accounts when someone attempts to login and unload when a backup is being taken. For guilds and other similar entities, I load them and never unload.

This works fine for the most part but, when the server crashes for whatever reason I lose all data which changed since last backup. So my question is, can I adjust my system in a way that even if the server crashes I will be able to recover unsaved data?

If your data is in memory, it disappears when your process ends, so there's no way to save that memory after that point.

If you set up a crash handler, then you get a chance to run some code when the process crashes, and save the data before the process ends. But this is a bad idea because a crash implies that your data is not coherent in some way. It might mean you save corrupt data, making things worse than just not saving at all.

So the only really robust way to avoid losing data - apart from removing all crash bugs, and installing an uninterruptible power supply - is to ensure that the difference between what's on disk and what's in memory is minimised. This means writing your changes to disk more frequently.

This is less practical if your only method of saving data is to write out a whole file for everything about an entity. That was pretty standard for MUDs back in 1993, but the world has moved on since then. There are various ways to improve on that model:

  • Ensure you're not writing out immutable data, which is often shared across entities and can be stored separately.
  • Write only what changes, when it changes, or as soon afterwards as possible. This is easier if you split the entity data into multiple files, or...
  • Use a database. Databases are designed to give you granular access to data and to let you only write exactly what you want to write.
Advertisement
File-based persistency is simple, but has a few limitations. Think through these questions to see whether you can live with those limitations:
Does the Guild entity store the members of the guild?
If so, what if two people join the guild at the same time; will you have two separate processes/threads trying to write the same file?
Also, if your system is every popular enough that you need more than one server, how do you think the file-based persistency will scale?

To easily move away from file-based persistency:
You could change your persistency to be in a database very simply; create a table that has a "filename" key and a "data" blob, and store/update the contents of a file in a transaction.
The benefit of this is that databases can support locking transactions, and multiple servers can talk to one database.

The other problem you seem to be having is that you don't commit changes to disk other than "as backup."
When any important change happens (member joins guild, players trade, loot is awarded, etc) then those changes need to be committed to disk.
For file based persistence, this means re-storing the file on disk.
Thus, you need some kind of queue of "entities that need storage" and you need some kind of process/thread that takes items from this queue, stores them to disk, and repeats forever.
The art of doing this is to balance "which kinds of changes are important" with "how many writes overall can I actually do on my system?"
Thus, players moving around, or chatting, should not generate storage requests, but more important events should.

Finally, I hope that you use "safe save" for storing new copies on the file system.
This means that, to "overwrite" file A, you create file A.tmp, write all the new version data to A.tmp, then close and flush A.tmp, then run link(A, A.old), rename(A.tmp, A), unlink(A.old) to actually "commit" the change.
This way, if you crash in the middle of storage, you will not be left without a working file (either old or new.)
When starting up, you should unlink all files named .tmp or .old, to clean up from any potential previous crash.
Also make sure you understand the difference between fsync() and fdatasync().
(I'm assuming Linux here; if you're using Windows, there are equivalent concepts except rename() doesn't allow overwriting existing files, so you need to add another step in the link/rename safe save algorithm.)
enum Bool { True, False, FileNotFound };

If your data is in memory, it disappears when your process ends, so there's no way to save that memory after that point.

If you set up a crash handler, then you get a chance to run some code when the process crashes, and save the data before the process ends. But this is a bad idea because a crash implies that your data is not coherent in some way. It might mean you save corrupt data, making things worse than just not saving at all.

So the only really robust way to avoid losing data - apart from removing all crash bugs, and installing an uninterruptible power supply - is to ensure that the difference between what's on disk and what's in memory is minimised. This means writing your changes to disk more frequently.

This is less practical if your only method of saving data is to write out a whole file for everything about an entity. That was pretty standard for MUDs back in 1993, but the world has moved on since then. There are various ways to improve on that model:

  • Ensure you're not writing out immutable data, which is often shared across entities and can be stored separately.
  • Write only what changes, when it changes, or as soon afterwards as possible. This is easier if you split the entity data into multiple files, or...
  • Use a database. Databases are designed to give you granular access to data and to let you only write exactly what you want to write.

I've been considering using databases but am afraid of the initial workload it could bring. Using file-based storage is really nice for me when developing but using a database might be inevitable due to limitations pointed by hplus0603.

Does the Guild entity store the members of the guild?
If so, what if two people join the guild at the same time; will you have two separate processes/threads trying to write the same file?
Also, if your system is every popular enough that you need more than one server, how do you think the file-based persistency will scale?

As you mentioned I don't commit anything to disk other than when I'm taking backups so I don't actually write to a guild entity file when someone joins a guild until next scheduled backup.

To easily move away from file-based persistency:
You could change your persistency to be in a database very simply; create a table that has a "filename" key and a "data" blob, and store/update the contents of a file in a transaction.
The benefit of this is that databases can support locking transactions, and multiple servers can talk to one database.

I will see if I can move onto using a database. My last attempt wasn't very successful but that was mainly due to me using them improperly.

The other problem you seem to be having is that you don't commit changes to disk other than "as backup."
When any important change happens (member joins guild, players trade, loot is awarded, etc) then those changes need to be committed to disk.
For file based persistence, this means re-storing the file on disk.
Thus, you need some kind of queue of "entities that need storage" and you need some kind of process/thread that takes items from this queue, stores them to disk, and repeats forever.
The art of doing this is to balance "which kinds of changes are important" with "how many writes overall can I actually do on my system?"
Thus, players moving around, or chatting, should not generate storage requests, but more important events should.

This is actually something I'm not sure about. With file-based storage I feel the need to either save ALL game data at one successful attempt or none at all. Reason being, for example when 2 players trade an item, I would need to re-write 2 files. If something goes wrong after re-writing the first file I'd end up with a duplicate of same item. Which is probably another thing which could be solved automatically by moving onto a database storage.

Finally, I hope that you use "safe save" for storing new copies on the file system.
This means that, to "overwrite" file A, you create file A.tmp, write all the new version data to A.tmp, then close and flush A.tmp, then run link(A, A.old), rename(A.tmp, A), unlink(A.old) to actually "commit" the change.
This way, if you crash in the middle of storage, you will not be left without a working file (either old or new.)
When starting up, you should unlink all files named .tmp or .old, to clean up from any potential previous crash.
Also make sure you understand the difference between fsync() and fdatasync().
(I'm assuming Linux here; if you're using Windows, there are equivalent concepts except rename() doesn't allow overwriting existing files, so you need to add another step in the link/rename safe save algorithm.)

That I believe I didn't mess up. I do take a backup to a separate folder first and then copy that backup to a dummy folder to finally swap with live server folder.

when 2 players trade an item, I would need to re-write 2 files


Yes! This is why database transactions are a thing :-)
enum Bool { True, False, FileNotFound };

There's one last thing I'd like to ask before I delve into attempting to move onto database storage. Should I keep storing data as binary blobs or structured tables (a different table for account information, player information, item information, etc.)? Binary blobs seem simpler but I would need to serialize whole account data every time I want to make a relatively small commit like an item deletion.

Advertisement

This would be a stupid idea. But you can set up a contingency server. The idea here would be that the main server sends data to the disk periodically, and sends data to it's contingency process/server. If we're going with a backup server, then the data drives must be located else where.

The back up server is just receiving duplicates of the data that the main server gets. If the backup stops recieving packets from the main server. Then all users will be immediately rerouted to the backup server. Data will be immediately saved, and users will be safely disconnected after being informed of some problems.

Should I keep storing data as binary blobs or structured tables


That question has no right answer. Some people swear by one option; some people swear by the other; successful games of large scale have been shipped on either.

A third option is to store entities as "documents" in a "document store" that lets you edit parts of the document -- this maps typically to JSON databases like Cassandra / Scylla / Riak / Amazon SimpleDB and so forth.
enum Bool { True, False, FileNotFound };

I want to add my support for the idea of using a document store. It's a good compromise approach, letting you keep a good degree of control over how you load and save things, without losing all the benefits that a database brings you. You can start off with a blob-like approach and migrate to having more explicit fields later, if you like.

If power failure or outright hardware failure is not an issue (and oh boy... believe me, it is!) then you can protect against "data loss because process died" problem rather easily.

Note however, that a runaway process can still very easily fuck up your entire dataset even without aborting (runaway code could e.g. just overwrite random memory locations!). If you haven't saved your data somewhere, you're in trouble.

Assuming something POSIX-like here, but you can do it on any other system, too. Spawn a "launcher" process that creates a large shared mapping, and have the launcher fork/exec the actual server thereafter. Whenever waitpid tells the launcher that the server exited in a non-clean way (WIFEXITED(status) == false), it fork/execs again right away, restarting the server (which reads its data from the same still-existing mapping). Otherwise, it's a regular server exit, the launcher writes data to disk, and exits. Note that if you make the mapping file-based (not anonymous), you can skip the write-to-disk step, the OS will do it for you (just hopefully, power doesn't fail half-way!).

The launcher/watchdog and the actual server can even be in the same executable. In that case you only need to fork and can skip the execve, just call server_main_loop() if fork returned zero (i.e. you're in the child process). The restart logic should be simple enough so you can guarantee with 100% confidence that there will be no bugs/failures happening in there.

But remember: There's still power failures, and there's hardware failures. If "all data since yesterday's save point gone" is not acceptable in such a case, you should really, really, really, consider using a database, with and without transactions, and with at least eventual consistency.

This depends a lot on the situation, some things absolutely must be transacted, but not everything needs to be, and usually not the entire world must be consistent at all times, but some things need to be. Some things change 10 times per second (think hit points), others once every few seconds (pick up gold/items, trade with another player), others change maybe twice per day, once per week, or less often (think guild membership, or achievements). Not all of them are of equal importance, or need to be equally consistent over a transaction, or within the world state.

What tool exactly you should use is therefore difficult to tell. Most people will use more than a single tool because one tool does not serve everything well enough.

Storing blobs or "documents" in some kind of key-value store is usually much faster with much higher transaction counts, but on the other hand, stock SQL databases are sufficiently fast for some operations, and they offer the ability to run an analysis (if, one day you're inclined to do that!) which is sheer impossible otherwise (well, not impossible, but you know...), and they are very well-suited to some tasks, making your life a lot happier.

Decide to make a highscore board a year later? WIth a stock SQL database, you replicate to a slave (which takes like 3 commands to set up) and there you run a SELECT * FROM characters ORDER BY score DESC LIMIT 20 or something similar on the replicated data, and there you go. That's it. Want to do the same thing with data stored in binary blobs or JSON documents? Here's a rope, go hang yourself.

Need to do 50,000 transactions per second? Well, good luck trying with a SQL database. For a key-value store that's not much of a challenge.

This topic is closed to new replies.

Advertisement