Advertisement

Plan for scaling a game server using MongoDB?

Started by October 23, 2017 07:45 AM
15 comments, last by hplus0603 7 years ago
2 hours ago, piojo said:

Postgres can run on Aurora

 

No, Aurora is a separate database entirely. You can run Postgres or Mongo on Amazon Web Services, and Aurora is another option they give you which they created and specialise in.

The main benefit of Postgres is that it's an old and well-known piece of software with many people who fully understand it and a development team that has worked hard to make it reliable, robust, and standards-compliant. Mongo, by comparison, is a very new piece of software, designed for a typical web use case of frequent reads and infrequent writes where JSON is the desired output format and where stale data is not a problem. (I have no opinion on Aurora.) I brought up Postgres specifically because it can meet your 'schemaless' use case via the JSONB columns, but is also able (when properly configured) to give you strong guarantees against giving you stale data, even across multiple instances. (But, it is NOT trivial. https://www.postgresql.org/docs/current/static/high-availability.html)

The relative speed of SQL and so-called 'NoSQL' databases is not something that can be compared in a vacuum. MongoDB will usually give you better read speed with its defaults than Postgres might for similar information, but Postgres won't return out of date values if you update during your speed test! And you can adjust both databases to be more or less lenient on this aspect, so it only makes sense to compare speed once you have set them up for equivalent levels of consistency.

Aurora has a Postgres-compatible (mostly) API, just like it also has a MySQL-compatible (mostly) API.

The reason I bring up relational databases as an alternative to MongoDB is that I've found that NoSQL solves some particular use cases very well, but run into a brick wall on other use cases, whereas the relational databases are a lot more flexible, and perform well overall, without excelling at any one thing.

That being said, if your use case is "get blob by key" and "write blob by key," then any database should be able to support that just fine, be it Mongo or Postgres or Scylla. Where they differ is in how you manage them, how you back them up, and how you grow use cases or deal with data that might be written in an older format, as well as which safety-versus-performance trade-offs you can make. Mongo aggressively lets you optimize performance, throwing safety to the wind, and there are use cases where that is fine. The relational databases lean in the other direction.

 

enum Bool { True, False, FileNotFound };
Advertisement

It really is a good idea to start off with your data model first, normalized/denormalized. Run through creating/updating scenarios, and identify critical areas. If strong consistency and complex joins are critical to your game, then you should go the way of RDBMS. To my knowledge, social games that do not require active player-player interactions (e.g Clash of Clans, etc) can safely use NoSQL. I have seen people put the entire player data into a big JSON object and just send that object back and forth between server/client. MongoDB or other key-value NoSQL databases are okay with this type of usage.

Once you start having more complex relationship between players, or you need to do joins based on certain player data like "give item X to all players level 20 and above that only has a chainmail in the inventory and nothing better", that sounds like you are going in the direction of RDBMS. Although there are techniques you can do with kv databases to accomplish the same task, they are typically more like hacks than actual solutions.

7 hours ago, alnite said:

It really is a good idea to start off with your data model first, normalized/denormalized

That's the sort of thing a database analyst would say. ;)

I've never worked on a game that had any idea what its data structure would end up looking like even a few months down the line, never mind at release time. Of course a degree of planning can give you the basic outline and should indicate some key objects but schema and data migrations are going to happen, regularly. Planning for migration is sensible and having worked with both relational and document-style DBs for this, I can confidently say the latter is less hassle as a developer, even if DBAs hate you for it.

9 hours ago, Kylotan said:

That's the sort of thing a database analyst would say.

I've never worked on a game that had any idea what its data structure would end up looking like even a few months down the line, never mind at release time. Of course a degree of planning can give you the basic outline and should indicate some key objects but schema and data migrations are going to happen, regularly. Planning for migration is sensible and having worked with both relational and document-style DBs for this, I can confidently say the latter is less hassle as a developer, even if DBAs hate you for it.

Ha. I am not a DBA, nor do I want to be one. I prefer schemaless document-style DBs personally, and I am also in the middle of researching a good reliable NoSQL database for my own project. Unfortunately, every single one of them seems to be designed for specific cases, and using it for other means can usually lead to headaches down the road. That's why I recommend to start off with the data model and how the data is going to be used, at least some rough idea, because that can narrow it down. Even as simple as knowing the maximum number of items in a player's inventory or do you allow trades between players can determine whether you want to normalize or denormalize your data.

Then hardware requirements can matter. NoSQL tend to have higher hardware requirements. They can claim 100M+ reads/writes per minute or whatever, but if they achieve that with a cluster of high end machines, then it's not really a number to rely on when you are on a budget. Team size can also affect the decision. I don't quite like Cassandra because it seems to be a high-maintenance type of DB, the kind you'd need a designated DBA for. It's highly configurable so most likely suits all kind of needs, but that also means someone needs to know all about it.

I am pretty sure piojo should be okay with MongoDB, or even Postgres if migration isn't much of a concern.

Quote

 

Planning for migration is sensible

 

It is more than sensible, it is essential!

If you use a document store, and store player data as a big JSON document, at least put in a "version number" in your JSON blob.

That way, you can detect whether data that you read is of an older format, or the current format.

Then write code that converts data from version 1 to version 2, from version 2 to version 3, .... and call the appropriate sequence of conversion functions when you read the player.

It sounds easy, and it is, if you do this up front. If you forget about it, you will regret it. You end up hacking in all kinds of special cases in the application for "when this player doesn't have a key named A, do X, else if the key named B is not null then do Y, else ...". Once you've gone down the path of accepting non-canonical data in the application layer (as opposed to the data access layer) then you've essentially already lost, you just don't know it yet.

The other thing to do is to put a write generation count into each document. When you read it, you get generation count C. When you write it back, the data layer can verify that generation is still C, and write it back as generation C+1. If the generation of the data on write is not C, then you return an error that includes the new generation of the data, and let the application level do the merge and re-try writing. This will scale very far, at the expense of suffering unpredictable write latencies for hotly contested keys. (It won't live-lock, because it's guaranteed to actually make progress for at least one writer at all times.)

enum Bool { True, False, FileNotFound };

This topic is closed to new replies.

Advertisement