Advertisement

Database Structure for MMO's

Started by June 20, 2017 05:30 PM
28 comments, last by snacktime 7 years, 1 month ago

SQL like database for online game - is not the best idea. I'd recommend to use NoSQL Schema-less db.

For example, we are using https://rethinkdb.com/. You can try it, or Riak, or something similar. The main advantage of it - you actually don't cate in general about data structure.

If you feel that RDBMSes are not a good idea for an online game, what data and experience has led to that?

There are tons of successful games that use RDBMSes correctly, and get great performance. There are tons of instances where schema-less key/value stores end up costing more than they help, because you can't easily use secondary indices (unless you turn your "schema-less" storage into a poor man's "pseudo-schema-ed" storage.)

And vice versa -- use a RDBMS incorrectly, and you will be slow.

Regarding storing acquired skills as a bitfield (didn't see that two months ago) -- if your table is (characterId not null, skillArchery default null, skillSneaking null, skillSmithing null, ...) then most RDBMS-es will turn the NULL columns into a bit field under the covers, and you still get to marshal/de-marshal as if it were a full struct. You get "null" back for the skills you haven't unlocked, and non-null values for the skills you have a rank in. Seems like a great match to me, especially for a first iteration.

 

enum Bool { True, False, FileNotFound };
Advertisement
1 hour ago, hplus0603 said:

If you feel that RDBMSes are not a good idea for an online game, what data and experience has led to that?

There are tons of successful games that use RDBMSes correctly, and get great performance. There are tons of instances where schema-less key/value stores end up costing more than they help, because you can't easily use secondary indices (unless you turn your "schema-less" storage into a poor man's "pseudo-schema-ed" storage.)

And vice versa -- use a RDBMS incorrectly, and you will be slow.

 

With SQL db you spend a lot of time and resources when your schema changes. That's my point. I'm not sure about performance. It depends on cluster size.

Btw, RethinkDB supports secondary indexes (:

 

With SQL db you spend a lot of time and resources when your schema changes.

Do you have specific experience with this that you can share?

With schemaless implementations, you instead spend a lot of time debugging bugs that happen because some code puts in data in a format different than what some other code reading the data expects. The bigger your game gets, the more likely those bugs are. I'd rather take the small cost of schema updates than the lack of type checking and "have to be able to read any old version of data" problems of schemaless. That's based on specific experience I have with both, so yours may be different :-)

 

.

enum Bool { True, False, FileNotFound };
6 hours ago, hplus0603 said:

 

 

Do you have specific experience with this that you can share?

With schemaless implementations, you instead spend a lot of time debugging bugs that happen because some code puts in data in a format different than what some other code reading the data expects. The bigger your game gets, the more likely those bugs are. I'd rather take the small cost of schema updates than the lack of type checking and "have to be able to read any old version of data" problems of schemaless. That's based on specific experience I have with both, so yours may be different :-)

 

.

I've worked with SQL in CRM field. Usually data structure in this field doesn't change a lot.

Yeh, I understand. For us...We have chosen NoSQL 'cause we can iterate and prototype faster.

You're hitting the nail on the head: Untyped, dynamic systems are great for smaller systems and prototyping, where getting to "can this work" and trying out ideas quickly is more important than operational concerns and long-term maintenance.

Typed, static systems typically require a bit more lifting to get started in, and the benefits don't pay off until the system grows, and you're starting to maintain it and having to live with data and decisions made months and years ago, and having to understand how dozens or hundreds of modules may interact.

This is true for programming languages, for databases, for network protocols, and most other design areas in computer science.

For a start-up with a limited amount of money, looking for product/market fit, the ability to try lots of things quickly is more important than the cost of maintenance, and the code is by definition small. For an established organization, that knows who the customers are, and know what the product is and will continue to grow into, the trade-off is different. (For companies that succeed, there's this hard, harrowing time in the middle, where the company has to transition from soft/dynamic to hard/static methods, which is another topic in and of it's own :-)

enum Bool { True, False, FileNotFound };
Advertisement
On 6/21/2017 at 1:43 AM, hplus0603 said:
Quote

The first solution of column-per-skill seems pretty far from 1st normalized form as I understand it

 

It's totally normal, if you consider a character as being an entity with a well-defined set of skills.

Yes, formally it is all in the definition of the "domain" used in definition of 1NF; however - you will still be pummeled really hard by SQL folks for doing it ;-) (and if, as it was noted somewhere in this thread, there are like 100 skills per player - I won't really blame them too).

Overall, personally I'd rather stay away from your approach #1 (column per skill), as writing SELECT and bindings for 100 columns is ugly, and going above 256 columns can get into strange troubles depending on RDBMS, etc. etc. As for #2 (classical relational JOIN) - it is not bad, but from what I've seen, SQL operations on such items (such as average value of the skill you mentioned) are extremely rare, so it tends to lead to unnecessary overstructuring (which has its own costs, both development-wise and runtime-wise). As a result, my order of preference would probably be #3 (BLOB) - though the same SQL folks will beat me too ;-), then #2 (JOIN), then #1 (column per skill).

It's interesting that this stands even for large organizations (I have seen a migration from JOIN to BLOBs for a 100M-row table in a game with 500K of simultaneous players, not a startup at any rate...).

Just my $0.02 though...

It's great that you point out the limit on number of columns of many databases (typically 255, but check your specific database!)

I don't think SQL people will necessary beat anyone up for having lots of columns. After all, for a customer record, we don't create separate property tables for "name" and "email address" and "mailing address line 1." If you think this table is fine:

CREATE TABLE customer(
customer_id NOT NULL INTEGER,
name VARCHAR(255),
email VARCHAR(255),
address1 VARCHAR(255),
address2 VARCHAR(255),
mailing_code VARCHAR(255),
mailing_city VARCHAR(255),
mailing_country VARCHAR(255),
phone VARCHAR(255),
PRIMARY KEY(customer_id));

Then why is this table bad?

CREATE TABLE character(
player_id NOT NULL INTEGER,
character_id NOT NULL INTEGER,
strength INTEGER,
dexterity INTEGER,
mana INTEGER,
swords INTEGER,
polearms INTEGER,
shields INTEGER,
dodge INTEGER,
swimming INTEGER,
PRIMARY KEY(player_id, character_id));

If you specifically use MySQL/InnoDB, you can even make your tables allow adding more columns without having to update table stores, by pre-creating 255 columns as nullable, named "future_01," "future_02" and so on. Nullable columns end up only taking a bit in the record when they don't have a value, and you wouldn't be referencing them. Then, when you introduce a new skill, ALTER TABLE RENAME some future_ column to the skill you need. That's a very fast data dictionary update!

I also don't understand why serializing out of a single row would be any more or less work than serializing out of a self-join table or out of a BLOB/TEXT field. You still have to look up the string "strength" to some integer variable somewhere, and you very likely want to write that as some kind of IDL that generates code from schema, rather than manually doing it, no matter which of the options you choose.

For most smaller games, the one-row-per-character option is actually surprisingly usable, and surprisingly easy to work with. You only need to bring out the bigger guns if your design and implementation actually exceed what you can do with that initial approach.

 

enum Bool { True, False, FileNotFound };

I've worked on games from small to huge scales, and for mmo type games now days I really prefer document stores.

The thing is that handling stuff like transactions is generally just simpler to do in code.  And what people often forget is you can mess up db transactions easily also, they are not a magic bullet and they only protect you for a small surface area.  One that I don't find as important for most of the stuff we do in game logic.

Add in the fact that at scale, you have significant caching layers which are often in the pipeline anyways, like write behind caching, and the benefits of ACID start to get smaller, because the surface area of the db just gets smaller and smaller.

Note that I do use schema's.  Just because the database doesn't care doesn't mean you have to extend that approach into your code.  That's a choice. 

I'm also not saying relational databases are not good at this, I just think document stores like mongodb are more of the sweet spot for reasons that are specific to this type of game.

This topic is closed to new replies.

Advertisement