Why is scaling an mmo so difficult?
I think the best option is to run the safe code on the server, use databases to store players game state and balance out the load between servers. Any tips here?
That's exactly it! Now, good luck developing!
The tricky part is in the how - how do we "share load" between servers, in a seamless fashion to the end user? How to we divide the calculations required across multiple servers? Do we split them geographically in the real world (for faster latency response) or in world (for faster lookup of local entity data)? How should we handle interactions across servers?
I'm not an MMO specialist, but there are a number of topics recently posted in this forum which should be of use :)
so how is this massively different to a multiplayer game?
Scale is really the challenge of MMOs, in two big ways:
- you will need to host significantly more players per process than you would in a conventionally-scoped multiplayer game; you will need to account for that through various avenues of optimization, including distribution of the player load across multiple processes and/or machines, which introduces challenges inherent in distributing the processing of what needs to appear as a single integrated world in some form. Conventional multiplayer games may have lots of concurrent players, but only a handful of those tend to be grouped together in the same playable experience. You will need a way to solve and/or alleviate the problems of the scenario whereby everybody in the world tries to go to the same spot and cast their most physically intensive spells or whatever.
- you will actually need to have enough players to stretch the system to something resembling "massive." In many ways this can be more challenging than the first bullet point for hobby developers. The technical aspects you can learn over time, the hardware you can buy with sufficient money. But the players? Players are not so easily acquired. A conventional multiplayer game has a peak concurrency per game of something like 64. Maybe 128. To really be considered "massive" a multiplayer game should probably be capable of handling a peak concurrency in the tens of thousands. Without actual players ('bots' can only do so much) it's hard to even know if you've solved the scalability problem.
These problems are easy enough to solve if you have enough money.
Scalable hardware and enterprise level equipment and staff to run it 24/7 costs serious money. We aren't talking bargain basement hosting here.
This is why it's hard to scale, programmatically it's simple enough for someone with experience of networking code and general IT, it's just that indies and newbies can't afford to scale that high and get stuck after about 1000 users because all the users are stuck on a couple of cheap hostgator servers...
Games/Projects Currently In Development:
Discord RPG Bot | D++ - The Lightweight C++ Discord API Library | TriviaBot Discord Trivia Bot
1) If a developer does not have any experience specifically with MMO-type deployments, said developer will often believe that experience in a nearby area (like 8-player FPS, or web-based enterprise forms applications) will transfer. It turns out, MMOs have a different usage profile. And, more importantly, MMO GAME DESIGN has a different profile -- you need to design a game that is implementable and scalable. This involves collaboration between engineering and game design in a way that's different from lower-end systems. Also, if the engineers don't have the applicable experience, they probably can't give very good guidance to the designers. Thus, many studios find that their MMO is "hard to scale" because it's "hard for them," just like anything new and different is "hard."
2) The sheer amount of dollars involved is bigger than many project managers want to admit to. Developing enough content to keep thousands of users happy for many months on end is a huge, massive, undertaking. And every three months you have to do it again, for the expansion packs, to keep the players coming back. This makes it "actually hard," in the sense that you may not have sufficient dollars to generate all the art, code, debugging, deployment, and maintenance needed. (Which extends into ongoing customer service, billing, community management, ...)
This is why it's hard to scale, programmatically it's simple enough for someone with experience of networking code and general IT, it's just that indies and newbies can't afford to scale that high and get stuck after about 1000 users because all the users are stuck on a couple of cheap hostgator servers...
Why can't people use cloud services? Amazon EC2 seems like a pretty solid choice, and it seems to be getting cheaper. You could spin up/down server instances as required if it's designed for it. If a game was just getting started it would be cheap to run a few servers, but if the game really takes off you can just throw more servers at it. Also there are options for putting servers in specific availability zones, and also the communication between servers in the same region is really fast.
Also they take care of all of the server maintenance, so that's cool.
EC2 isn't free, "designing for it" is still a non-trivial challenge, and there's something to be said for having and controlling your own datacenter. Including not having to wait for Amazon technical support when shit hits the fan.
It's not a intractable idea, but it's not without its own risks and challenges (technically, logistically, business-wise).
Amazon auto scale the servers/services as needed.
Including not having to wait for Amazon technical support when shit hits the fan.
And half the time you do get a hold of Amazon technical support the advice boils down to "that instance is hosed, better spin up a replacement." Most of the rest of the time it's a transient problem affecting the entire availability zone or even the entire region and you just have to wait until it magically gets fixed. Not to mention Amazon will frequently schedule instances to be rebooted or shutdown sometimes with very little warning. And if you ask them about why the instances have such poor uptime they'll tell you up front that they use very cheap hardware that has a relatively high failure rate. None of these are particularly big issues if you're doing something like running a web service which you can stick behind a standard load balancer. Sure it'll cause your dev ops guys some headaches when they find they need to deal with the fact that Amazon's going to reboot 75% of your servers with only four days notice (not that I'm bitter), but you can still get good service uptime despite EC2's relatively poor availability at the instance level. However, if you're not doing something where you've got short connections where you can treat all of your servers as functionally identical, then you've got some serious design work to do if you don't want constant annoying downtimes.
Why can't people use cloud services?
Sure they can! If the application is already designed and implemented to shard properly across hardware. Which goes back to the know-how part in the first place.
Also, Amazon, specifically, is not always the right choice for two reasons:
1) Virtualized servers introduce scheduling jitter, which will end up causing lag for all connected players if your game depends on timely server response times and/or real-time physical simulation.
2) Amazon actually charges a fair bit -- not only for the hardware, but for the network bandwidth. At work, with top-tier bandwidth providers, and running our own co-location facility with our own hardware, we operate our service at less than half the cost (operations team included) than it would cost on Amazon.
For some kinds of games, Amazon is great! Zynga credited Amazon with their being able to scale FarmVille to the growth in users. As they said "at one point, we literally could not have manually provisioned new instances at a sufficient rate, so we were happy to have that scripted!"