Advertisement

30k concurrent players on a (private) MMO server...is this possible ?

Started by November 30, 2014 10:46 PM
25 comments, last by Butabee 9 years, 10 months ago
Here's all the game logic you need:


auto newpos = player.pos + player.vel;
if (!map.blocked(newpos)) player.set_pos(newpos);
map.blocked() is a simple grid or hash table look-up.
player.set_pos() is a variable update and an update of the map -- either hash table delete/insert, or map grid write-write.
There's approximately three cache misses worst case (this includes dereferencing player, which can be pre-fetched if you loop through all players.)
Three cache misses at the time was about a microsecond total. That leaves a frame rate of 33 Hz for 30,000 players, minus whatever overhead for networking.
(My guess is it ran nowhere near 30 Hz on the server, though.)

On modern machines, there's some chance you can keep the map in L3 cache, which means only one cache miss per player -- the actual player record. And, as I said, that could be pre-fetched.

For networking, you'd likely want to use epoll, kevent, or I/O Completion Ports (depending on O/S).

Physical simulation makes for a significantly more CPU hungry implementation, especially if 3D geometry collision is involved.
enum Bool { True, False, FileNotFound };

For something round-based, this is pretty trivial, but for something kind of "action, PvP", I think the claim is a bit unreasonable for a single server (withough e.g. doing some P2P trickery -- which is not really "a single server" -- as suggested above).

Here's all the game logic you need
Well yes, almost. Though in reality it needs to be a little bit more complex, and the "little bits" are unluckily quite expensive. Something like:

for(;;)
{
while((result = system_poll()) != timer_object_id)
{
    if(is_connect(result))
        log(add_new_player_blah());
    else
        log(parse_command(decrypt_and_verify_auth_token(recvfrom(...))));
}

for(auto p : player)
{
    auto newpos = player.pos + player.vel;

    if (!map.blocked(newpos))
        player.set_pos(newpos)
    else          // a non-cheating client software should normally
        log(...); // not allow this move in the first place
}
for(auto c : clients)
    send_updates();
}

In particular, send_updates must figure out what to send to each client. Sending the complete map is out of the question (death by bandwidth) as is sending a fixed small subsection of the map (including tiles with no changes, but possibly excluding tiles outside your area which are however interesting). Deciding what to send is a O(N2) problem, and N2 for large N is not precisely a trivial thing to cope with. Of course you can prune away some of those 30k (zones, spatial hashing, you name it), but

the OP makes explicit mention of "PvP", "siege" and "very few small areas", which suggests that no matter what you do, N will be considerably large. If only 1,000 of those 30,000 fight against each other in each siege, then collision/interest queries are 1 million operations (so, 30 million alltogether, for 30 such sieges). That's without actually creating a response, encrypting it, copying data to the network stack or processing it. 30k calls to send are not entirely free either (assuming most people are actually actively fighting, they will want updates every round!).

Also, system_poll() (that might be epoll_wait or whatever) and actually reading and processing the readied sockets will have considerable overhead which is more than just a few cache misses. The usual assumption is that connecting to 30k clients means most are inactive most of the time anyway, and epoll is O(1) in respect to that. However, it is of course still O(N) in respect of actual stuff that happens. There is no way it could be any different, if N events happen, you must process N events.

For "PvP in constrained areas" that "most are inactive anyway" assumption does not hold -- most are active most of the time.

A single call to any not-totally-trivial kernel function like epoll_wait or recvfrom costs a minimum of 5,000 clocks (I haven't actually measured this to be honest, but I think it's a rather conservative estimate for those two functions -- in fact, receiving a datagram may easily cost 4-5 times as much if you include the kernel overhead for interrupt, going through iptables, reassembly etc, and copying data on the user end), so doing maybe 5k calls to epoll_wait and 20k calls to recvfrom per second might already consume roughly 40 milliseconds, and there you haven't actually done anything useful yet (yes, 40ms does not look like a lot, there are 1,000ms in a second... but mind you, that's just for receiving stuff, not actually doing something yet -- plus you may want more than just one update per second, at 10 ticks per second, you only have 100ms available).

Especially in a PvP game, regular complaints about cheaters will inevitably come, and you will have to deal with them in a manner that satisfies the (anyway never satisfied) audience. Which means that as the most basic thing you need to log pretty much everything, and logging must be kind of reliable and failsafe, and in a format so your customer service / moderators / bot heuristics can easily and efficiently replay, parse, or query data. Which, of course, is possible but not at all a "free" operation.

Advertisement

RO-players are often bragging and exaggerating things - he was probably speaking of 300, not 30000.rolleyes.gif

The game got inefficient TCP networking and IIRC doesnt even set the nodelay option. Its well known that an official server (actually a server-cluster) would hold about 2000-3000 people, in exceptional case 8000. And if there were more than 200 on a single mapserver in WoE it started lagging, at about 250-300 moves could take a minute until the server reacted, skill commands would be ignored and people would get disconnected constantly from CPU-overload. I remember a company licensing the game apologised for the constant lag and thereby admitted the game was so inefficient that even though they bought the fastest machines available (although they got told their old machines were good enough which they coudnt believe) it did not help much.

I doubt the server emulator I heared the pservers were using was much more efficient and even more, there would never have been that many people on an illegal pserver.

hplus0603/samoth,
That is an interesting view on processing power needed to just update on player movement. My sentiment however that O(N^2) problem must also apply to his server bandwith requirement, considering the exceptional number of players in his claims... is this correct?

wintertime,
While I do personally believed that he is overexaggerating, I lacks the facts to refute his claims. Considering the huge advancement in mmorpg in recent years especially from networking perspective, I thought I might have missed something that might make it a possibility.

Another question though, in order to support such high number of players, are the consumer level networking devices (e. g router) able to cope with the load? Or do I need special switch/router/etc for the server?

My sentiment however that O(N^2) problem must also apply to his server bandwith requirement, considering the exceptional number of players in his claims... is this correct?
Not necessarily, no. Actually, not at all smile.png

You usually decide for some maximum bandwidth per client (in the simplest case) that you will allow, and then you figure out what you can send in that bandwidth, ordered by importance. So the per-client bandwidth would be "constant" and the total bandwidth requirement would be O(N). But you can of course do a less naive bandwidth management, so the total bandwidth is capped as well (not only limited by what can physically go over the cable).

Your character's position and life status would be among the most important things to transmit, as well as anything that happens very close to your character (other character swinging a sword at yours, etc). The farther away things are, the less important frequent updates are (usually, that is). Some events are more important than others, too (someone shooting a fireball at you vs. a team member dying vs. someone putting on a silly looking hat).

One common strategy would be to find the most important X events (final importance being a function of distance and event importance) and transmit those changes, presumably giving events that were not transmitted a little more weight for the next round. That's basically a nearest neighbour search which is O(N2) although with a structure like a kd-tree it would be O(log(n)) at the expense of extra memory and the cost to build and maintain the tree (which is a nuisance with moving/changing objects, rebuilding from scratch is most probably easier and faster than maintaining the tree). In any case it is not a totally trivial thing to get right, and not totally trivial on the CPU side.

If you have tiles, a different strategy could be to "walk" the tiles in a spiral-like manner starting from your character's position, and send updates from events that happen on tiles that you touch, until you have X events together. I'm not sure what complexity that is, but it might actually be O(1) in respect of the number of players, since nothing really changes when there are more players. However, this is certainly not the most "deterministic" runtime (you might visit 5 tiles or 50, depending on what happens when and where). Nor is it the most cache-friendly way possible to traverse data.

You guys are over complicating the problem. The problem isn't to create a "proper" MMO with proper validation, it's to create a ragnarok online server capable of having that many players...

AKA a server that takes game updates from the client (nonvalidated). Of course it's possible on a P4, but you'll get people teleport hacking/breaking collision.

Now creating a simulation with that many users where the client isn't trusted... That sounds very iffy. I suppose it depends on the game's physics rules. It might be possible to run a simulation with that many users... It also depends on how important the data is. If there's 50 characters entering a region do they all need data, or can it be sent gradually?

So if there's 50 players entering a region the player is going to, how much data needs to be sent to get the simulation accurate (Besides things like customizations etc).

Wait did he work on the client or just the server?

Advertisement

My sentiment however that O(N^2) problem must also apply to his server bandwith requirement


Bandwidth really isn't the problem. Gigabit interfaces were standard in 2005, and a gigabit commit of co-lo network traffic is about $2k/month (probably was somewhat higher then, but not enough to kill a game.)

Of course it's possible on a P4, but you'll get people teleport hacking/breaking collision.


And, for all we know, that's exactly the experience that was delivered.

Btw: If the map is a grid, you just need to send the occupied data of the WxH cells around the player with any frequency. I'd imagine that, the more occupied the grid is, the less frequently you'd get a full refresh.

The real outcome of this discussion is likely "you can push any one axis about as far as you want -- it's pushing on all axes at the same time that's hard."
enum Bool { True, False, FileNotFound };

The one thing I didn't hear anyone discuss is, where did he get 30k players? That many players at once implies a huge user-base.

I played RO for a long, long time, the most players online I have seen on a server was ~20k on iRO, near the launch.

If there is a private server with 30k players, I would play in it @.@

On topic.

Some very important information is missing in this topic:

- RO is a zone based game, to pass from one zone to the next you need to enter a warp. This makes away easier for the server to spread the players around.

- RO had several big cities, people would spread mostly based on their levels (I would pick mine based on the music, search aldebaran theme on youtube, you won't regret it).

- WoE was played in 20 different castles, each with more than one line of defense. I would say that 1k people in a single attack is possible, but very unlikely, heavy attacks would have 300 players involved (search "how to break an unbreakable defense" on youtube and you will see how massive the attacks can get).

From my experience I can say I hardly ever lagged, even on the most hard core invasions, sometimes I would take some times to be transfered from one zone to other, but rarely had a laggy experience from the server lagging (and I player with 128 ~ 256kb connection at that time).

Currently working on a scene editor for ORX (http://orx-project.org), using kivy (http://kivy.org).

The one thing I didn't hear anyone discuss is, where did he get 30k players? That many players at once implies a huge user-base.

Probably clones and duplicates rolleyes.gif

This topic is closed to new replies.

Advertisement