Advertisement

SpatialOS single shard MMO

Started by June 20, 2017 12:04 AM
29 comments, last by hplus0603 7 years, 4 months ago

Has anyone tried SpatialOS?

https://improbable.io/games

I'd like a different perspective I don't pick up a lot just by thinking about it myself.

Is this stuff quite similar to Pikkoserver though?

Are workers/services really the way forward. I don't see any detailed explanation or case studies, just presentations.

I know of two members here who may be able to help, and there may be more but I'll tag them. @JWalsh's studio Soulbound Studios has partnered with SpatialOS.

@riuthamus was talking to them at one point as well for their Greenlit game, but I don't know if anything came of it. Perhaps there are others.

For large scale simulation SpatialOS is probably a strong future since it's based on the fundamental concept of distributed simulation architectures. Pikkoserver sounds similar, but to be honest I had never heard of it until your post.

The concept that SpatialOS provides has been around a long time. In the simulation world, distributed simulations are a common architecture. SpatialOS takes that idea to the next level by moving the distributed concept to "the cloud", where computation is relatively cheap and easy to scale. They then take that to the next level with gaming by offering engine integrations and other technology that makes it easier to work with their platform.

As I understand it, in SpatialOS you define behaviors for entities in the world, and these entities are running on their servers. From a backend perspective the servers scale as the number of entities scale. In theory you could simulate every detail in a world by just adding more computational power.

 

 

Admin for GameDev.net.

Advertisement

Every five years, some company with expertise from some "adjacent" technology area (finance, mil/sim, telecom, geospatial, etc) believes that they can do games better! They will win with their superior tech!

I have not seen a single one of those actually manage to get anything real and lasting into the marketplace. Not for lack of trying!

Is Improbable different? Perhaps. They got a giant investment from SoftBank, which might mean they have something neat and new. Or it may mean they have good connections to investors who have different investment criteria.

Quote

 

moving the distributed concept to "the cloud", where computation is relatively cheap and easy to scale


 

You mean, in the "cloud," where latencies cannot be managed, where noisy neighbors can flood your network, and where two processes that communicate intimately end up being placed on different floors of a mile-long data center, and that charges 10x mark-ups on bulk network capacity? That "cloud"? Or is this some other "cloud" that actually works for non-trivial low-latency real-time use cases?

In the end, though, most games just aren't well-funded and big enough to actually make a lot of sense for more business-focused companies. And, those who are (OverGears of BattleDuty and such) they end up distinguishing their games from others by integrating gameplay with networking with infrastructure really tightly, and at that point, a "one size fits all" solution looks more like "one size fits none."

Don't get me wrong. Distributed simulation is a fun area to work in, and there are likely large gains to be had through innovative whole-stack approaches. History just shows that the over/under on any one particular entrant in the market is "not gonna make it."

enum Bool { True, False, FileNotFound };
3 minutes ago, hplus0603 said:

That "cloud"? Or is this some other "cloud" that actually works for non-trivial low-latency real-time use cases?

That's "the cloud". I don't see SpatialOS being ready for low-latency real-time any time soon and not necessarily of their own fault.

FWIW, I did some press time with them at GDC. It was an interesting visit since I have experience in developing the types of distributed entity simulation platforms they're building. The demo was to showcase intelligent behavior across thousands of entities in a "virtual world" setting. It was fairly slow paced, when it worked - true to startup style they didn't account for poor internet at a trade show. So, not much of a demo.

Admin for GameDev.net.

23 minutes ago, khawk said:

That's "the cloud". I don't see SpatialOS being ready for low-latency real-time any time soon and not necessarily of their own fault.

FWIW, I did some press time with them at GDC. It was an interesting visit since I have experience in developing the types of distributed entity simulation platforms they're building. The demo was to showcase intelligent behavior across thousands of entities in a "virtual world" setting. It was fairly slow paced, when it worked - true to startup style they didn't account for poor internet at a trade show. So, not much of a demo.

Ah pretty disappointing if it can't do low latency. They kind of sold it as that though.

Also, a variety of multi-entity architectures have been tried. The most famous failure is probably Sun Darkstar, which ended up supporting fewer entities in cluster mode than in single-server mode :-) They used tuple spaces, which ends up being an instance of "shard by ID." The other main approach is "shard by geography."

A "massively multiplayer" kind of server ends up with, at a worst case, every entity wanting to interact with every other entity. For example, everyone try to pile into the same auction area or GM quest or whatever. (Or all the mages gather in one place and all try to manabolt each other / all soldiers try to grenade each other / etc.)

N-squared, as we know, leads to an upper limitation to the number of objects that can go into a single server. Designing your game to avoid this, helps not just servers, but also gameplay. When there's a single auction area that EVERYBODY wants to be in, it's not actually a great auction experience (too spammy,) so there's something to be said for spreading the design out. (Same thing for instanced dungeons/quests, etc.)

Anyway, once you need more than one server, then you can allocate different servers to different parts of the world (using level files, or quad trees, or voronoi diagrams, or some other spatial index,) To support people interacting across borders, you need to duplicate an entity across the border for as far as the "perception range" is. This, in turn, means that you really want the minimum size of the geographic areas to be larger than the perception range, so you don't need to duplicate a single entity across very many servers. If by default you have chessboard distribution, and the view range is two squares, you have to duplicate the entity across 9 servers all the time. That means you need 10 servers just to get up to the capacity range of a single non-sharded server! The draw-back then is that you have a maximum density per area, and a minimum area size per server, which means your world has to spread out somewhat evenly. Because the server/server communication is "local" (only neighbors,) you can easily scale this to as large an area as you want, as long as players keep under the designated maximum limit. Many games have used methods similar to this (There.com, Asheron's Call, and several other.)

The other option is to allocate by ID, or just randomly by load on entity instantiation. Each server simulates entities allocated to them. You have to load the entire static world into each server, which may be expensive if your world is really large, but on modern servers, that's not a problem. Then, to interact between other servers, each server broadcasts the state of their entities using something like UDP broadcast, and all other servers decode the packets and forward entities that would "interact with" entities that are in their own memory. This obviously lets you add servers in linear relation to number of players, and instead you are limited by the speed at which servers can process incoming UDP broadcast updates to filter for interactions with their own entities, and you are limited by available bandwidth on the network. 100 Gbps Ethernet starts looking really exciting if you want to run simulation at 60 Hz for hundreds of thousands of entities across a number of servers! (In reality, you might not even get there, depending on a number of factors -- Amdahl's Law ends up being a real opponent.)

None of this is new. The military did it in the '80s on top of DIS. And then again in the late '90s / early '00s on top of HLA. It's just that their scale stops at how many airplanes and boats and tanks they own, and they also end up accepting that they have to buy one computer per ten simulated entities or whatever the salespeople come up with. There's only so many billion-dollar airplanes in the air at one time, anyway.

For games, the challenge is much more around designing your game really tightly around the challenges and opportunities of whatever technology you choose, and then optimizing the constant factor such that you can run a real-size game world on reasonable hardware. (For more on single-hardware versus large-scale, see for example http://www.frankmcsherry.org/assets/COST.pdf )

enum Bool { True, False, FileNotFound };
Advertisement

What if you had just one server with 128 cores and a huge shared memory?

Surely that would support hundreds of thousands of players.

Cores talk to memory through a memory bus. That memory bus has some fixed latency for filling cache misses. Surely, it will be faster than an external network, but on the other hand, all of those users need packets generated to/from themselves, going into/outof the core, too.

It's hard to get more than four-channel memory into a single socket, and price goes up by something like the square of the socket count (dual-socket much more expensive than single-socket; quad-socket much more expensive than dual-socket.)

And, once you have CPUs with different sockets, the different CPUs should be thought of as different network nodes -- a cache miss filled in 400 cycles on a local RAM module may take 4000 cycles when filling from a remote CPU NUMA node. (usually, the difference is less stark, but it's easily noticeable if you measure it.)

So, let's assume there are four sockets, each with 50 GB/s throughput. Split 25,000 users per core, at 60 Hz. That gives 33 kilobytes per player per frame, and this is assuming that you will use memory optimally. (Most cache-miss-based algorithms would be happy to get past half the theoretical throughput.)

Can you do all the processing you need to do for a single player for a single frame, touching only 33 kilobytes of RAM? (Physics, AI, rules, network input, network output, interest management, and so on all go into this budget.) It's quite possible, if you know what you're doing, and carefully tune the code for the system that's running it, but it's no sure slam-dunk winner. Write your code in Java or Python or some other language that ends up chasing too many pointers, and you lose your edge very quickly.

I just priced a PowerEdge with quad 16 core/32 thread Xeons, four 8-way 16 GB DIMMs per socket, and dual 10 Gb network interfaces; it's about $50k (plus tax and if you need storage disks and such.) You'd also want at least two, because if you have a single server and it dies, your game is as dead as the server. (You'd also need data center rack space/power/cooling and routers/switches/uplink and so on.) Although, still, $100k isn't that bad; the multiplayer networking systems have to compete with this offering and make it worthwhile, which limits their ability to charge upwards, which in turn means they can't solve problems that are too advanced or too fancy, and thus have to simplify their solution to the wider masses of developers. This, coupled with the incredible importance of designing game and network behavior hand-in-hand, probably is one of the explanations why there isn't a phletora of successful game multiplayer hosting/middleware solutions out there, and why each of the ones that are surviving actually fills a different niche -- there simply isn't space to survive for two different competitors in the same niche.

 

enum Bool { True, False, FileNotFound };

I can't afford $50k but I am planning on building a beowulf cluster to support a kind of MMO with 100k+ entities (not all human but all are persistent).

I'm just going to go for a simple method - each 2 km sq region is handled one process. Every process sends sync messages to its neighbours via network messaging. Only edges of the regions are kept in sync. I'd like to run this over a VLAN to separate network channels.

Of course this has lots of limitations but if I get this far then I will implement some load balancing e.g. split the regions with heavy loads into 1km sq regions, etc.

So long as not too many objects converge in spot and do too many interactions it might be ok.

I worked for an MMO company who had very similar technology up and running back in 2005. We didn't even use that tech in the end because it wasn't a useful selling point and the market had moved away from the single-shared-world model anyway due to World of Warcraft. (Arguably this was because their networking model was so primitive that they had no choice, but turning limitations into opportunity is always a good idea.)

As mentioned above the main draw of this new service is that you don't have to manage your own servers like we did. Cloud services are probably low latency enough for most MMO usages now, but probably not shooter-style ones. Potential downsides include hosting and operation costs, since you can probably get it cheaper via a specialised solution (although you also run the risk of over-specifying and paying for capacity you don't need), and development costs, because you have to write the entire game using their paradigm, which may not suit you.

This topic is closed to new replies.

Advertisement