Advertisement

Delta Encoding Snapshots + Scope/Priority

Started by February 25, 2016 07:24 PM
9 comments, last by sufficientreason 8 years, 8 months ago

I'm trying to implement Quake-style delta encoding that works as follows:

1) Each client reports to the host the last tick tlast that it has received from the host.

2) Host generates a snapshot for simulation tick tcurrent and places it in a buffer.

3) For each client, the host compares the snapshot generated at tcurrent to the one generated at tlast and sends that client only the changed values.

4) Client receives the delta snapshot and recreates the full snapshot by filling in the gaps from the one it received at tlast.

Now, this algorithm seems to assume that the host is always sending the client an update on every object in the world, even if that object is irrelevant to that client. What I'm wondering is, what if I have more objects than I can reasonably encode in a single packet? I can send snapshots to clients that only contain updates on some objects (determined by priority, area of interest, etc.), but then I run into a problem. If I'm sending sparse snapshots, when a client tells me it's received data up to tlast, that no longer guarantees that every object has updated to tlast on the client's side.

The naive way I can think of solving this is to keep track individually for every object when it was last updated by the client. This seems to require keeping a jitter buffer for each individual object, though, which adds a lot of complexity. It also requires the client->host packet to contain a long list of every object and its last received tick. Is there a smarter way to handle this?

Thanks!

What I'm wondering is, what if I have more objects than I can reasonably encode in a single packet?


First: A UDP packet maxes out at 64 kilobytes. That's a LOT of objects!

Second: Yes, for large games, you will do interest management. You will decide, on the server, that player X can't see object Y, and you will tell the client that "object Y is no longer visible" (or you won't tell it about that object in the first place.) Then, you won't need to do "delta compression" on that object, because the client isn't expecting to see any data about it anyway. You keep this "visibility" state on a per-client basis, and you updated it frequently (the client<->object relation invalidates each time either moves.)

Once the object becomes visible again, you will have to tell the client about ALL of the state of that object, as if that object was a new, freshly spawned object. If you use hard-core delta compression, this means that you will keep sending a full snapshot of this object until the first ack for that object comes back from the client.
enum Bool { True, False, FileNotFound };
Advertisement

So, how does this look?

---

Server Process:

Precondition: For each client, the server maintains a ring buffer of the snapshots it has sent.

  • Each tick, the server generates a master snapshot for the world state.
  • For each client:
    • Select n entities from the new master snapshot to send to that client (by distance, visibility, etc.).
    • Fetch the last snapshot that the client acknowledged from the outgoing ring buffer.
    • For each selected entity:
      • If the entity was present in the prior snapshot and not marked as frozen, send a delta encoding.
      • Otherwise send a full encoding.
    • For each entity that was not selected:
      • Mark the entity as frozen in the snapshot when it goes into the outgoing ring buffer.
      • Do not send any data regarding that entity.
    • Store the generated snapshot in the outgoing ring buffer.

Client Process:

Precondition: The client maintains a ring buffer of received snapshots.

  • When it's time to process a snapshot (i.e. it's next in the dejitter buffer).
  • For each entity in the world:
    • If the entity is in the snapshot:
      • Update the entity either from the full state or by recomposing its state from a prior received snapshot.
    • Otherwise, mark the entity as frozen locally. Extrapolate it for a short period of time and then hide it.
It's probably more efficient to just have a snapshot of all objects for the last N time steps globally, and point into that from each client connection instance. Assuming you have more clients than you have steps of memory, at least.
100 clients, saving the last 10 steps of state, would be quite a win.
If you also need the old states for backwards-checking (Valve/Source/Counter-Strike networking model style) then you can re-use this!
enum Bool { True, False, FileNotFound };


First: A UDP packet maxes out at 64 kilobytes. That's a LOT of objects!

One thing to keep in mind in this regard is that the larger your UDP datagram is, the more (usually 1.5K-or-so-in-size) fragments it will be split into. And as loss of any fragment leads to effective loss of the whole datagram, this will lead to increasing the datagram loss rate (compared to packet loss rate), IIRC, more or less linearly as your datagram size grows. I'd certainly try to keep away from 64K-sized datagrams (for many reasons, the above one included, but certainly not the only one).

It's probably more efficient to just have a snapshot of all objects for the last N time steps globally, and point into that from each client connection instance. Assuming you have more clients than you have steps of memory, at least.
100 clients, saving the last 10 steps of state, would be quite a win.
If you also need the old states for backwards-checking (Valve/Source/Counter-Strike networking model style) then you can re-use this!

Yep, that's a good idea. I'll just need to maintain a ring buffer of "views" on each client containing which entities it send unfrozen updates for in each outgoing snapshot.

The whole freezing/unfreezing approach is a little more bandwidth-hungry (in terms of server upload) in bad cases than the individually-updated entities approach I proposed in my original post. If an object dances in and out of scope for a period of time I'll potentially be sending a lot of unnecessary full-updates that could have otherwise been delta-encoded. It also enforces a hard limit on my scope size. If everyone in the entire match decides to concentrate on one point and be on screen at once, I'll have pop in/out problems and will consume a lot of extra bandwidth.

However, I think I can do a hybrid of my original post idea (keep track of when each individual entity was updated for a client) without duplicating the snapshots so much based on what you've suggested with this sort of "view" structure per-client while retaining a global master snapshot buffer.

One thing to keep in mind in this regard is that the larger your UDP datagram is, the more (usually 1.5K-or-so-in-size) fragments it will be split into. And as loss of any fragment leads to effective loss of the whole datagram, this will lead to increasing the datagram loss rate (compared to packet loss rate), IIRC, more or less linearly as your datagram size grows. I'd certainly try to keep away from 64K-sized datagrams (for many reasons, the above one included, but certainly not the only one).

I'm trying to stick with Quake-style 1400B packet payloads (with an extra few bytes of protocol overhead). Hence the need for aggressive scoping.

Thanks for the help!

Advertisement

The whole freezing/unfreezing approach is a little more bandwidth-hungry (in terms of server upload) in bad cases than the individually-updated entities approach I proposed in my original post


That's true in theory, but the cost of keeping old states of objects around forever for each client is rather punishing.
And if an object has been invisible for a long time, the "delta update" packet to bring it up to date should be the same size as the "reincarnate" packet to bring it up to date, assuming that you're only sending data that is actually likely to change over time. (If you send static data, you probably want to pre-install that and just send a reference to it once.)

Generally, a server will have some degree of hysteresis.
Let's say that the "visibility" algorithm is a simple "if you're within 500 meters, you can see it."
A server might actually implement that as something like:
If the object is visible AND it is further away than 550 meters AND it has been further away than 550 meters for the last three seconds, THEN remove it from the client.
If the object is invisible AND it is nearer than 500 meters AND it has been nearer than 500 meters for the last half second, THEN add it to the client.

That way, the churn of visible/invisible messages will be significantly reduced.

Various game engines and servers may use more sophisticated algorithms for visibility. If you support "selecting targets" in the GUI, you may want to make selected targets more sticky on the network. If you have a graphics engine with PVS, you may want to trace line-of-sight as an input. If you have a grouping mechanic, you may want to prioritize other players in your group, and perhaps targets/vehicles/objects of those other players, even if they are further away.

Networked game design is so interesting, because it's not just a question of shuffling the bits around, but it's a question of providing the best possible experience to the user, given some some restricted budget of bits, and a healthy dose of uncertainty about those bits! Game design, engine construction, network protocols, and UI, all have to be designed together to achieve the state-of-the-art level of user experience.
enum Bool { True, False, FileNotFound };

That's true in theory, but the cost of keeping old states of objects around forever for each client is rather punishing.

And if an object has been invisible for a long time, the "delta update" packet to bring it up to date should be the same size as the "reincarnate" packet to bring it up to date, assuming that you're only sending data that is actually likely to change over time. (If you send static data, you probably want to pre-install that and just send a reference to it once.)

I think I can do it with one snapshot buffer as you said. In this approach, the server keeps one outgoing master snapshot buffer and for each client it retains a "view table". A "view table" is a lookup table that contains a mapping of each object id to the latest tick number the client has for that object. So it would look like this:

Every send (on server):

  • The server produces a master snapshot and stores it in the history buffer.
  • The server creates a scoped/prioritized list of entities to update for a given client
  • For each entity
    • Find its last acked frame from the client for that entity from the view table
    • If it's too old, send a full encode from the newly-minted master snapshot
    • If it isn't, retrieve the snapshot from that frame from the master snapshot history buffer and delta-encode

Every receive (on client):

  • For each state in the snapshot
    • Add the received state to a per-entity jitter buffer
    • Update the client's view table for that entity to the frame of the received snapshot

Now, there are two downsides to this:

1) The client has to report a view table to the server instead of a single frame ack in each send, which makes the C2S packet bigger. However, a view table is only going to be about 4B per entry, and download on the server is cheaper than upload.

2) The client has to keep one dejitter buffer per entity instead of a single snapshot-level dejitter buffer. This is more expensive, but the client isn't doing this at scale, it only has to maintain one view and one simulation. It also actually kinda simplifies some code I think if I want to do something like spline-based interpolation later on.

Whereas, the big advantage is that we don't necessarily need to update every entity every frame. We could de-prioritize smaller or less game-relevant entities so they only go out in every other snapshot, for instance (like in the "I Shot You First" HALO: Reach networking model). If there are more objects in scope than we can fit in the packet (i.e. everyone decides to meet in the center of the zone), we can transparently round-robin the updates and still delta-encode them rather than having to freeze/unfreeze things intermittently.

Generally, a server will have some degree of hysteresis.

Let's say that the "visibility" algorithm is a simple "if you're within 500 meters, you can see it."
A server might actually implement that as something like:
If the object is visible AND it is further away than 550 meters AND it has been further away than 550 meters for the last three seconds, THEN remove it from the client.
If the object is invisible AND it is nearer than 500 meters AND it has been nearer than 500 meters for the last half second, THEN add it to the client.

That way, the churn of visible/invisible messages will be significantly reduced.

Various game engines and servers may use more sophisticated algorithms for visibility. If you support "selecting targets" in the GUI, you may want to make selected targets more sticky on the network. If you have a graphics engine with PVS, you may want to trace line-of-sight as an input. If you have a grouping mechanic, you may want to prioritize other players in your group, and perhaps targets/vehicles/objects of those other players, even if they are further away.

This is a top-down game, so I'm interested in situations where the number of visible entities exceed the possible scope. I don't want the scope limit to be a hard number for this reason. I think I can avoid hysteresis and hard bounds entirely if I allow for the "round-robin" update approach I describe above. Ultimately it's always going to be a worst-case.

I think I can avoid hysteresis and hard bounds entirely if I allow for the "round-robin" update approach I describe above.


That's why each game really needs to have the networking and the simulation and the game design evolve together! There is no "one size fits all" and your approach of making them a harmonious whole for exactly your game seems good.
enum Bool { True, False, FileNotFound };
You should think of the 'game state' not to be the global game state for all clients, but a separate game state for each client.

Each client will have a view model of each entity in the game.

Each view model will query their entity for a new state at different rate (depending on interest).

The delta compression then occurs on each client view model.

When a client initially connects, or more generally, requests a full game state, you can treat those as reliable data, because it will be a lot of data when compared to frame updates (static objects, dynamic entities, various bits and bobs, but maybe not ephemeral entities like rockets...).

So, send initial data as one big reliable transmission, and start sending deltas once the client acknowledged that 'full-state'.

Furthermore, you can split your entities into groups of different game states, or contexts, which are independent from each other.

Example, a 'Lobby context' that manages only lobby data, and a 'game' context that manages entities in the game. Or game 'shards' small islands in the game sorted by locality. Entering a new game area? Negotiate with the host that you'd like to get a view of that area.

Fragmenting your game data that way can be useful. Generally, it's not required (I.E. Quake 3).

Everything is better with Metal.

This topic is closed to new replies.

Advertisement