Advertisement

Accounting for lost packets?

Started by January 04, 2018 07:53 PM
6 comments, last by hplus0603 6 years, 10 months ago

I'm trying to make a 2D multiplayer game. So far I've implemented client-side prediction and server reconciliation and a method that keeps the player objects almost perfectly sync, with as much as a 1 pixel difference. It works like this:

  1. Client samples input, position and other variables 60 times per second and stores them as a snapshot, and stores the input in a packet buffer.
  2. Every 50 miliseconds (30 tickrate), the queued input packets are sent to the server with a timestamp and sequence number.
  3. When the server receives an input packet, it adds half of the round trip time to the timestamp, as well as en extra 100 miliseconds, and stores the sequence number as "last acknowledged packet".
  4. The inputs are stored in another buffer and are applied accordingly to the server timer (albeit 100 ms late).
  5. The server also stores snapshots, but 30 times a second as well as sending a packet with all the positions and relevant variables, as well as the "last acknowledged packet".
  6. When the client receives the packet, it reapplies all the snapshots from the last acknowledged packet onwards (server reconciliation).

This seems to work pretty well, but I'm concerned about packet loss and any desynchronization that might happen. I've tried faking packet loss and it appears that when the character strays too far away from the true server position, it doesn't "correct" itself and the client remains in the wrong position.

When I play games like Overwatch, sometimes I get terrible lag spikes and everything freezes for a while, characters begin flying to completely random positions until my connection gets better and then everything "jumps" to a correct state, and I'm not sure of how to account for situations like these.

Also, the 100ms delay seems necessary as it makes sure the timestamps are applied in the correct order and properly spaced between each other, but I'm not sure if it's a good idea. Is there anything more I might be forgetting? What am I doing wrong?

There are two ways to do networked games:

1) You (the client) are provided an input state, and then, in order, each command given by each other player (including when the command was to be executed.) Your code is written to be deterministic; every player will run the exact same simulation with the exact same input, and will derive the exact same end state. This is not particularly common in FPS games, but very common in RTS games, and somewhere in between for RPGs. The main draw-back is that debugging de-sync is a pain, and there is significant command latency between giving a command, and seeing the result (because you need all player's inputs for time T to actually show the output at time T.) RTS games hide this latency behind the "yes, sir!" animation.

2) You (the client) are provided a stream of object events -- "start showing object," "object updated," and "stop showing object." The server figures out which objects are important to you, and tells you about them. It also figures out how wide your network pipe is, and updates changing state about objects every so often. The client then does what it can to display a "good enough" state of the world (this may include speculatively simulating physics for the objects.) However, when the server gives you an update that is different from what you speculated, you need to "fix" the state of the object to be consistent to what the server says. For small changes, this is easy to hide using interpolation and such. For bigger changes -- either in time, or in things like "did I get shot or not" or "did I trigger the bomb or not" -- this may be perceived as "lag" by the player.

Your implementation sounds like it's a variant of 2). Yes, you will de-synch, almost all the time, but usually very little. Your job on the client is to try to hide the small corrections, and at least make the game still possible to play when you get big corrections.

 

enum Bool { True, False, FileNotFound };
Advertisement

Just to add to what was said above:

  • The 100ms server-side delay seems like the wrong thing to do. Just apply the data when it arrives. If you ever receive a message that is out of order - i.e. you have already handled message 10, but now message 9 arrives - just drop it.
  • A client-side delay however is a useful tool to reduce the effects of varying transmission speeds (aka "jitter"). The idea is usually to treat any received data as applying to some future time, so that you will always have 1 or 2 future states to blend smoothly towards. Note that a fixed number of milliseconds after receipt is probably less optimal than a varying time after receipt, where the variance takes transmission time into account. If each message is stamped with the server's sending time, you can get an idea of which messages are arriving 'early' and which are arriving 'late', and also get a feel for how big the delay needs to be in order to cover this variation.
  • If you're sending snapshots, and sending them less often than you collect them, bear in mind there's no point sending the earlier ones - their information is superceded by the newer data.
  • 50 milliseconds doesn't mean '30 tickrate' unless someone changed the duration of a second recently.
17 hours ago, hplus0603 said:

There are two ways to do networked games:

1) You (the client) are provided an input state, and then, in order, each command given by each other player (including when the command was to be executed.) Your code is written to be deterministic; every player will run the exact same simulation with the exact same input, and will derive the exact same end state. This is not particularly common in FPS games, but very common in RTS games, and somewhere in between for RPGs. The main draw-back is that debugging de-sync is a pain, and there is significant command latency between giving a command, and seeing the result (because you need all player's inputs for time T to actually show the output at time T.) RTS games hide this latency behind the "yes, sir!" animation.

2) You (the client) are provided a stream of object events -- "start showing object," "object updated," and "stop showing object." The server figures out which objects are important to you, and tells you about them. It also figures out how wide your network pipe is, and updates changing state about objects every so often. The client then does what it can to display a "good enough" state of the world (this may include speculatively simulating physics for the objects.) However, when the server gives you an update that is different from what you speculated, you need to "fix" the state of the object to be consistent to what the server says. For small changes, this is easy to hide using interpolation and such. For bigger changes -- either in time, or in things like "did I get shot or not" or "did I trigger the bomb or not" -- this may be perceived as "lag" by the player.

Your implementation sounds like it's a variant of 2). Yes, you will de-synch, almost all the time, but usually very little. Your job on the client is to try to hide the small corrections, and at least make the game still possible to play when you get big corrections.

 

I will make sure to keep this in mind. I know that to "fix" the state from the server, there needs to be a maximum amount of difference that is considered acceptable, and if it differs, the client will fix itself. Is that correct?

10 hours ago, Kylotan said:

Just to add to what was said above:

  • The 100ms server-side delay seems like the wrong thing to do. Just apply the data when it arrives. If you ever receive a message that is out of order - i.e. you have already handled message 10, but now message 9 arrives - just drop it.
  • A client-side delay however is a useful tool to reduce the effects of varying transmission speeds (aka "jitter"). The idea is usually to treat any received data as applying to some future time, so that you will always have 1 or 2 future states to blend smoothly towards. Note that a fixed number of milliseconds after receipt is probably less optimal than a varying time after receipt, where the variance takes transmission time into account. If each message is stamped with the server's sending time, you can get an idea of which messages are arriving 'early' and which are arriving 'late', and also get a feel for how big the delay needs to be in order to cover this variation.
  • If you're sending snapshots, and sending them less often than you collect them, bear in mind there's no point sending the earlier ones - their information is superceded by the newer data.
  • 50 milliseconds doesn't mean '30 tickrate' unless someone changed the duration of a second recently.

1. The reason I did the 100ms delay is because it makes it so the inputs are played in a correct order and at the right times with the precise delay between each key press, albeit 100ms late. It is the only solution I've found to a major problem: when the client presses a lot of keys repeatedly, the server representation of the character used to desync horribly. This is an issue that I've never seen acknowledged in any article, so it makes me wonder if it's my implementation that's wrong.

2. I've read about this but I'm not too sure about it; I know the Source Engine uses this by delaying inputs 100ms from rendering, however wouldn't this cause a very noticeable input lag? It only seems that it would work if the client represented the character by interpolating between snapshots, and that's not the way I'm doing it.

3. This makes sense, however I feel it's good in case one of the packets might be lost, the server has something to fall back on to.

4. 33ms*, my bad.

Some amount of de-jitter delay on the server is often a good thing. 100 ms seems a bit much, but might be OK on your system.

What's important is to make sure that you simulate in discrete "ticks." A "tick" might be 1/100th of a second, 1/60th of a second, 1/30th of a second, or whatever. For each simulation tick on the client, there is some player input (which may just be nothing, or a repeat of what the previous tick was, or may be a new input state.) You need to send these inputs, in that order, to the server, so the server knows the relative spacing of those inputs. The server should then attempt to apply the inputs in the same order, at the same relative tick number. When something's lost, you can ignore those inputs, and the client will get corrected. When something arrives late, that's the same as "lost," although you should have a little bit of de-jitter buffer to account for this. Also, it's common to pack multiple input/simulation ticks into a single network packet -- simulate at 60 Hz, network at 15 Hz, packing 4 inputs in turn per packet. The packet size, transmission delay, and receive buffering will all add to input delay between client and server; this is unavoidable.

Then, when the server sends state back to players, the players always accept this state, with the caveat that the state may be "old" for the player character (because the player already simulated ahead since it sent the inputs.) Thus, you typically want to keep a log of old player state, so you can compare what you get from the server, and apply the delta when you receive the updates. On screen, you can choose to smoothly lerp between positions, or just "cut/jump" the character to the new spot, if the difference is too much. But the simulation state, itself, must be derived from what you get from the server at all times.

Some more additional bits: To support a simulation rate that is not an even multiple of the frame rate, you may wish to support interpolation for rendering. Or you can quantize to the "nearest" simulation step at each render frame. If you simulate at 200 Hz, and render at 60 Hz, that can work OK, for example, but with simulating at, say, 100 Hz, and rendering at 60, there will be jitter that some players will notice during movement.

Snapshots should never be sent from client to server, only from server to client. The server can also forward the inputs for each other player more often than snapshots, if you want to save bandwidth. Each client can re-simulate the state of each other player based on the snapshot data and subsequent control inputs.

It's not uncommon to simply send snapshots on a rotating basis; spread all entities over, say, a 3 second window, and snapshot them all during that time. If you send 15 Hz packets, that's 45 packets to send snapshots in, so if you only have 5 players, there's 8 packets without snapshot, for each packet with a snapshot. The packets also contain other-player input, as well as particular game events (say, explosions, or somesuch, that can affect gameplay for the player directly.) When there are NPCs or more players, there will be more snapshots to send.

You also typically want to send a snapshot when a player/player interaction happens, even if it's "out of order" with the scheduled updates. When two players collide on the server, you know that they will not have seen the same thing on each client, so it's best to send immediate updates to each of the affected players to make the delta time between collision and adjustment as small as possible.

enum Bool { True, False, FileNotFound };
15 hours ago, hplus0603 said:

Some amount of de-jitter delay on the server is often a good thing. 100 ms seems a bit much, but might be OK on your system.

What's important is to make sure that you simulate in discrete "ticks." A "tick" might be 1/100th of a second, 1/60th of a second, 1/30th of a second, or whatever. For each simulation tick on the client, there is some player input (which may just be nothing, or a repeat of what the previous tick was, or may be a new input state.) You need to send these inputs, in that order, to the server, so the server knows the relative spacing of those inputs. The server should then attempt to apply the inputs in the same order, at the same relative tick number. When something's lost, you can ignore those inputs, and the client will get corrected. When something arrives late, that's the same as "lost," although you should have a little bit of de-jitter buffer to account for this. Also, it's common to pack multiple input/simulation ticks into a single network packet -- simulate at 60 Hz, network at 15 Hz, packing 4 inputs in turn per packet. The packet size, transmission delay, and receive buffering will all add to input delay between client and server; this is unavoidable.

Then, when the server sends state back to players, the players always accept this state, with the caveat that the state may be "old" for the player character (because the player already simulated ahead since it sent the inputs.) Thus, you typically want to keep a log of old player state, so you can compare what you get from the server, and apply the delta when you receive the updates. On screen, you can choose to smoothly lerp between positions, or just "cut/jump" the character to the new spot, if the difference is too much. But the simulation state, itself, must be derived from what you get from the server at all times.

Some more additional bits: To support a simulation rate that is not an even multiple of the frame rate, you may wish to support interpolation for rendering. Or you can quantize to the "nearest" simulation step at each render frame. If you simulate at 200 Hz, and render at 60 Hz, that can work OK, for example, but with simulating at, say, 100 Hz, and rendering at 60, there will be jitter that some players will notice during movement.

Snapshots should never be sent from client to server, only from server to client. The server can also forward the inputs for each other player more often than snapshots, if you want to save bandwidth. Each client can re-simulate the state of each other player based on the snapshot data and subsequent control inputs.

It's not uncommon to simply send snapshots on a rotating basis; spread all entities over, say, a 3 second window, and snapshot them all during that time. If you send 15 Hz packets, that's 45 packets to send snapshots in, so if you only have 5 players, there's 8 packets without snapshot, for each packet with a snapshot. The packets also contain other-player input, as well as particular game events (say, explosions, or somesuch, that can affect gameplay for the player directly.) When there are NPCs or more players, there will be more snapshots to send.

You also typically want to send a snapshot when a player/player interaction happens, even if it's "out of order" with the scheduled updates. When two players collide on the server, you know that they will not have seen the same thing on each client, so it's best to send immediate updates to each of the affected players to make the delta time between collision and adjustment as small as possible.

That was all extremely insightful, thank you! Some of it I already knew about but the rest seem like good practices that I'll make sure to keep in mind.

My biggest concern is with inputs and ticks. I'm using the LOVE2D framework and it's update loop is either locked through vsync (usually 60hz) or it can be whatever if it's disabled. I think it checks for inputs more often than 60 times per second, therefore when a new tick happens, there have been more than 1 input for sure. This is why if I press many keys repeatedly I get a huge desync on the server side, and the buffer with 100ms delay seemed to be my solution. One thing I've thought to fix this problem without delaying the packets' times was to limit how many times per second an input is sampled in the client. For example, 30 times a second (same as the tickrate), which shouldn't show any significant input lag, and it would make sure there's only 1 input sent every tick. Is this an acceptable solution?

About the tickrate = framerate equivalency, I do not plan to implement variable tickrates as it's a simple enough game that it won't need anything over 60hz, but I'll remember that if I ever have to.

I'm never sending snapshots from client to server, although I'm thinking of implementing P2P between specific clients to send some data, as my game relies on duos of players cooperating between each other, and I'm going to see if P2P connections for some types of data is better than routing it through the server first.

The rotating snapshots concept is interesting, I've never heard of it. I would just simulate the other players in the client with the inputs and correct them every time the server sends a snapshot?

And as for the last part, this is something that makes a lot of sense, thank you!

 

 

Advertisement

You can still establish whatever tick rate you want, as long as you realize that some ticks may have zero inputs, or more than one inputs. Simply keep your own timer, and advance it by 0, 1, or more ticks each time in the main loop, based on what the time actually is, and what your tick rate is.

Note that you can't send inputs for tick T from the client to the server until the client sees the beginning of tick T+1, because otherwise the  next time through the main loop may still be within the time period of tick T.

The other drawback, if the vsync is not well matched to your desired tick rate, is that you will get some jitter. Not all screens are 60 Hz. Some screens are even variable-rate (G-sync comes to mind.)

enum Bool { True, False, FileNotFound };

This topic is closed to new replies.

Advertisement