Client Side Prediction and Server Reconciliation

impguard · 2018-06-10T04:17:51

Hey! Hopefully this isn't a complicated question! I've been looking into netcode and client prediction + server reconciliation using a variety of sources: Valve's dev blog, Gabriel Gambetta's tutorial, and other posts. I'm just trying to get some confirmation to help me understand one distinction that doesn't seem to be made about the various ways one can handle prediction + reconciliation. As far as I can tell, there are two distinct ways of handling it: Option 1: You ensure that your engine frame number is "universal" across the server and the client. This means that all server client communication includes this frame number and commands associated with a frame are run at exactly that frame on both the client and the server. This involves ensuring that the client runs at a later frame number than the server, running "ahead of it", to ensure that commands that reach the server are stamped with a frame number >= the actual frame the server is currently calculating. Option 2: Each client stamps their commands with a "command number" that the server reciprocates when it sends updates to each client. This seems to be the model that Gabriel Gambetta's tutorial encourages. In this case, the server and the client frame numbers aren't really considered, and the server simply tells the client which command number it last processed and the client's state to allow the client to easily go back and reconcile which commands are old/new. I've found a few tutorials based on Option 2, and I've implemented my own example using Option 1. While there's a couple of posts floating around about Option 2, I question whether or not that model works well at all. It feels like basing client side prediction on command numbers makes server and client state feel extremely inconsistent since the server simply executes any command whenever it receives it and the client reconciles extremely naively without any consideration of actual frame timings. Additionally, it seems like Valve's dev blog indicates a possible Option 3 which is similar to Option 1 but is more complicated. They seem to use timestamps instead of frames, but also seem to mention running the server ahead of the client, which is a little confusing to me. Thanks!

Networking and Multiplayer Programming

Started by impguard June 02, 2018 04:20 AM

12 comments, last by hplus0603 6 years, 5 months ago

hplus0603

11,938

June 07, 2018 10:14 PM

The input that you generate on frame 5 should probably more clearly be labeled "your input at frame (5 + transmission latency)"

That is, you expect that the server will execute those commands on a future frame. The frame delta (distance) depends on your simulation/rendering latency and the network transmission latency.

So, yes, you keep track of at least two time frames in this way of slicing things! Another gnarl comes in when you have many remote entities, and can't receive an update for each of them in each packet, and you have different objects updated at different frame times. At that point, you will typically start forward-extrapolating all objects to a consistent time, and once you do that, you might as well change your design to extrapolate to the "future" time that you're generating inputs for. These design decisions all interact, so not all "mix and match" choices make sense together, and there's also not a "one size fits all" solution.

enum Bool { True, False, FileNotFound };

impguard

Author

2

June 09, 2018 10:21 PM

Thanks! I recently read this article: https://link.springer.com/article/10.1007/s00530-012-0271-3 and I think it helped me clarify the point I was trying to make.

Summarizing here to see if it makes sense. They describe in 4.3.1 a model for "Time Offsetting Techniques". It seems like this matches up with the intentions of a Lag Compensation model coupled with a Client side prediction model. It seems like this solution essentially assumes two frame times. There's the frame associated with your current character, which is where you've predicted yourself to be when you choose to shoot. There's also the frame associated with all the remote objects you're seeing, which is the frame the server has last sent you.

The reason for having two frames is because the server has all the latest information (relative to you) of your remote objects, and it packaged and sent that to you. However, you have some inputs that the server does not have yet. Therefore, when you render "Frame 5", this frame, on your screen, represents all the objects in the past at Frame 5 but your player character potentially at Frame 10. When the server receives the packet that you shot, it needs to reconcile two things: where the objects were at frame 5, and where you were at frame 10. This accurately allows the server to produce a rendition of what you were seeing when you shot and resolve things appropriately.

Is this what you meant when you said "your input at frame (5 + transmission latency)"? As in, 5 represents the frame that all your remote objects were at, and the transmission latency + any buffer represents where you were when you chose to shoot at frame 5? With both pieces of information, the server knows how to reconcile things appropriately. This can be likened to "combining Option 1 and 3" and storing two frame numbers.

What I'm seeing here is that maybe my breakdown of the "options" wasn't the best way to put it. At the end of the day, the key way to look at it may be to think about the timeline of each of my objects on any particular player or server's game state. At any given time, I will be rendering a single frame based on my knowledge of where remote objects and my player is. For all objects, I technically hold stale data, but I hold perfect data for my player inputs. The server holds perfect data, but stale data for all player inputs. As a result, as a dev, I'm choosing how I want to choose to reconcile everything when I render one frame.

One option for clients, as described above, is to choose to render the stale data for opponents but perfect input for myself. This allows the server an easier job to reconcile and perform what's known as "lag compensation". Alternatively, I can choose to predict everything including myself, which would be some form of full prediction. Alternatively I could choose to delay my own inputs to ensure that everything, including my inputs are "stale" for consistency. On the server, I could choose to reconcile the client's stale data to favor the client, or wield absolute power and use my perfect data to make decisions. All of these choices aren't valid in the grand scheme of things because networking is annoying, but they may result in 99% of the time, the game feeling fine, depending on the game. But it all boils down to reconciling the different logical frames (different timelines) of remote/player objects in order to render the actual physical frame on the screen.

A long wall of text later, does this match expectations?