1 hour ago, d000hg said:
Great post, thanks!
Is it normal/desirable that every client and server works in terms of frame number i.e. a fixed time-step, regardless of rendering? So a client says "at frame 1234 X happened"?
The last multiplayer game I worked on had this, and of course the issue is the server receives updates from clients not always in sync. You don't want to lock-step everything (I think) as one person's slow connection lags everyone. But the server is potentially getting frame 1236 from client B before it has got 1234 from A. What happened here was that the server did two things in parallel (as I remember it):
-
Apply the updates in the order they are received
-
Apply them in the order of their frames
And then check if the states match, and retrace its steps to the frame things diverged. This is what I'm calling rollback and I hadn't seen it before. So the server would have a state at frame 1236 which was malleable and another at 1234 which was locked in.
You're still confusing things.
In a lockstep environment, the server will receive client inputs and it must apply them in the order of their frames. Anything else will cause desync. This means the server can't simulate too far behind because it must wait for everyone's input. And this is why it doesn't scale to many users.
In a prediction-based, server-based network model (aka Quake's Multiplayer), client inputs can be applied in any order. But typically for responsiveness reasons you'll want to apply them in the order they're received (inputs aren't frame numbered, but packets still are sequenced) and discard inputs belonging to past packets.
For example if you receive packet 0, packet 2, and packet 1, in that order, then packet 1 should be ignored (unless you're receiving all those packets at the same time, in which case you sort them first, and apply them in order).
This potentially means if the user hit a button for one frame and its packet gets lost or reordered, then the server will never see that he pushed that button.
But that's rarely an issue because:
-
In a UDP model, most packets actually arrive just fine for most of the time.
-
The user isn't that fast to push a button for just 16.66ms
-
Button presses that need to be hold down (like firing a weapon in a shooter, or moving forward) aren't a problem.
-
Worst case scenario, you can send this "button pressed" message repeated in several packets, and the server gives it a small cooldown to prevent acting on this button push twice; or instead of a cooldown, this message is sent with a "I hit this important button 2 frames ago"; and the server keeps a record to see if that was done. If it wasn't, then we do it now.
-
Alternatively, worst case scenario the user will push that button again.
To put it bluntly, a client-server Quake style model is like a mother and her child. The child has a toy gun, but the toy only makes a sound when the mother pushes a button in a remote control in her hand. The kid fires his toy gun but nothing happens, then suddenly 5 seconds later the toy gun begins making sound. The child says "Why mom!?!? I pressed this button 5 seconds ago! Why is it only reacting now!?" And the mother replies: BECAUSE I SAY SO.
Client/Server models are the same. The client says what it wants, but the server ends up doing what it wants. (have you ever played a shooter where you're clearly shooting at an enemy but he doesn't die? and suddenly you're dead???)
Now, the internet is unreliable, but it isn't that unreliable. It's not chaos. Normally most packets arrive and they arrive in order, and when they don't, it's hard to notice (either because nothing relevant was happening, or because the differences of what the client said it wanted and what the server ended up doing are hard to spot) and this is further masked via client side prediction (i.e. the weapon firing animation begins when the client pushed the button so it looks like it's immediate, but enemies won't be hit until server says so).
Errors only get really obvious when the ping is very high (> 400ms) or your internet connection goes really bad for a noticeable amount of time (e.g. lots of noise in the DSL line, overheated router/modem, overloaded ISP, overloaded Server, Wifi connectivity issues, etc) and thus lots of packets start getting dropped or reordered until the connection quality improves again.
For more information read Gaffer on Game's networking series, and read it several times (start from the bottom, then to the top articles)