That is some great info there! Thanks a bunch! I had the above in mind but someone verifying them is actually great. I'll be re-reading this answer a few times as I'm working out the details on the kinks of server to master server to client and back. Fingers crossed I get better understanding on where bottlenecks happen so I have more educated answers to my questions.
Snapshot Interpolation
These are excellent points.
Latency is actually a lot more complicated than people realize. Not only do you have a server loop, but you'll have encoding time and other factors that can sneak a millisecond in here and there.
The approach we took was to run two threads: A game loop thread, and an network thread. All the serialzation/encoding happens in the networking thread, so we have a pretty reliable loop latency. Typically we send data about 2-3ms after the logic loop has completed. When the encode runs longer, we track the extra ms and try to catch up in subsequent frames that finish earlier. This helps to smooth out the arrival of packets on the client-side.
We do something a bit goofy, but works for us. The network library runs as independent Hz from the main loop, and further can receive at a different Hz than it sends. This means that if we set the main loop to 60 Hz, we can run the networking send loop at 20 Hz, so you get the precision of 60 Hz but the data reduction of fewer packets. The network receive can still happen at 60 Hz, which means you can receive RPCs fast enough to have them be meaningful in the game logic. http://kazap.io runs at 30 Hz but does feel quite responsive despite being a web game that uses websockets, and has a relatively low update rate (okay, I consider 30 Hz low...). Part of the reason it feels responsive is we use local input and extrapolation to mask the extra latency. Lag manifests differently when you use extrapolation. Instead of a delayed response, you get sluggishness, which is preferable in some cases.
We use UDP and TCP. TCP has a lot of drawbacks but is otherwise fine for development, and when your frames encode to < minimum MTU it does behave a lot like an RUDP library. The key thing with making sure sends go out quickly using TCP is to turn off Nagle (set by using TCP_NODELAY).
To address that last point, which I think is the most important: Synchronous multiplayer is an illusion - focus on the appearance of synchronization, not the accuracy of it. Most of the time you'll be close enough that it won't be an issue, anyway.
Working on Scene Fusion - real-time collaboration for Unity3D (and other engines soon!)
It's pretty typical that networking runs at a different tick rate than simulation.
You will generally pack messages for multiple ticks / timestamps into a single networking packet, and exactly how the packets are scheduled (fixed-clock, minimum-size, adaptive, etc) varies between implementations and games.
when your frames encode to < minimum MTU it does behave a lot like an RUDP library
The whole problem with TCP is that it is in-order. If you drop packet N, the kernel won't deliver packet N+1 to the application until packet N can be re-transmitted and received. Most games would be much happier receiving packet N+1 as soon as possible, and may in fact not care about packet N at all once N+1 has arrived. Any good RUDP library will provide the latter semantic.
Synchronous multiplayer is an illusion
Not if you use the "RTS" deterministic lockstep model, and accept a round-trip of latency for all commands.
1 hour ago, hplus0603 said:It's pretty typical that networking runs at a different tick rate than simulation.
Yes. In our case we run network receives at a different rate than sends as well in order to process RPCs as fast as possible.
1 hour ago, hplus0603 said:The whole problem with TCP is that it is in-order. If you drop packet N, the kernel won't deliver packet N+1 to the application until packet N can be re-transmitted and received. Most games would be much happier receiving packet N+1 as soon as possible, and may in fact not care about packet N at all once N+1 has arrived. Any good RUDP library will provide the latter semantic.
In our case we use previous packet data to predict current state, just due to the number of objects we are describing. In that case we need to packets to always be in-order and guaranteed to arrive. If order does not matter, dropped packets also won't matter and you don't need RUDP, regular UDP will suffice.
1 hour ago, hplus0603 said:Not if you use the "RTS" deterministic lockstep model, and accept a round-trip of latency for all commands.
True. I should have qualified that better: It is possible to each user to experience exactly the same simulation, just not at the exact same time.
Working on Scene Fusion - real-time collaboration for Unity3D (and other engines soon!)
It is possible to each user to experience exactly the same simulation, just not at the exact same time.
With deterministic lockstep, users actually (can) perceive the same events at the same time, within the limits of speed-of-light-in-copper.
However, perhaps the more important observation is that it's more important that users experience events in the same relative order, rather than "at the same time" (which, physically, isn't even a concept! :-)
@JM-KinematicSoup we too have catered for multithreaded serialisation and I've done some checks, even on mobile we get something like 100-300 ticks and 0ms plus I have added delta compression and quaternion compression to have our packet be as tiny as possible. Server wise we use rUDP for performance.
I chose to go with snapshot interpolation rather than deterministic lockstep because it felt a better technique for a generic multiplayer library and even though it might be ok, quoting Glenn Fiedler deterministic lockstep is not good with floating point determinism which again won't help our library which is to work multi-platform in PC, consoles and mobile. He also comments that it doesn't scale very well with a lot of players as you can't wait for them to send input to continue.
Rather than extrapolation which I will be trying anyway for predictions etc I think state synchronisation might help my issue but it might also not.
Glenn's articles are a good place to start. Multiplayer is such a complicated topic that even in all his blog posts he only covers most of the basics. We are using snapshot encoding as well because we need to do dynamic physics, and it's easier and much more reliable to reconcile from a single authoritative source.
We can add a deterministic lockstep method later, however we will still have an authoritative sim running somewhere. We also will need to be able to sync in at any point without rerunning the sim from the beginning.
Working on Scene Fusion - real-time collaboration for Unity3D (and other engines soon!)
23 hours ago, Tipotas688 said:@JM-KinematicSoup we too have catered for multithreaded serialisation and I've done some checks, even on mobile we get something like 100-300 ticks and 0ms plus I have added delta compression and quaternion compression to have our packet be as tiny as possible. Server wise we use rUDP for performance.
What's a 'tick' in this context?
23 hours ago, Tipotas688 said:Rather than extrapolation which I will be trying anyway for predictions etc I think state synchronisation
This sounds like a false dichotomy. Extrapolation in this context is extrapolating from previously synchronised states. It's not an either/or thing. All these methods are all doing the same thing, just with different tradeoffs for latency/accuracy/synchronicity.
21 hours ago, Kylotan said:What's a 'tick' in this context?
This sounds like a false dichotomy. Extrapolation in this context is extrapolating from previously synchronised states. It's not an either/or thing. All these methods are all doing the same thing, just with different tradeoffs for latency/accuracy/synchronicity.
Sure I meant that I'll try both and see which works best for my game. I just think that state synchronisation might wield better results although honestly "cheating" with deterministic lockstep is the easiest way to go and I'm worried if I don't manage good results for pong that could be the best alternative.
On 29/03/2018 at 8:22 PM, JM-KinematicSoup said:Glenn's articles are a good place to start. Multiplayer is such a complicated topic that even in all his blog posts he only covers most of the basics. We are using snapshot encoding as well because we need to do dynamic physics, and it's easier and much more reliable to reconcile from a single authoritative source.
We can add a deterministic lockstep method later, however we will still have an authoritative sim running somewhere. We also will need to be able to sync in at any point without rerunning the sim from the beginning.
Where can I go after that? The more I learn the more ideas I might come up with for my solutions.
2 hours ago, Tipotas688 said:I meant that I'll try both and see which works best for my game. I just think that state synchronisation might wield better results although honestly "cheating" with deterministic lockstep is the easiest way to go and I'm worried if I don't manage good results for pong that could be the best alternative.
Everything about game networking is 'state synchronisation'. Sometimes we transmit entire state snapshots, sometimes we transmit state changes/deltas, sometimes we transmit inputs so that the state change can be replicated, but it all results in essentially the same thing, i.e. the state on one machine is replicated on another machine. So when you say "I think state synchronisation might help my issue but it might also not", that doesn't really make sense.
When we talk about "deterministic lockstep" that's actually 2 concepts that are usually used together - first, a deterministic engine which means that the same inputs will yield the same outputs on every and any machine, and second, a lockstep model meaning that the simulation moves in discrete stages and each computer displays the same stage at broadly the same time. The first part means you can transmit less data, making it practical for larger states, as seen in RTS games. The second part means everyone sees exactly the same thing, given some inevitable delay between issuing a command and it getting replicated out to all the simulations. What this means is that a pong game won't see the benefits of the first part, and will actively suffer from the second part.
At this stage I think the problem is not so much of getting new ideas, it's about implementation. Most likely all your problems are at the implementation level rather than the conceptual level, but since we've seen no code or gameplay it's hard to know what they would be.