when i did the multiplayer version of SIMTrek / SIMSpace i also went with lockstep and fixed update speed. it was necessary for the high degree of accuracy required by a hard core flight sim.
but i handled things a bit differently...
"if every player shares their input with every other player, the simulation can be run on everyone’s machines with identical input to produce identical output"
its only necessary to share relevant state changes with other players to keep all players in sync. input can be processed locally, and just the pertinent results are sent to other
players.
" lockstep is especially attractive to mobile developers because cellular and bluetooth connections can be extremely poor relative to the broadband connections that PC and console developers can generally rely upon."
i wrote my own transfer protocall. its was so robust you could unplug the phone line and plug it back in and the game would keep right on running without missing a beat. so lost packets was a non-issue. all said, at the end of the day, you can only ACK so many ACKs, then you just have to take it on faith that the packet got through. if it didn't, that's what auto-re-send and auto-re-sync are for.
"The code that drives your game logic must be fully deterministic across all the machines that will play against each other. That is, the machines must run the exact same set of calculations based on the exact same set of inputs and produce the exact same results."
unnecessary if calculations are performed locally, and just the results are transmitted.
" in common with most lockstep games we share checksums of the game state between machines to detect desynchronisation and treat checksum mismatches similarly to network errors by stopping the game and displaying an appropriate message."
with a robust protocall, lost packets go away. by transmitting results, not input, you always get the same results on all machines. with no lost packets and the same results on all machines, you basically can't lose sync, so checksum is unnecessary.
"One of the simplest but often forgotten steps that we took was to organize our code to make it obvious which systems needed to be fully deterministic for lockstep"
by transmitting results, not input, nothing has to be deterministic.
"Right from the start we made sure that our simulation used an independent random number generator from the rest of the game.... ...Obviously, the random number seed used by the simulator needs to be agreed upon by all machines and we did this by generating a seed from a checksum of the shared launch settings.
by transmitting results, nothing has to be deterministic, so separate random number generators with matching seeds for deterministic code sections are unnecessary.
"Floating point numbers can present something of a problem"
if you perform floating point operations locally, then transmit just the results, floats are not an issue.
"Having decided not to use floating point math, we naturally decided to use fixed point math in its place."
by transmitting results not input, floats are not an issue, so fixed point is unnecessary.
"The main tool we used was that every time we step our simulation forward, we load the previous state, and step forward again. We then compared the two new states we produced and if there are any differences then it indicates a problem in our determinism. in order to achieve this we wrote code to serialize and deserialize the entire simulation state. "
sending results not input means nothing has to be deterministic, so all this is totally unnecessary when sending results, not input.
"We found it was incredibly valuable to invest the time on code to let the computer run automated matches overnight... ...our overnight single-player tests caught lots of rare event desync bugs."
sending results over a robust protocall means no loss of sync, so no automated testing required.
"Despite all our care there were a small handful of desync bugs that slipped through the net."
sending results over a robust protocall basically means this can't happen.
"It is invaluable during development to have extensive debug logging, so that if a rare desync occurs you can pinpoint the cause without necessarily needing to repro. Our multiplayer logging in Rapture involved serialising and writing the entire state to a logfile every frame, along with the launch data structure and any input messages. "
with no loss of sync, no debug logging is required.
"We had an interesting bug caused by ambiguous sequencing that looked something like: MyFunction(myRNG.GetRand(), myRNG.GetRand());"
By sending results, this becomes a non-issue.
its encouraging to see someone building a game with lockstep accuracy, as opposed to the typical sloppy prediction BS of non-lockstep. but you might want to consider sending results, and building a more robust transmission pipeline as a way to vastly simplify your life.
by sending results over a robust protocall, all of the following become unnecessary:
1. sending all input to every player
2. deterministic code
3. checking sync by transmitting game state checksums
4. keeping track of deterministic vs non-deterministic code
5. a separate random number generator for the simulation
6. identical seed values for all random number generators.
7. dealing with floating point issues between compilers and platforms
8. avoiding use of floating point
9. use of fixed point
10. serializing / de-serializing the entire game state
11. running update twice to check for sync errors.
12. automated "burn-in" testing for sync errors
13. debug logging of sync errors
14. dealing with different evaluation orders on different compilers / platforms.