with a time stamp, it is still quite likely that they arrive in a different order.
Yes, so you have to wait for some amount of time (200 milliseconds? Until you've heard from both?) until you declare a winner.
Keep sending from both clients with a "nothing pressed" status is quite useful to know that the clients are in fact visible to the server, and vice versa.
clocks drift and clocks on computers still represent an interval generally on the order of microseconds
Yes. The drift should be compensated for using the periodic updates the server sends out. NTP has this down pat, but you can do a more ghetto version on your own if you want.
Luckily, OP has a totally doable upper bound on precision:
if we're talking a few milliseconds then that's acceptable
So, one simple implementation would be:
- server broadcasts clock packets 10 times a second with server high-resolution time stamp (based on QueryPerformanceCounter, CLOCK_MONOTONIC_RAW, or similar.)
- clients broadcast state 10 times a second, with either "nothing pressed, last server timestamp was X" or "I saw a button, X amount of time after last timestamp, which was Y"
- server makes determination about winner when it has received updates from both clients from a time stamp AFTER the time claimed by at least one client (this will generally be within 100 milliseconds)
To compensate for jitter in receiving the time stamps, the clients should probably update their "server offset estimate" by perhaps 10% of the delta, rather than the full delta, for each received timstamp. (This is the "ghetto NTP alternative" option)
Note: This doesn't immediately send a packet on button press, but instead sends a continual stream of state, and makes the server make a determination when it can be as sure as possible. If the server needs to make a decision sooner, you have to trade "tolerance to delay" against" quickness of decision."