For detecting accidental hardware data corruption, is it worthwhile to implement my own checksum, or is the UDP header checksum for IPv4/IPv6 sufficient? If I should implement my own, what do you recommend? CRC32?
UDP - Custom Checksum or Built-In?
In my opinion checksums do not make sense at the application level, as you would be trying to compensate for a hypothetical data corruption event which is out of your control, not local to the user's system, and not specific to your application or game. Furthermore you can't handle it correctly anyway, if there is a corruption somewhere along the link then it should be detected by the link-layer nodes immediately surrounding the faulty link and routed around of; triggering an application-level resend just because a checksum failed is just not the correct way to handle things.
Do note that the IPv4/IPv6 checksums do not cover the actual packet data, they only cover the header and are mostly there to protect the router/switch from doing stupid things, not to protect your data. The real checksum magic happens at e.g. the Ethernet or PPP layers, and these layers implement rather strong error detection and correction schemes because they are closest to hardware. There is also the TCP checksum which helps a tiny bit but is not as good as the lower-layer checksums, that checksum is in my opinion misplaced but it is redeemed by TCP being widely used so it's not completely worthless. It is in practice extremely rare for corrupt frames to escape into the IP layer, so it's not something you need to worry about for most applications.
Also do note that checksums are virtually useless against someone maliciously modifying a packet's contents, so they can only deal with accidental software or hardware faults.
“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”
Also do note that checksums are virtually useless against someone maliciously modifying a packet's contents, so they can only deal with accidental software or hardware faults.
I'm thinking about the checksum mostly for hardware issues along the way. Malicious tampering would be beyond the scope of the checksum's protection. In the case of a bad checksum, I would just drop the packet (since I have a fairly robust reliability layer). I'd rather drop the packet (repeatedly, if necessary, ultimately resulting in that user timing out) than allow strange values to corrupt my simulation.
Do note that the IPv4/IPv6 checksums do not cover the actual packet data, they only cover the header and are mostly there to protect the router/switch from doing stupid things, not to protect your data. ... It is in practice extremely rare for corrupt frames to escape into the IP layer, so it's not something you need to worry about for most applications.
I wasn't aware the checksum only covered the header. However, if corruption in the wild is extremely rare, I won't worry about it. Eventually if I want to add shared-key encryption or something (is this actually practical for most moment-to-moment game data?), I'll probably add an encrypted checksum as part of the protection.
Are there any other data safety things I should be adding to my packets? Currently I send them more or less "naked". My five header bytes are just the message type, two bytes for a ping timestamp, and two bytes for a pong timestamp. The rest is just raw game state data. Eventually I'll have an initial handshake packet that checks against a known hail message, and probably includes a protocol version number once I'm a little more stable.
I wasn't aware the checksum only covered the header. However, if corruption in the wild is extremely rare, I won't worry about it. Eventually if I want to add shared-key encryption or something (is this actually practical for most moment-to-moment game data?), I'll probably add an encrypted checksum as part of the protection.
I don't know about practical (probably?) but it doesn't seem very useful. The one thing you need to secure strictly speaking is player authentication; once you've mapped UDP sockets to player identities, you're pretty much done. Packets containing game state info don't really need to be encrypted and strongly authenticated; a cheater should be detected by server-side verifications and the client should only get good data from the server side. As for a middle-man eavesdropping and interfering with the client-server communication (also known as "not your typical hacker") the game network packets don't really have a lot of value in and of themselves, so there is nothing to protect. I mean, you can if you want... but players won't care, and to be honest if someone really cares enough about that feature they can always tunnel your game's network activity through their VPN or something.
I would probably do it only if you have CPU cycles and developer time to spare, and are confident you'll get it right (correctly and reliably encrypting a UDP link is not as easy as TCP). Make sure to do it after you've debugged your network code so that you don't have your encryption getting in the way of your debugging.
Other things you might want to secure though include player account operations, player-to-player communications (often this feature can be provided by a middleware service or library e.g. steam chat, or you can just put something together with HTTPS + SSL) and of course in-game microtransactions, that kind of stuff. But I definitely don't think everything needs to be encrypted.
As for the protocol, what would you add? Game network protocols tend to be simple to cut down on bandwidth, usually having just a tiny header with message type (and a message length for TCP) and any other needed stuff (like your timestamps) followed by some serialized payload.
“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”
Do note that the IPv4/IPv6 checksums do not cover the actual packet data, they only cover the header and are mostly there to protect the router/switch from doing stupid things, not to protect your data.
This is true for the IP checksum, but the UDP header actually also has a checksum that is done on a part of the UDP/IP header and the payload. The UDP checksum is optional (on IPv4) but most operating systems implement it. That said I agree that the transport layer also has some good error detection/corrections mechanisms, so you can safely assume when you receive a UDP packet that it is error free. UDP packets are either dropped or transmitted without error.
I don't think it's much of an issue. Generally, you will want some kind of checksum (or rather a MAC) to be on the safe side where it matters. Such as when downloading a huge file, or for making sure your executables or data files are undamaged and untampered.
But... as part of your application-level network protocol... no.
Ethernet typically has something like 10-8 to 10-10 bit error rate (depending on what kind of network, and depending on whom you ask, some claim on your LAN you can expect 10-12, but... whatever). IEEE 802 functional requirements as of 1991 requires (5.6.1) 10-8 or better for a device to be compliant, so 10-10 is probably not a too unreasonable expectation today.
Anyway, seeing how your traffic goes over the internet and you don't know how barely standards-conforming cables somewhere on the internet may be, I will assume the worst case, 10-8.
Ethernet frames are terminated by a 32-bit CRC which guarantees (neglecting collisions) to capture all single- double- and triple-bit errors. That doesn't mean that 4-bit or 5-bit errors cannot (or will not) be detected, it just isn't guaranteed by the mathematical model. This means that (neglecting collisions) in order to have a "silent" bit error, i.e. one that isn't directly discarded by the hardware and actually makes it to the IP layer, you need to have at least 4 bit errors in one frame.
Assuming no jumbo frames (internet, eh!) you have a maximum of 1,500 bytes, or 12,000 bits in a frame (well, a bit more, something like 1536 bytes or such... but makes no difference). In order to encounter a single silent bit error in that frame, you thus need 4 bit errors happening among 12,000 bits on the wire. 1/3,000 is quite a different number from 10-8. Also, if you aren't doing bulk transfers, your frames will usually be smaller than the maximum size, so it's even less likely to have this happen (same number of bit errors on the wire, but more frames, more checksums, more interpacket gaps).
Or, look at it from the opposite side. A gigabit ethernet can push 81,247 frames over the wire each second (that's for maximum-sized frames, the likelihood of getting errorneous bits is smaller with smaller frames, both because you have more checksums and because the interframe gaps which are always 96 bits relatively grow in size). The 81,246 Interpacket gaps correspond to 7,799,616 bits (which are harmless, the network card doesn't look at them). If you count in preambles and destination MAC, which is reasonable because it is "harmless", too (any bit error there and the packet will not arrive), it's 16,899,376 bits (about 1.7%) which can contain errors and be totally harmless to begin with.
At a BER of 10-8, a billion bits will contain 10 error bits. Actually only 9.9 bits if we consider the "harmless" locations, there is about a 10% chance that we only have 9 bits to deal with (the tenth disappears in an interpacket gap). But let's be pessimistic. Let's say 10. That's up to 10 packets but in that case the CRC is guaranteed to pick them up, or if we assume a "somewhat malicious" clustering of our bad bits so actually 4 of them make it into a single packet, we have a maximum of 2 packets where it's not guaranteed that the CRC will pick it up. IEEE 802 requires (5.6.2) a maximum tolerable likelihood for that case too, which is 10-14.
(cough) On the other hand, if we assume that the bad bits cluster up in such a way, we should also consider that there's a fair chance they disappear alltogether in a single interpacket gap, too... :)
So, we have 2 out of 81,247 frames (0.002%) that are problematic. They contain an error and we don't know for sure that the network card will discard the frame (it probably will, but we don't know... there is a 10-14 chance that it won't). That's in the worst, theoretical case, on a network that only just barely conforms to a 25 year old standard.
Unless you plan to do a transmission that saturates your gigabit link for about 1.58 million years, that 10-14 chance should be no biggie, you can let the TCP or UDP checksum deal with that.
If you implement cryptography on top of UDP, you may be able to get "checksumming" for free, by having the sender sign each packet, and discard the packet if it doesn't match the signature.
Note that a signature doesn't HAVE to be 20-32 bytes of extra data, if you're OK with an easier-to-break signature scheme (just making real-time game attacks harder rather than protecting authenticity against state actors) you can use the last 4-8 bytes of the HMAC hash.
The UDP checksum algorithm is unfortunately not very good.
As in, I shouldn't rely on it?
If you implement cryptography on top of UDP, you may be able to get "checksumming" for free, by having the sender sign each packet, and discard the packet if it doesn't match the signature. Note that a signature doesn't HAVE to be 20-32 bytes of extra data, if you're OK with an easier-to-break signature scheme (just making real-time game attacks harder rather than protecting authenticity against state actors) you can use the last 4-8 bytes of the HMAC hash.
So far I wasn't planning on doing encryption for my game state packets since I didn't figure it was worth the computational cost. If anyone has any resources on what to do with a C# byte array in terms of protecting my packets though, I'd love reading material. Everything I've read so far is very conflicted.