Advertisement

Is it worth logging every packet as a trace request?

Started by August 18, 2014 03:47 PM
6 comments, last by Ravyne 10 years, 3 months ago

Writing a game and it seems like logging all packets sounds like a good idea in theory but it'll increase I/O load a lot and use a lot more disk space (plus increase filtering times.)

Is logging every packet sent and receive by a client in an authoritative server model good? Does anyone have some stories that show it can be useful?

Is it worth it TO YOU?

Sometimes when you are hunting down a particularly rare bug it can be quite useful, you can find a malformed packet or other quirk in the system.

Sometimes they are awesome as they can let you completely replay a game through time to watch how to reproduce a bug.

Sometimes logging is useful to be triggerable, such as when an unexpected pattern happens logging gets turned on for a time and then report the results to either community managers or to the dev team.

Other times it serves as little more than a distraction.

Packet-level logs are only useful in limited circumstances. They should be configurable, something that can be reduced to zero output, with logs that can be rotated based on size and on duration, or logs that can indicate exactly which game session and which players are involved during the session in addition to the packet data.

Advertisement

it'll increase I/O load a lot and use a lot more disk space (plus increase filtering times.)


If done well, it won't have any significant impact on performance. Queue up the packets into a buffer and then flush that on a separate thread (or even over a socket to another process/machine) when the buffer fills. The main processing of packets will be unaffected by the logging. The actual logs then can be gzipped and stored on a storage array with multiple terabytes of space (you can pick up 1TB disks for pretty cheap these days; just set up your log storage server/NAS with a handful of those and you should be good during at least development).

Sean Middleditch – Game Systems Engineer – Join my team!

Yes, a full log of all user input and network data is quite common (and handy.)

Given that your network throughput (10 kB/s ?) is about 10,000 less than your disk throughput (100 MB/s ?) it's unlikely there will be any perceptible impact by the logging. Make sure you use a buffered I/O mechanism. fwrite() is fine, or collect into your own 4kB buffer and flush when full. The kernel will in turn asynchronize the write, so there will be no stall in the writing thread.

Setting it up so that you can also play back the full stream is the pay-off for that logging -- this will let you very easily reproduce rare bugs.
enum Bool { True, False, FileNotFound };

Agreed with Sean -- implemented well, with threading, it shouldn't have an undue load on I/O, your main thread, or the CPU in general.

If the packets are many and/or very large, then you can choose different hardware destinations in order to meet the throughput needs. By nature of these things coming over the network, I don't imagine you'd overshoot even a rotating magnetic HDD over SATA II/II, but it could be the case if you've got other IO happening on that spindle. Adding another physical drive dedicated to log capture should solve that, and if you have a second HDD controller, consider putting that dedicated drive on it, separate from the controller handling the game load. If that's still insufficient consider an SSD instead of a mechanical drive, they can be had now for even under 50 cents / GB for even modestly-sized drives, and you can get them in capacities ranging up to 1TB in the same 50-60 cents / GB ballpark. If a traditional SSD is still insufficient, a PCIe-based SSD will be more expensive but can achieve more than a GB / second transfer rate.

Finally, you should queue the packets into a memory buffer and write out larger chunks as Sean says. A very simple approach to this, if you know that the size of your traces will be relatively small (or that you care only about a rolling 'window' within the whole packet stream) -- say, under a handful of GB -- and you have the RAM (and memory bandwidth, sharing with your game/server) to spare, you can create a RAMDisk and just use normal file-io operations to it, then copy the file to permanent storage when the game's no longer running. A RAMDisk is basically a driver module that appears as a normal hard disk, but is backed by a portion of your RAM -- it has very, very high bandwidth and low latency as a result, but you obviously need a good chunk of RAM to spare -- Another benefit it that using a RAMDisk means that you can log the packets with normal file-io and be insulated from whether the backing storage is provided by any of the options I've mentioned, including RAM, which you would otherwise have to code specially.

Another thing you might possibly consider is simply pushing the packets out of another network interface to some kind of service backed by a fast key-value store (MongoDB or similar). This will push much of the load off of your machine, and also allow you to isolate all the provisioning decisions (whether to use mechanical, solid-state, or PCIe storage, whether to cache in memory) all using standard and well understood software and interfaces -- the downside, I suppose, is that you need another machine and some know-how to set that all up -- but it would be a solid and robust solution.

Finally part-II, consider the format that will store your traced packets in. The raw packets are often super important, but you might like to also/instead store them in human-readable form, such as JSON or perhaps YAML. You need to be careful not to introduce translation errors, which is why keeping the raw packets might be important too (so you can verify suspected mistranslations), but for a quick look-see, having human-readable text at the ready can be beneficial

throw table_exception("(? ???)? ? ???");

implemented well, with threading


You do not need threading. Unless you explicitly call sync() or a similar system call, write() will simply copy the data into a kernel buffer and scheduled for I/O later, and will not block your application. And, in fact, you may want to simply write() each piece of data directly, because this will survive an application crash. write() on a raw file descriptor will survive a process crash as soon as the write() returns, whereas fwrite() needs a fflush() to get to that point. write() will not guarantee against a full system crash, though -- only sync() does that (and then again, sometimes there's NO way of doing that, depending on how fast-and-loose the hard disks and drivers are playing.)
enum Bool { True, False, FileNotFound };
Advertisement

implemented well, with threading


You do not need threading. Unless you explicitly call sync() or a similar system call, write() will simply copy the data into a kernel buffer and scheduled for I/O later, and will not block your application. And, in fact, you may want to simply write() each piece of data directly, because this will survive an application crash. write() on a raw file descriptor will survive a process crash as soon as the write() returns, whereas fwrite() needs a fflush() to get to that point. write() will not guarantee against a full system crash, though -- only sync() does that (and then again, sometimes there's NO way of doing that, depending on how fast-and-loose the hard disks and drivers are playing.)

Ah, I forgot that writes were cached by the kernel and then flushed (write calls not being blocked.) That's a pretty big oversight on my problem and makes this a lot easier.


You do not need threading


You're right -- threading isn't necessary if all you're doing is writing out data packets in raw format. I should have been more clear, though admittedly it was 50% my forgetting that the kernel doesn't block on disk IO. The other 50% was that if they want to do any additional processing on the data before logging, OP probably wants to thread that if it takes any significant time. At any rate, in the interest of full disclosure, my advice is sometimes over-engineered :)

throw table_exception("(? ???)? ? ???");

This topic is closed to new replies.

Advertisement