Throttling

Started by Oogst July 27, 2014 01:10 PM

2 comments, last by Oogst 10 years, 3 months ago

Author

481

July 27, 2014 01:10 PM

We recently managed to reduce network errors in Awesomenauts by 10% by improving our throttling algorithm. This sounds pretty good to me, but we are quite inexperienced with this topic so I am curious to hear whether our approach is good or maybe something else would be better. Have any of you ever implemented a throttling algorithm for a game and what were your experiences? Did it have any effect? What approach did you use?

I wrote a blogpost explaining our approach here: Using throttling to reduce network errors

My dev blog
Ronimo Games (my game dev company)
Awesomenauts (2D MOBA for Steam/PS4/PS3/360)
Swords & Soldiers (2D RTS for Wii/PS3/Steam/mobile)

Swords & Soldiers 2 (WiiU)
Proun (abstract racing game for PC/iOS/3DS)
Cello Fortress (live performance game controlled by cello)

hplus0603

11,938

July 27, 2014 08:02 PM

In general, my experience is that you should send no more data than is necessary. Throttling, means that you were sending data that wasn't necessary, and now you aren't, so that's an improvement!

Do you have good measurements of how the throttling helps? What is the bottleneck that it's helping with, and where is that bottleneck?

enum Bool { True, False, FileNotFound };

All8Up

6,000

July 28, 2014 01:40 PM

So, I don't completely agree with hplus in regards to the sent data, given that I tend to send some redundant data in UDP, but I do agree that there is missing information in your blog post and above description. How much data are you averaging per packet is the key missing item? I will make some assumptions about the data but mostly just cover at a very high level (i.e. missing enough details to drive a truck through ) some of the problems involved with the naive solution you have.

First off, there is little reason to be sending packets at such a high rate. Your reasoning for wanting to get things on the wire as fast as possible is relevant but when you look at the big picture, hardly viable. The number you need to be considering here is latency, but of course latency consists of three specific pieces: delay till put on the wire, network transit time and actual receiver action time. Assuming that your nic is completely ready to receive a packet and put it on the wire, your minimal latency is 5-10 ms because the nic/wifi/whatever needs to form the data into a packet, prepend the headers and then actually transmit the data at the appropriate rate over the wire. Add on top of this the fact that you are sending packets every 33.33~ms you have a potential maximal latency of 40ish ms from the point you call the send API to when the data actually hits the wire. If the network is busy, the wifi is congested or weak, you can easily be in the 50+ms range before a packet actually even hits the wire from the point you call the function to send the data. In general, you need more intelligence in your system than simply sending packets at a high fixed rate if you want to reduce latency but still not "cause" errors and dropped packets.

The next thing to understand is that routers tend to drop UDP before TCP. At a high level this is a technically incorrect description, it's more to the fact that the routers will see small high rate packets from your client, potentially even having two or three buffered for transit to the next hop, and then larger packets of TCP at a more reasonable rate and prefer to drop your little packets in favor of the larger packets from someone else. Given there are easily 10+ hops between a client and a server, the packet lottery is pretty easy to loose under such conditions when the network is even minimally congested. Add in reliable data getting dropped regularly and now your latencies are creeping up into the 200+ range depending on how you manage resend.

How to start fixing all these issues to deal with the random and unexplained nature of networking while maintaining low latency is all about intelligent networking systems. Your "experiment" to reduce packet rates is headed in the correct direction, but unfortunately a simple on/off is not the best solution. The direction you need to be headed is more TCP like in some ways, specifically you should be balancing latency and throughput as best you can while also (unlike what hplus suggested) using any extra bandwidth required to reduce the likelyhood of errors causing hickups.

I'll start by mentioning how to reduce the effect of errors on your networking first. The common reliable case is the fire button or the do something button which must reach the other side. In my networking systems I have a "always send" buffer which represents any critical data such as the fire button. So, if I'm sending a position update several times a second, each packet also contains the information for the fire button until such time as the other side ack's that it received it. So, baring massive network hickup, even through a couple packets may have been dropped the fire button message will likely get through as quickly as possible. This is specifically for "critical" data, I wouldn't use this for chat or other things which are not twitch critical. In general, this alone allows you to avoid the worst cases of having "just the wrong packet got dropped" which throws off the players game. Yup, it uses more data than strictly necessary but for very good reason.

Moving towards the TCP like stuff, let me clarify a bit. What you really want here is the bits which replace your "experiment" piece of code with something a bit more robust. In general, you want three things: mtu detection (for games you just want, can I send my biggest packet safely), slow start/restart packet rates and a non-buffered variation of the sliding window algorithm. So, the MTU (maximum transmission unit) is pretty simple and kinda like your current throttling detection, send bigger packets until they start consistently failing then back off till they get through. Somewhere between where they were failing and where they are getting through is the MTU for the route you are transmitting on. You don't need to actually detect the MTU for a game, you just want to know that if everything starts failing, MTU could be the reason and you should back off large packets till they get through.

The second bit, slow start/restart is actually a lot more important than many folks realize. Network snarls happen regularly, either things are being rerouted, something has a hickup or potentially real hardware failures crop up. In regards to UDP, the rerouting can be devastating because your previously detected "safe" values are now all invalid and you need to reset them and start over. A sliding window deals with this normally and is generally going to take care of this, but I wanted to call it out separately because you need to plan for it.

The sliding window (see: http://en.wikipedia.org/wiki/Sliding_Window_Protocol) is modified from TCP for UDP requirements. Instead of filling a buffer with future data to be sent, you simply maintain the packets per second and average size of the packets you "think" you will be sending. The purpose of computing the sliding window though is so you can build heuristics for send rate and packet sizes in order to "play nice" with the routers between two points and still minimize the latencies involved. Additionally, somewhat like the inverse of the nagle algorithm, you can introduce "early send" for those critical items in order to avoid the maximal latencies. I.e. if you are sending at 10 a second and the last packet goes out just as a "fire" button is received, you can look at sending the next packet early to reduce the critical latency but still stay in the nice flow that the routers expect from you. A little jitter in the timing of packets is completely expected and they don't get mad about that too much. But, even if some router drops the packet, your next regularly scheduled packet with the duplicated data might get through.

I could go on for a while here but I figure this is already getting to be a novel sized post. I'd suggest looking at the sliding window algorithm, why it exists, how it works etc and then consider how you can use that with UDP without the data stream portion. I've implemented it a number of times and, while far from perfect, it is about the best you can get given the randomness of networking in general.

Oogst

Author

481

July 28, 2014 08:38 PM

@AllEightUp:

That is one really interesting write-up, thanks for the extensive information!

As far as I understood from other sources there is a minimum MTU size that all remotely acceptable connections accept. In theory the MTU can be as low as 64 bytes, but in practice there is a minimum MTU that is quite workable and works for 99.99% of internet connections.

This is really important for the case of Awesomenauts. In Awesomenauts we want to send large numbers of really small packets. Data is highly optimised, so 100 bytes per packet is already a lot for us. We are a fully peer-to-peer game with up to six players, which means that data needs to go to all players. This brings our outgoing packet counts to between 50 and 120 per second in total. Since we are sending so many packets we need to stay extremely far from the maximum MTU anyway, because sending 120 packets at maximum MTU is a completely unnacceptable amount of network traffic.

Our bandwidth usage is at the moment between 10 and 13 kB/s in total. This includes UDP, IP and Steam headers, but does not include DSL headers (I didn't even know those exist until yesterday...).

In general, my experience is that you should send no more data than is necessary. Throttling, means that you were sending data that wasn't necessary, and now you aren't, so that's an improvement!

This is actually not true in our case. Since Awesomenauts is a very fast game, I want to send character position updates 30 times per second. When we throttle, we send less than that. This essentially increases lag and reduces precision. So throttling makes things worse and we only do it when the connection otherwise cannot handle it. Dropping the connection altogether is of course worse then just sending less position updates.

My dev blog
Ronimo Games (my game dev company)
Awesomenauts (2D MOBA for Steam/PS4/PS3/360)
Swords & Soldiers (2D RTS for Wii/PS3/Steam/mobile)

Swords & Soldiers 2 (WiiU)
Proun (abstract racing game for PC/iOS/3DS)
Cello Fortress (live performance game controlled by cello)

Throttling

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Throttling

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines