Advertisement

Single Threaded UDP

Started by July 23, 2005 11:21 PM
18 comments, last by Megahertz 19 years, 6 months ago
Quote:
Original post by hplus0603
If you're reading from a single socket, using select() just means you make a useless kernel call. It's more efficient to just set the socket in non-blocking mode, and reading from the socket until it returns no more data; then go through your regular game loop.


Agreed. If you're really concerned about optimizing your main game loop, you need to evaluate and weigh every single kernel transition your loops causes.

Quote:
Non-threaded game loops mean that you don't have to worry about locking your world state, which is a performance improvement. However, on a multi-core CPU, you'll probably do OK by doing receipt in one thread, and put data on a queue for the main world processing thread to pick up; you only need to lock this queue once per game tick which is a negligible cost. You can even double-buffer the queue if you're so inclined, although that may incur a latency penalty.


I do most of my programming on a dual xeon, and all of it in Windows. When it came time to write my octree code and run through all the entities and determine which entities can "see" the other entities around it, I found doing this in multiple threads was both easy and yielded phenomenal performance gains. I could successfully run this computation on 500,000 entities in about 750ms in my simple unoptimized test octree app using both cpu's (speed depends on density of the entities and other factors of course).

This ran with two threads running side-by-side, calling InterlockedIncrement() on a shared index to the entities. There is one SWMR (single writer multiple reader) lock class I wrote that manages the array in its entirety, but those locks are only called at the beginning and ending of each walk.

Ya'll can pry multithreading from my cold dead fingers! Until then, I ain't givin' it up!! :)

Robert
I think as long as you spend more time on the computations than you spend doing the context switches or waiting for shared resources to have mutexes unlocked then multi-threading is going to see a performance gain.

-=[ Megahertz ]=-
-=[Megahertz]=-
Advertisement
Quote:
Original post by rmsimpson
I do most of my programming on a dual xeon, and all of it in Windows. When it came time to write my octree code and run through all the entities and determine which entities can "see" the other entities around it, I found doing this in multiple threads was both easy and yielded phenomenal performance gains.
Naturally if you do actually have two physical processors, making your code multithreaded will see performance gains of oh say... double? :) But how about for a multithreaded game server with threads using real critical sections, and blocking waiting for input, running on a single cpu.
My bet is that any performance improvement is not worth the effort of dealing with the issues multithreading brings in.
Maybe I should have qualified that - any performance improvement is not worth the effort of dealing with the issues multithreading brings in for threads doing socket management. Most likely the main reason that multithreading topics come up in relation to network programming is because people see that sockets block. Just think, if sockets were non-blocking by default and there was no way to set them to blocking, what problem would it pose to a game server? Having a socket block is only really useful for applications that must have some data to read/send or else they have nothing else to do eg. FTP.
Of course there's nothing wrong with multithreading in general. With sockets though, since the OS is already managaging a thread under your socket layer, there seems little point in adding another thread between that and your main app. Like someone mentioned here already it's mostly a matter of taste whether multithread code is uglier than a single thread 'giant loop' with interruption logic, I just don't think either has any great performance advantage over the other. Single threaded is definitely easier to debug (but comes with more management issues, like handling non-blocking connect() properly etc).

Here's another statement that needs qualification: "You can get more then double the performance if you pay close attention to what you are doing."
On a single cpu??
Quote:
Naturally if you do actually have two physical processors, making your code multithreaded will see performance gains of oh say... double? :) But how about for a multithreaded game server with threads using real critical sections, and blocking waiting for input, running on a single cpu.
My bet is that any performance improvement is not worth the effort of dealing with the issues multithreading brings in.


My point really, is this: The days of single-core CPU's are numbered. Coding for single-threaded, single CPU's will very soon be like writing for DOS or in 16-bit.

If you're going to write new code, right now, and actually want to see improvement in its performance when you upgrade your equipment in a year, you need to start writing scalable code.

While we're on the subject of performance, which is more taxing on the CPU:

A. Polling a socket in a tight game loop in a single thread, potentially 100,000 kernel transitions a second or even higher.

B. Putting a thread to sleep and having the kernel wake it up when socket data arrives. A couple small kernal sync objects added to that and you have no more than say a dozen kernel transitions per message.

Oh, and just for kicks to demonstrate performance of multiple threads in a single CPU ... try this on for size:

In a single thread, do the following (assuming Windows because, well, that's all I write for!)

FindFirstFile() on "*.*" of the root C Drive
For each directory, re-enter the function with the new path
For each file, open the file and call CreateFileMapping() to map the file. Then close the mapping and the file.
Call FindNext() and loop until no more files.

A simple re-entrant function to iterate the entire filesystem of your harddrive and map all the files to memory, unmap them, and close the files.

Once you've run this program once which will take some time, the subsequent runs will be much faster (caching). Now, complicate this function a bit by introducing another thread and some synchronization between the two threads. Some of you might guess that the thread overhead and synchronization might just cause the whole thing to take much longer, but you'd be dead wrong. The multithreaded version completely obliterates the single-threaded version in performance. I'm more than happy to follow up with source code here if anyone's remotely interested.

Robert
Quote:
Naturally if you do actually have two physical processors, making your code multithreaded will see performance gains of oh say... double? :)


If you are entirely compute bound, and run in cache, that's possible.

If you're contending for the memory bus, then that's not true. There's still only a single interface to main memory, which now two separate CPUs have to suck data through.

It all depends on what you're trying to optimize.
enum Bool { True, False, FileNotFound };
Advertisement
To multithread or not to multithread. This is a topic under much debate in many worlds. My personal opinion is as follows.

If you plan to run your software on a single CPU don't multithread. I noticed that you said "I multithread database accesses" but does this really need to be multithreaded? If your database API is accessing over TCP/IP writing your own handeller might do wonders for the efficiency of your code. Task switches on a CPU with the locking and blocking waste more cycles than it's worth. Try to look at your design and clean it up maybe. You might dazzle yourself on what you learn about how to handle multiple tasks at once. For example, in my client engine for my games, I got events like input (mouse etc), harddrive, sockets, and more on a single system call. It's alot cleaner, alot more optimized, and god does it feel good ^_^

On my personal project I'm working on I developed a microthread/multithread software. Basically in a single CPU it uses microthreading and in multi cpu it uses multi threading.
The software will run on machines with more than one CPU as long as the code can be designed to take advantage of more than one and see a reasonable performance increase from doing so.

"If your database API is accessing over TCP/IP writing your own handeller might do wonders for the efficiency of your code."

I'm not sure what you mean by this but it sounds similar to what hplus has mentioned before about what he did in his own project. If its definitely more efficient then perhaps its worth looking into. Could you explain what you mean a little better?

Right now I have a thread that gets "query packets" from the server. Queries are put on a queue and then executed and their results are stored and returned on queue that the server can pop the results off of later once they've been processed.

The only reason I did this is that I didnt want to block on waiting for the results from a query that could potentially take a while to process thus holding up the server from doing other things in the meantime.

It's my first go around with coming up with a solution to such a problem, so my current solution could be less than ideal. Just one of those things you have to try and see how it works and if it doesn't, learn from it and do it better the next time around.

Ideas are always appreciated tho. =)

-=[ Megahertz ]=-
-=[Megahertz]=-

This topic is closed to new replies.

Advertisement