Advertisement

help me build the server (hardware)

Started by January 27, 2005 02:42 PM
29 comments, last by kman12 20 years ago
Quote:
You can "NOT" do this without at least one SMP machine.


We're operating a large virtual world without any multi-threaded servers and without any SMP machines(*). For more information, see the Forum FAQ regarding the pros/cons of multi-threaded versus asynchronous programming. Moral: buy hardware that matches your chosen implementation paradigm.


(*) Actually, we run our web servers on Apache 2 on Linux with threads, rather than fork() -- but the rest of the servers are thread-less.
enum Bool { True, False, FileNotFound };
We don't use any threads, and I am totally happy with it. Using threads is a debugging hell, when it comes to a MMO server.
Advertisement
Quote:
Original post by Anonymous Poster
Having worked on a major MMPG...

... make sure you program "solid" multi-threaded code. You can "NOT" do this without at least one SMP machine. This means that your primary developement machine needs to be a dual processor, dual Xeon or Athalon doesn't matter, just be sure it's two honest to god processors.

... Long story short. Program for threads and remote latency using pretty much the worst hardware you can get. Building "big" boxes is actually counter to the idea of building something simple (for 32 players) and then scaling up because it will not "show" the issues with scaling.


There are only two MMOs that I know of (having worked with several of the "major" ones myself) that use high-end hardware. For scalability and cost control reasons, using numerous cheap (i.e. 1U single-CPU) servers is the way to go for game state management. For bottleneck processes, you beef up the hardware to meet your scalability requirements. Your edge servers (app gateways, whatever you call it at company XXX) and your DB server(s) will be the first weaknesses exposed in the infrastructure of any MMO.

I agree that *IF* you use multithreading, you need to make sure it is "solid". Multithreaded code is typically an order of magnitude more difficult to debug than single threaded, and in a service oriented environment your number one priority has to be to keep the service running - maintainability of hardware and software are the things that will keep you in business. Again, for scalability, consider using discrete processes rather than threading. You can always move a discrete process to another piece o' cheap hardware (IO threads excluded for obvious reasons).
hey everyone,

thanks for your replies.

@hplus

do you really think that memory latency will be such a bottleneck that i need such a powerfull processor? im kind of on a budget here [grin]. however, if it is truly worth it, then i will invest. i just thought that the server work even on a crappy machine.

also, about the second hard drive. you were right, i actually planned on using the CD burner rather then a second hard-drive. for one thing, this way i could keep backups on an infinite amount of disks, rather then just 2. but also, it is cheaper to use a CD burner. is the main reason you recommend the dual disk drives because of the if a disk crashes your not screwed thing? because i could just make backups of the DB every day or so. if my DB ever gets bigger then can fit on a CD, then it might be time to upgrade =).

however, the extra disk is only 45 bucks, so ill most likely go with it anyway. also, i should look for a MB which supports RAID, correct? also, why do you recommend intel over AMD? i had always used AMD, and always thought they were faster / better, and even cheaper then Intel.

@Onemind

do you think i really need 512 RAM? i think what will use memory the most is definetly the maps. i would be really surprised if there was even 10 MB worth of objects in my world at any given time. and even if each map was 500x500 tiles and each tile took up 20 bytes, that is still under 5 megs for each map. i would need 50 of these huge maps to take up 256. i see your point though, since 50 huge maps i guess is not too unreasonable (however, more realistically the game will probably have 50 much smaller then 500x500 maps..).

last, could someone explain what exactly is errors in memory and error checking? (and why it would be important to have this..)

oh and BTW everyone, the game runs in a single thread. so no MP systems for me =).

thanks again.
FTA, my 2D futuristic action MMORPG
RAM is the cheapest way to improve perf. One page fault is equal to millions of instructions that could have been doing useful work instead. Having said that, how much RAM you needs depends on a lot of things that only you can answer. You need to look at the working set of your server when it's running heavily loaded and go up from there.

2+ GHz cpu is probably faster than most commercial mmps out today. Maybe WoW or EQ2 uses servers in that range but anything else I'm sure is slower.

memory and error checking - I'm intrepreting this to mean you're asking about ECC RAM. Basically the thing is that memory can be corrupted for essentially random things even if your app is 100% correct - bad chips, stray cosmic rays, power glitches, etc, etc. ECC RAM automagically fixes one-bit errors and can detect multiple-bit errors. This is decent discussion. For small scale stuff you don't really need it but when real money is on the line you're risking undetectable data corruption or really wierd crashes that take a lot of (expensive) dev time to analyse. When I posted my initial comment I had just spent most of a day debugging what turned out to be a single-bit error in the esp register that through the stack off in a real subtle way.

Threading. The mmp I worked on used threading but in a different way then is commonly discussed. Basically we looked at the logical divisions between our data and built our threading around that. It was nice because any particular set of data was accessed in a single-threaded way (avoiding the complexities of full-on threading). These different tasks communicated via sockets on the backend and thus we could put them in the same process, a different process, or even on a different machine as we (or our load balancing task) decided. If you're familiar with COM it was a lot like the difference between apartment threading and free threading.
-Mike
Quote:
do you really think that memory latency will be such a bottleneck that i need such a powerfull processor?


You said you wanted to scale. Next after disk, memory throughput is where the next bottleneck is likely to be. Somewhat depending on your game, though -- a profile will tell you. Chances are, anything will do for now. And, chances are, your simulation is less costly than the simulation we do, so you won't actually go memory bound at all as quickly.

Thinking more about your game, and especially what the AP said: if you're serving a large number of maps, AND are running a SQL server on the same machine, upgrading to 512 MB of RAM is likely to help more than upgrading CPU and RAM bandwidth, once you go limited. So, if you want to save money, AMD is fine, and the RAM you have is fine, just add the second disk. Your next upgrade is to swap a 256 (or even 512) MB stick for the 128s, when and if you need it. If you want to prepare, get one 256 stick instead of two 128s (it's not going to matter with that CPU, because it has a FSB slower than the memory).

Of course, getting to even 30 simultaneous online players might take a while, and you might be OK with swapping out parts once you find that there's a problem. Also, you should run some load tests, and optimize the obvious bloated parts, but then let the system run under production load. It's more important that you build good measurement and reporting, than optimizing everything up-front. Once the measurements and reports tell you what's the problem ("I'm paging because of badly indexed MySQL queries" or "I'm leaking memory when players log off" or whatever), you can optimize that part, and save time on the others that don't need it.


I didn't say use RAID 1 instead of back-ups. I said use RAID-1 in addition to back-ups. Back-ups are taken at some point in time -- say, weekly, or daily. However, if you lose a disk, then all player data back to the time of the backup may be lost. With RAID 1, that's much less likely to happen, as you can keep running degraded, while sending away for a replacement disk (and making sure to be extra cautious about your backups until you get the second disk). In fact, you'll save money getting 1x256 instead of 2x128.

You don't need motherboard RAID if you're running Linux, as the "md" driver that comes with Linux does RAID-1 for you all by itself, when correctly configured.


Intel vs AMD: AMD is allright, although unless you get the latest FX stuff, their memory busses are not usually as fast as Intel. However, I've had some really bad experiences with VIA chip sets, so it'll be Intel for me for reliability. I guess a second option would be NVIDIA chip sets -- although not during the first six months of release :-/. SiS and the others aren't even options, for my personal stuff -- your mileage may vary.
enum Bool { True, False, FileNotFound };
Advertisement
Having written an OS for SMP machines, I can say with certainty that any user-level bug that would show up on a MP (Multi-Processor) machine, will also show up on a SP (Single Processor) machine, although sometimes with less frequency. The big difference with SMP is the driver model -- "cli" no longer ensures that nothing else is mucking with your devices and memory. Assuming your drivers and kernel are properly written, however, the application model and failure modes really are no different between a MP and a SP machine, except possibly for certain timing related anomalies, and the anomaly that on an MP machine, you have more CPUs competing for the same memory bus, and thus lower througput per CPU.

When it comes to clustered machine setups (which IBM tries to call "grids" these days), I think that the best way to test is to run the processes on different physical machines -- chances are, you'll get two fast SP machines for less money than one "equivalently fast" MP machine. In my mind, as an OS guy who turned to distributed simulation, SMP only helps if you are compute bound (not memory bound) AND your architecture needs to use shared memory for IPC, say because you do zero-lookahead interactions and don't have a facility to replay.

So, I guess I disagree that SMP is a necessary testing environment, unless you don't trust your drivers or kernel (or that's what you're developing). I think it's great that you expressed your opinions with clarity in this thread, though, so readers can read both arguments and make up their own minds.
enum Bool { True, False, FileNotFound };
hi,

i have been doing a little more shopping and have decided on a new mb, cpu, ram, and HD based on your suggestions.

first, hplus, when you were talking about memory latency, this has to do with the speed of the RAM and also the FSB of the MB? anyway, heres the new stuff.. (going with Intel for one)

new motherboard:


ASUS "P4S800" SiS648FX Chipset Motherboard for Intel Socket 478 CPU -RETAIL


one thing that bothered me, is someone left a review for this board and they said "the only drag to this mobo is the byzantine set of memory restrictions.". anyone know what this means and if it effects me? the other downside is theres no on board video. in fact, i cant find any ATX MB's with onboard video. maybe i should switch to Micro ATX? it seems Micro has cheaper MB's with better specs..

CPU:

Intel Celeron D 330 2.66 GHz, 533 MHz FSB, 256K L2 Cache Processor - Retail


this CPU has a higher FSB, and more MHz, and its cheaper. however it is a Celeron, a CPU ive never used before. do you think this MB/CPU combo is better then my old one? its a little more expensive, but it has a higher FSB, which is what you were saying i should upgrade, right? (like i said, im not sure if FSB is related to memory latency..)

RAM:
VIKING 184 Pin 256MB DDR PC-3200 - OEM

this has a slower "Cas latency", whatever that means. so im thinking its better then the old RAM (and yeah its also 256 instead of 128).

HDD:

Hitachi 40GB 7200RPM IDE Hard Drive, Model HDS728040PLAT20, OEM Drive Only


this HD has a 8.5 seek time instead of 11. im getting 2 of these as well.

just to make sure, my MB doesnt have to mention RAID, correct? it should be configurable no matter what? just wanted to be sure, i think you mentioned something about this though..

if you check out this page, it shows some CD burners. now, none of them mention Linux / Unix as the operating system. does this mean they wont work with *nix? i didnt think that was possible, however i would like to get one of the first few mentioned on that page..

thanks again for any help.



FTA, my 2D futuristic action MMORPG
First: if you're using Linux, then you can have one boot partition that's small, and contains only the boot file system(s). All the "real" stuff can go into one or more devices managed by the "md" driver, which will create RAID-1 for you. There's a little bit of configuration to set it up, but really not all that much. There are HOWTO documents on how to do it.

Second: The FSB affects latency a bit, because loading a cache line when you get a cache line miss is faster with a faster FSB. However, you're saying that the memory you chose has a slower CAS -- CAS also affects latency. If you meant "lower CAS" (like, 2.5 instead of 3), then it'll likely be faster.

Third: OEM drives have crappier warranties. However, if you use RAID-1, and you're willing to buy a new drive when the first drive goes bad, you can save a little money up-front going that route. Just make sure you run a script that tells you about failure messages going to syslog, so you know when things go bad :-)

Fourth: CD writers use a common, standard command set, plus or minus implementation bugs. The Linux CD drivers support almost all CD burners you can buy. If you really want to be sure, take a look at the compatibility list (or the list of the burning program, such as cdrecord. For reading (not writing), all ATAPI CD drives will work just fine.

Fifth: The machine you're suggesting will probably work fine, as will the initial machine you suggested. 30 Players just isn't that much.

Sixth: The 800 FSB of the chipset is a little bit wasted when you stick a 533 FSB CPU into it. 533 is still more than 333, though :-) Regarding the Byzantine memory restriction, I think they refer to the text in blue on this ASUS.com page. Seems like you'll be OK (2 PC3200 DIMMs).

If you want a stable board with built-in graphics, look for something with an i865G or i865GV chip set. Perhaps a P4P800-MX? Yes, Micro ATX is cheaper. Btw: "GV" means you don't get an AGP port, so you can't upgrade the graphics -- doesn't matter for a server, of course.


Good luck!
enum Bool { True, False, FileNotFound };
hey hplus,

thanks a lot for that reply. i only have one last question, so you think i should go with the Micro ATX then, since it has better specs (and is cheaper) ? im just making sure theres no "gotchas" or restrictions to the whole Micro ATX thing, since it seems to be cheaper and better then regular ATX (too good to be true kind of thing). so if i go Micro, then only the case and MB have to be micro? EDIT: sorry, actually thats 2 questions [grin].

thanks again.

[Edited by - graveyard filla on January 30, 2005 10:44:36 PM]
FTA, my 2D futuristic action MMORPG

This topic is closed to new replies.

Advertisement