I'm not experienced with MMOs at all, however I am with scalability, games and networking; and I have a couple of comments.
In recent discussions with web and app developers one thing has become quite clear to me - the way they tend to approach scalability these days is somewhat different to how game developers do it. They are generally using a purer form of horizontal scaling - fire up a bunch of processes, each mostly isolated, communicating occasionally via message passing or via a database. This plays nicely with new technologies such as Amazon EC2, and is capable of handling 'web-scale' amounts of traffic - eg. clients numbering the the tens or hundreds of thousands - without problem. And because the processes only communicate asynchronously, you might start up 8 separate processes on an 8-core server to make best use of the hardware.
I wouldn't put so much trust on "how web developers approach scalability".
Not too long ago, we had the C10K problem. Browsing through the net, recommendations were "just use fork, it scales very well". Turns out, fork had an initialization overhead (which like you said, you preallocate to prevent this problem). Then they said "use one socket per thread, super scalable! The Linux Kernel does magic for you. Thus select and poll were recommended" and someone digged up the Linux Kernel src and found that the Kernel linearly walks through the list of sockets to know which process/thread needs to be delivered. Some of the algorithms had O(N^2) complexity.
Someone fixed this, and then we got epoll.
But still the problem remains that we have one socket per TCP connection and that sucks hard. On a
a C10M problem article points out these problems, and points at the driver stack as the biggest bottleneck.
The first time I deeply digged into scalable networking for a client project, I met this bizarre architecture flaw and no one talked about it as if there was no problem at all. Then that C10M article appeared, and I was relieved to hear a voice that finally someone pointed the same problems I saw.
So, no. I don't trust the majority of web developers in doing highly scalable web development. Most of the time they just get lucky their servers don't get enough stress to (D)DoS. But my gut is that if they were better at that job, they could handle the same server load with far less farm budget.
Sure, at a very high level with distributed servers like Amazon EC2, these paradigms work. But beware that a user waiting 5 second for the search results of their long-lost friend on Facebook is acceptable(*). A game with a 5 second lag for casting a spell is not.
Half an hour of delay until my APK gets propagated across Google Play Store servers is acceptable and reasonable. Half an hour of delay until my newly created character gets propagated so I can start playing is not.
(*) Many giants (i.e. Google, Amazon) are actively working on solutions as Amazon (or was it Apple) found out a couple milliseconds improvement in page loading correlated with higher sales.
Web content has a much higher consumption rate than production rate. Games have the annoying property that have both frequent read and write access to everything (you can mitigate by isolating, but there's a limit).
Frequent write access hinders task division, which is necessary for scaling across cores/machines/people/whatever.