Advertisement

Assembly : When is it worth your time?

Started by May 25, 2004 02:57 PM
101 comments, last by OpenGL_Guru 20 years, 5 months ago
Well, for example, Small, a nice scripting language we use.
It has a few implementations of the VM.
The ASM version is ~5 times faster than the ANSI C version.
But anyway, if you don''t believe me, have fun not using any ASM, and wondering why other''s people code works better.
Also, look on Quake 1 sourcecode documentation. The 3D engine (some of it) has a C and an ASM implementation. The author claims that the ASM version is twice as fast. But of course, you know better.
Properly written assembler can be faster than C code for one single reason. A human can optimize much father ahead than a compiler. I was once writing a software renderer, only for points, as a test to see which worked faster, inline assembler or C. I was using Visual Studio 6, with full optimization enabled when doing this, and the assembler version was 2-3x faster than the C version. Simply because i could leave values on the FP stack and other places for use many lines later in the code. Aside from that assembler is also useful in using CPU features that the compiler won''t use.

Assembler is also useful for doing things that would be otherwise impossible for a C program.
Advertisement
I am damn sick of this argument, it crops up on every programming board every couple of weeks.

Asm IS faster than C, and is certainly faster than C++ (all those classes and templates do add overhead you know).

Asm IS needed to do things that can''t be done with C/C++. Your compiler will not do SSE, SSE2 or 3DNOW! for you, at least not at the same level as hand coded assembly language. When you speed-optimise in asm, you make significant algorithm changes for extra speed, like working on multiple vectors at once with a vector X matrix function. A compiler cannot make those changes for you.

If don''t want to learn asm, fine. But don''t put down people who have invested the time to learn asm to speed up their programs.
Any one that is not using ASM as optimizer or even worse using a high level lanuge like Phyton is wasting time in Game Development.
No game coud posbly be writen in somthing like C# and look nice + go fast + be complex (on PC 1800/2000 mhz wich is normal today). If you dont trust me try running DX9 SDK samples for C++ & C# and then see witch goes faster.
Red Drake
The only time that I have had to use assember is when coding MMX/SSE stuff. While some compilers have intrinsics now, it seems to be a very complex problem to "detect" loops and such that can be vectorized with SIMD, although that is partially due to the severe limits currently placed on SSE instructions etc.

That said, I''ve managed to get a 4-8x speed improvement on software alpha blending/mapping (it can easily get >200fps blending a full 1024x768 screen in 32-bit color... quite usable), and a 2x improvement of certain per-pixel filters (water in this case).

Then again, programming for that level of SSE is not trivial... it took me a few months to get good at. Unless you''re doing obviously vectorizable code, I wouldn''t suggest touching assembly at all. Even if you do, only convert the small vectorized portion to ASM
quote:
The only time that I have had to use assember is when coding MMX/SSE stuff. While some compilers have intrinsics now, it seems to be a very complex problem to "detect" loops and such that can be vectorized with SIMD, although that is partially due to the severe limits currently placed on SSE instructions etc.


Didn''t I said to use it for OPTIMIZATION.
Last time I checked the SIMD was in that group.

quote:
Then again, programming for that level of SSE is not trivial... it took me a few months to get good at.


It''s true buth it''s not that hard eather. It''s yust wery time consuming buth once you learnd it it will become trivial.
(I think that you after makeing your code you can write an assembler optimized code in a snap)

Today computers get faster and this gives developers an excuse to use somthing easy as C# or Visual Basic to write their code.
This is probably good thing in writeing app''s with low CPU usage buth when writeing a top class 3D engine there is yust no place for things like high level lanuges. So the point is -- Don''t be lazy, write your code the best you can and do nothing less so we can all play games at more than 15 FPS
Red Drake
Advertisement
if you use a .net language then you won''t have to wrory about this sortof thing. These languages automatically take advantage of 3dnow, sse, and naturally things that arn''t out yet (eg 64bit extensions). This is the way of the future, highly optimized type-safe cross platform languages. ASM has no place here.


That said, comparing C code to raw ASM and saying it''s 10x faster may not be because one is written in C and one is written in asm..

for example:

int values[5000][5000];void doSomething(){	for (int y=0; y<5000; y++)		for (int x=0; x<5000; x++)			values[x][y]++;} 


would actually be _extremly_ inefficient code... and the compiler would likly have to be very smart to optimize that... Yet I''d bet that occurs in a lot of people code. This code is not very efficient with the stack, but far more importantly it''s hugly inefficient with the cache. The thing is it would be hard to write it with the same inefficiency in ASM since asm is so low level and ugly.

And as I''ve said before many many times,
the number one most important thing when it comes to making your program run fast is the algorithms you use. I think someone mentioned this but I can''t find it anymore. And secondly how well you exploit the cpu cache.


I gave a lecture yesterday about how important algorithms are..
The example I used was a raytracer, and the end point was that by using the right algorithm you can make it in the order of 20,000 times faster... with this sort of margin the language you write it in has absolutly no bearing whatsoever. And thats why I have made a decision that I will never write ASM again (and that includes shaders btw).

| - My project website - | - email me - |
I don''t have any experience with MMX or SIMD so I can''t say what sort of speedup is possible with those features of the instruction set.

That aside, a good assembly language programmer can often produce code that is 2-5 times faster than the compiler. Compilers do a good job of optimizing code but lack ingenuity. A compiler analyzes code and is able to determine when to apply certain optimizations based on a series of rules. Human programmers aren''t so limited and have the ability to try many different approaches before settling on the fastest one. Compilers are not as flexible in their register usage, loop unrolling (they can do some), limiting of memory accesses or a number of other things. Also, compiler writers are not the gods of assembly optimization. Surely, some of them are really good at assembly and others are not as good. Programmers, game and otherwise, vary greatly in their skill levels.

Assembly language is harder to debug and easier to screw up--especially when you are trying to cut corners for speed rather than write robust code. Coding whole applications in assembly isn''t very useful anymore. Rewriting time-critical pieces of code in assembly will probably be useful for quite some time.

Many of the job postings at various game companies still ask for assembly experience.

"Assembly programming alas is all too often considered a dying art form; however, this is definitely not the case at Naughty Dog. We take assembly programming VERY seriously and use assembly extensively in our games"
- quote from the Naughty Dog website

If you''re writing a game and find that you can improve the average framerate for your game by even a few frames a second if you recode a handful of functions in assembly, would you do it? For some games, it probably wouldn''t matter. If you''re trying to write the next great FPS...
I''d just like to add something extra:

quote:

No game coud posbly be writen in somthing like C# and look nice + go fast + be complex (on PC 1800/2000 mhz wich is normal today). If you dont trust me try running DX9 SDK samples for C++ & C# and then see witch goes faster.




Sorry but I beileve that to be wrong.

I forget the name of the company, but a while ago they compiled the quake2 source code using managed C++ (C++ for .net), ie, the same byte code that C# will produce.. And they stated the performance was approximatly 85% of the original C/asm version. In my opinion that is more than acceptable for a just-in-time compiled language. And as time goes on and the .net framwork is optimized, and net features are supported, this margin will shrink further (if not reverse).
That said quake2 is a very good testing example since it''s BSP algorithms are extremly cpu dependant... look into the code and every single triangle is drawn with a sires of glBegin(GL_POLYGON)...glEnd() calls and goes through all the appropriate bsp culling... So with that, considering these gl calls were likly running through a thrid party .net GL library, and you may just have your 15% difference there alone. (plus you get all the significant advantages of the .net runtime - which is easily worth the 15%)
quote: Original post by RipTorn
int values[5000][5000];void doSomething(){	for (int y=0; y<5000; y++)		for (int x=0; x<5000; x++)			values[x][y]++;}   

would actually be _extremly_ inefficient code... and the compiler would likly have to be very smart to optimize that... Yet I'd bet that occurs in a lot of people code. This code is not very efficient with the stack, but far more importantly it's hugly inefficient with the cache. The thing is it would be hard to write it with the same inefficiency in ASM...

My problem is some of my alogrithms are ugly but also more complex than this one so it would be hard to optimise those in ASM, however i do think it might make a difference (and probably not just a small one) because of the use of stacked loops and such.



[edited by - Tree Penguin on June 3, 2004 5:32:21 AM]

This topic is closed to new replies.

Advertisement