Advertisement

asm expert needed =)

Started by September 07, 2000 08:48 PM
8 comments, last by md2ge 24 years, 3 months ago
I wonder if somebody could help me with some asm code I need to clear out (set to 0) an array of 129x129 char every frame. I''m wondering if a specific asm function to do this would be faster than the memset() Thanks in advance, Rick
Hi.

Well, this is the best I can think of...
(THis code assumes the use of MASM)

    ClearAray   proc   near   uses eax ecx edi,                          Destination:dword;cache destination addres in a register   mov   edi,Destination;initialize value used to reset the destination   xor   eax,eaxFirstByteLoop:;if the destination address is dword aligned, go by the dword   test  edi,11b   jz    ByTheDWord;reset next byte   mov   [edi],al   ;faster than stosb   dec   ecx   inc   edi   jmp   FirstByteLoopByTheDWord:;see if there is a complete dword to reset   cmp   ecx,4   jb    LastByteLoop;reset next dword   mov   [edi],eax   ;faster than stosd   sub   ecx,4   add   edi,4   jmp   ByTheDWordLastByteLoop:;see if we any bytes left to reset   or    ecx,ecx   jz    ExitLastByteLoop1:;reset next byte   mov   [edi],al   inc   edi   dec   ecx   jnz   LastByteLoop1Exit:   retClearArray  endp    



Latter, Topgoro
We emphasize "gotoless" programming in this company, so constructs like "goto hell" are strictly forbidden.
Advertisement
Why don''t you try this:

    void set129x129array(void *dest){   asm{      xor eax, eax      mov ecx, 32      mov edi, dest      rep stosd      stosb   };};    


"Now you will see time life music commercials for the rest of your life, no forever! Bwahaha!"
"Nooooooooo!!!!!!!!!!"

Sludge Software
www.sludgesoft.com
Developing a secret of mana style role-playing-game
Sludge Softwarewww.sludgesoft.comDeveloping a secret of mana style role-playing-game
I think a loop would be faster then stos* under multitasking
since a task switch can take place under stos* and it would take longer time to "return" to the stos* then the loop. Well I could be quite wrong, never tested it


quote: Original post by Sludge

Why don't you try this:

void set129x129array(void *dest){
asm{
xor eax, eax
mov ecx, 32
mov edi, dest
rep stosd
stosb
};
};

"Now you will see time life music commercials for the rest of your life, no forever! Bwahaha!"
"Nooooooooo!!!!!!!!!!"

Sludge Software
www.sludgesoft.com
Developing a secret of mana style role-playing-game


Hi all.

I will tell you why: optimization

I start on byte at a time until the destination adress is aligned bacause due to the way the processor is conected to the memory it is faster to do aligned wirtes than it is to do misaligned writes.

But you are right otherwise, once the destination address is aligned, it would be better to use a rep stosd.

By the way, you are olny resetting a 129x1 array and did not make sure the direction flag is clear, but don't worry, I did not initialize ecx in my first code example at all LOL.

Here is the new version:

    ClearAray   proc   near   uses eax ecx edi,                          Destination:dword;cache destination addres in a register   mov   edi,Destination;initialize value used to reset the destination   xor   eax,eax;initialize number of bytes to reset   mov   ecx,16641FirstByteLoop:;if the destination address is dword aligned, go by the dword   test  edi,11b   jz    ByTheDWord;reset next byte   mov   [edi],al   dec   ecx   inc   edi   jmp   FirstByteLoopByTheDWord:;save the byte count   push  ecx;determine number of dwords   shr   ecx;make sure direction register increments edi   cld;do the reset   rep stosd;restore the byte counter   pop   ecx;determine number of bytes left to reset   and   ecx,11bLastByteLoop:;do the reset   rep stosb   retClearArray  endp        


MatsG: This is a very small block, so I don't think that would be a cosideration (the task switching thing).

Topgoro

PS: Dang, you can only have one "source" block per repply

Edited by - Topgoro on September 8, 2000 9:20:21 AM
We emphasize "gotoless" programming in this company, so constructs like "goto hell" are strictly forbidden.
Hi every one.
I I agree that the loops presented above are good... but try to avoid inline asm cause this would make for an obsolete call
from c to your c function and then c does some prepares for the inline asm. It's best to write an .asm and compile it then link the obj file because this causes the code to be placed right inside your code with only a call to it. If you do this make sure you place an underscore(_) in front of the asm routine name. Other wise you would get an linker error saying it can't find your routine.

cya!

EOT MR Master

Edited by - MR_MASTER on September 8, 2000 10:52:02 AM
EOT MR Master
Advertisement
How about..

void clear(char* dest)
{
asm
{
nop
}

memset(dest, 0, sizeof(char)*129*129);
}
Isn''t their a compiler option to force arrays to a certain alignment?


Mike
"Unintentional death of one civilian by the US is a tragedy; intentional slaughter of a million by Saddam - a statistic." - Unknown
Topgoro: so you mean when you are dealing with small memory writes, it is better to mov to [edi], than using stos*, and with really big array with stos* better than mov to [edi]?

"Now you will see time life music commercials for the rest of your life, no forever! Bwahaha!"
"Nooooooooo!!!!!!!!!!"

Sludge Software
www.sludgesoft.com
Developing a secret of mana style role-playing-game
Sludge Softwarewww.sludgesoft.comDeveloping a secret of mana style role-playing-game
quote: Original post by Sludge

Topgoro: so you mean when you are dealing with small memory writes, it is better to mov to [edi], than using stos*, and with really big array with stos* better than mov to [edi]?



Hi.

It really depends on what you want to do, because in my game engine I am using both aproaches in diferent functions. I tried switching between them and in one of the functions stod/b is faster while in the other function mov [edi],xxx/inc edi is faster.

It must has something to do with this out of order execution and other arquitecture considerations.

MR_MASTER, if you use MASM (all of my code samples assmume use of MASM, the only assembler I have experience with) you can also declare the function as using the "stdcall" calling convetion (or use the /Gz command line switch when assembling the file) and don't have to worry about yet another detail like adding an underscore (like we programmers don't have already enogh details to worry about) to your function names.

By the way, what the heck is the anonimous poster talking about? LOL

Topgoro

PS: I wonder if we are being of any help to md2ge


Edited by - Topgoro on September 8, 2000 3:20:17 PM
We emphasize "gotoless" programming in this company, so constructs like "goto hell" are strictly forbidden.

This topic is closed to new replies.

Advertisement