Thanks for the replys guys..., but
Either with ZeroMemory, or memset, the cleaning is sloooowwww...
I loose around 80 fps when i make the cleaning..
Is there any other way, so that i don''t lose so much performance??
Who can i do this with MMX instructions???
thanks guys
Bruno
fastest way to clean an array
Ok, let''s say you already detected MMX.
then :
_asm
{
mov edi,_arr_ptr // load edi width address of arr.
mov ecx,num of bytes / 8 // clears 8 bytes per loop
pxor mm1,mm1 // reset MMX register mm1 to 0
Loop:
movq [edi],mm1
add edi, 8 // inc. ptr to next quadword
dec ecx
jnz Loop
}
Sorry if I have forgotten anything, but hope this helps for a start.
/ Tooon
then :
_asm
{
mov edi,_arr_ptr // load edi width address of arr.
mov ecx,num of bytes / 8 // clears 8 bytes per loop
pxor mm1,mm1 // reset MMX register mm1 to 0
Loop:
movq [edi],mm1
add edi, 8 // inc. ptr to next quadword
dec ecx
jnz Loop
}
Sorry if I have forgotten anything, but hope this helps for a start.
/ Tooon
October 09, 2000 12:08 PM
Why are you clearing that much memory every frame anyway? I would suggest you redesign your system.
quote:
Original post by Mithrandir
MOV is an expensive operation. Most ASM guru''s use X xor X to clear a register/memory position.
Hmm... for a 468 it is:
xor mem, immediate value (or mem, register) - 3 clocks.
mov mem, immediate value (or mem, register) - 1 clock.
Doesn''t look faster to me :-)
Well, to understand what i''m doing, download this :
www.geocities.com/brunomtc/test.zip
It''s a c-buffer, and the array, has the pixels that were used.., i still don''t know how to use inline assembler, whatever assembler i put there, i always get an error messagedata:image/s3,"s3://crabby-images/ca6f6/ca6f6775e17e865c2b8f46f808f8a68c699e0ef2" alt=""
Any of you guys, know how to link assembler made by masm with VC ???
www.geocities.com/brunomtc/test.zip
It''s a c-buffer, and the array, has the pixels that were used.., i still don''t know how to use inline assembler, whatever assembler i put there, i always get an error message
data:image/s3,"s3://crabby-images/ca6f6/ca6f6775e17e865c2b8f46f808f8a68c699e0ef2" alt=""
Any of you guys, know how to link assembler made by masm with VC ???
mr BiCEPS >> I think xor mem,mem is faster than mov mem,0 but when setting mem to zero the xor method can''t be faster. Not in my world anyway
Wait... xor mem, mem - that isn''t possible at all, is it?
On of the operands got to be either reg or immediate.
At least that''s what my opcodes manual says.
And all mov operations involving a register, immediate value or a memory location only take one clock, so it doesn''t actually get faster than that.
On of the operands got to be either reg or immediate.
At least that''s what my opcodes manual says.
And all mov operations involving a register, immediate value or a memory location only take one clock, so it doesn''t actually get faster than that.
data:image/s3,"s3://crabby-images/5c440/5c4403797a27903def402214e7cedc45a2295919" alt=""
What are you trying to zero, the screen buffer? In OpenGL just use glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
I'd like to point out 4 things:
as usual, a better method is needed, not better code, but here goes anyway...
you can't xor a memory location, xor is used to clear regs only, and I think it saves you two ticks over mov.
an array[200][200] is not necessarily continous ram, so memset(array, 0x0, 200*200*sizeof(array[0])) is not good idea, unless you've force the array to be continuous.
If you use MMX (128bit blocks right?) the array needs to fall evenly on a 16 byte block, which 200x200 does so I guess your ok. If you use a different sized array, you have stop short and use normal regs for the last few bytes.
Edited by - Magmai Kai Holmlor on October 9, 2000 11:15:59 PM
as usual, a better method is needed, not better code, but here goes anyway...
you can't xor a memory location, xor is used to clear regs only, and I think it saves you two ticks over mov.
an array[200][200] is not necessarily continous ram, so memset(array, 0x0, 200*200*sizeof(array[0])) is not good idea, unless you've force the array to be continuous.
If you use MMX (128bit blocks right?) the array needs to fall evenly on a 16 byte block, which 200x200 does so I guess your ok. If you use a different sized array, you have stop short and use normal regs for the last few bytes.
|
Edited by - Magmai Kai Holmlor on October 9, 2000 11:15:59 PM
- The trade-off between price and quality does not exist in Japan. Rather, the idea that high quality brings on cost reduction is widely accepted.-- Tajima & Matsubara
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement
Recommended Tutorials
Advertisement