Advertisement

fastest way to clean an array

Started by October 06, 2000 05:51 AM
35 comments, last by Bruno 23 years, 7 months ago

memset() uses the asm REP STOSW instruction (on Intel machines). Even MMX with 64bit registers you''ll find it difficult to beat it (in general cases).

Use memset() it''s fast.

is there a rep stoxx opcode for mmx regs?
- The trade-off between price and quality does not exist in Japan. Rather, the idea that high quality brings on cost reduction is widely accepted.-- Tajima & Matsubara
Advertisement

I don''t think so.
As for the xor eax,eax vs. mov eax,0, on the Pentium 4 isn''t the fastest way sub eax,eax? The reason being that on the P4 each integer op takes 1/2 clock cycle-so you could clear two registers in one clock, which is the best you could hope for. As for mmx it uses only 64 bit registers, SSE uses 128 bits but it''s made for floats. However I do think there are integer ops for sse...not sure though.
Where do you get that it takes four clocks to load an opcode? While it may take four or more clocks to get a read cycle from the bus, I find it hard to beleive that it takes any more than one to fetch one from the cache. Also, according to the docs for the AMD K6 it can decode many, less than, nine byte instructions in a single clock.
On pretty much any processor (above the original K6''s, I think), you have write combiners that buffer memory writes so that, for consecutive memory locations, the system always writes eight bytes at a time.

This means memory writing performance is pretty much unaffected by MMX or whatever code.
Advertisement
Heya,

the XOR trick only works with registers,
instead of saying mov eax, 0
you use xor eax, eax.
This way it is an optimzation!

But I guess today''s compilers/optimizers can make the change
automaticly.

BTW
Jaap, ik heb in je history gekeken en zag een post over eindhoven staan gepost ergens in december 2000.
Ik studeer al 3 jaar in EHV, dus als je nog vragen hebt....


Gr,
BoRReL

This topic is closed to new replies.

Advertisement