a memset that doesn't suck
Is there a memset type command that copies DWORDs instead of BYTEs?
Anywhere in the C lib, STL, or Win32?
Or do you have to roll-your-own?
Magmai Kai Holmlor
- The disgruntled & disillusioned
- The trade-off between price and quality does not exist in Japan. Rather, the idea that high quality brings on cost reduction is widely accepted.-- Tajima & Matsubara
You need a roll-your-own.
Try here : http://www.azillionmonkeys.com/qed/blockcopy.html
Why not use MMX? Shift two DWORDs at a time ...
"NPCs will be inherited from the basic Entity class. They will be fully independent, and carry out their own lives oblivious to the world around them ... that is, until you set them on fire ..." -- Merrick
"It is far easier for a camel to pass through the eye of a needle if it first passes through a blender" -- Damocles
Try here : http://www.azillionmonkeys.com/qed/blockcopy.html
Why not use MMX? Shift two DWORDs at a time ...
![](http://www.geocities.com/mind_phuq/mpsb1.gif)
"It is far easier for a camel to pass through the eye of a needle if it first passes through a blender" -- Damocles
"NPCs will be inherited from the basic Entity class. They will be fully independent, and carry out their own lives oblivious to the world around them ... that is, until you set them on fire ..." -- Merrick
The AMD SDK comes with a 3DNow! enhanced memcpy, but I can''t find an enhanced memset, maybe you could take some hints from the amd_memcpy though
.
http://www.gdarchive.net/druidgames/
![](wink.gif)
http://www.gdarchive.net/druidgames/
ZeroMemory will do what you want. I ran a test clearing a 16MB block 128 times and it took a little about 12.72 seconds. That works out to around 160MB/s with PC100 memory and a 450MHZ Pentium III.
Keys to success: Ability, ambition and opportunity.
I always thought that ZeroMemory was a MACRO for memset. If that''s the case, that''ll be just like using memset, right?
==========================================In a team, you either lead, follow or GET OUT OF THE WAY.
memset, memcpy, etc. are usually highly optimized by the maker of the compiler. Which compiler are you using?
I thought it was strange when I traced into the windows call the routine was exactly the same as the source for the runtime library. Ok, so correction, the Borland runtime library implements memset with a Pentium optimized algorithm
Just implementing my own routine to increment a pointer and move zero got 145MB/s. Unrolling the loop by doing *j = *(j+1) = *(j+2) = *(j+3) = 0 instead of just *j = 0 kicked it back up to 160MB/s. I don''t think you really need anything elaborate to clear memory although done just right you might be able to get it up to 200MB/s.
![](tongue.gif)
Keys to success: Ability, ambition and opportunity.
|
I don''t believe it''d do much good but here''s what I use. I pushes through about 169MB p/s (P3-750mhz). It is, as you can tell, rather restictive and may be more of a hinderance if your buffer size is not an equal multiple of four.
Regards,
Jumpster
Regards,JumpsterSemper Fi
quote:
Original post by Magmai Kai Holmlor
Is there a memset type command that copies DWORDs instead of BYTEs?
Anywhere in the C lib, STL, or Win32?
Or do you have to roll-your-own?
Magmai Kai Holmlor
- The disgruntled & disillusioned
STL solution:
#include <algorithm>DWORD array[ARRAY_SIZE];fill (array, array+ARRAY_SIZE, INITIAL_DWORD_VALUE);
Don''t know if you''ll get the speed out of this that you''d get with Jumpster''s soln, but if you''re looking for a one-line standard library memset for any type, there ya go.
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement
Recommended Tutorials
Advertisement