Advertisement

a memset that doesn't suck

Started by February 10, 2001 03:58 PM
39 comments, last by Shannon Barber 23 years, 11 months ago
Is there a memset type command that copies DWORDs instead of BYTEs? Anywhere in the C lib, STL, or Win32? Or do you have to roll-your-own? Magmai Kai Holmlor - The disgruntled & disillusioned
- The trade-off between price and quality does not exist in Japan. Rather, the idea that high quality brings on cost reduction is widely accepted.-- Tajima & Matsubara
You need a roll-your-own.

Try here : http://www.azillionmonkeys.com/qed/blockcopy.html

Why not use MMX? Shift two DWORDs at a time ...




"NPCs will be inherited from the basic Entity class. They will be fully independent, and carry out their own lives oblivious to the world around them ... that is, until you set them on fire ..." -- Merrick

"It is far easier for a camel to pass through the eye of a needle if it first passes through a blender" -- Damocles
"NPCs will be inherited from the basic Entity class. They will be fully independent, and carry out their own lives oblivious to the world around them ... that is, until you set them on fire ..." -- Merrick
Advertisement
The AMD SDK comes with a 3DNow! enhanced memcpy, but I can''t find an enhanced memset, maybe you could take some hints from the amd_memcpy though .



http://www.gdarchive.net/druidgames/
ZeroMemory will do what you want. I ran a test clearing a 16MB block 128 times and it took a little about 12.72 seconds. That works out to around 160MB/s with PC100 memory and a 450MHZ Pentium III.
Keys to success: Ability, ambition and opportunity.
I always thought that ZeroMemory was a MACRO for memset. If that''s the case, that''ll be just like using memset, right?
==========================================In a team, you either lead, follow or GET OUT OF THE WAY.
ZeroMemory is a macro for memset().

-Ironblayde
 Aeon Software

Down with Tiberia!!
"Your superior intellect is no match for our puny weapons!"
Advertisement
memset, memcpy, etc. are usually highly optimized by the maker of the compiler. Which compiler are you using?
I thought it was strange when I traced into the windows call the routine was exactly the same as the source for the runtime library. Ok, so correction, the Borland runtime library implements memset with a Pentium optimized algorithm Just implementing my own routine to increment a pointer and move zero got 145MB/s. Unrolling the loop by doing *j = *(j+1) = *(j+2) = *(j+3) = 0 instead of just *j = 0 kicked it back up to 160MB/s. I don''t think you really need anything elaborate to clear memory although done just right you might be able to get it up to 200MB/s.
Keys to success: Ability, ambition and opportunity.
  ////////////////////////////////////////////////////////////////////////////////////////////////////////////////                    //    Function Name:  MemSet32Bit//          Summary:  Sets memory chunks of 32-bit//                    blocks.//                    //       Parameters:  void* dest - The addres to set.//                    int fill - The value to set to.//                    int count - The number of 32-bit//                                blocks to set.// //          Returns:  None.//                    //      Description:  If the size of the memory to//                    fill does not equal a multiple//                    of four, you will be responsible//                    for changing the remaining 1 to 3//                    bytes. Because the "fill" value//                    is expecting a 32-bit value, I //                    can not have the function //                    automatically adjust the value //                    to the last remaining bytes.////         Examples:  //                    //   ///////////////////////////////////////////////////   ///////////////////////////////////////////////////   // MemSet32Bit requires the third value to be the//   // number of 32bit "fill" values to place. If the//   // size of the "buffer" is 33 - MemSet32Bits will//   // not set the final byte in the buffer...//   int* buffer = new int[4194304];//   MemSet32Bit( buffer, 0, 4194304 );//                    //////////////////////////////////////////////////////////////////////////////////////////////////////////////__inline void MemSet32Bit ( void* dest, int fill, int count ){  if (count > 0)  {    _asm    {      mov   eax, fill;   // store value to copy...      mov   ecx, count;  // how many copies      mov   edi, dest;   // to where      rep   stosd;    }  }}  


I don''t believe it''d do much good but here''s what I use. I pushes through about 169MB p/s (P3-750mhz). It is, as you can tell, rather restictive and may be more of a hinderance if your buffer size is not an equal multiple of four.

Regards,
Jumpster
Regards,JumpsterSemper Fi
quote:
Original post by Magmai Kai Holmlor

Is there a memset type command that copies DWORDs instead of BYTEs?

Anywhere in the C lib, STL, or Win32?

Or do you have to roll-your-own?

Magmai Kai Holmlor
- The disgruntled & disillusioned



STL solution:
#include &ltalgorithm>DWORD array[ARRAY_SIZE];fill (array, array+ARRAY_SIZE, INITIAL_DWORD_VALUE); 


Don''t know if you''ll get the speed out of this that you''d get with Jumpster''s soln, but if you''re looking for a one-line standard library memset for any type, there ya go.

This topic is closed to new replies.

Advertisement