Advertisement

How to do Dynamic Virtual Memory Allocation on Windows

Started by June 22, 2024 06:01 PM
4 comments, last by Juliean 6 months ago

Pretty much all talks about Windows's Virtual memory API mentions the fact it allows Dynamic Arrays to be easily used.

You basically Reserve large virtual memory address with VirtualAlloc and Commit memory as you index through the reserved pages. In principle this sounds too simple. But I have been looking and could not find a single example of this. I have been told by a game developer, that this is actually one of the standard way to do dynamic arrays especially in Gamedev. So after a few hours inside Win32 API, I created this minimal example and would appreciate if someone can elaborate if this is actually the right way to do dynamic arrays with the help of virtual memory.

/*
Compile with:
set compiler_flags=/W4 /Zi /nologo /wd4100 /wd4505 /wd4201 /diagnostics:column /EHs- /EHc- /EHa- 
set libs=Ole32.lib Kernel32.lib User32.lib
set linker_flags=/INCREMENTAL:NO 
cl %compiler_flags% main.cpp %libs% /link %linker_flags%
*/

#include <Windows.h>
#include <stdint.h>
#include <stdio.h>

typedef int32_t s32;
typedef int64_t s64;
typedef int16_t s16;
typedef int8_t   s8;

typedef uint32_t u32;
typedef uint64_t u64;
typedef uint16_t u16;
typedef uint8_t   u8;

typedef float  f32;
typedef double f64;

#define ARRAY_COUNT(Array) (sizeof(Array)/(sizeof(Array[0])))
#define Global static

#define KB(Value) (  (Value) * 1024)
#define MB(Value) (KB(Value) * 1024)

struct dynamic_memory
{
    void* MemoryBase;
    s32   ReservedPages;
    s32   CommittedPages;
};

Global struct memory_table
{
    dynamic_memory Arrays[KB(4)];
    s32 Used;
} MemoryTable;



Global struct memory_info
{
    s32 PageSize;
    s32 AllocSize;

} MemoryInfo;

// TODO(ALI) A deallocation function might be used, that decommit the memory
static void*
CreateDynamicArray(SIZE_T AllocSize)
{

    // NOTE(ALI) Round Allocation to the system allocation granularity 
    // NOTE(ALI) This assumes the granularity is a power of two
    u64 Mask = MemoryInfo.AllocSize - 1;
    AllocSize = (AllocSize + Mask) & (~Mask);

    void* Memory = VirtualAlloc(0, AllocSize, MEM_RESERVE, PAGE_READWRITE);
    if(!Memory)
    {
        DWORD ErrorCode = GetLastError();
        printf("Failed to allocate with error code = %d\n", ErrorCode);
        return (0);
    }

    s32 MaxSizeArrays = ARRAY_COUNT(MemoryTable.Arrays);
    if (MemoryTable.Used >= MaxSizeArrays)
    {
        printf("You have used too much dynamic arrays \n");
        return(0);
    }

    dynamic_memory Entry = {};
    Entry.MemoryBase     = Memory;
    Entry.ReservedPages  = (s32)(AllocSize / MemoryInfo.PageSize); // NOTE(ALI) this assumes allocation size is a multiple of page size, which is reasonable
    Entry.CommittedPages = 0;

    MemoryTable.Arrays[MemoryTable.Used++] = Entry;

    return (Memory);
}

LONG WINAPI
PageFaultHander(struct _EXCEPTION_POINTERS *ExceptionInfo)
{
    if(ExceptionInfo->ExceptionRecord->ExceptionCode == EXCEPTION_ACCESS_VIOLATION)
    {
        if(ExceptionInfo->ExceptionRecord->ExceptionInformation[0])
        {
            void* PageMemory = (void*)ExceptionInfo->ExceptionRecord->ExceptionInformation[1];

            s32 DynamicArraysCount = MemoryTable.Used;

            s32 Found = 0;
            s32 Index = 0;
            for(; Index < DynamicArraysCount; ++Index)
            {
                if(MemoryTable.Arrays[Index].MemoryBase == PageMemory)
                {
                    Found = 1;
                    break;
                }
            }

            if(!Found)
            {
                return EXCEPTION_CONTINUE_SEARCH;   
            }


            dynamic_memory* Entry = &MemoryTable.Arrays[Index];

            if(Entry->ReservedPages <= 0)
            {
                printf("Trying to allocate memory but there is no more reserved pages\n");
                return EXCEPTION_CONTINUE_SEARCH;
            }

            void* BaseAddress = (u8*)Entry->MemoryBase;
            LPVOID Result = VirtualAlloc( BaseAddress, MemoryInfo.PageSize, MEM_COMMIT, PAGE_READWRITE);     

            if(!Result)
            {
                printf("Failed to commit pages with error code = %d\n", GetLastError());
                return EXCEPTION_CONTINUE_SEARCH;
            }
            
            Entry->CommittedPages++;
            Entry->ReservedPages--;
            Entry->MemoryBase = (u8*)Entry->MemoryBase + MemoryInfo.PageSize;
        }

        return EXCEPTION_CONTINUE_EXECUTION;
    }
    return EXCEPTION_CONTINUE_SEARCH;   
}

// STUB entity structure
struct entity
{
    f32 PosX;
    f32 PosY;
    s32 Data;
};

int main(void)
{
    //------------ init system info ---------------//
    SYSTEM_INFO Info;     
    GetSystemInfo(&Info); 
    MemoryInfo.PageSize  = Info.dwPageSize;
    MemoryInfo.AllocSize = Info.dwAllocationGranularity;
    PVOID h1 = AddVectoredExceptionHandler(1 , PageFaultHander);
        
    s32 EntityCount  = MB(1);
    entity* Entities = (entity*) CreateDynamicArray(EntityCount * sizeof(entity));

    entity Empty = {};
    for(s32 I = 0; I< EntityCount; ++I)
    {
        Entities[I] = Empty;
    }

    RemoveVectoredExceptionHandler(h1);    
    return(0);
}

This system might leave a lot to be desired, from the way it checks if the page-fault happened within our pages and not some other buggy place, to the fact it always increase by a fixed size and not pre-committing. But I hope it at least demonstrate the general idea.

I have been told that __try / __except block are considered to be generally faster, but I don't think it matters in this case.

Also, since I could not find a single blog entry on google, I would appreciate if someone can link me any blog related to this.

None

AliAbdulKareem said:
Pretty much all talks about Windows's Virtual memory API mentions the fact it allows Dynamic Arrays to be easily used.

Let's first be clear about what you're writing about. There are a few terms you're using that have confusingly similar words, but extremely different meanings.

Dynamic arrays generally means an array that automatically grows when you add more items than it has room for it. Unlike a regular array (such as int MyArray[100]; ) that is always the same size, a dynamic array can expand. In C++ the standard library provides an implementation, called std::vector. Other languages have similar implementations of dynamic arrays.

Dynamic memory allocation means allocating memory as it is used during the execution of the program. There are many ways to do it, including smart pointers, manually managing memory with the function pair malloc()/free(), manually managing memory with now/delete operators, or dynamically allocating memory with standard library containers, The library's containers internally perform dynamic memory allocations, and many have a function called reserve() that does the allocation all at once, to prepare for the space you're intending to use.

Virtual memory allocation like you described in your post are very different. Windows and many other operating systems allow programs to use more memory addresses than are physically installed on a computer. For example, a computer might have 8 GB of physical memory but the various programs running on a computer might have much more that they've requested, maybe 20 or 30 GB or more. Sometimes much more, hundreds of gigabytes or even terabytes of memory reserved through virtual allocation. That extra space is managed through virtual memory and swapped out to disk. The functions you call with VirtualAlloc and page fault manipulation in code are all about reserving space on a disk and manipulating the virtual memory. You didn't find blog entries about it because it is rarely done. There are games that do it, but it is rare, and far more advanced than most programmers ever dive into memory management.

So for clarity and to let us help you with the real problem you are trying to solve, what exactly are attempting to accomplish, and why are you trying to do it? I strongly suspect you've trying to solve a problem accomplished by the first one of the three, not the last of the three.

Advertisement

@frob
The technique I described is probably clearly stated here at 22:10 (Mysteries of Memory Management Revealed (Part 1/2) By Mark Russinovich) Just if my descriptions were not clear enough.

Regarding what I am trying to accomplish, I just wanted to test this new (to me) system. And see if it is better than wrapping my checks in a get function like this if(size ≥ capacity) grow(capacity * 2); every time I want to add elements. So there is no specific end goal, just wanted to test it. Since I have been hearing about it for a long time. My initial benchmarks of the code I posted is not great, iterating 1MB array adds roughly 10+ms over iterating them without exception handling (in debug build of course).

None

I remember virtual memory being proposed to implement a sparse grid for fluid simulations.
It's not about about extending RAM using the hard disk. Instead the goal is to achieve ‘hardware accelerated’ lookups into sparsely allocated memory pages. Basically the CPUs ability to map virtual memory pages is used to replace a search / traversal algorithm, if i got that right. Here is a related paper: https://pages.cs.wisc.edu/~sifakis/papers/SPGrid.pdf

I guess it's not really practical for a single dynamic array.
But maybe you can find related implementations for a more practical application like such sparse grids.

I've been using VirtualAlloc in some cases, but it never occurred to me using SEH to handle the page-commits. So if understand it correctly, you are trying to remove the need to do if-checks to for resizing/reserving? In this case, as you've seen, this is a pretty bad idea. Any kind of exception-handling on Windows has a significant overhead, be it SEH or c++-exceptions (which are just build on top of SEH). It involves evaluating and checking the validity of the stack-related CPU states, walking the stack (pretty much entirely to determine all types of installed handlers), looking up the exception-information for each frame, invoking them… Case and point, this won't save you anything, especially if you only commit one page at a time.

EDIT: While typing, I realized that you are trying to do what the video says. The video describes how a call-stack is managed. This is the only sensible reason to use SEH with a pre-reserved memory region, the way you attempt; and is the only sensible way to handle a call-stack. A call-stacks memory is mostly re-used. A stack is usually set to a few MBs capacity at max, and rarely uses it all. So doing those allocations is rare, thus the exception-overhead is small (and mostly handled by the kernel at an early stage for the default applications stack, so even less to do than your example). And the alternative of having to check for resizing on every modification of “RSP” would be way worse - that would include things like push,call, and even pop and ret (as underflows are handled too). So what the video does makes sense, but not like how you are trying to use it.

So I'd recommend trying to test VirtualAlloc with the same interface that you'd use for a new/malloc backed (std::)vector. This still gives you benefits - existing memory doesn't have to be copied/moved, making the cost of reserving O(n) (instead of amortized O(n)). Also, existing references stay intact, which is one of the reasons that I personally use VirtualAlloc in some places. Come to think of it, its kind of the primary reason. The other use-case is for my JIT-compiler, though reasoning is similar. I can reserve the maximum supported address-range for x64 code (2gb, so that relative displacement still works ), but only physically allocate what is currently needed. So if I compile a new section, I can just map the pages into that range, and unmap them when done.

So to sum up, VirtualAlloc definately has uses, though they also have downsides. Having a fixed allocation granularity (4096kb) makes it unsuitable for many types of dynamic arrays, and having to pre-reserve the entire possible memory can also be a problem. Meaning for both cases, you can use that type of VirtualAlloc for rare, large containers; but not for many small ones, so don't know who said that this was the “standard way to do dynamic arrays in game dev”, but its a bunch of baloney for sure. Unless he was talking about writing a custom allocator for all the games memory, from which individual allocations are made (forgot how that type of allocator is named).

This topic is closed to new replies.

Advertisement