Advertisement

Custom memory allocator for STL - dead end due to constructors?

Started by June 24, 2024 02:33 PM
36 comments, last by JoeJ 4 months ago

It turned out memory allocation is a real performance killer for my application, and i need to manage memory myself.
Using STL, have made a custom allocator which should solve the fragmentation problem.
To my surprise, this works just fine for std::vector after some research and trial and error. It could do anything i need, although there some hidden problems, e.g. using std::swap involving different allocators, but i can life with that.
Here's how basic usage looks:

			using PagedAlloc = Allocators::PagedAllocator<Allocators::FreeListAllocator, 1024, 4>;
			using stdAlloc = Allocators::stdAllocator<int, PagedAlloc>;
			PagedAlloc pagedAlloc;
			PagedAlloc pagedAlloc2;

			stdAlloc alloc(&pagedAlloc);
			std::vector<int, stdAlloc > vec(alloc);
			
			stdAlloc alloc2(&pagedAlloc2);
			std::vector<int, stdAlloc> vec2(alloc2);

			std::vector<int, stdAlloc> test(10, -1, alloc);
			test.resize(20);		
					


Basically i need to give the allocator to the constructor of std::vector.
Which is a show stopper, because i can not create an array for example:

std::vector<int, stdAlloc> vecN[10];

I can not call the proper constructor to set the allocator of an array of vectors at this point.
I could loop over the 10 elements to somehow set the allocator afterwards, but it's too late. STL calls the default constructor of vector and allocates one element (for whatever reason), which crashes because the allocator was not yet set.

This means:
To use custom allocators, i'm now forced to remove ALL default constructors for any class that has vector members.
They all need to be constructed given the allocator in parameterized constructors, forwarding that to members during construction in time.
So i can't have an array of any of my data structures, and also i can not create data structures in memory to set them up later for use. It's no longer just data. C++ forces me to treat everything as ‘objects’.

This is nonsense. Bjarne was wrong, and they still haven't fixed it.

Or do i miss something? Are there some hacks to postpone calling constructors until it can be actually done? Maybe using placement new or somethign like that?

Any help welcome. But currently it looks if have to write my own version of STL… <:(

Make them 10 vector pointers and allocate when you want how you want.

Option 2: Just write your own vector class.

NBA2K, Madden, Maneater, Killing Floor, Sims

Advertisement

You can use an unholy hack such as this (placement new):

std::byte vecN[10*sizeof(vector<int, stdAlloc>)];
for ( int i = 0; i < 10; i++ )
{
    new ( (vector<int, stdAlloc>*)vecN + i ) ( myAllocator );
}

You could get similar behavior by creating a vector-of-vectors, calling reserve(), then emplace_back() each one.

I agree there are some serious design deficiencies in the STL allocators. The main one being that the allocator type must be a template of the type it allocates, which means you need separate allocators for every type, rather than an equivalent to malloc()/free(). This is extra annoying when dealing with containers that don't directly allocate their contents (e.g. std::list allocates nodes, not the template type T).

Personally I have my own “STL” which has been accumulated over the last 15-20 years. It has support for custom allocators in a much more sensible way than STL. At its core, an allocator is just:

class MyAllocator
{
    void* allocate( size_t byteCount );
    void deallocate( void* pointer );
};

The allocate()/deallocate() functions can be static, or member functions in case the allocator has any state. There is a global default allocator which has static function pointers for the allocate()/deallocate() functions, which then redirect to malloc()/free(). This design allows easily inserting debugging (e.g. allocation tracking, prefix/suffixes to detect writing out-of-bounds). If the allocator has no state, then there is no need to pass it into the constructor of a container (it will be default-constructed internally).

It's hard to tell without more details. Currently it looks like a botched allocator implementation.


std::vector's default constructor default constructs its allocator type, so std::vector<int,stdAlloc> should have no problems creating Allocators::stdAllocator<int,Allocators::PagedAllocator<Allocators::FreeListAllocator,1024,4>>. It also should not allocate one element as default constructor should create size 0, capacity 0 container in this case.

Can you share the definitions of the constructs used from Allocators? If you can't/are unwilling to share the code, could you explain:

  1. The meaning and architecture behind those templates.
    1. How do they fill the Allocator requirements.
  2. The stdAlloc constructor taking a PagedAlloc ptr.
  3. How the code that leads to the crash look and the exact crash reason.

@Verboten_9 Hmmm…. post reply makes no sense in context, account created today… I'd say this is AI generated BS. The only thing missing is a link to a spam site.

@Aressera That's a bit unfounded, I resent that. If you have problems understanding the topic I can help you with that. I can throw in some racial slurs or toxic opinions to prove I'm a human /s.

Advertisement

There should be a way to specify a partial specialization of your allocator to ‘override’ the default STL allocator for a specific type. I haven't tried this myself specifically for allocators, but I've used the behavior to ‘override’ the default delete for std::unique_ptr for object custom allocated via custom allocator( using the custom allocator for STL allocation is the next bridge I have to cross at some point).

dpadam450 said:
Make them 10 vector pointers and allocate when you want how you want.

Sure, solving a memory fragmentation problem by introducing even more memory fragmentation. :D

Aressera said:
You could get similar behavior by creating a vector-of-vectors, calling reserve(), then emplace_back() each one.

Pretty much the same.

Aressera said:
std::byte vecN[10*sizeof(vector)]; for ( int i = 0; i < 10; i++ ) { new ( (vector*)vecN + i ) ( myAllocator ); }

Yeah, that's what i had in mind.
But doing this everywhere? Nah - i'll probably try EASTL. It's old, but they fixed all those problems it seems.

Verboten_9 said:
Can you share the definitions of the constructs used from Allocators?

	template<typename T, class Allocator>
	class stdAllocator 
	{
	public:
		using value_type = T;
    
		Allocator *allocator;

		stdAllocator() = default;
		stdAllocator(const stdAllocator&) = default;
		stdAllocator(Allocator *allocator) : allocator(allocator) {}
    
		// Conversion constructor that passes the buffer along.
		template <typename U, class Allocator>
		stdAllocator(const stdAllocator<U, Allocator>& other) : allocator(other.allocator) {}
		// Required for the conversion constructor.
		template <typename U, class Allocator> friend class stdAllocator;

    
		T* allocate(std::size_t n) 
		{
			if (n > std::allocator_traits<stdAllocator>::max_size(*this)) 
			{
				throw std::bad_alloc();
			}
			return (T*)allocator->Allocate(n * sizeof(T), sizeof(T)); // todo: alignment as param?
		}

		void deallocate(T* p, std::size_t) noexcept 
		{
			allocator->Deallocate(p);
		}

		template<typename U, typename... Args>
		void construct(U* p, Args&&... args) 
		{
			new(p) U(std::forward<Args>(args)...);
		}

		template<typename U>
		void destroy(U* p) noexcept 
		{
			p->~U();
		}

		friend bool operator==(const stdAllocator&, const stdAllocator&) { return true; }
		friend bool operator!=(const stdAllocator&, const stdAllocator&) { return false; }
	};

Mixed and matched from random blog posts.
I also have a simpler one where the allocator is not templated. Maybe less buggy:

	template<typename T>
	class stdPagedAllocator 
	{
	public:
		using value_type = T;
 		using PagedAlloc = Allocators::PagedAllocator<Allocators::FreeListAllocator, 1024, 4>;
   
		stdPagedAllocator() = default;
		stdPagedAllocator(const stdPagedAllocator&) = default;
		
		template <typename U>
		stdPagedAllocator(const stdPagedAllocator<U>& other) : flAlloc(other.flAlloc) {}

		stdPagedAllocator(PagedAlloc *flAlloc) : flAlloc(flAlloc) {}

		template <typename U> friend class stdPagedAllocator;

		
		PagedAlloc *flAlloc;

		
		
		T* allocate(std::size_t n) 
		{
			if (n > std::allocator_traits<stdPagedAllocator>::max_size(*this)) 
			{
				throw std::bad_alloc();
			}
			return (T*)flAlloc->Allocate(n * sizeof(T), sizeof(T)); // todo: alignment as param?
		}

		void deallocate(T* p, std::size_t) noexcept 
		{
			flAlloc->Deallocate(p);
		}

		template<typename U, typename... Args>
		void construct(U* p, Args&&... args) 
		{
			new(p) U(std::forward<Args>(args)...);
		}

		template<typename U>
		void destroy(U* p) noexcept 
		{
			p->~U();
		}

		friend bool operator==(const stdPagedAllocator&, const stdPagedAllocator&) { return true; }
		friend bool operator!=(const stdPagedAllocator&, const stdPagedAllocator&) { return false; }
	};

@JoeJ This heavily depends on your use-cases and their context/environment:

https://godbolt.org/z/ohcsW876q

#include <vector>
#include <iostream>
#include <cstdlib>

template<typename T, class Allocator>
class stdAllocator 
{
    Allocator static* underlyingAllocatorInstance;
public:
    void static SetUnderlyingAllocatorInstance(Allocator* allocator = nullptr)
    {
        underlyingAllocatorInstance = allocator;
    }

    using value_type = T;
    using underlying_allocator_t = Allocator;

    Allocator *allocator;

    stdAllocator()
    {
        ::std::cout << "stdAllocator\n";
        if (underlyingAllocatorInstance)
        {
            this->allocator = underlyingAllocatorInstance;
        }
    }
    stdAllocator(const stdAllocator&) = default;
    stdAllocator(Allocator *allocator) : allocator(allocator) {}

    // Conversion constructor that passes the buffer along.
    template <typename U, class underlying_allocator_t>
    stdAllocator(const stdAllocator<U, underlying_allocator_t>& other) : allocator(other.allocator) {}
    // Required for the conversion constructor.
    template <typename U, class underlying_allocator_t> friend class stdAllocator;


    T* allocate(std::size_t n) 
    {
        if (n > std::allocator_traits<stdAllocator>::max_size(*this)) 
        {
            throw std::bad_alloc();
        }
        return (T*)allocator->allocate(n * sizeof(T)); // todo: alignment as param?
    }

    void deallocate(T* p, std::size_t n) noexcept 
    {
        allocator->deallocate(p, n);
    }

    template<typename U, typename... Args>
    void construct(U* p, Args&&... args) 
    {
        new(p) U(std::forward<Args>(args)...);
    }

    template<typename U>
    void destroy(U* p) noexcept 
    {
        p->~U();
    }

    friend bool operator==(const stdAllocator&, const stdAllocator&) { return true; }
    friend bool operator!=(const stdAllocator&, const stdAllocator&) { return false; }
};

template<class T, class Allocator>
Allocator* stdAllocator<T,Allocator>::underlyingAllocatorInstance{};

template<class T, class... Args>
class Foo
{
public:
    using pointer = T*;
    using value_type = T;
    using size_type = std::size_t;

    Foo()
    {
        std::cout << "Foo\n";
    }

    auto allocate(size_type n) -> pointer
    {
        return reinterpret_cast<T*>(std::malloc(n));
    }

    void deallocate(pointer/* p*/, size_type/* n*/)
    {}
};

class Bar final
{
    int i{};

public:
    Bar()
    {
        std::cout << "Bar\n";
    }
};

auto main() -> int
{
    Foo<Bar> foo{};
    stdAllocator<Bar,Foo<Bar>>::SetUnderlyingAllocatorInstance(&foo);
    std::vector<Bar, stdAllocator<Bar,Foo<Bar>>> x[10];
    stdAllocator<Bar,Foo<Bar>>::SetUnderlyingAllocatorInstance();
    x[5].emplace_back();
}

(Please disregard the omissions in Foo or Bar - they're merely examples).

(And this https://godbolt.org/z/6d9fv7578 is some simple thing showing the default creation of allocator and the lack of element creation).

(Pink banana ← something random to prove my humanity.)

cgrant said:
There should be a way to specify a partial specialization of your allocator to ‘override’ the default STL allocator for a specific type.

Hmm, sounds interesting. But i guess it could only use global state at this point? Multiple threads create objects, so i can't use global state to manage multiple allocators as needed.

Verboten_9 said:
@JoeJ This heavily depends on your use-cases and their context/environment:

Thanks! Seems like implementing just that idea.

But it's not really practical for me. I have many vectors for temporary data used by parallel jobs, and i could allocate related memory completely lock free. Thus, another lock to use a global workaround is not great.

Verboten_9 said:
(And this https://godbolt.org/z/6d9fv7578 is some simple thing showing the default creation of allocator and the lack of element creation).

Interesting this works. If i return a nullptr if the allocator pointer is still missing, i still get a crash. It tries to allocate one element, while your example seemingly does not.

If i could prevent this first allocation, i could probably set the allocator afterwards, which would solve the problem. I'll investigate… (but i have this strong feeling it's time to give up : )

This topic is closed to new replies.

Advertisement