Advertisement

Understanding Visual Studio's Profiling

Started by August 31, 2016 11:41 AM
11 comments, last by jpetrie 8 years, 2 months ago

Hello forum!

I'm trying to profile the heap of my program, as it seems to be a bit too huge for just the things I'm doing.

Already hunted down a very ugly issue but now I've got multiple questions concerning the following:

My heap-profiler (Visual Studio 2015) claims that an object type called "void" allocated 729.015 bytes.

What is this? Looking at the largest instance with 726.919 bytes, it has this allocation stack:


 	programname.exe!operator new() - Line 19	C++
 	programname.exe!std::_Allocate<class sf::Vertex>(unsigned int,class sf::Vertex *,bool)()	C++
 	programname.exe!std::allocator<class sf::Vertex>::allocate(unsigned int)()	C++
 	programname.exe!std::_Wrap_alloc<class std::allocator<class sf::Vertex> >::allocate(unsigned int)()	C++
	programname.exe!std::vector<class sf::Vertex,class std::allocator<class sf::Vertex> >::_Reallocate(unsigned int)()	C++
	programname.exe!std::vector<class sf::Vertex,class std::allocator<class sf::Vertex> >::_Reserve(unsigned int)()	C++
	programname.exe!std::vector<class sf::Vertex,class std::allocator<class sf::Vertex> >::resize(unsigned int)()	C++
	programname.exe!sf::VertexArray::resize(unsigned int)()	C++
	programname.exe!Level_Tile::Level_Tile() - Line 22	C++
	programname.exe!std::make_unique<Level_Tile,Level_Field &,int,int,char const (&)[16]>() - Line 1630	C++
	programname.exe!Level_Field::Level_Field() - Line 18	C++
	programname.exe!std::make_unique<Level_Field,Game_Engine &>() - Line 1630	C++
	programname.exe!Level_State::Level_State() - Line 23	C++
	programname.exe!std::make_unique<Level_State,Game_Engine * &,char const (&)[12]>() - Line 1630	C++
	programname.exe!Intro_State::Update() - Line 29	C++
	programname.exe!Game_Engine::Update_Game_States() - Line 58	C++
	programname.exe!main() - Line 27	C++
	programname.exe!invoke_main() - Line 64	C++
	programname.exe!__scrt_common_main_seh() - Line 253	C++
	programname.exe!__scrt_common_main() - Line 296	C++
	programname.exe!mainCRTStartup() - Line 17	C++
	kernel32.dll!0x7492338a()	C++
	ntdll.dll!0x76ee9902()	C++tex> >::allocate(unsigned int)()	C++

So, are these just a bunch of objects that I allocated? I know what a void is, but having this object type in the heap profiler confuses me.

There are 34 (debug) and 42 (release) of these voids.

What is this? As I want to reduce the RAM used by my application, should I reduce this? What can I do about this?

Another weird thing that appears in my profiler, is the std::_Container_proxy, 816 bytes with 102 instances.

How can I have 102 instances? What does create an instance? These also completely vanish once I build a Release instead of Debug.

My last question, where is the rest of RAM usage?

The application is using 78.864 K in the windows task-manager. But the heap only counts 22.334 kb.

Where is the rest? On the stack?

Here is a screenshot of the profiler even claiming about 98 mb.

hp0wJ0l.png

Thanks for taking your time reading through my thread! I hope you can help me out.

It seems quite clear from the trace that you're allocating a large vertex array in Level_Tile::Level_Tile(). Ignore the word 'void' - I don't know where that came from, but the important thing is the call stack. The thing to do with call stacks is look down the list until you see your code, then work out what it's doing from that. The same goes for your std::_Container_proxy example.

It's not possible to know whether it's trivial to reduce the memory usage without knowing what your code is trying to do - sometimes, a certain thing absolutely requires a certain amount of memory. Other times, you might be able to change your game - eg. make maps smaller by using fewer tiles.

As for "where is the rest of RAM usage?" - it could be the stack, it could be the executable itself, it could be memory that backs up other features (eg. file or network buffers), etc. This can also include memory that your application previously allocated but is no longer using (because it doesn't always get immediately returned to the operating system). You can reduce this sort of 'fragmented memory' by performing fewer allocations - for example, by reusing objects instead of deleting them and newing a replacement, or by creating vectors of the correct size at the start instead of growing them as needed.

Advertisement

Thanks a lot, this made things way clearer for me.

Are there are any tools to track down instances on the stack? I mean, I know at some point they will go out of scope.

Is Visual Studio's profiling more trustworthy than Windows' tool?

The task manager claims it to be 75mb and the profiler says 95mb after my change to the vertex array.

The heap is only 21.8mb out of this.

I will try your mentioned tips though. Even if it feels like seeking a needle in a haystack, as there seems nothing suspicious.

It's quite unlikely that you have stack objects big enough to be a problem (at least in part because stack size is usually explicitly limited). However there could be some big global/static variables - in which case, convert them to heap variables.

Visual Studio is telling you about memory allocations made by the app - Windows Task Manager is telling you how much memory is allocated to the app. The former is always lower than the latter. Neither is more trustworthy than the other, as they measure different aspects. But as I said, if you're churning through memory and fragmented it, that can account for a large part of the difference between them.

I can't see any situation where the profiler would say 95MB but Task Manager only says 75MB - where would your app be getting the other 20MB from, if not from the OS?

Are you checking for memory leaks, incidentally? It seems unlikely you'd have any if you're being careful with make_unique etc., but it's a consideration - especially if you see memory usage continue to rise unexpectedly.

I looked into it. Lua was taking ~20mb.

Down from 86.1 to 66.7mb.

Sadly, Lua isn't something I cannot stop using.

I could not reproduce this. Very weird.

I can't see any situation where the profiler would say 95MB but Task Manager only says 75MB - where would your app be getting the other 20MB from, if not from the OS?

Just found out, that pixel-perfect hovering actually show the exact value: 86.1mb.

While my task manager says: 75.8mb.

I have no OS calls, if that would matter in any way.

Are you checking for memory leaks, incidentally? It seems unlikely you'd have any if you're being careful with make_unique etc., but it's a consideration - especially if you see memory usage continue to rise unexpectedly.

My application was running for over an hour, not a single bit increased. I highly doubt dead or leaked memory.

Nonetheless, now it says 67.7mb and 67.4mb without Lua.

Cleaning and rebuilding the solution brought me down to 65.9mb without Lua.

Or was this just confirmation bias?

Weird, whenever I take out Lua and debug my code again, it might even increase the size by small amounts.

This is very confusing. I would have never thought to have such a weird behaviour in my code. There is nothing "random", it is the same procedure with the same results every single time.

I have no OS calls, if that would matter in any way.

Maybe you don't make them directly, but almost certainly things you call make them (for example, anything that allocates memory or reads from a file).

Advertisement
Maybe you don't make them directly, but almost certainly things you call make them (for example, anything that allocates memory or reads from a file).

True!

Well, I've done some more profiling with break points, this seems to help.

Once the following initializer list is being worked through (this is a main component of the game),


texture_storage(std::make_unique<Texture_Store>(),
window_storage((std::make_unique<Window_Storage>(sf::VideoMode(1280, 720), "Window Name")))

68mb are allocated.

This does not include any textures, yet. Just constructors of null or default values.

Starting with the texture_storage:

First I want to show the members of the texture_store class:


	std::unordered_map<std::string, std::shared_ptr<Texture_Proxy>>
		hashed_texture_proxies;

	std::unordered_map<std::string, std::shared_ptr<Texture_Proxy>>::const_iterator
		hash_iterator;

	std::unique_ptr<Texture_Cache> 
		texture_cache;

I'm not sure how expensive this actually is - only being empty.

The cache class itself has these member variables:


	std::unordered_map<std::string, std::unique_ptr<sf::Texture>>
		cached_textures;

	std::unordered_map<std::string, std::unique_ptr<sf::Texture>>::const_iterator
		hash_iterator;

Summing the behaviour up:

The game creates its member of a texture storage. The storage instantiates its cache. That's it.

The other constructor being called, which is the window_storage, instantiates an object with the following members:


	std::unique_ptr<sf::RenderWindow> render_window;
	std::string title = "Default Name";

The sf::RenderWindow class allocates 520 bytes. While I look through the allocation stack of object-type void, I feel like "void" refers to

std::make_unique<Texture_Store>() and std::make_unique<Window_Storage>(sf::VideoMode(1280, 720), "Window Name") in the beginning.

Last but not least, the constructor initializer list of the window_storage:


	window_pointer(std::make_unique<sf::RenderWindow>(window_mode, title_bar_text)),
	title(title_bar_text)

Nonetheless, the heap only shows 21mb out of these 60mb being allocated.

If there seems to be something smelly in my code, I would be happy to know about it.

While this might not be the perfect approach, the code works without any problems.

And as I mentioned, I did not allocate a single texture yet. Just the storage with its cache (and a window storage).

All you're doing is showing us member lists; that's not enough to give an accurate picture. When you indicate that 68MB of memory is allocated by initializing two items in the initializer list in your first example, that doesn't jive with the members you assert those two types have. It doesn't take 68MB of RAM to store a default-constructed unordered map, an iterator, and a pointer.

What do the constructors of Texture_Store and Window_Store actually do? It's also important to know what kind of allocation the 68MB allocation is reported as?

If necessary, single-step through the code, watching the usage and taking heap snapshots before and after each initialisation step. If there's as little to this code as is suggested, this shouldn't take long to find out exactly where most of this memory gets allocated.

What do the constructors of Texture_Store and Window_Store actually do? It's also important to know what kind of allocation the 68MB allocation is reported as?

I did some further checking, the texture_store is fine.

What kind of allocation it is reported as? Well, the profiler does not say this. It only shows a graph and the current RAM used. I cannot find in the heap profiler.

I did some more profiling.

It pretty much is the sf::RenderWindow that allocates 68mb.


#include <SFML/Graphics.hpp>

int main()
{
    sf::RenderWindow window(sf::VideoMode(1280, 720), "Allocation Test");

    while(window.isOpen())
    {
        sf::Event event;

        while(window.pollEvent(event))
        {
            if(event.type == sf::Event::Closed)
                window.close();
        }
    }
}

This very basic piece allocates a lot of RAM on my end already.

Is it possible Visual Studio's compiler bloats this?

I tried different SFML-versions, same result. Ran it in debug and release, same result.

This topic is closed to new replies.

Advertisement