Advertisement

Embedding comments in machine-code/JIT

Started by April 22, 2022 06:20 PM
20 comments, last by Juliean 8 months ago

frob said:
The PE (Portable Executable) format is used for both the exe and dll files. You can call LoadLibrary on an exe just as well as a dll.

I understand so far. But with the JIT-method, I neigther have an exe nor a DLL file. So since what I'm doing is basically correct (VirtualAlloc+VirtualProtect), that means that instead of just writing the machine-code for each function into the executable memory, I have to write a PE-DLL-like file structure, and create a separate PDB on disk which I then load. Is that about right?

I don't know if VS can load symbols for that. Apart from the docs and just trying it, I don't know what more to offer there. It is not something commonly done.

Advertisement

frob said:
I don't know if VS can load symbols for that. Apart from the docs and just trying it, I don't know what more to offer there. It is not something commonly done.

Yeah, thanks anyway. Not being able to debug is a bit annoying at times, but at least I'm going to be able to create a walkable stack using the Sym-APIs that I already found, not having this would be an absolute no-go. I might be inclined to try out if the PDB-loading works, but since its a huge complication over my current simplistic “compiler”-design its not a top-priority then - its really a bit much work for something that might even fail in the end anyways.

Ok, I think I've found another clue for how this can be done. Visual Studio mentions a “symbol provider”, which is part of the extension-kit for their debugger:

https://docs.microsoft.com/en-us/visualstudio/extensibility/debugger/symbol-provider?view=vs-2022

https://docs.microsoft.com/en-us/visualstudio/extensibility/debugger/reference/symbol-provider-interfaces?view=vs-2022

It seems this is what I'm looking for. Now due to some required changes to function-calls, the readability of the jit-code has decreased drastically again, making me more temped to invest time into this. And even though I haven't dealt with writing VS-extensions before, at least it seems that this way is somewhat guaranteed to work unlike trying to write a fully-fletched out PE/PDB-compiler.

Anyway, just though I'd leave that there if anyone comes across the same problems, since it seems there is not much information about it.

Juliean said:

On the contrary, I did find some API-functions for supplying symbols to the debugger ad runtime, via SymLoadModuleExW/SymAddSymbolW in DbgHelp.h:

const auto process = GetCurrentProcess();
static bool hasInitialized = false;
if (!hasInitialized)
{
	if (!SymInitialize(process, nullptr, false))
		Log::OutErrorFmt("Failed to call SymInitialize.");

	hasInitialized = true;
}

//if (!SymLoadModuleExW(process, nullptr, L"Test.dll", nullptr, (DWORD64)(pMemory + 8), DWORD(codeSize), nullptr, SLMFLAG_VIRTUAL))
//	Log::OutErrorFmt("Failed to call SymLoadModuleEx.");

uint32_t index = 0;
for (const auto [target, bytecode, info] : data.vBytecodeMapping)
{
	const auto end = [&]() -> uint32_t
	{
		const auto next = index + 1;
		if (data.vBytecodeMapping.IsValidIndex(next)) [[likely]]
			return data.vBytecodeMapping.At(next).target - REFERENCE_SIZE;
		else
			return uint32_t(codeSize);
	}();

	const auto code = DWORD64(pMemory + target);
	const auto size = end - target;
	if (!SymLoadModuleExW(process, nullptr, nullptr, nullptr, code, size, nullptr, SLMFLAG_VIRTUAL))
		Log::OutErrorFmt("Failed to call SymLoadModuleEx.");
	if (!SymAddSymbolW(process, code, L"Test(void)", code, size, 0))
		Log::OutErrorFmt("Failed to call SymAddSymbol.");

	index++;
}

This seems to work in supplying a name to the function, but its critically lacking the ability to specify the line/source-file mapping which is being enumerated by SymGetLineFromXXX-functions as well as from the debugger itself. But SymAddSymbol seems to be the only way to “add” anything to the virtual module. Am I missing something, or do I really have to create a PE/COFF-formatted executable alongside a full PDB in order for this symbols to work?

(Sorry for the thread necromancy! I haven't found any better sources of info on SymAddSymbolW.)

Were you actually able to see these global symbols when you attach to your jit engine in Visual Studio? I've been fiddling with SymLoadModuleEx and SymAddSymbol for a while and I can't get VS to display anything. Are the virtual symbols added by the jit really supposed to be visible to the debugger? (i.e. in a different process?) It sort of feels like the dbghelp data you create in the child process would be more likely stay inside the child, rather than being shared to the debugger.

I was hoping e.g. that SymLoadModuleEx would cause a new entry in the “Modules” window of VS, but I don't see that happening. And in “Disassembly” view there are no symbol headings that would normally be there for binary that has a pdb.

My silly plan was to abuse the symbols for a very basic form of debug info, by putting a symbol for each line of the source code where the symbol name is the full line of source code. It wouldn't be the great, but might be sort of useful for a low-level style of debugging.

But… I haven't even been able to get VS to see anything added by SymAddSymbol at all. Any hints or suggestions would be much appreciated! (Writing out a pdb with a stub dll seems possible, but definitely also a pretty large and messy undertaking.)

sgraham said:
Were you actually able to see these global symbols when you attach to your jit engine in Visual Studio? I've been fiddling with SymLoadModuleEx and SymAddSymbol for a while and I can't get VS to display anything. Are the virtual symbols added by the jit really supposed to be visible to the debugger? (i.e. in a different process?) It sort of feels like the dbghelp data you create in the child process would be more likely stay inside the child, rather than being shared to the debugger.

Hey, no unfortunately SymLoadModule etc… doesn't do anything. Its good enough to be able to get a stack-trace, but those can be manually injected so I didn't bother with it. You can see my last post though with the information of how I think that could be done. Its just above yours.

Essentially, VS has something called a “symbol provider" which you'd have to implement in order to be able to see symbols for your JIT in the debugger/disassembler. At least thats what I think it is, I haven't bothered with it as documentation seems lacking and its an VS-extension which I have no clue how to develop. But this is something you might want to investigate if this is really important to you. I kind of got away without it, the JIT is stable enough now that I don't need it. Really depends on how strongly you need it.

Advertisement

Thanks for the response. The extension development and installation does seem like a hassle and probably not well supported, especially since they seem to be encouraging you not to use it in the docs… so maybe the hacky PE/PDB is the “best” way. Or maybe nothing. :D

@sgraham If you do end up doing the PE/PDB-thing, let me know how it went. I read a bit into the format and it seems like a real chore, you'd have to develop a lot of things that usually wouldn't matter for a JIT; and you'd also loose the ability to separately compile functions as needed (which I don't do in my JIT, but I have heard that in general its something that is done with JITs with high optimization and on large codebases as it would otherwise take a lot of time).

There are not much you can do. Maybe you can add NOPs for better readibility after every operation in debug builds, but thats all.

Hi, thanks for the suggestion, I also did a blob of nops for a while to separate lines.

Another hack I belatedly thought of that might be useful is to do some nops coupled with “push [address of comment string] ; pop”, which might be helpful for debugging in some cases.

But, I got sucked in to the idea after I started looking at PDB files, so I wrote a library to write PDBs (and cause VS to load them) at runtime if anyone's interested. It was a “real chore” as you said @juliean !

There's a blog post here explaining a bit about it https://scot.tg/2023/05/02/debugging-with-pdbs/​ with a short demo video and links to the library + example.

This topic is closed to new replies.

Advertisement