If what you're writing already requires a dll, or you can augment an existing one, then you're already set and can use the fact that DllMain gets called when threads are created and destructed to your advantage. If you're not, or can't then you're pretty much stuck for an answer. Conventional wisdom on the web seems to revolve around hooking CreateThread or even use the kernel based notification scheme. However making a whole driver is overkill and with several methods of creating threads called at various levels of Windows, hooking isn't always sufficient either, especially if you want to execute code in the thread context. WMI is also a technical possibility, but with its '10,000 lines of code where 10 will do' philosophy, that's where its staying.
Dll thread_attach notifications work because when threads are created and torn down, ntdll loops around the internal structures corresponding to each module loaded in the process and calls their entry point if they meet certain criteria. The structure for the exe is included in the enumeration but as it doesn't identify as a dll, its entry point isn't called. The thing to do then, is modify the structure to a) look like a dll and b) make it think our entry point is a DllMain...
Continue reading on
Just Let It Flow