https://bugs.winehq.org/show_bug.cgi?id=53032
--- Comment #28 from Kevin Puetz PuetzKevinA@JohnDeere.com --- Created attachment 72849 --> https://bugs.winehq.org/attachment.cgi?id=72849 Test code exploring the load/unload order of delayload dlls
In the attached example, there are 5 dlls, all of which trace the order of their DllMain calls to stderr, as well as their namesake function
main is linked to a.dll, and /delayload linked to b.dll. e.dll is linked to b.dll and /delayload linked to d.dll. main also does a LoadLibrary/GetProcAddress/call/FreeLibrary on e() twice. The first time it frees it properly, the second it leaks it.
The resulting trace (diffing PE vs ELF, both x64) is:
--- MSVC2022 Win10 +++ winegcc 11.2 wine-7.0 a.DllMain(1) main() -- begin a() b.DllMain(1) b() c.DllMain(1) e.DllMain(1) c() d.DllMain(1) d() e() e.DllMain(0) c.DllMain(0) +e.destructor() +c.destructor() +d.destructor() c.DllMain(1) e.DllMain(1) c() +d.DllMain(1) d() e() main() -- return atexit handler +d.DllMain(0) e.DllMain(0) c.DllMain(0) -d.DllMain(0) b.DllMain(0) a.DllMain(0) +main.destructor() +a.destructor() +b.destructor() +e.destructor() +c.destructor() +d.destructor()
So the first time through, a is loaded up front, and b is delay-loaded before its first call (as expected). Loading e.dll first loads c.dll, and d is delay loaded before its first call. Unloading e frees c, but not d. The second call to load e reloads c, but d was still loaded from the first time through. When main exits, e, c, b, and a all unload (in that order). This eppears to just be the reverse order of the still-active LoadLibrary calls, nothing based on one library's finalizer containing code to free another. So, to first appearences, MSVC's delayimp.lib just leaks the LoadLibrary call, and it's cleaned up like any other still-loaded module during process exit. Running these msvc-built .dll/.exe files in wine produces a the same trace, so it appears that wine's ntdll contains the code to clean this up the same way, and for PE files everythign matches.
But building this same code with winegcc (and thus winecrt0.a) produces differences even prior to the (potential) crash. FreeLibray(e_dll) still omits d.DllMain(DLL_PROCESS_DETACH), but e.dll.so's matching *does* run the ELF finalizers (__attribute((destructor)) of d.dll.so, so apparently it did dlclose the handle for d.dll.so, just without doing the Win32 DLL_PROCESS_DETACH first. And, because we dumped the buitin d.dll.so, the second round of loading and calling e does a second LoadLibrary(d.dll) and we see d.DllMain(DLL_PROCESS_ATTACH) run again. This seems bad, since we've unbalanced DLL_PROCESS_{ATTACH,DETACH}. The DllMain call seems to be missing because of the free_lib_count check in LdrUnloadDll (https://gitlab.winehq.org/wine/wine/-/blob/wine-7.0/dlls/ntdll/loader.c#L374...). Since we've ended up making recursive FreeLibrary(e.dll) -> dlclose(e.dll.so) -> free_delay_imports -> FreeLibrary(d.dll.so), which would violate the loader_section lock, it skips skips calling process_detach, and the d.DllMain(DLL_PROCESS_DETACH) call is lost.
And also, this change to the load order means d.DllMain(DLL_PROCESS_DETACH) is the fist thing run after atexit, instead of e (and c) preceding it (which would be the reverse order of the LoadLibrary calls), which differs from windows.
So... after doing this more investigation of the actual behavior, it seems like free_delay_imports should simply not exist. Calling FreeLibrary during the unload of a .exe.so is pointless (everything gets properly cleaned up during LdrShutdownProcess anyway), and doing so during the unload of a .dll.so inherently leads to this this invalid recursion (and thus misses the DLL_PROCESS_DETACH). The safe thing to do would be to somehow decrement delayloaded dependencies during the recursive MODULE_DecRefCount, which would then lead to process_detach and MODULE_FlushModrefs calls unloading it sequentally (rather than recursively). But that would be seemingly require private API between ntdll and winecrt0, and while the result seems sound, it wouldn't match the apparent behavior of MSVC's delayimp.lib (which presumably omits freeing delayload libs for similar reasons). It's explicitly unsafe to call FreeLibrary from within DllMain, and there's no other unload hook available in a PE build, so there's just no way for delayimp.cpp to do something similar to what wine is trying here.
https://docs.microsoft.com/en-us/windows/win32/api/libloaderapi/nf-libloader...
It is not safe to call FreeLibrary from DllMain. For more information, see the Remarks section in DllMain.
And if it's not a part of the real PE/windows/MSVC implementation, then I'm not sure what wine is trying to match.
This is very old code (mid 2005 and pre-1.0), though it seems like there was some amount of investigation in b2d35c3620be73ca662e73fed49a280b405d1f1c that lead to disabling it for MacOS but keeping it for other platforms (?). So any other insight is welcome...