Changes from the previous version accross the patches: - use ldr_data_srw_lock instead of introducing a new one when modifying WINE_MODREF data with a shared loader lock. - protect cached_modref with ldr_data_srw_lock instead of introducing interlocked access; - remove redundant 'volatile' qualifier from locked_exclusive; - fix and improve tests, split them across the patchset.
I actually mean to send first 7 patches from this series for review and not 22 at once but it feels like the changes introduced by these first patches make sense in this exact form only if the general idea of locking rework is accepted. So I am sending the whole thing in its current state in a hope it would be possible to see what I am suggesting and make a decision on the general idea at least. I did some cleanup on all the functional patches, although would look once again at those below patch 7 at least before sending that for review. The test in the last patch obviously needs more work, it works well for me on the recent enough Windows 10 but not that well yet on earlier versions, also needs cleaning up and splitting.
Some facts about the current Windows loader behaviour: - It seems that loader lock (the one that can be managed externally through LdrLockLoaderLock) is held only when calling the app's functions from the loader: DLL entry point, TLS callbacks, LDR notifications and LdrEnumerateLoadedModules callback. There is a quirk (at least on the newer Win10) with LDR notifications: the loader lock is held during the unload event but not during the load event, at least the first one. I am guessing that maybe it is unheld only until the first dependency DLL entry point is called (so they take it once there and then keep until process attach ends, leaving first load notification behind), but I didn't test that with the DLL with dependencies yet. In any case it looks to me this quirk may be left for later. - There is apparently other lock mechanics involved as Windows does not allow loading process of different DLLs to go in parallel; - It looks like any function which does not end up requiring actual loading or unloading of the DLL and does not query anything from the module currently being loaded or unloaded can be executed in parallel with the current load process. E. g., LoadLibrary("b.dll") won't ever block if the DLL is already loaded, LdrAddRefDll never blocks at all even if called for the module currently being loaded (hanging in dll entry point process attach) or unloaded (hanging in detach), LdrUnloadDll will only block if it decrements load count to zero. - The behaviour of the functions operating on the module being currently loaded or unloaded looks quirky. Here is what I recovered from testing on the 19043.1237 (that is mostly covered in the test in the last patch): - LdrUnloadDll blocks until the DLL load sequence is complete and succeeds without waiting if unload sequence is in progress; - LdrAddRefDll never blocks but behaviour depends on whether it is called from the app callback (tested with entry point and LDR notification) or outside of that (patch 0014). - LdrGetDllHandleEx blocks during module unload but not load; - LdrGetProcedureAddress waits until the module is being loaded and immediately returns an error if module is being unloaded; - Looks like calling functions from LDR notifications is the special case. Those that wouldn't block work, but those that would block if called from outside return error STATUS_NOT_FOUND. That is apparently not how that behaves from DLL entry points and TLS callbacks.
I think the main patch in the series is patch 0008 which introduces exclusive locks downgrade / upgrade functions. The idea is that we can take shared or exclusive lock from loader function. The exclusive lock can be downgraded to shared. However, the thread still holds exlusiveness in this state in a sense that other exclusive lock cannot be taken until the initial exclusive lock is released. Shared lock waiters can joing when the exlusive lock is downgraded. This works recursively so loader functions can call each other and call outside callbacks with the lock held, with the exception that exclusive lock cannot be taken recursively if shared is currently held. So, the way of using it is: - Never call the callbacks with read lock held: that will assert once the application calls some loader function which takes the exclusive lock. - The way to call the callbacks is to have an exclusive lock and downgrade it to shared, then the recursive taking of exclusive lock is ok; - If it is unclear at function start whether exclusive lock is needed or not, the way to go is to start from shared lock, try to perform the operation (in case of LdrAddRefDll / LdrUnloadLibrary I used an extra short lived SRW lock), and if that is not possible release the shared lock and restart from scratch with exclusive. I think maybe it would be technically possible not to introduce the limitation and allow upgrading shared locks to exclusive but it seems it would make things much more complicated than they are: it would require some neater and finer grained structure access sync to avoid invalidating all the data the shared locker depends on at the moment it wants to upgrade to exclusive and yields to exclusive library load / unload sequence.
Signed-off-by: Paul Gofman pgofman@codeweavers.com