https://bugs.winehq.org/show_bug.cgi?id=50292
--- Comment #8 from Zebediah Figura z.figura12@gmail.com --- (In reply to RĂ©mi Bernon from comment #7)
Regarding the performance issue, I think the linked-list nature of the TEB list could also be part of issue, especially when a large number of threads are created. On synthetic benchmarks, it's very clear that it induces a lot of CPU cache misses.
IIUC it's not really possible to avoid a global lock here. As NtAlertThreadByThreadId function is supposed to return a specific status if the targeted TID does not exist, I guess there's no other way than tracking all threads.
However, it should be possible to make it faster by having a static TID + sync object array instead, to help the CPU speculative execution do its magic. Of course it will then have to support a dynamic number of threads, but perhaps a reasonably large fixed maximum number may be acceptable and could still perform better than the linked list?
It should be possible to do it lock-free with the same mechanism as we use on the PE side, i.e. a growable linked list of arrays. I hadn't though of that when I first wrote the patch. It was my vague plan to try that optimization and see if it helps.