https://bugs.winehq.org/show_bug.cgi?id=45546
--- Comment #20 from Ian Sabourin sslasher0@gmail.com --- A discussion took place on irc yesterday, which I'll try my best to synthesize here. The main participants were Zebediah Figura, 'ken', and myself - with some input from julliard, and a few other comments. Apologies if I forgot anything.
Thread A requests the context of thread B, by calling get_thread_context(). The Microsoft API stipulates that prior to this, B must be suspended by the client program.
Regardless of whether this actually happened, it seems that the wine server is unable to produce the thread context, until B is suspended - since it is in fact B that writes its context, as part of cooperatively suspending itself. To further complicate things, it seems that the wine server has no means of forcibly suspending a thread. These are ken's comments, which I don't know enough to comment on.
If that's the case, the wine server has no choice but to request B's suspension, and hope B eventually cooperates. In the mean time, all the server can do is return 'PENDING'.
Correspondingly, the DLL code (in thread A) periodically retries asking the server. The question now is, what if B never suspends? Maybe it stopped executing, or maybe another thread C resumed it. These would be examples of incorrect client programs, and we could say, let the client program hang, if it does this. But the argument was: what if A is a debugger? Then we don't want it to hang forever in get_thread_context(), just because B doesn't suspend as asked.
Zebediah put forth the (good) idea of having the server signal thread A, instead of having A poll the server. But ultimately this doesn't make it any more certain that B will eventually suspend.
As a result, if the server has no means of forcibly suspending a thread, and if we also we want native debuggers not to hang in this scenario, there must be a timeout in get_thread_context().
The current problem is that the timeout occurs, but really thread B was just 'legitimately' taking a long time to suspend (quoted because a slow suspend probably indicates some problem in the client code, but that's in a sense irrelevant here). As a practical solution, julliard suggested "an exponential backoff over a few seconds". I took that to mean that the polling of the server starts out quick, and then slows down, to limit server contention when threads are slow to suspend. Nevertheless, there would remain the question of what timeout to select, which seems very arbitrary.
Before settling on that (a longer timeout), I'd like to ask two things: 1. does the server have absolutely no way of forcibly suspending a thread, and then returning a context for it, even if this context is 'invalid'? Why not, exactly? This could open up different solutions; 2. is it absolutely required that a native Windows debugger not hang in this degenerate scenario, when running on wine? Could there not be a custom debugger that targets the wine environment? It could speak directly to the wine server, instead of going through the MS API, which in this case already does not match the reality of the wine environment.