https://bugs.winehq.org/show_bug.cgi?id=50292
--- Comment #16 from Jacek Caban jacek@codeweavers.com --- I meant it as a replacement, not an addition. I may have been an overly optimistic about just 2 dim array, 3 dim could be less wasteful (although there may be a better structure). The point is to make a single tid slot cheap enough (4 bytes seems more than enough, but it could be even larger if needed) so that we may over-allocate and never free. Over-allocation makes allocation of a single slot cheap (because it will usually be already allocated) and never freeing avoids any need for locking.
It may seem like a wasteful strategy, but note that server gives us a good locality of tid values and reuses previously freed ids. In practice, I'd expect that a single block of a few thousands entries should be enough for majority of applications. If we optimize for that (while being able to handle even a theoretical case of 2**30 created threads with enough allocations), I'd hope that we can get decent results. I can't be sure without trying, maybe I'm missing something.