I see, I don't see a simple way around this. Would it be really bad if we have an internal `WakeByAddress` that takes a size?
Well, we'd need to store the size in the futex wait queue, which is a bit strange.
At that rate it may make more sense to have dedicated wait queues for SRW locks. That'd end up being something closer to 3504, but not necessarily with all of the complexity of (or even compatibility with) the Windows implementation.
One other note is that regardless of what path we take, I think we should keep the test that replicates what the offending application is doing.