i am unsure about this one. if the thread that got through is de-scheduled in between the cmpxchgs, wouldn't all the spinning threads preventing it from being scheduled again in a timely manner?
I'm afraid I'm not familiar enough with kernel scheduling to know the answer to this one.
It's worth pointing out, though, that RtlWaitOnAddress() has its own spinlock. If there's more than two threads trying to concurrently add to the list, they're going to run into that spinlock anyway.
does `RtlWaitOnAddress` not have a short busy waiting loop?
It does not. Perhaps it should; I don't know, but given how much effort it took to get that function correct and performant, I'm wary of touching it.