This patch uses several instances of ReadAcquire(). I firmly believe that any instance of a barrier needs to be justified with a comment, and it should be explained where the other half of the barrier is. If a location doesn't need a barrier, it should be ReadNoFence().
I believe there are two places where we need to care about acquire/release:
(1) Because we are implementing a lock, we need RtlAcquire* to have acquire semantics, and RtlRelease* to have release semantics. This should be covered by the cmpxchg of the lock ptr value itself, in all relevant paths. We may consider weakening those to InterlockedCompareExchangePointerAcquire() / InterlockedCompareExchangePointerRelease(), but I'm slightly less concerned about that, if only because there's precedent.
(2) When we fill the srwlock_waiter structure, we need the writes to be visible before we link it. Therefore the cmpxchgs to the lock Ptr and to the "next" pointer need to have release semantics, and on the other side, we need any reads of those members to have acquire semantics. Which they do, currently, but justification is a good idea.