It turns out that no concurrent writes occur at all. I added mutex locks and tests with WARN messages to the descriptor creation functions and copy, but nothing at all showed up. The only issue is descriptors being written to while being copied. I sent a patch upstream which has couple of issues fixed. It doesn't slow down Metro Exodus, which is nice.

On Fri, Sep 20, 2019 at 7:29 AM Derek Lesho <dlesho@codeweavers.com> wrote:

On 9/19/19 12:43 PM, Derek Lesho wrote:

On 9/19/19 11:12 AM, Conor McCarthy wrote:
On Fri, Sep 20, 2019 at 1:32 AM Derek Lesho <dlesho@codeweavers.com> wrote:
Also, in your patch, I noticed that you intentionally unlock before
performing the copy or overwrite.  This puzzles me, if one thread waits
on the global mutex for access to a descriptor's view to dereference,
and we unlock the mutex for them when the view is NULL, they may crash
when trying to decrement the reference of the NULL view.

Yes, it's missing a null check after locking the mutex. If it's null after locking then the function can return because it means another thread has destroyed the descriptor. I think the only remaining problem is the old one of a single mutex being used for all descriptors. Maybe performance can be improved.
Even if one thread destroys a descriptor, and then we let both concurrently write their data, we will leak the Vulkan handle of written by the thread which writes it to the descriptor first and gets overwritten.

I did some testing with your patch as is, and compared it with a solution where we destroy the vulkan handle after unlocking the global mutex.  The difference was within the margin of error.

I also tried adding a dedicated mutex for this purpose, but that didn't have much of an affect on performance either.  A dedicated spinlock actually worsened performance, even without the Vulkan call inside the lock.

If we can confirm that no resource leakage is occurring, the patch you sent is probably fine as is.