On 04/10/2022 19:18, Esme Povirk (@madewokherd) wrote:
Zebediah has graciously linked me to a resource on memory barriers which I am planning to read in the hope of understanding this better. She also brought up the question of whether we'd be better served by a standard locking mechanism.
The image_lock and image_unlock functions were added with the intention of making image access thread-safe, so we should make sure they have the correct semantics. AFAICT the lock itself doesn't contain any information except the current thread id, so as long as the lock can be cheaply created and destroyed without creating any kernel objects (because we can create and destroy a lot of images), and can be used without blocking, I think a standard mechanism will work fine.
It looks to me like if we create a critical section without debug info and only use TryEnter, it should be as light as it can reasonably be. Using debug info should also be fine as it's just a single heap allocation, but I don't think we need it if we never block on the critical section.
I don't think a critical section is needed if it works now without one. These memory barriers do nothing on x86, and don't even exist (compile to no code at all), it's only useful for architectures without strong memory-ordering guarantees (like ARM or PowerPC, etc).
Those architectures have specific instructions to deal with this, because the CPU can reorder memory writes or reads without them. It's sort of like using "sync" on disk, but on the CPU's memory. You want the other pieces of memory to be synced (flushed) before you update the atomic lock because when you read the lock in some other thread, you expect to read the rest of the stuff and have it synced with it.
Anyway this is a very simplified explanation, but might be easier to grasp, at least as to why it's needed.