Jinoh Kang (@iamahuman) commented about dlls/ntdll/heap.c:
+ + if (!(block_get_flags( block ) & BLOCK_FLAG_LFH)) return STATUS_UNSUCCESSFUL; + + bin = heap->bins + BLOCK_SIZE_BIN( block_size ); + if (bin == last) return STATUS_UNSUCCESSFUL; + + i = block_get_group_index( block ); + valgrind_make_writable( block, sizeof(*block) ); + block_set_type( block, BLOCK_TYPE_FREE ); + block_set_flags( block, ~BLOCK_FLAG_LFH, BLOCK_FLAG_FREE ); + mark_block_free( block + 1, (char *)block + block_size - (char *)(block + 1), flags ); + + InterlockedOr( &group->free_bits, 1 << i ); + + /* if this was the last used block in a group and GROUP_FLAG_FREE was set */ + if (ReadNoFence( &group->free_bits ) == (GROUP_FREE_BITS_MASK | GROUP_FLAG_FREE)) You should revert to using the `InterlockedOr` return value here. Note how there is a race condition bug here: two threads can simultaneously attempt to claim ownership of the same group and release it. To demonstrate this, consider the following scenario:
**Precondition**: `group->free_bits == 0xfffffffc`. **Sequence**: 1. Thread A: calls `heap_free_block_lfh` with block `0` of group `group`. 2. Thread A: evaluates `InterlockedOr( &group->free_bits, 1 << 0 )`. `free_bits` is now `0xfffffffd`. 3. Thread B: calls `heap_free_block_lfh` with block `1` of group `group`. 4. Thread B: evaluates `InterlockedOr( &group->free_bits, 1 << 1 )`. `free_bits` is now `0xffffffff`. 5. Thread A: evalutes `ReadNoFence( &group->free_bits ) == (GROUP_FREE_BITS_MASK | GROUP_FLAG_FREE)`, which evalutes to _true_. 6. Thread B: evalutes `ReadNoFence( &group->free_bits ) == (GROUP_FREE_BITS_MASK | GROUP_FLAG_FREE)`, which evalutes to _true_. 7. Thread A: resets `group->free_bits` and releases `group`. 8. Thread B: resets `group->free_bits` and releases `group`. → [double release!] -- https://gitlab.winehq.org/wine/wine/-/merge_requests/1628#note_23819