Jinoh Kang (@iamahuman) commented about dlls/ntdll/heap.c:
+/* keep the block size category count reasonable */ +C_ASSERT( BLOCK_SIZE_CATEGORY_COUNT <= 256 );
+/* difference between block classes and all possible validation overhead must fit into block tail_size */ +C_ASSERT( BLOCK_SIZE_MEDIUM_STEP + 3 * BLOCK_ALIGN <= FIELD_MAX( struct block, tail_size ) );
+/* affinity to tid mapping array, limits the number of thread-local caches,
- and additional threads will fight for the global categories groups
- */
+static LONG next_thread_affinity;
+/* a category of heap blocks of a certain size */ +struct category +{
- /* counters for LFH activation */
- volatile LONG blocks_alive;
How about counting cumulative total _freed_ blocks instead? This will reduce two writes into one.
`(ULONG)category->blocks_total - (ULONG)category->blocks_dead` will still evaluate to the (approximate) alive block count even if an individual variable overflows.