On Thu Feb 12 21:25:02 2026 +0000, Paul Gofman wrote:
Are there any measurements confirming that such a change give a performance gain? If there are that should be a kernel issue or host setup issue if that uses some unusual NUMA defaults configuration. Single threaded programs (and wineserver is single threaded) should not ever need to care about NUMA locality, kernel cares about allocating memory on the same node where thread is currently running and care about the memory locality when making CPU scheduling decisions for the process or migrating allocated memory (unbound to NUMA nodes) between NUMA nodes. Explicit NUMA node management can only makes sense in multithreaded apps (while the rule of thumb for simple cases is just allocate memory on the same thread which is going to primarily use it). But that it is very specific to what the app is doing and usually only makes sense in combination with CPU pinning. I. e., mostly not applicable to Wine in general because most of the time Wine can't control thread's affinity and allocations, that is ultimately stipulated by app (apart from supporting corresponding bits in Nt memory allocation functions which are currently not supported). @gofman, numa_alloc() not only allocates memory at the nearest node where current thread is located, it also aligns memory blocks because malloc() works very poorly on NUMA systems by default.
About numa_alloc(): https://linux.die.net/man/3/numa_alloc Detailed benchmarks here malloc() vs numa_alloc(): https://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/issues/236 -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10091#note_129553