These patches make a test case attached to the bug https://bugs.winehq.org/show_bug.cgi?id=33190 work.
--
v8: win32u: NtGdiExtTextOutW() should translate x,y from logical to device units at the last step.
win32u: Fix device<->world width/height converters.
win32u: Use slightly more readable names for DP/LP converters.
win32u: Use correct helper for converting width to device units.
gdi32/tests: Add some tests for rotated font metrics.
https://gitlab.winehq.org/wine/wine/-/merge_requests/5068
Patch 4/5 has `broken(1)` which is probably not very nice, but I don't know how to handle that otherwise. I would say it's a property that we want to enforce for our implementation; at the same time it breaks on older Windows builds without much logic, AFAICT. I would constrain the `brokn()` on the Windows version, but my understanding is that this is not allowed.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/8978
NtSetEventBoostPriority stub regression fixes for:
https://bugs.winehq.org/show_bug.cgi?id=58688
Regressive commit: ed9f31120b68e7d684c1544c05d94c38b25cb759
Other stubs were also changed by regress commit and may still be broken.
I'll test the other stubs and fix if necessary.
In the meantime I'm submitting this merge for review in case I missed anything that may be needed for the other stub fixes.
--
v4: ntdll: ZwSetEventBoostPriority typo correction for bug #58688
This merge request has too many patches to be relayed via email.
Please visit the URL below to see the contents of the merge request.
https://gitlab.winehq.org/wine/wine/-/merge_requests/8955
This merge request implements several NUMA functions previously stubbed in kernel32 and kernelbase, adds a basic NUMA node discovery/topology layer, and enriches the associated tests. It also improves the traceability of SetThreadGroupAffinity.
## Context / Motivation
Some Windows applications (game engines, middleware, runtimes) query the NUMA API to adapt memory allocation or thread distribution. The lack of an implementation returned errors (ERROR_CALL_NOT_IMPLEMENTED) or unhelpful values, which could degrade the internal heuristics of these programs. This first implementation provides:
- A logical topology derived from GetLogicalProcessorInformation.
- A reasonable approximation of available memory per node.
- Consistent processor masks for the present nodes.
It prepares for future optimizations (targeted memory allocation, better scheduling strategies) without modifying the existing behavior of generic allocations.
## Main Changes
- `kernel32/process.c`:
- Implementation of GetNumaNodeProcessorMask, GetNumaAvailableMemoryNode / Ex, GetNumaProcessorNode / Ex, GetNumaProximityNode.
- Parameter validation and consistent error propagation (ERROR_INVALID_PARAMETER).
- `kernelbase/memory.c`:
- New NUMA infrastructure (topology cache, lazy initialization, dedicated critical lock).
- Topology reading via GetLogicalProcessorInformation.
- Runtime options via environment variables:
- WINE_NUMA_FORCE_SINGLE: Force a single logical node.
- WINE_NUMA_CONTIG: Remap masks to produce contiguous blocks.
- Implementations of GetNumaHighestNodeNumber, GetNumaNodeProcessorMaskEx, GetNumaProximityNodeEx.
- Robust fallback: if no NUMA info → single node.
- `kernelbase/thread.c`:
- Added detailed traces in SetThreadGroupAffinity (removed the redundant DECLSPEC_HOTPATCH here).
- Tests (`dlls/kernel32/tests/process.c`):
- Added a new test, test_NumaBasic, covering:
- GetNumaHighestNodeNumber
- GetNumaNodeProcessorMaskEx (nodes 0 and 1)
- GetNumaProximityNodeEx
- Tolerant behavior: accepts `ERROR_INVALID_FUNCTION` / `ERROR_INVALID_PARAMETER` depending on the platform.
- Added the `WINE_DEFAULT_DEBUG_CHANNEL(numa)` debug channel for the subsystem.
## Assumptions / Limitations
- Support for a single processor group (Group = 0) for now.
- Memory approximation: equal division of available physical memory (improvable later with internal counters per node).
- Proximity = node (simplistic direct mapping).
- No impact yet on VirtualAlloc / Heap allocation by node.
## Security / Concurrency
- Initialization protected by dedicated critical section (numa_cs).
- Thread-safe lazy read.
- Table bounded to 64 nodes (historical Windows limit).
## Compatibility Impact
- Improves compatibility with software probing the NUMA API.
- Low risk of regression: previously failed paths now return TRUE with consistent data.
- In case of topology collection failure → single-node fallback (conservative behavior).
## Validation / Tests
- New test_NumaBasic added and integrated into the process suite.
- Traces (numa channel) allow for detection diagnostics.
- Invalid parameters tested (NULL, nodes out of range).
- Works in environments without real NUMA via fallback.
## Environment Variables (quick documentation)
- WINE_NUMA_FORCE_SINGLE=1: Forces a single node (mask covering all CPUs).
- WINE_NUMA_CONTIG=1: Reallocates compact bit blocks per node (useful if the topology returns sparse masks).
## Potential Next Steps (not included)
- Implement true memory tracking per node (via allocation hooks).
- Multi-group support (PROCESSOR_GROUP_INFO).
- Improved VirtualAllocExNuma / First-touch implementation.
- More accurate proximity-to-node mapping on complex NUMA platforms. - Dedicated tests for environment variables.
## Potential Risks / Regressions
- Applications relying on the absence of an API may slightly change their strategy (low).
- Masks remapped with WINE_NUMA_CONTIG could surprise a profiling tool (opt-in option).
- Memory approximation too coarse for very fine-grained heuristics (no functional regression expected).
## Request for Review
- Verify logging conventions and TRACE_(numa) usage.
- Verify the relevance of removing DECLSPEC_HOTPATCH on SetThreadGroupAffinity (alignment with local conventions).
- Opinion on error granularity (ERROR_INVALID_PARAMETER vs. ERROR_INVALID_FUNCTION) for more accurate mimicry.
Once the kernel can handle those functions directly (in a NUMA module i.e.) we could use this implementation as a fallback when the kernel doesn't support NUMA natively (when the module cannot be loaded).
--
v2: kernelbase: Improve initialization of NUMA information to handle pathological cases
https://gitlab.winehq.org/wine/wine/-/merge_requests/8970
--
v2: dwrite: Reuse font set entries to return set instances for collections.
dwrite/tests: Add a small test for EUDC collection.
dwrite: Cache set elements for returned system sets.
dwrite: Remove nested structures in fontset entries.
dwrite: Simplify collection initialization helper.
dwrite: Create custom collections using font sets.
dwrite: Create both WWS and typographic system collections using system font set.
dwrite: Mark system font sets.
dwrite: Remove system collection marker.
dwrite: Check against local file loader in ConvertFontToLOGFONT().
dwrite/tests: Add some more tests for ConvertFontToLOGFONT().
https://gitlab.winehq.org/wine/wine/-/merge_requests/8961
Motivation is Ubisoft Connect, which calls IDWriteFontCollection::GetFontSet many times on the system font collection, and then GetMatchingFonts on those sets. Without a change like this, each set created by GetFontSet has a separate cache of `dwrite_fontset_entry`s, so the GetMatchingFonts calls rescan every font on the system (via dwritefontset_GetMatchingFonts -> fontset_entry_is_matching -> fontset_entry_get_property, which does not find cached properties and thus makes a new file stream).
Perhaps it makes sense to not have the `owns_entries` flag in all the dwrite_fontset initializers, and just set it as needed in IDWriteFontCollection::GetFontSet? (And perhaps to invert it to `unowned_entries` or something, so the calloc initialization sets the most common value?)
--
v2: dwrite: Reuse entries when a font set is created from a collection.
https://gitlab.winehq.org/wine/wine/-/merge_requests/8906