April 2023 - wine-gitlab

[PATCH 0/5] MR2631: wineps: Handle more spool file records.
by Piotr Caban (＠piotr) 11 Apr '23

11 Apr '23

-- https://gitlab.winehq.org/wine/wine/-/merge_requests/2631

3 6

[PATCH v6 0/3] MR2496: Fix hang when a program calls IMFMediaSession_Start with an invalid source.
by Santino Mazza (＠tati1454) 11 Apr '23

11 Apr '23

This fixes a bug when the session topology contains an invalid source, which makes the session thread to hang and stop executing commands. -- v6: mf/session: Handle error when a source fails to start. mf/session: Handle errors when subscribing to source's events. mf/tests: Test media session error handling. https://gitlab.winehq.org/wine/wine/-/merge_requests/2496

4 6

[PATCH v4 0/1] MR2617: kernel32/tests: Fix test for ACTCTX_FLAG_HMODULE_VALID with hModule = NULL case.
by Jinoh Kang (＠iamahuman) 11 Apr '23

11 Apr '23

Today, the test scenario "ACTCTX_FLAG_HMODULE_VALID but hModule if not set" is broken and unreliable. This problem is not evident in WineHQ batch test runs; rather, the test failure seems to only be triggered when the kernel32:actctx test is run in isolation. When the flag ACTCTX_FLAG_HMODULE_VALID is specified in ACTCTX but hModule is set to NULL, CreateActCtxW() may encounter different failure modes depending on the test executable file. Error codes observed so far include ERROR_SXS_CANT_GEN_ACTCTX and ERROR_SXS_MANIFEST_TOO_BIG. It appears that the inconsistent failure was caused by Windows trying to interpret the main executable file of the current process as an XML manifest file. This fails due to one or more of the following reasons: - A valid PE executable that starts with the "MZ" signature is not a valid XML file. - The executable's size may exceed the limit imposed by the manifest parser. This is much more likely for binaries with debugging symbols. Meanwhile, winetest.exe bundles a stripped version of the test executable (kernel32_test-stripped.exe), which is often smaller than the original executable (not stripped). This probably explains why the problem was not visible in batch test runs. Fix this by changing the FullDllName of the main executable module's LDR_DATA_TABLE_ENTRY to the pathname of a temporary manifest file (valid or invalid) before testing. The testing is performed in a child process, since "corrupting" the internal state of a main test process is not desirable for achieving deterministic and reliable tests. Blocks !2555. -- v4: kernel32/tests: Fix test for ACTCTX_FLAG_HMODULE_VALID with hModule = NULL case. https://gitlab.winehq.org/wine/wine/-/merge_requests/2617

2 2

[PATCH v6 0/1] MR2622: ntdll: Use log-linear bucketing for free lists.
by Tatsuyuki Ishi (＠ishitatsuyuki) 11 Apr '23

11 Apr '23

Currently, the free list consists of a "small list" for sizes below 256, which are linearly spaced, and a "large list" which is manually split into a few chunks. This patch replaces it with a single log-linear policy, while expanding the range the large list covers. The old implementation had issues when a lot of large allocations happened. In this case, all the allocations went in the last catch-all bucket in the "large list", and what happens is: 1. The linked list grew in size over time, causing searching cost to skyrocket. 2. With the first-fit allocation policy, fragmentation was also making the problem worse. The new bucketing covers the entire range up until we start allocating large blocks, which will not enter the free list. It also makes the allocation policy closer to best-fit (although not exactly), reducing fragmentation. The increase in number of free lists does incur some cost when it needs to be skipped over, but the improvement in allocation performance outweighs it. For future work, these ideas (mostly from glibc) might or might not benefit performance: - Use an exact best-fit allocation policy. - Add a bitmap for freelist, allowing empty lists to be skipped with a single bit scan. For the benchmark, this drastically improves initial shader loading performance in Overwatch 2. In this workload 78k shaders are passed to DXVK for DXBC -> SPIRV translation, and for each shader a few allocation happens in the 4K – 100K range for the staging buffer. Before this patch, malloc consisted a whooping 43% of overhead. The overhead with log-linear bucketing is drastically lower, resulting in a ~2x improvement in loading time. Overhead for each `FREE_LIST_LINEAR_BITS` is as below: - 0: 7.7% - 1: 2.9% - 2: 1.3% - 3: 0.6% Since performance seems to scale linearly with increase in buckets (up to the point I have tested), I've opted for 3 (8 buckets per doubling) in the current revision of patch. Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki(a)gmail.com> -- v6: ntdll: Use log-linear bucketing for free lists. https://gitlab.winehq.org/wine/wine/-/merge_requests/2622

2 2

[PATCH v5 0/1] MR2622: ntdll: Use log-linear bucketing for free lists.
by Tatsuyuki Ishi (＠ishitatsuyuki) 11 Apr '23

11 Apr '23

Currently, the free list consists of a "small list" for sizes below 256, which are linearly spaced, and a "large list" which is manually split into a few chunks. This patch replaces it with a single log-linear policy, while expanding the range the large list covers. The old implementation had issues when a lot of large allocations happened. In this case, all the allocations went in the last catch-all bucket in the "large list", and what happens is: 1. The linked list grew in size over time, causing searching cost to skyrocket. 2. With the first-fit allocation policy, fragmentation was also making the problem worse. The new bucketing covers the entire range up until we start allocating large blocks, which will not enter the free list. It also makes the allocation policy closer to best-fit (although not exactly), reducing fragmentation. The increase in number of free lists does incur some cost when it needs to be skipped over, but the improvement in allocation performance outweighs it. For future work, these ideas (mostly from glibc) might or might not benefit performance: - Use an exact best-fit allocation policy. - Add a bitmap for freelist, allowing empty lists to be skipped with a single bit scan. For the benchmark, this drastically improves initial shader loading performance in Overwatch 2. In this workload 78k shaders are passed to DXVK for DXBC -> SPIRV translation, and for each shader a few allocation happens in the 4K – 100K range for the staging buffer. Before this patch, malloc consisted a whooping 43% of overhead. The overhead with log-linear bucketing is drastically lower, resulting in a ~2x improvement in loading time. Overhead for each `FREE_LIST_LINEAR_BITS` is as below: - 0: 7.7% - 1: 2.9% - 2: 1.3% - 3: 0.6% Since performance seems to scale linearly with increase in buckets (up to the point I have tested), I've opted for 3 (8 buckets per doubling) in the current revision of patch. Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki(a)gmail.com> -- v5: ntdll: Use log-linear bucketing for free lists. https://gitlab.winehq.org/wine/wine/-/merge_requests/2622

4 9

[PATCH v2 0/9] MR2627: imm32: Implement input context cache and use INPUTCONTEXT directly in some functions.
by Rémi Bernon (＠rbernon) 11 Apr '23

11 Apr '23

-- v2: imm32: Use INPUTCONTEXT directly in ImmSetConversionStatus. imm32: Use INPUTCONTEXT directly in ImmGetConversionStatus. imm32: Compare open status values in ImmSetOpenStatus. imm32: Cache INPUTCONTEXT values for every IME. imm32: Use INPUTCONTEXT directly in ImmSetOpenStatus. imm32: Use INPUTCONTEXT directly in ImmGetOpenStatus. imm32: Serialize ImeInquire / ImeDestroy calls. imm32/tests: Cleanup the cross thread IMC tests. imm32/tests: Reduce the number of IME installations. https://gitlab.winehq.org/wine/wine/-/merge_requests/2627

3 11

[PATCH v4 0/6] MR2577: d3dcompiler: Use the vkd3d-shader DXBC API.
by Henri Verbeet (＠hverbeet) 11 Apr '23

11 Apr '23

The main advantage is that this way we're getting valid DXBC checksums for DXBC blobs generated by d3dcompiler. See also https://bugs.winehq.org/show_bug.cgi?id=54464. -- v4: d3dcompiler: Use vkd3d_shader_parse_dxbc() in d3dcompiler_shader_reflection_init(). d3dcompiler: Use vkd3d_shader_parse_dxbc() in d3dcompiler_strip_shader(). d3dcompiler: Use vkd3d_shader_parse_dxbc() in d3dcompiler_get_blob_part(). https://gitlab.winehq.org/wine/wine/-/merge_requests/2577

3 10

[PATCH 0/5] MR2629: msvcrt: Use remaining math functions from the bundled musl library.
by Alexandre Julliard (＠julliard) 11 Apr '23

11 Apr '23

-- https://gitlab.winehq.org/wine/wine/-/merge_requests/2629

3 6

[PATCH v3 0/1] MR2622: ntdll: Use log-linear bucketing for free lists.
by Tatsuyuki Ishi (＠ishitatsuyuki) 11 Apr '23

11 Apr '23

Currently, the free list consists of a "small list" for sizes below 256, which are linearly spaced, and a "large list" which is manually split into a few chunks. This patch replaces it with a single log-linear policy, while expanding the range the large list covers. The old implementation had issues when a lot of large allocations happened. In this case, all the allocations went in the last catch-all bucket in the "large list", and what happens is: 1. The linked list grew in size over time, causing searching cost to skyrocket. 2. With the first-fit allocation policy, fragmentation was also making the problem worse. The new bucketing covers the entire range up until we start allocating large blocks, which will not enter the free list. It also makes the allocation policy closer to best-fit (although not exactly), reducing fragmentation. The increase in number of free lists does incur some cost when it needs to be skipped over, but the improvement in allocation performance outweighs it. For future work, these ideas (mostly from glibc) might or might not benefit performance: - Use an exact best-fit allocation policy. - Add a bitmap for freelist, allowing empty lists to be skipped with a single bit scan. For the benchmark, this drastically improves initial shader loading performance in Overwatch 2. In this workload 78k shaders are passed to DXVK for DXBC -> SPIRV translation, and for each shader a few allocation happens in the 4K – 100K range for the staging buffer. Before this patch, malloc consisted a whooping 43% of overhead. The overhead with log-linear bucketing is drastically lower, resulting in a ~2x improvement in loading time. Overhead for each `FREE_LIST_LINEAR_BITS` is as below: - 0: 7.7% - 1: 2.9% - 2: 1.3% - 3: 0.6% Since performance seems to scale linearly with increase in buckets (up to the point I have tested), I've opted for 3 (8 buckets per doubling) in the current revision of patch. Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki(a)gmail.com> -- v3: ntdll: Use log-linear bucketing for free lists. https://gitlab.winehq.org/wine/wine/-/merge_requests/2622

4 4

[PATCH v5 0/2] MR141: vkd3d-shader/ir: Normalise hull shader control point phase I/O.
by Conor McCarthy (＠cmccarthy) 11 Apr '23

11 Apr '23

To simplify SM 6 support, insert a control point id relative address where needed, and declare control point phase inputs where missing. -- v5: vkd3d-shader/ir: Insert hull shader control point input declarations if no control point phase is defined. https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/141

2 2

2025

2024

2023

2022

wine-gitlab April 2023