On Wed Apr 12 08:55:50 2023 +0000, Zhiyi Zhang wrote:
> Does the following help?
> ```
> [
> explicit_handle
> ]
> interface interface-name
> {
> ...
> }
> ```
> See https://learn.microsoft.com/en-us/windows/win32/midl/explicit-handle
> and https://learn.microsoft.com/en-us/windows/win32/rpc/explicit-binding-handles
I think in general, what you want is to add another call to run_client() with a new string, and handle that in client(). In general the flow should look pretty similar: create an RPC binding, run some tests with it [in this case you're using a different interface, so you don't want run_tests()]. You can duplicate the other tests while you're at it [authinfo, test_is_server_listening()] but it doesn't seem necessary.
On the server side, you'll need to register the new server interface as well, in server().
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2305#note_29775
--
v2: mshtml/tests: Add tests for non-stringed url() with non-URL characters in CSS.
mshtml: Implement ProgressEvent's initProgressEvent method.
mshtml: Get rid of dispatch_nsevent_hook.
mshtml: Implement `complete` prop for input elements.
mshtml: Set dom.ipc.plugins.enabled to FALSE.
mshtml: Tell wine-gecko about the IE compat document mode.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2633
On Wed Apr 12 12:09:30 2023 +0000, **** wrote:
> Marvin replied on the mailing list:
> ```
> Hi,
> It looks like your patch introduced the new failures shown below.
> Please investigate and fix them before resubmitting your patch.
> If they are not new, fixing them anyway would help a lot. Otherwise
> please ask for the known failures list to be updated.
> The tests also ran into some preexisting test failures. If you know how
> to fix them that would be helpful. See the TestBot job for the details:
> The full results can be found at:
> https://testbot.winehq.org/JobDetails.pl?Key=131819
> Your paranoid android.
> === debian11 (32 bit report) ===
> ntdll:
> directory.c:164: Test failed: file L".": expected (null) (10), got 12
> directory.c:164: Test failed: file L"..": expected (null) (10), got 12
> === debian11 (32 bit zh:CN report) ===
> ntdll:
> directory.c:164: Test failed: file L".": expected (null) (10), got 12
> directory.c:164: Test failed: file L"..": expected (null) (10), got 12
> === debian11b (64 bit WoW report) ===
> ntdll:
> directory.c:164: Test failed: file L".": expected (null) (10), got 12
> directory.c:164: Test failed: file L"..": expected (null) (10), got 12
> ```
will fix
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1148#note_29753
Currently, the free list consists of a "small list" for sizes below 256,
which are linearly spaced, and a "large list" which is manually split
into a few chunks.
This patch replaces it with a single log-linear policy, while expanding
the range the large list covers.
The old implementation had issues when a lot of large allocations
happened. In this case, all the allocations went in the last catch-all
bucket in the "large list", and what happens is:
1. The linked list grew in size over time, causing searching cost to
skyrocket.
2. With the first-fit allocation policy, fragmentation was also making
the problem worse.
The new bucketing covers the entire range up until we start allocating
large blocks, which will not enter the free list. It also makes the
allocation policy closer to best-fit (although not exactly), reducing
fragmentation.
The increase in number of free lists does incur some cost when it needs
to be skipped over, but the improvement in allocation performance
outweighs it.
For future work, these ideas (mostly from glibc) might or might not
benefit performance:
- Use an exact best-fit allocation policy.
- Add a bitmap for freelist, allowing empty lists to be skipped with a
single bit scan.
For the benchmark, this drastically improves initial shader loading performance in Overwatch 2. In this workload 78k shaders are passed to DXVK for DXBC -> SPIRV translation, and for each shader a few allocation happens in the 4K – 100K range for the staging buffer.
Before this patch, malloc consisted a whooping 43% of overhead. The overhead with log-linear bucketing is drastically lower, resulting in a ~2x improvement in loading time.
Overhead for each `FREE_LIST_LINEAR_BITS` is as below:
- 0: 7.7%
- 1: 2.9%
- 2: 1.3%
- 3: 0.6%
Since performance seems to scale linearly with increase in buckets (up to the point I have tested), I've opted for 3 (8 buckets per doubling) in the current revision of patch.
Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki(a)gmail.com>
--
v7: ntdll: Use log-linear bucketing for free lists.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2622
Based on !2524.
This adds the remaining stubs needed for Crazy Machines 3 to work and decode preview images correctly in online mode.
--
v4: msvcr110: Implement _Context::_IsSynchronouslyBlocked.
msvcr110: Add _Context::_IsSynchronouslyBlocked stub.
msvcr110: Add _Cancellation_beacon::_Cancellation_beacon_dtor stub.
msvcr110: Add _Cancellation_beacon::_Cancellation_beacon_ctor stub.
https://gitlab.winehq.org/wine/wine/-/merge_requests/1979
Based on !2524.
This adds the remaining stubs needed for Crazy Machines 3 to work and decode preview images correctly in online mode.
--
v3: msvcr110: Implement _Context::_IsSynchronouslyBlocked.
msvcr110: Add _Context::_IsSynchronouslyBlocked stub.
msvcr110: Add _Cancellation_beacon::_Cancellation_beacon_dtor stub.
msvcr110: Add _Cancellation_beacon::_Cancellation_beacon_ctor stub.
https://gitlab.winehq.org/wine/wine/-/merge_requests/1979
* Fix InternetGetConnectedStateEx() parameter checking.
* InternetGetConnectedStateExA() must always null-terminate the state string.
* Dump the state string if it is not as expected.
* Remove a couple of redundant InternetGetConnectedStateEx*() tests.
* Avoid an unnecessary lstrlenW() call in internet.c.
--
v2: wininet/tests: Fix InternetGetConnectedStateEx() parameter checking.
wininet: InternetGetConnectedStateExA() must always null-terminate the state string.
wininet/tests: Dump the state string if it is not as expected.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2634
Fixes bug [53826](https://bugs.winehq.org/show_bug.cgi?id=53826).
--
v24: ntdll: Set xattr in NtCreateFile if inferred and requested attributes don't match.
ntdll: Only infer hidden attribute from file name if xattr is not present.
ntdll: Handle hidden file names inside get_file_info instead of after it.
ntdll/tests: Add test for file attributes of files with names beginning with a dot.
https://gitlab.winehq.org/wine/wine/-/merge_requests/1148
On Wed Apr 5 15:02:41 2023 +0000, Alexandre Julliard wrote:
> This doesn't make sense, it's using the wrong slashes and shouldn't be
> necessary in the first place.
Yup, this isn't necessary anymore and I forgot or accidentally dropped the change of the backslashes to forward slashes. The latest version of this MR now contains just a straightforward conversion of that function from working with nt paths to working with unix paths. The reason this conversion is necessary at all is because I'm moving the calls to this function to the `get_file_info` function, so that more places in the code use the correct file attributes (everything that uses `get_file_info` now automatically returns attributes of hidden files correctly), and so that the extended attribute can take precedence over the file name.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1148#note_29732
To simplify SM 6 support, insert a control point id relative address where needed, and declare control point phase inputs where missing.
--
v6: vkd3d-shader/ir: Insert hull shader control point input declarations if no control point phase is defined.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/141
The D3D12 spec guarantees that lists submitted in ExecuteCommandLists()
will complete execution before any subsequent commands begin execution.
Based on a vkd3d-proton patch by Hans-Kristian Arntzen.
--
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/146
> The patch may fix an application, but that doesn't mean it's correct.
Yeah, I know that.
The problem is my test isn't just add an interface function to the existing interface, I need to add a new interface without 'implicit_handle'. I need to understand all the server test code of rpcrt4 and insert my interface in a correct place, and start the server and client at a right time. It's too hard, Visual Studio did this for me automatically.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2305#note_29694
On Wed Apr 12 07:11:24 2023 +0000, Yeshun Ye wrote:
> I tried to add an interface like this:
> ```
> [
> uuid(00000000-4114-0704-2301-000000000002)
> ]
>
> interface IServer_Expilicit
> {
> void need_explicit_handle([in] handle_t binding, [in] unsigned int
> protseq);
> }
> ```
> I added the code above to the file 'server_interp.idl', the function
> code to 'server.c' like this:
> ```
> void __cdecl s_need_explicit_handle(RPC_BINDING_HANDLE binding, unsigned
> int protseq)
> {
> RPC_STATUS status;
> ULONG pid;
> winetest_push_context("%s", client_test_name);
> ok(binding != NULL, "Got unexpected binding\n");
> trace("s_need_explicit_handle\n");
> if (protseq == RPC_PROTSEQ_LRPC) /* Other protocol sequences throw
> exceptions */
> {
> trace("RPC_PROTSEQ_LRPC\n");
> status = I_RpcBindingInqLocalClientPID(binding, &pid);
> trace("I_RpcBindingInqLocalClientPID\n");
> ok(status == RPC_S_OK, "Got unexpected %ld.\n", status);
> ok(pid == client_info.dwProcessId, "Got unexpected pid.\n");
> trace("pid=%x client_info.dwProcessId=%x\n", pid, client_info.dwProcessId);
> }
> winetest_pop_context();
> }
> ```
> Then I don't know what to do next.
For a generic test case, you can see https://source.winehq.org/git/wine.git/commitdiff/056dbb04dea4d0b990e0177f6… for an example. However, in this case, you need to add a test case specifically for your changes. To be honest, I don't understand your patch, that's why I asked for a test case.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2305#note_29693
On Wed Apr 12 06:56:41 2023 +0000, Zhiyi Zhang wrote:
> Could you be more specific about your question? You can paste your code here.
I tried to add an interface like this:
```
[
uuid(00000000-4114-0704-2301-000000000002)
]
interface IServer_Expilicit
{
void need_explicit_handle([in] handle_t binding, [in] unsigned int protseq);
}
```
I added the code above to the file 'server_interp.idl', the function code to 'server.c' like this:
```
void __cdecl s_need_explicit_handle(RPC_BINDING_HANDLE binding, unsigned int protseq)
{
RPC_STATUS status;
ULONG pid;
winetest_push_context("%s", client_test_name);
ok(binding != NULL, "Got unexpected binding\n");
trace("s_need_explicit_handle\n");
if (protseq == RPC_PROTSEQ_LRPC) /* Other protocol sequences throw exceptions */
{
trace("RPC_PROTSEQ_LRPC\n");
status = I_RpcBindingInqLocalClientPID(binding, &pid);
trace("I_RpcBindingInqLocalClientPID\n");
ok(status == RPC_S_OK, "Got unexpected %ld.\n", status);
ok(pid == client_info.dwProcessId, "Got unexpected pid.\n");
trace("pid=%x client_info.dwProcessId=%x\n", pid, client_info.dwProcessId);
}
winetest_pop_context();
}
```
Then I don't know what to do next.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2305#note_29689
Needed for mingw Firefox build.
The WIDL error points to the wrong function, specifically the one after it.
include/windows.ui.composition.interop.idl:35:63: error: parameter 'swapchain' of function 'CreateCompositionSurfaceForHandle' cannot derive from void *
HRESULT CreateCompositionSurfaceForSwapChain([in] IUnknown *swapchain, [out, retval] ICompositionSurface **result);
^
make[1]: *** [Makefile:163749: include/windows.ui.composition.interop.h] Error 1
--
v4: include: Add windows.ui.composition.interop.idl file.
widl: Add support for WinRT HANDLE parameter type.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2620
On Wed Apr 12 05:38:56 2023 +0000, Yeshun Ye wrote:
> I'm sorry I tried again, but still failed. I just know that the
> interface should not add 'implicit_handle(handle_t IXXX_HANDLE)' after
> 'uuid(IXXX_UUID)'.
> Can you help me add a simple interface in the testcases and tell me how
> to call the interface function? Otherwise, all I can do is give up.
Could you be more specific about your question? You can paste your code here.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2305#note_29687
On Wed Apr 12 05:47:11 2023 +0000, Huw Davies wrote:
> There's too much going on in this commit / MR. The changes to the
> drivers seem to be addressing a bug rather than implementing the
> Endpoint Volume API.
> Also, note that the values passed to the non-`Scalar` API are in
> decibels, the `Scalar` API appears to be somewhere between linear and
> logrithmic, and the `ISimpleAudioVolume` values are linear.
> Does your app call the `Scalar` versions of the API? If so, a good
> start would be to write some tests to figure out the mapping between
> `Scalar` and non-`Scalar` by e.g. setting using one and retreiving that
> value using the other.
Yes, I'm fixing a bug of an application, not implementing the set of Endpoint Volume API.
Do I really need to write some tests?
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2565#note_29674
On Fri Apr 7 10:56:17 2023 +0000, Huw Davies wrote:
> Did these failures get addressed? I don't see testbot runs for the
> later versions of the commits in this MR.
No, I think the function will never return 'MIXERR_INVALCONTROL', so I just run the test again without any change, and it passed. This failures maybe caused by a bug of testbot.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2545#note_29672
On Wed Apr 12 05:38:56 2023 +0000, Zhiyi Zhang wrote:
> Hi Yeshun, you don't need to regularly rebase the MR on top of the
> latest master. Because this area of code is largely undocumented, to
> help this MR gets in, you need to add a test, even if it might not be
> easy. I think you can use some of the example codes in NCALRPC_example. Thanks.
I'm sorry I tried again, but still failed. I just know that the interface should not add 'implicit_handle(handle_t IXXX_HANDLE)' after 'uuid(IXXX_UUID)'.
Can you help me add a simple interface in the testcases and tell me how to call the interface function? Otherwise, all I can do is give up.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2305#note_29671
* Fix InternetGetConnectedStateEx() parameter checking.
* InternetGetConnectedStateExA() must always null-terminate the state string.
* Dump the state string if it is not as expected.
* Remove a couple of redundant InternetGetConnectedStateEx*() tests.
* Avoid an unnecessary lstrlenW() call in internet.c.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2634
Needed for mingw Firefox build.
The WIDL error points to the wrong function, specifically the one after it.
include/windows.ui.composition.interop.idl:35:63: error: parameter 'swapchain' of function 'CreateCompositionSurfaceForHandle' cannot derive from void *
HRESULT CreateCompositionSurfaceForSwapChain([in] IUnknown *swapchain, [out, retval] ICompositionSurface **result);
^
make[1]: *** [Makefile:163749: include/windows.ui.composition.interop.h] Error 1
--
v3: include: Add windows.ui.composition.interop.idl file.
widl: Add support for WinRT HANDLE parameter type.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2620
The app I'm considering opens a video_processor on its own, with
a NV12 format on input and a ARGB32 format on output.
Tested on Windows: the samples are flipped vertically. While Wine
keeps them untouched.
So added a videoflip in the video processor to be activated when needed.
Current activation is based on RGB vs non RGB input/output formats.
Set as draft as if somehow related to MR!2159.
Comments welcomed.
Signed-off-by: Eric Pouech <epouech(a)codeweavers.com>
--
v3: winegstreamer: In video_processor, activate a videoflip converter.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2471
Zebediah Figura (@zfigura) commented about dlls/ntoskrnl.exe/ntoskrnl.c:
> irp->Tail.Overlay.Thread = (PETHREAD)KeGetCurrentThread();
> irp->Tail.Overlay.OriginalFileObject = file;
> irp->RequestorMode = UserMode;
> + HeapFree( GetProcessHeap(), 0, context->in_buff );
> context->in_buff = NULL;
I don't think we need to be deallocating the input buffer; we're not using it. Rather we should just remove the assignment to NULL.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2439#note_29633
--
v2: winepulse: Use mmdevdrv structs from mmdevapi.
wineoss: Use mmdevdrv structs from mmdevapi.
winecoreaudio: Use mmdevdrv structs from mmdevapi.
winealsa: Move common mmdevdrv structs into mmdevapi.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2626
This fixes a bug when the session topology contains an invalid
source, which makes the session thread to hang and stop executing
commands.
--
v6: mf/session: Handle error when a source fails to start.
mf/session: Handle errors when subscribing to source's events.
mf/tests: Test media session error handling.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2496
Today, the test scenario "ACTCTX_FLAG_HMODULE_VALID but hModule if not
set" is broken and unreliable. This problem is not evident in WineHQ
batch test runs; rather, the test failure seems to only be triggered
when the kernel32:actctx test is run in isolation.
When the flag ACTCTX_FLAG_HMODULE_VALID is specified in ACTCTX but
hModule is set to NULL, CreateActCtxW() may encounter different failure
modes depending on the test executable file. Error codes observed so
far include ERROR_SXS_CANT_GEN_ACTCTX and ERROR_SXS_MANIFEST_TOO_BIG.
It appears that the inconsistent failure was caused by Windows trying to
interpret the main executable file of the current process as an XML
manifest file. This fails due to one or more of the following reasons:
- A valid PE executable that starts with the "MZ" signature is not a
valid XML file.
- The executable's size may exceed the limit imposed by the manifest
parser. This is much more likely for binaries with debugging symbols.
Meanwhile, winetest.exe bundles a stripped version of the test
executable (kernel32_test-stripped.exe), which is often smaller than
the original executable (not stripped). This probably explains why
the problem was not visible in batch test runs.
Fix this by changing the FullDllName of the main executable module's
LDR_DATA_TABLE_ENTRY to the pathname of a temporary manifest file (valid
or invalid) before testing. The testing is performed in a child
process, since "corrupting" the internal state of a main test process
is not desirable for achieving deterministic and reliable tests.
Blocks !2555.
--
v4: kernel32/tests: Fix test for ACTCTX_FLAG_HMODULE_VALID with hModule = NULL case.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2617
Currently, the free list consists of a "small list" for sizes below 256,
which are linearly spaced, and a "large list" which is manually split
into a few chunks.
This patch replaces it with a single log-linear policy, while expanding
the range the large list covers.
The old implementation had issues when a lot of large allocations
happened. In this case, all the allocations went in the last catch-all
bucket in the "large list", and what happens is:
1. The linked list grew in size over time, causing searching cost to
skyrocket.
2. With the first-fit allocation policy, fragmentation was also making
the problem worse.
The new bucketing covers the entire range up until we start allocating
large blocks, which will not enter the free list. It also makes the
allocation policy closer to best-fit (although not exactly), reducing
fragmentation.
The increase in number of free lists does incur some cost when it needs
to be skipped over, but the improvement in allocation performance
outweighs it.
For future work, these ideas (mostly from glibc) might or might not
benefit performance:
- Use an exact best-fit allocation policy.
- Add a bitmap for freelist, allowing empty lists to be skipped with a
single bit scan.
For the benchmark, this drastically improves initial shader loading performance in Overwatch 2. In this workload 78k shaders are passed to DXVK for DXBC -> SPIRV translation, and for each shader a few allocation happens in the 4K – 100K range for the staging buffer.
Before this patch, malloc consisted a whooping 43% of overhead. The overhead with log-linear bucketing is drastically lower, resulting in a ~2x improvement in loading time.
Overhead for each `FREE_LIST_LINEAR_BITS` is as below:
- 0: 7.7%
- 1: 2.9%
- 2: 1.3%
- 3: 0.6%
Since performance seems to scale linearly with increase in buckets (up to the point I have tested), I've opted for 3 (8 buckets per doubling) in the current revision of patch.
Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki(a)gmail.com>
--
v6: ntdll: Use log-linear bucketing for free lists.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2622
Currently, the free list consists of a "small list" for sizes below 256,
which are linearly spaced, and a "large list" which is manually split
into a few chunks.
This patch replaces it with a single log-linear policy, while expanding
the range the large list covers.
The old implementation had issues when a lot of large allocations
happened. In this case, all the allocations went in the last catch-all
bucket in the "large list", and what happens is:
1. The linked list grew in size over time, causing searching cost to
skyrocket.
2. With the first-fit allocation policy, fragmentation was also making
the problem worse.
The new bucketing covers the entire range up until we start allocating
large blocks, which will not enter the free list. It also makes the
allocation policy closer to best-fit (although not exactly), reducing
fragmentation.
The increase in number of free lists does incur some cost when it needs
to be skipped over, but the improvement in allocation performance
outweighs it.
For future work, these ideas (mostly from glibc) might or might not
benefit performance:
- Use an exact best-fit allocation policy.
- Add a bitmap for freelist, allowing empty lists to be skipped with a
single bit scan.
For the benchmark, this drastically improves initial shader loading performance in Overwatch 2. In this workload 78k shaders are passed to DXVK for DXBC -> SPIRV translation, and for each shader a few allocation happens in the 4K – 100K range for the staging buffer.
Before this patch, malloc consisted a whooping 43% of overhead. The overhead with log-linear bucketing is drastically lower, resulting in a ~2x improvement in loading time.
Overhead for each `FREE_LIST_LINEAR_BITS` is as below:
- 0: 7.7%
- 1: 2.9%
- 2: 1.3%
- 3: 0.6%
Since performance seems to scale linearly with increase in buckets (up to the point I have tested), I've opted for 3 (8 buckets per doubling) in the current revision of patch.
Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki(a)gmail.com>
--
v5: ntdll: Use log-linear bucketing for free lists.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2622
--
v2: imm32: Use INPUTCONTEXT directly in ImmSetConversionStatus.
imm32: Use INPUTCONTEXT directly in ImmGetConversionStatus.
imm32: Compare open status values in ImmSetOpenStatus.
imm32: Cache INPUTCONTEXT values for every IME.
imm32: Use INPUTCONTEXT directly in ImmSetOpenStatus.
imm32: Use INPUTCONTEXT directly in ImmGetOpenStatus.
imm32: Serialize ImeInquire / ImeDestroy calls.
imm32/tests: Cleanup the cross thread IMC tests.
imm32/tests: Reduce the number of IME installations.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2627
The main advantage is that this way we're getting valid DXBC checksums for DXBC blobs generated by d3dcompiler. See also https://bugs.winehq.org/show_bug.cgi?id=54464.
--
v4: d3dcompiler: Use vkd3d_shader_parse_dxbc() in d3dcompiler_shader_reflection_init().
d3dcompiler: Use vkd3d_shader_parse_dxbc() in d3dcompiler_strip_shader().
d3dcompiler: Use vkd3d_shader_parse_dxbc() in d3dcompiler_get_blob_part().
https://gitlab.winehq.org/wine/wine/-/merge_requests/2577
Currently, the free list consists of a "small list" for sizes below 256,
which are linearly spaced, and a "large list" which is manually split
into a few chunks.
This patch replaces it with a single log-linear policy, while expanding
the range the large list covers.
The old implementation had issues when a lot of large allocations
happened. In this case, all the allocations went in the last catch-all
bucket in the "large list", and what happens is:
1. The linked list grew in size over time, causing searching cost to
skyrocket.
2. With the first-fit allocation policy, fragmentation was also making
the problem worse.
The new bucketing covers the entire range up until we start allocating
large blocks, which will not enter the free list. It also makes the
allocation policy closer to best-fit (although not exactly), reducing
fragmentation.
The increase in number of free lists does incur some cost when it needs
to be skipped over, but the improvement in allocation performance
outweighs it.
For future work, these ideas (mostly from glibc) might or might not
benefit performance:
- Use an exact best-fit allocation policy.
- Add a bitmap for freelist, allowing empty lists to be skipped with a
single bit scan.
For the benchmark, this drastically improves initial shader loading performance in Overwatch 2. In this workload 78k shaders are passed to DXVK for DXBC -> SPIRV translation, and for each shader a few allocation happens in the 4K – 100K range for the staging buffer.
Before this patch, malloc consisted a whooping 43% of overhead. The overhead with log-linear bucketing is drastically lower, resulting in a ~2x improvement in loading time.
Overhead for each `FREE_LIST_LINEAR_BITS` is as below:
- 0: 7.7%
- 1: 2.9%
- 2: 1.3%
- 3: 0.6%
Since performance seems to scale linearly with increase in buckets (up to the point I have tested), I've opted for 3 (8 buckets per doubling) in the current revision of patch.
Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki(a)gmail.com>
--
v3: ntdll: Use log-linear bucketing for free lists.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2622
To simplify SM 6 support, insert a control point id relative address where needed, and declare control point phase inputs where missing.
--
v5: vkd3d-shader/ir: Insert hull shader control point input declarations if no control point phase is defined.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/141
The deleted test is a bad test: The behavior is different on different
drivers, for example, the test fails on the Windows 10 and 11 testbot
machines that have AMD video cards. Furthermore, MSDN does not say that
the destination context "must not" have any display lists yet but rather
that it "should not" have any.[1] The Khronos OpenGL Wiki similarly
advises against calling wglShareLists after the source or destination
context has at least one object, but if you do, it only says that "there
is a chance that wglShareLists will fail", not that it will necessarily
fail.[2] Since there's no clear "right" behavior here, we can adopt the
more permissive behavior that some programs expect, as long as it
doesn't corrupt the context.
[1] https://learn.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-wglsha…
[2] https://www.khronos.org/opengl/wiki/Platform_specifics:_Windows#wglShareLis…
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=11436
--
v4: winex11: Allow replacing either context in wglShareLists.
opengl32/tests: Make the wglShareLists tests comprehensive.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2032
include: _InterlockedExchangePointer and _InterlockedCompareExchangePointer are intrinsics in x86 msvc.
I fixed this issue in ad05f33d67, but a40973f20 regressed this again. I was carrying a
patch for quite a while, feeling dejavu.
The msvc ver of 1900 is taken from Boost's interlocked.hpp, which matches MSVC 2015
(toolset version v140). Boost has a comment that claims that in msvc 2012 those
functions were defined in intrin.h, but those defines are broken with Microsoft's
winnt.h.
--
v2: include: x86 msvc has _InterlockedExchangePointer and _InterlockedCompareExchangePointer
https://gitlab.winehq.org/wine/wine/-/merge_requests/2591
Currently, the free list consists of a "small list" for sizes below 256,
which are linearly spaced, and a "large list" which is manually split
into a few chunks.
This patch replaces it with a single log-linear policy, while expanding
the range the large list covers.
The old implementation had issues when a lot of large allocations
happened. In this case, all the allocations went in the last catch-all
bucket in the "large list", and what happens is:
1. The linked list grew in size over time, causing searching cost to
skyrocket.
2. With the first-fit allocation policy, fragmentation was also making
the problem worse.
The new bucketing covers the entire range up until we start allocating
large blocks, which will not enter the free list. It also makes the
allocation policy closer to best-fit (although not exactly), reducing
fragmentation.
The increase in number of free lists does incur some cost when it needs
to be skipped over, but the improvement in allocation performance
outweighs it.
For future work, these ideas (mostly from glibc) might or might not
benefit performance:
- Use an exact best-fit allocation policy.
- Add a bitmap for freelist, allowing empty lists to be skipped with a
single bit scan.
Signed-off-by: Tatsuyuki Ishi <ishitatsuyuki(a)gmail.com>
--
v2: ntdll: Use log-linear bucketing for free lists.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2622
Needed for mingw Firefox build.
The WIDL error points to the wrong function, specifically the one after it.
include/windows.ui.composition.interop.idl:35:63: error: parameter 'swapchain' of function 'CreateCompositionSurfaceForHandle' cannot derive from void *
HRESULT CreateCompositionSurfaceForSwapChain([in] IUnknown *swapchain, [out, retval] ICompositionSurface **result);
^
make[1]: *** [Makefile:163749: include/windows.ui.composition.interop.h] Error 1
--
v2: include: Add windows.ui.composition.interop.idl file.
widl: Add support for WinRT HANDLE parameter type.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2620
To simplify SM 6 support, insert a control point id relative address where needed, and declare control point phase inputs where missing.
--
v4: vkd3d-shader/ir: Insert hull shader control point input declarations if no control point phase is defined.
vkd3d-shader/ir: Normalise control point phase output registers to include the control point id.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/141
To simplify SM 6 support, insert a control point id relative address where needed, and declare control point phase inputs where missing.
--
v3: vkd3d-shader/ir: Insert hull shader control point input declarations if no control point phase is defined.
vkd3d-shader/ir: Normalise control point phase output registers to include the control point id.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/141
To simplify SM 6 support, insert a control point id relative address where needed, and declare control point phase inputs where missing.
--
v2: vkd3d-shader/ir: Insert hull shader control point input declarations if no control point phase is defined.
vkd3d-shader/ir: Normalise control point phase output registers to include the control point id.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/141
--
v5: vkd3d: Pass an offset and size to d3d12_heap_unmap() in d3d12_resource_WriteToSubresource().
vkd3d: Call vkFlushMappedMemoryRanges() when unmapping of a heap is requested.
vkd3d: Pass an offset and size to d3d12_heap_map() in d3d12_resource_ReadFromSubresource().
vkd3d: Call vkInvalidateMappedMemoryRanges() when a mapping is requested on a heap.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/126
Fixes Combat Mission: Battle for Normandy (GL game) failing to initialize GL on start. Looks like the game depends specifically on cAlphaShift being 24 (seemingly without any other prior checks, simply surfing through gdi32.DescribePixelFormat).
The shifts in winex11.drv are obviously wrong currently: alpha channel doesn't preceded BGR in BGRA format (which is also confirmed by the test). cAlphaShift is a bit trickier though, that is marked as unsupported on Deck. Here on real hardware AMD desktop it is 24 (and alpha bits are 8). That's not the case on one Testbot machine though (win11_nv64) where both cAlphaShift and cAlphaBits are 0. So I made the test to accept such case as well.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2625
This is required by https://bugs.winehq.org/show_bug.cgi?id=54660 .
--
v2: vkd3d-shader/hlsl: Error out when a semantic is used with multiple types.
vkd3d-shader/hlsl: Error out when an output semantic is used more than once.
vkd3d-shader/hlsl: Don't create semantic vars more than once.
vkd3d-shader/hlsl: Report missing semantics in struct fields.
vkd3d-shader/hlsl: Move get_array_size() and get_array_type() to hlsl.c.
vkd3d-shader/hlsl: Support semantics for array types.
tests: Test array types with semantics.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/148
This matches the declaration with official documentation and fixes
the following two compiler errors in apitrace project.
d3d9trace.cpp:28469:59: error: invalid conversion from 'const DXVA2_ConfigPictureDecode*' to 'DXVA2_ConfigPictureDecode*' [-fpermissive]
d3d9trace.cpp:28194:65: error: invalid conversion from 'void*' to 'IUnknown*' [-fpermissive]
--
v2: include: Fix IDirectXVideoDecoderService declaration in dxva2api.idl.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2624
This matches the declaration with official documentation and fixes
the following two compiler errors in apitrace project.
d3d9trace.cpp:28469:59: error: invalid conversion from 'const DXVA2_ConfigPictureDecode*' to 'DXVA2_ConfigPictureDecode*' [-fpermissive]
d3d9trace.cpp:28194:65: error: invalid conversion from 'void*' to 'IUnknown*' [-fpermissive]
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2624