This behaviour is expected by games such as Borderlands 3.
--
v5: mfplay/tests: Add tests for MF_SD_LANGUAGE.
winegstreamer: Map MF_SD_LANGUAGE to ISO 639-1 for QuickTime media.
https://gitlab.winehq.org/wine/wine/-/merge_requests/1662
This behaviour is expected by games such as Borderlands 3.
--
v4: mfplay/tests: Add tests for MF_SD_LANGUAGE.
winegstreamer: Map MF_SD_LANGUAGE to ISO 639-1 for QuickTime media.
https://gitlab.winehq.org/wine/wine/-/merge_requests/1662
We are currently not initializing static values to zero by default.
Consider the following shader:
```hlsl
static float4 va;
float4 main() : sv_target
{
return va;
}
```
we get the following output:
```
ps_5_0
dcl_output o0.xyzw
dcl_temps 2
mov r0.xyzw, r1.xyzw
mov o0.xyzw, r0.xyzw
ret
```
where r1.xyzw is not initialized.
This patch solves this by assigning the static variable the value of an
uint 0, and thus, relying on complex broadcasts.
This seems to be the behaviour of the the native compiler, since it retrieves
the following error on a shader that lacks an initializer on a data type with
object components:
```
error X3017: cannot convert from 'uint' to 'struct <unnamed>'
```
--
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/54
It turns out that WinHttpReceiveResponse() completes synchronously in async mode (unless recursive request for handling authorization or redirect is involved). Some apps depend on that and do not wait for WINHTTP_CALLBACK_STATUS_HEADERS_AVAILABLE, calling WinHttpQueryHeaders() or WinHttpWebSocketCompleteUpgrade() right after calling WinHttpReceiveResponse, relying on that to finish synchronously.
My initial out of tree testing shows that no network communication is performed during WinHttpReceiveResponse() call (when recursive request is not involved). I tested that by inserting a wait between WinHttpSendRequest and WinHttpReceiveResponse and disabling network connection during the wait. WinHttpReceiveResponse still succeeds on Windows.
I think the above means that the actual response receiving from server is performed during WinHttpSendRequest. WinHttpReceiveResponse is not a complete no-op however. As shown by the existing tests the notifications related to receiving response are still delivered during WinHttpReceiveResponse (albeit in the same thread). Also WinHttpReceiveResponse affects request state: querying headers or upgrading to websocket without calling WinHttpReceiveResponse does not succeed.
When redirect is involved, all the WINHTTP_CALLBACK_STATUS_RECEIVING_RESPONSE, WINHTTP_CALLBACK_STATUS_RESPONSE_RECEIVED and WINHTTP_CALLBACK_STATUS_REDIRECT notifications are delivered synchronously from the calling thread. Then, the new request send notifications and response receiving notifications are delivered from the new thread.
An interesting case is when WinHttpReceiveResponse is called from SendRequest callbacks in async mode. If WinHttpReceiveResponse is called from WINHTTP_CALLBACK_STATUS_SENDING_REQUEST or WINHTTP_CALLBACK_STATUS_REQUEST_SENT (i. e., when request is not complete yet), calling WinHttpReceiveResponse() suddenly succeeds an shows the following message sequence (that is partially reflected in the tests I am adding):
- calling WinHttpReceiveResponse from WINHTTP_CALLBACK_STATUS_SENDING_REQUEST (which is already called on the async thread on Win10, thread A). Win8 queues WINHTTP_CALLBACK_STATUS_SENDING_REQUEST synchronously and goes async a bit later.
- WINHTTP_CALLBACK_STATUS_RECEIVING_RESPONSE, thread A;
- WinHttpReceiveResponse() returns to the caller WINHTTP_CALLBACK_STATUS_REQUEST_SENT callback; returning from user callback;
- WINHTTP_CALLBACK_STATUS_REQUEST_SENT, thread A;
- WINHTTP_CALLBACK_STATUS_SENDREQUEST_COMPLETE (another thread, although the sequence is probably synced; I am not implementing this part and calling this callback from the same thread A);
- WINHTTP_CALLBACK_STATUS_RESPONSE_RECEIVED in thread A;
- WINHTTP_CALLBACK_STATUS_HEADERS_AVAILABLE in thread A.
So the receive_response() state RECEIVE_RESPONSE_SEND_INCOMPLETE is primarily needed to handle this case.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1582
I'm not sure what's the best way of registering print processor. The alternative is to register winprint in winspool when driver is added.
--
v2: winprint: Register winprint print processor.
https://gitlab.winehq.org/wine/wine/-/merge_requests/1685
Implement a basic GC based on the mark-and-sweep algorithm, without requiring manually specifying "roots", which vastly simplifies the management. For now, it is triggered every 30 seconds since it last finished, on a new object initialization. Better heuristics could be used in the future.
The comments in the code should hopefully understand the high level logic of this approach without boilerplate details. I've tested it on FFXIV launcher (along with other patches from Proton to have it work) and it stops the massive memory leak successfully by itself, so at least it does its job properly. The last patch in the MR is just an optimization for a *very* common case.
For artificial testing, one could use something like:
```javascript
function leak() {
var a = {}, b = {};
a.b = b;
b.a = a;
}
```
which creates a circular ref and will leak when the function returns.
It also introduces and makes use of a "heap_stack", which prevents stack overflows on long chains.
--
v3: jscript: Create the source function's 'prototype' prop object on demand.
jscript: Run the garbage collector every 30 seconds on a new object
jscript: Implement CollectGarbage().
jscript: Implement a Garbage Collector to deal with circular references.
https://gitlab.winehq.org/wine/wine/-/merge_requests/1635
Not completely sure if it's worth having for 8.0, but opening this to share the target I'm trying to reach.
--
v2: ntdll: Implement Low Fragmentation Heap.
ntdll: Introduce per-thread free lists for heap blocks.
ntdll: Introduce a new BLOCK_FLAG_SPLIT heap block flag.
ntdll: Introduce a new subheap thread affinity field.
ntdll: Introduce a new heap block_init_used helper.
ntdll: Introduce a new heap free_list_init helper.
ntdll: Count allocations and automatically enable LFH.
ntdll: Implement HeapCompatibilityInformation.
https://gitlab.winehq.org/wine/wine/-/merge_requests/1628
On Sat Dec 3 08:33:04 2022 +0000, Huw Davies wrote:
> Yes, this is from the build dir. I'm setting
> `DYLD_FALLBACK_LIBRARY_PATH` in `.winewrapper` to avoid the issue of
> `/bin/sh` stripping the `DYLD_` variables, which worked before the
> mentioned commit.
Ah ok I've reproduced the problem, looking into it
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1616#note_18602
The title was set to 'wineconsole.exe' instead of the name of
the created process. conhost.exe loads its per application
configuration from the console's title.
Signed-off-by: Eric Pouech <eric.pouech(a)gmail.com>
--
v2: wineconsole: Set launched process name as created console title.
https://gitlab.winehq.org/wine/wine/-/merge_requests/1668
The title was set to 'wineconsole.exe' instead of the name of
the created process. conhost.exe loads its per application
configuration from the console's title.
Signed-off-by: Eric Pouech <eric.pouech(a)gmail.com>
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1668
--
v2: localspl: Add AddPrintProcessor implementation.
winspool: Implement print processor validation in AddPrinter.
localspl: Support Port handles in EndDocPrinter.
localspl: Partially support Port handles in StartDocPrinter.
localspl: Support Port handles in WritePrinter.
https://gitlab.winehq.org/wine/wine/-/merge_requests/1672
On Sat Dec 3 04:33:26 2022 +0000, Brendan Shanks wrote:
> Hmm, and this worked correctly before? Are you running from the build directory?
> I didn't realize this before, but any DYLD_ environment variables will
> be stripped when running from the build directory, because `./wine` or
> `./wine64` is a shell script, and system binaries (including `/bin/sh`)
> ignore/strip `DYLD_` env vars. But if you edit `wine`/`wine64` and
> set/export `DYLD_FALLBACK_LIBRARY_PATH` inside the script, it works.
> Or, if you install Wine and then run it from there, `wine`/`wine64` is a
> binary, so `DYLD_FALLBACK_LIBRARY_PATH` works correctly.
> I did some tests with the preloader and couldn't find any cases where it
> loses or mishandles the environment variables.
Yes, this is from the build dir. I'm setting `DYLD_FALLBACK_LIBRARY_PATH` in `.winewrapper` to avoid the issue of `/bin/sh` stripping the `DYLD_` variables, which worked before the mentioned commit.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1616#note_18558
On Sat Dec 3 04:33:26 2022 +0000, Huw Davies wrote:
> commit 588e5554252 is preventing a 32-bit build on Mojave from finding
> the 32-bit libfreetype.6.dylib which I have in a non-standard location.
> I suspect that DYLD_FALLBACK_LIBRARY_PATH is somehow being lost.
Hmm, and this worked correctly before? Are you running from the build directory?
I didn't realize this before, but any DYLD_ environment variables will be stripped when running from the build directory, because `./wine` or `./wine64` is a shell script, and system binaries (including `/bin/sh`) ignore/strip `DYLD_` env vars. But if you edit `wine`/`wine64` and set/export `DYLD_FALLBACK_LIBRARY_PATH` inside the script, it works.
Or, if you install Wine and then run it from there, `wine`/`wine64` is a binary, so `DYLD_FALLBACK_LIBRARY_PATH` works correctly.
I did some tests with the preloader and couldn't find any cases where it loses or mishandles the environment variables.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1616#note_18544
The test for ISmbiosInformationStatics_get_SerialNumber is broken on Window 10 testbot VMs, presumably because they don't have a serial number? It results in an HRESULT of E_UNEXPECTED. I added a broken test case for it. I'm assuming that normal installations of Windows return a valid serial number or at least something like "Not Specified" and not NULL. Also, on my Linux OS running cat /sys/class/dmi/id/product_serial returns "To be filled by O.E.M". So I added a fallback to return 0 as the number. Or is it fine to just return whatever string is found?
On the Windows 8 VMs, the test crashes at line 75, hr = ISmbiosInformationStatics_get_SerialNumber( smbios_statics, &serial ). Not sure what I should do in this case. I was hoping for a flag that checks if the VM is Windows 8, but there doesn't seem to be one. Should I wrap the test in if (0) or is there an alternative way?
Another weird thing is the test fails prematurely on only the 32-bit version of debian11b, saying that the runtimeclass is not registered. I'm assuming it's an issue with the testbot. Debian11 32 bit runs fine.
--
v5: windows.system.profile.systemmanufacturers: Implement ISmbiosInformationStatics_get_SerialNumber.
https://gitlab.winehq.org/wine/wine/-/merge_requests/1588
> * direct call: 5761
> * unpatched Wine: 13933
> * ret.diff: 6823 (55% time spent in \__wine_unix_call_dispatcher, 29% in PE vkGetPhysicalDeviceProperties)
>
> Looks impressive!
Thanks, committed as 0aae4b05633cb9b38eb37cc662f5a3aadb3ce108. Can we get rid of direct calls now? :wink:
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1552#note_18536
> Would it help to return to the return address already on the PE stack?
Are we sure it's never clobbered?
> I guess that moving the ret address to rcx and push rcx / ret might be
the same performance-wise as pushq 0x70(%rcx), ret.
Yes, skipping rcx save will break existing tests.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1552#note_18485
On Fri Dec 2 20:30:47 2022 +0000, **** wrote:
> Paul Gofman replied on the mailing list:
> ```
> On 12/2/22 14:25, Gabriel Ivăncescu (@insn) wrote:
> > On Fri Dec 2 18:57:30 2022 +0000, Jacek Caban wrote:
> >>> This should help a bit more, does it make a difference for you?
> >> My previous test wasn't really good for measuring it.
> >> I hacked a micro-benchmark, which confirms that the patch improves
> >> performance a lot. It was visible when doing "real" Vulkan
> >> vkGetPhysicalDeviceProperties calls in a loop, but even cleaner when I
> >> changed it further to make Unix side to be no-op. It closes most of the
> >> gap between direct call and __wine_unix_call_dispatcher. Times recorded
> >> for no-op calls:
> >> - direct call: 5761
> >> - unpatched Wine: 13933
> >> - ret.diff: 6823 (55% time spent in __wine_unix_call_dispatcher, 29% in
> >> PE vkGetPhysicalDeviceProperties)
> >> Looks impressive!
> > @gofman This isn't about setting it in rcx or not, it's about
> mispairing `call`s and `ret`s, which basically means 100% mispredicted
> because CPUs are optimized for it, so it couldn't do any speculative
> execution past the return before.
> >
> Yes, I figured that much. Yet the attached diff removes the return
> address from rcx in wine_syscall_dispatcher(), so I thought it makes
> sense to note that it will break things.
> ```
Would it help to return to the return address already on the PE stack?
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1552#note_18480
On Fri Dec 2 18:57:30 2022 +0000, Jacek Caban wrote:
> > This should help a bit more, does it make a difference for you?
> My previous test wasn't really good for measuring it.
> I hacked a micro-benchmark, which confirms that the patch improves
> performance a lot. It was visible when doing "real" Vulkan
> vkGetPhysicalDeviceProperties calls in a loop, but even cleaner when I
> changed it further to make Unix side to be no-op. It closes most of the
> gap between direct call and __wine_unix_call_dispatcher. Times recorded
> for no-op calls:
> - direct call: 5761
> - unpatched Wine: 13933
> - ret.diff: 6823 (55% time spent in __wine_unix_call_dispatcher, 29% in
> PE vkGetPhysicalDeviceProperties)
> Looks impressive!
@gofman This isn't about setting it in rcx or not, it's about mispairing `call`s and `ret`s, which basically means 100% mispredicted because CPUs are optimized for it, so it couldn't do any speculative execution past the return before.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1552#note_18478
> This should help a bit more, does it make a difference for you?
My previous test wasn't really good for measuring it.
I hacked a micro-benchmark, which confirms that the patch improves performance a lot. It was visible when doing "real" Vulkan vkGetPhysicalDeviceProperties calls in a loop, but even cleaner when I changed it further to make Unix side to be no-op. It closes most of the gap between direct call and __wine_unix_call_dispatcher. Times recorded for no-op calls:
- direct call: 5761
- unpatched Wine: 13933
- ret.diff: 6823 (55% time spent in __wine_unix_call_dispatcher, 29% in PE vkGetPhysicalDeviceProperties)
Looks impressive!
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1552#note_18474
Signed-off-by: Nikolay Sivov <nsivov(a)codeweavers.com>
--
v2: d3d10/effect: Add 'frc' instruction support for expressions.
d3d10/effect: Add 'rcp' instruction support for expressions.
d3d10/effect: Add 'div' instruction support for expressions.
d3d10/effect: Add 'ftob' instruction support for expressions.
d3d10/effect: Partially implement updates through value expressions.
https://gitlab.winehq.org/wine/wine/-/merge_requests/1622
Implement a basic GC based on the mark-and-sweep algorithm, without requiring manually specifying "roots", which vastly simplifies the management. For now, it is triggered every 30 seconds since it last finished, on a new object initialization. Better heuristics could be used in the future.
The comments in the code should hopefully understand the high level logic of this approach without boilerplate details. I've tested it on FFXIV launcher (along with other patches from Proton to have it work) and it stops the massive memory leak successfully by itself, so at least it does its job properly. The second patch in the MR is just an optimization for a *very* common case.
For artificial testing, one could use something like:
```javascript
function leak() {
var a = {}, b = {};
a.b = b;
b.a = a;
}
```
which creates a circular ref and will leak when the function returns.
It also introduces and makes use of a "heap_stack", which prevents stack overflows on long chains.
--
v2: jscript: Create the source function's 'prototype' prop object on demand.
jscript: Run the garbage collector every 30 seconds on a new object
jscript: Implement CollectGarbage().
jscript: Implement a Garbage Collector to deal with circular references.
https://gitlab.winehq.org/wine/wine/-/merge_requests/1635
On Fri Dec 2 16:22:01 2022 +0000, Jacek Caban wrote:
> In a Vulkan sample that I previously used to measure the impact on
> command buffers, I can see a really nice improvement. If I disable all
> direct calls, overhead drops from __wine_syscall_dispatcher ~8%
> (measured in Wine without your recent patches) to 1.05% (and <0.2% for
> __wine_syscall_dispatcher, so not related to winevulkan). That compares
> to 0.6% for direct Unix calls. FPS differences roughly match that. It
> looks promising.
[ret.diff](/uploads/4cc909db6cfc3d4029b8a8bcec669de5/ret.diff)
This should help a bit more, does it make a difference for you?
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1552#note_18459
On Fri Dec 2 10:52:41 2022 +0000, mbriar wrote:
> Hello,
> I've compared a windows build of
> [vkoverhead](https://github.com/zmike/vkoverhead) test case 92, which
> uses descriptor buffers, on latest wine from git. I'm getting a score of
> around 48000 operations per second without this patch, and around 84000
> with this patch using direct calls, so still almost a 2x difference.
> FWIW, a linux build without using wine gets around 320000 in the same
> test, all using the RADV vulkan driver.
> I haven't tested it with actual games yet, but I expect it to still have
> a noticeable effect on CPU-bound games with vkd3d-proton.
In a Vulkan sample that I previously used to measure the impact on command buffers, I can see a really nice improvement. If I disable all direct calls, overhead drops from __wine_syscall_dispatcher ~8% (measured in Wine without your recent patches) to 1.05% (and <0.2% for __wine_syscall_dispatcher, so not related to winevulkan). That compares to 0.6% for direct Unix calls. FPS differences roughly match that. It looks promising.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1552#note_18458