On Fri Dec 2 18:57:30 2022 +0000, Jacek Caban wrote:
> > This should help a bit more, does it make a difference for you?
> My previous test wasn't really good for measuring it.
> I hacked a micro-benchmark, which confirms that the patch improves
> performance a lot. It was visible when doing "real" Vulkan
> vkGetPhysicalDeviceProperties calls in a loop, but even cleaner when I
> changed it further to make Unix side to be no-op. It closes most of the
> gap between direct call and __wine_unix_call_dispatcher. Times recorded
> for no-op calls:
> - direct call: 5761
> - unpatched Wine: 13933
> - ret.diff: 6823 (55% time spent in __wine_unix_call_dispatcher, 29% in
> PE vkGetPhysicalDeviceProperties)
> Looks impressive!
@gofman This isn't about setting it in rcx or not, it's about mispairing `call`s and `ret`s, which basically means 100% mispredicted because CPUs are optimized for it, so it couldn't do any speculative execution past the return before.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1552#note_18478
> This should help a bit more, does it make a difference for you?
My previous test wasn't really good for measuring it.
I hacked a micro-benchmark, which confirms that the patch improves performance a lot. It was visible when doing "real" Vulkan vkGetPhysicalDeviceProperties calls in a loop, but even cleaner when I changed it further to make Unix side to be no-op. It closes most of the gap between direct call and __wine_unix_call_dispatcher. Times recorded for no-op calls:
- direct call: 5761
- unpatched Wine: 13933
- ret.diff: 6823 (55% time spent in __wine_unix_call_dispatcher, 29% in PE vkGetPhysicalDeviceProperties)
Looks impressive!
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1552#note_18474
Signed-off-by: Nikolay Sivov <nsivov(a)codeweavers.com>
--
v2: d3d10/effect: Add 'frc' instruction support for expressions.
d3d10/effect: Add 'rcp' instruction support for expressions.
d3d10/effect: Add 'div' instruction support for expressions.
d3d10/effect: Add 'ftob' instruction support for expressions.
d3d10/effect: Partially implement updates through value expressions.
https://gitlab.winehq.org/wine/wine/-/merge_requests/1622
Implement a basic GC based on the mark-and-sweep algorithm, without requiring manually specifying "roots", which vastly simplifies the management. For now, it is triggered every 30 seconds since it last finished, on a new object initialization. Better heuristics could be used in the future.
The comments in the code should hopefully understand the high level logic of this approach without boilerplate details. I've tested it on FFXIV launcher (along with other patches from Proton to have it work) and it stops the massive memory leak successfully by itself, so at least it does its job properly. The second patch in the MR is just an optimization for a *very* common case.
For artificial testing, one could use something like:
```javascript
function leak() {
var a = {}, b = {};
a.b = b;
b.a = a;
}
```
which creates a circular ref and will leak when the function returns.
It also introduces and makes use of a "heap_stack", which prevents stack overflows on long chains.
--
v2: jscript: Create the source function's 'prototype' prop object on demand.
jscript: Run the garbage collector every 30 seconds on a new object
jscript: Implement CollectGarbage().
jscript: Implement a Garbage Collector to deal with circular references.
https://gitlab.winehq.org/wine/wine/-/merge_requests/1635
On Fri Dec 2 16:22:01 2022 +0000, Jacek Caban wrote:
> In a Vulkan sample that I previously used to measure the impact on
> command buffers, I can see a really nice improvement. If I disable all
> direct calls, overhead drops from __wine_syscall_dispatcher ~8%
> (measured in Wine without your recent patches) to 1.05% (and <0.2% for
> __wine_syscall_dispatcher, so not related to winevulkan). That compares
> to 0.6% for direct Unix calls. FPS differences roughly match that. It
> looks promising.
[ret.diff](/uploads/4cc909db6cfc3d4029b8a8bcec669de5/ret.diff)
This should help a bit more, does it make a difference for you?
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1552#note_18459
On Fri Dec 2 10:52:41 2022 +0000, mbriar wrote:
> Hello,
> I've compared a windows build of
> [vkoverhead](https://github.com/zmike/vkoverhead) test case 92, which
> uses descriptor buffers, on latest wine from git. I'm getting a score of
> around 48000 operations per second without this patch, and around 84000
> with this patch using direct calls, so still almost a 2x difference.
> FWIW, a linux build without using wine gets around 320000 in the same
> test, all using the RADV vulkan driver.
> I haven't tested it with actual games yet, but I expect it to still have
> a noticeable effect on CPU-bound games with vkd3d-proton.
In a Vulkan sample that I previously used to measure the impact on command buffers, I can see a really nice improvement. If I disable all direct calls, overhead drops from __wine_syscall_dispatcher ~8% (measured in Wine without your recent patches) to 1.05% (and <0.2% for __wine_syscall_dispatcher, so not related to winevulkan). That compares to 0.6% for direct Unix calls. FPS differences roughly match that. It looks promising.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1552#note_18458
--
v5: mfplay/tests: Add MF_SD_LANGUAGE and MF_SD_STREAM_NAME value tests.
winegstreamer: Extract stream name from QT demuxer private data.
winegstreamer: Query stream tags and set MF_SD_LANGUAGE attribute.
https://gitlab.winehq.org/wine/wine/-/merge_requests/1542