Re: [PATCH v2 0/1] MR1552: winevulkan: Use direct calls for vkGetDescriptorEXT.
Dec. 2, 2022
8:25 p.m.
On Fri Dec 2 18:57:30 2022 +0000, Jacek Caban wrote: > > This should help a bit more, does it make a difference for you? > My previous test wasn't really good for measuring it. > I hacked a micro-benchmark, which confirms that the patch improves > performance a lot. It was visible when doing "real" Vulkan > vkGetPhysicalDeviceProperties calls in a loop, but even cleaner when I > changed it further to make Unix side to be no-op. It closes most of the > gap between direct call and __wine_unix_call_dispatcher. Times recorded > for no-op calls: > - direct call: 5761 > - unpatched Wine: 13933 > - ret.diff: 6823 (55% time spent in __wine_unix_call_dispatcher, 29% in > PE vkGetPhysicalDeviceProperties) > Looks impressive! @gofman This isn't about setting it in rcx or not, it's about mispairing `call`s and `ret`s, which basically means 100% mispredicted because CPUs are optimized for it, so it couldn't do any speculative execution past the return before. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/1552#note_18478
December 2022
8:30 p.m.
New subject: [PATCH v2 0/1] MR1552: winevulkan: Use direct calls for vkGetDescriptorEXT.
On 12/2/22 14:25, Gabriel Ivăncescu (@insn) wrote: > On Fri Dec 2 18:57:30 2022 +0000, Jacek Caban wrote: >>> This should help a bit more, does it make a difference for you? >> My previous test wasn't really good for measuring it. >> I hacked a micro-benchmark, which confirms that the patch improves >> performance a lot. It was visible when doing "real" Vulkan >> vkGetPhysicalDeviceProperties calls in a loop, but even cleaner when I >> changed it further to make Unix side to be no-op. It closes most of the >> gap between direct call and __wine_unix_call_dispatcher. Times recorded >> for no-op calls: >> - direct call: 5761 >> - unpatched Wine: 13933 >> - ret.diff: 6823 (55% time spent in __wine_unix_call_dispatcher, 29% in >> PE vkGetPhysicalDeviceProperties) >> Looks impressive! > @gofman This isn't about setting it in rcx or not, it's about mispairing `call`s and `ret`s, which basically means 100% mispredicted because CPUs are optimized for it, so it couldn't do any speculative execution past the return before. > Yes, I figured that much. Yet the attached diff removes the return address from rcx in wine_syscall_dispatcher(), so I thought it makes sense to note that it will break things.
1210
Age (days ago)
1210
Last active (days ago)
1 comments
2 participants
participants (2)
-
Gabriel Ivăncescu -
Paul Gofman