On 24.01.2021 11:54, Stefan Dösinger wrote:
Am Samstag, 23. Jänner 2021, 14:40:08 EAT schrieb Jacek Caban:
On 22/01/2021 17:21, Paul Gofman wrote:
I think we still support processors which don't have AVX and thus don't have xsave instruction (which is reported as a separate cpuid bit).
Looking closer at this, we indeed need a feature check here. I will work on a new version (patches 1-11 in the series should not be affected).
I am curious if you have tested the performance impact of this. While I agree that this chance is the right thing to do it would be nice to know the downsides.
It's not exactly clear to me what results you'd like to see. This is a similar operation that Windows has to do in its syscalls, so real applications already take that into account and avoid unneeded syscalls on hot paths. That leaves us with micro benchmarks. I came out with the attached benchmark, which tries to show the impact on three types of Nt* functions in Wine. It calls NtQueryInformationProcess with different arguments. Depending on the argument:
- ProcessIoCounters: Wine quickly returns some data. This is a typical thing that stubs do, but some implemented functions are like that as well.
- ProcessVmCounters: Wine does some stuff on client side, including Linux syscalls, to do its work.
- ProcessBasicInformation: Wine uses a server call to implement it.
Here are my averaged results of a few runs, but I really don't want to read too much out of it. I originally planned to send result of a random run, but it showed that patched Wine is notably faster on server calls, so the variation was higher than the impact:
Current Wine: 310 17692 4748
Patched Wine: 2910 18243 4898
For the patched version, I used my local tree which has this series with additional runtime cpuid checks to use fxsave/xsavec/xsave depending on CPU capabilities. As expected, the impact on plain stub call is large, but compared to a real load the the impact seems marginal.
For comparison, Windows results are something like 2200, 140, 140.
Jacek