On 11/4/22 05:25, Jacek Caban (@jacek) wrote:
Some tests fail for me in VirtualBox running on AMD CPU:
thread.c:230: Test failed: got xmm0:lo 0 thread.c:231: Test failed: got xmm0:hi 0 thread.c:277: Test failed: got xmm0:lo 0 thread.c:278: Test failed: got xmm0:hi 0
But yes, if we can skip full context store, it would be nice. I've been thinking about skipping it for `__wine_unix_call` syscall, but skipping it for more syscalls would be even nicer.
I don't remember details, but full context store was needed to pass existing ntdll AVX tests. It's possible that they depend on triggering some 'slow' code path one way or another, I guess we will find out when we try to implement this.
I think that the plan is to stop using ms_abi for syscalls and depend on syscall dispatcher to deal with ms_abi->sysv conversion, but for that we need to get rid of remaining direct calls first.
I suspect that the majoirity of overhead of xsavec is not in volatile XMM registers save. Most of the time they are in init state (that is, all zero) and nothing is actually saved, but xsavec saves more than that. IMO the only feasible way to solve this issue is to have a lighter wine_unix_call which will skip any FPU state save restore and possibly something else, with delayed processing of NtGetContextThread (should that happen during that wine_unix_call), so NtGetContextThread still works correctly and doesn't break DRMs and debuggers.