Hi Paul,
On 22.01.2021 17:21, Paul Gofman wrote:
On 1/22/21 18:51, Jacek Caban wrote:
Signed-off-by: Jacek Caban jacek@codeweavers.com
dlls/ntdll/unix/signal_x86_64.c | 32 ++++++++++++++++++++++++++------ 1 file changed, 26 insertions(+), 6 deletions(-)
This (together with saving all the basice and XMM registers) looks like a big overhead on every Nt function call. Is it maybe possible to do that when explicitly requested only (some option)?
If you mean an user configurable option, I do not think that's the right direction (just like any BreakXXXFeature option). I'd rather optimize this solution to make sure that its performance is acceptable. This series is what I considered good enough for the first iteration.
We may want an extension enabling debugger to get a context inside a syscall. I've been thinking about a flag in PEB that winedbg could set when desired.
I think we still support processors which don't have AVX and thus don't have xsave instruction (which is reported as a separate cpuid bit).
xsave is part of SSE2, not AVX, and it should ignore unsupported requested features, so the patch should be fine as is on hardware without AVX. xsave needs, however, to be enabled by OS, so we may need a feature check if we want to support OSes without xsave enabled.
Also, to save at least this part, it is possible to use xsavec which won't be saving anything (aside from the mask) if the ymm high part is zero (that is, in initial state, which is quite the common case when ymm regs were not used before the call; compilers even tend to reset higher part of ymm when done with them). There is user_shared_data->XState.EnabledFeatures which tells if xsave supported at all and user_shared_data->XState.CompactionEnabled tells if xsavec is available.
If I read documentation right (and my testing confirms that), xsave does what you described as well. It will not store high ymm part if it's in initial state. The difference between xsave and xsavec is about storage format, but that doesn't make a difference here. xsavec is also not exactly free: in an addition to feature check, it also requires entire xsave header to be initialized.
What seems to be more interesting is xsaveopt, which I think could make a difference. That would, however, need xsave are to be at constant address. I've been thinking about storing it next to TEB, but we can't do that as long as winsock is called on signal stack, so I left experimenting with it for the future.
Thanks,
Jacek