FWIW more than 75% of the CPU time spent in the syscall dispatcher is reported in xsavec. Forcing xsave / fxsave didn't change much.
This, as well as the measurements above are also probably dependent on the hardware, they were measured with an AMD Ryzen 9 5900X CPU and RX 580 GPU, all graphics settings to lowest.