Hi Fabian,
On 8/20/20 7:13 PM, Fabian Maurer wrote:
it would probably also be useful to benchmark the different glibc implementations. Because for games, 10% more speed would be nice. I doubt a C implementation can compete with an AVX based one.
Yes, we might need to add better (platform specific) implementation to wine at some point. It still makes sense to improve generic code (especially if its performance can be easily increased 2-3 times for bigger move operations).
You patch is only affecting a subset of memmove calls. It also slows down
some cases a lot (around 1.5-2 times).
You mean the cases were we could use memcpy?
And the cases when buffers can't be word aligned.
I've also tested full implementation from musl (that uses their memcpy
implementation in some cases). It performs much better. It's much slower than native if buffers overlap (around 3 times slower).
musl is slower in a lot of cases. I'm attaching a cheap test program. You can compile it normally with "gcc" or you can link musl static with "musl-gcc". That should compare the best glibc implementation vs the best musl implementation. Correct me if I'm wrong though.
I was comparing musl, glibc and native implementation. In i386 case, on average, glibc was the fastest one (it was e.g. ~2 times faster for big memmove's than both musl and native msvcrt). musl was performing similar as native msvcrt in memcpy case. When memory was copied starting from the end musl was much slower (~3 times).
Thanks, Piotr