Hi Fabian,

I'll be back from vacation on Monday (currently I have very limited internet access). I'll look on it then.

I'm not sure how complicated the assembly implementation is but I'm expecting that a separated assembly file will not be needed. Also, AFAIK, we can't take the implementation from glibc. It would be also useful to know how efficient Microsoft implementation is.

Musl also have platform specific implementation of memove (for i386 and x64) written is assembly. I bet it should be good enough for Wine.

Thanks,
Piotr

On Aug 12, 2020 23:33, Fabian Maurer <dark.shadow4@web.de> wrote:

Hello,

since msvcrt isn't relying on the standard library memmove/memcpy anymore,
there's been a pretty bad performance regression. See https://bugs.winehq.org/
show_bug.cgi?id=49663.

For the best performance, and since those memory operations are pretty common,
we'd presumably like to optimize them as much as possible. You might have seen
my patch for an implementation from musl, although Zebediah rightfully pointed
out we might want to opt for the best performance we can get...
glibc currently offers the best performance, thanks to SSE/AVX implementations
and runtime selection of the best supported path.

First, would you have any objections adding specialized paths written in
assembly for x86?
And if we were to add them, would we link against assembly files, or someway
transform them into inline assembly? AFAIK, Wine didn't come with pure
assembly files yet...

If you want, I could set up a few crude benchmarks to see how different
versions compare.

Regards,
Fabian Maurer