https://bugs.winehq.org/show_bug.cgi?id=38558
--- Comment #9 from katsunori.kumatani@gmail.com --- I can confirm now: this patch fixes the bug. Good job locating it!
If I'm not wrong, a "fast" check that picks the "slow" case (1 byte at a time) would be something like this:
if((UINT_PTR)((unsigned char*)(dst+dstlen) - src) < srclen) { /* slow path */ }
It looks ugly but it has only one branch (relies on two's complement, which Windows runs of anyway), I tried to keep the overhead minimal for most cases before.
But this is probably too conservative? A more normal / simpler check can obviously be done, if it's better. Just offering some ideas.
Anyway, the bug gets fixed so definitely something has to be done about it. I think the above won't impact performance in most cases, except those which were wrong before (but correct results are more important than performance IMO).