On 8/31/21 2:41 PM, Rémi Bernon wrote:
On 8/31/21 2:18 PM, Piotr Caban wrote:
On 8/31/21 12:52 PM, Rémi Bernon wrote:
For a start we could add -ffreestanding / -fno-builtin compilation flags to msvcrt instead, although it won't be much faster, it will at least fix the issue with the builtin replacement.
As far as I understand -fno-builtin does not guarantee that there will be no call to memset introduced by compiler. Isn't it similar to the memcpy problem we have while using clang?
I don't really know, but in practice it has the desired effect. And yes, there's probably the same problem with memcpy.
Note that apparently there's a flag for each builtin but -fno-builtin-memset doesn't do the trick for GCC (it does on clang).
Also note that, as far as I could see, unrolling the loop a bit like in my quote is already enough to make it stop optimizing it to a builtin memset call, without having to add specific flags (although there's always the risk of it becoming even more clever).
For memcpy, it's a bit longer but I have this, which works too: https://github.com/rbernon/wine/commit/6f0b5ed9abb5d8.patch