On Wed, 15 Sep 2021, Rémi Bernon wrote:
On 9/15/21 10:27 PM, Martin Storsjo wrote:
From: Martin Storsjö martin@martin.st
This fixes a regression in memset on ARM since 7b17d7081512db52ef852705445762ac4016c29f.
ARM can do 64 bit writes with the STRD instruction, but that instruction requires a 32 bit aligned address - while these stores are unaligned.
Two consecutive stores to uint32_t* pointers can also be fused into one single STRD, as a uint32_t* is supposed to be properly aligned - therefore, do these stores as stores to volatile uint32_t* to avoid fusing them.
Signed-off-by: Martin Storsjö martin@martin.st
dlls/msvcrt/string.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+)
I'm confused that it causes trouble when I thought it could benefit more than just Intel architectures...
It's kinda arch dependent - some architectures (sparc, mips iirc?) flat out reject that kind of unaligned loads/stores. ARM used to be much more strict about it too, but in newer architecture versions it's kinda lenient, but some instructions still require proper alignment. And with casting unaligned pointers to a larger data type like that, the compiler is free to assume it is aligned to that size.
Maybe it could be made not too ugly with some macro to wrap the 64bit stores, defined differently for arm?
Thanks, that makes it much more tolerable.
// Martin