On 30/03/2022 19:33, Rémi Bernon wrote:
On 3/30/22 17:27, Jinoh Kang wrote:
On 3/23/22 10:33, Elaine Lefler wrote:
Signed-off-by: Elaine Lefler elaineclefler@gmail.com
New vectorized implementation improves performance up to 65%.
MSVCRT has one. Maybe deduplicate?
IIUC upstream isn't very interested in assembly optimized routine, unless really necessary.
The msvcrt implementation was probably necessary because it's often called by apps, and needs to be as optimal as possible, but I'm not sure ntdll memcpy is used so much. Maybe for realloc though, in which case it might be useful indeed.
I think an unrolled version like was done for memset should already give good results and should work portably (though I got bitten with memset already, and I wasn't very keen on trying again with memcpy so soon).
Why not just copy pasting it from msvcrt since it's already done?