Re: [PATCH] msvcrt: Import memmove from musl

21 Aug 2020


      On 8/21/20 1:51 PM, Gabriel Ivăncescu wrote:
...
FWIW "rep movsb" is supposedly the fastest when transferring larger 
blocks (I think more than 128 bytes?) on recent CPUs. The cool thing is 
that the CPU handles everything, no matter the alignment or "memcpy vs 
memmove", so it's by far the simplest, and since it knows about the 
alignment requirements of that particular CPU it can optimize it 
internally itself.
Same story with "rep stosb" for memset. Unfortunately these are very 
slow on older CPUs. I think there's a CPUID flag that says whether they 
are fast, we could use that.
"rep movsb" is ~3 times slower than "rep movl" on my cpu (AMD Ryzen 7 
2700X). Maybe the single byte variant is better optimized on Intel cpus.
Also the "rep movl" implementation is still almost ~2 times slower than 
memmove from glibc (tested on 64MB data blocks).
Thanks,
Piotr

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: [PATCH] msvcrt: Import memmove from musl