It is as if the code from your program is executing far slower than anything from libc!
That is very well possible if he has the right libc. Take a look at the optimizations in the string functions in glibc, and you'll have an idea why. Unless you are a assembler programming Guru for a certain architecture, you'll have a hard time beating them.
I'm a fairly good assembler programmer....
Actually I don't have glibc - I'm running netbsd not linux. Netbsd might benefit from faster strxxx routines.
OTOH the times are very dependant on the cpu model! My slotA athlon 700 executes my str_add() faster the way I coded it - I tried the other order and it sucked. Similarly escaping with: lim[-1] = 0; return lim; didn't help.
Of course you can use the same tricks as glibc does to speed up your own variant of the copy routine.
David