On Sun, 13 Feb 2005 10:31:12 +0100, you wrote:
With one inline asm-statement this function would smaler and faster.
You are underestimating what compilers can do.
Filling some Gigabytes with your patch
(gcc 3.4, optimization -O2):
real 0m32.037s user 0m29.584s sys 0m0.051s
Original:
real 0m31.471s user 0m29.004s sys 0m0.081s
And this works optimized on all supported CPU's.
Rein.