Something like the following may be faster:
static const int bit_pos[] = { 0x00, 0x01, 0x0e, 0x02, 0x0f, 0x1b, 0x12, 0x03, 0x1e, 0x10, 0x1c, 0x16, 0x18, 0x13, 0x0a, 0x04, 0x1f, 0x0d, 0x1a, 0x11, 0x1d, 0x15, 0x17, 0x09, 0x0c, 0x19, 0x14, 0x08, 0x0b, 0x07, 0x06, 0x05, }; const unsigned int c = 0x7dc4d65; return bit_pos[((x & -x) * c) >> 27];
I'd expect that mostly due to eliminating the branches and the implicit shift & mask operations in something like "y.c[1]".
Henri Verbeet hverbeet@gmail.com writes:
Something like the following may be faster:
static const int bit_pos[] = { 0x00, 0x01, 0x0e, 0x02, 0x0f, 0x1b, 0x12, 0x03, 0x1e, 0x10, 0x1c, 0x16, 0x18, 0x13, 0x0a, 0x04, 0x1f, 0x0d, 0x1a, 0x11, 0x1d, 0x15, 0x17, 0x09, 0x0c, 0x19, 0x14, 0x08, 0x0b, 0x07, 0x06, 0x05, }; const unsigned int c = 0x7dc4d65; return bit_pos[((x & -x) * c) >> 27];
I'd expect that mostly due to eliminating the branches and the implicit shift & mask operations in something like "y.c[1]".
Maybe, but there's no point. This is just a fallback implementation for broken platforms, any platform that we care about should never use this.
On 03/11/2011 07:19 AM, Alexandre Julliard wrote:
Henri Verbeethverbeet@gmail.com writes:
Something like the following may be faster:
static const int bit_pos[] = { 0x00, 0x01, 0x0e, 0x02, 0x0f, 0x1b, 0x12, 0x03, 0x1e, 0x10, 0x1c, 0x16, 0x18, 0x13, 0x0a, 0x04, 0x1f, 0x0d, 0x1a, 0x11, 0x1d, 0x15, 0x17, 0x09, 0x0c, 0x19, 0x14, 0x08, 0x0b, 0x07, 0x06, 0x05, }; const unsigned int c = 0x7dc4d65; return bit_pos[((x& -x) * c)>> 27];
I'd expect that mostly due to eliminating the branches and the implicit shift& mask operations in something like "y.c[1]".
Maybe, but there's no point. This is just a fallback implementation for broken platforms, any platform that we care about should never use this.
For systems with fast multiplication that way is probably faster. This was just something I noticed that took me 5 mins and was clearly better than what we had.