Vitaly Budovski wrote:
Make use of the Triangle Inequality Nearest Neighbour algorithm to find the nearest colour more efficiently than a simple linear search. The improvements are most noticeable with a palette of 256 colours. Testing shows approximately 3-4x performance increase for 256 colours.
Overall idea looks good, however the big problem with it is use of float point numbers. Fortunately you can get rid of them and use integers instead. Because you never use the distance in the calculations, but only to compare against other distances, you can skip "sqrt" and just compare squares, as the original code does. That will give you even more speed improvements.
Few more nitpicks about this and other patches in the series:
+static int compare_distance(const void *left, const void *right) +{
- const struct tinn *l = left;
- const struct tinn *r = right;
Please don't use void pointers. Use typed pointers, especially that you cast them to a hard-coded type.
+const void *init_tinn(struct tinn *dest, const void *buf, size_t num,
- size_t size, float (*distance)(const void *, const void *))
+{
- size_t i;
- size_t ref = rand() % num;
I'm not so sure using random reference point really necessary.
+static struct tinn tinnPalette[256];
This won't work well with multi-threaded apps. You probably should make it on the stack for each X11DRV_DIB_SetImageBits_* function. Especially that some need only 2 elements. Btw, how big of the gain/loss whill there be for 1-bit and 4-bit DIBs?
q.rgbRed = srcval.peRed;
q.rgbGreen = srcval.peGreen;
q.rgbBlue = srcval.peBlue;
Looks much better, if written as: q = srcval;
((srcval >> 7) & 0xf8) | /* r */
((srcval >> 12) & 0x07),
((srcval >> 2) & 0xf8) | /* g */
((srcval >> 7) & 0x07),
((srcval << 3) & 0xf8) | /* b */
((srcval >> 2) & 0x07) ) << (7-(x&7)) );
q.rgbRed = ((srcval >> 7) & 0xf8) |
((srcval >> 12) & 0x07);
q.rgbGreen = ((srcval >> 2) & 0xf8) |
((srcval >> 7) & 0x07);
q.rgbBlue = ((srcval << 3) & 0xf8) |
((srcval >> 2) & 0x07);
What happened to the comments and alignment?
Vitaliy Margolen.
Vitaliy Margolen wrote:
Vitaly Budovski wrote:
Make use of the Triangle Inequality Nearest Neighbour algorithm to find the nearest colour more efficiently than a simple linear search. The improvements are most noticeable with a palette of 256 colours. Testing shows approximately 3-4x performance increase for 256 colours.
Overall idea looks good, however the big problem with it is use of float point numbers. Fortunately you can get rid of them and use integers instead. Because you never use the distance in the calculations, but only to compare against other distances, you can skip "sqrt" and just compare squares, as the original code does. That will give you even more speed improvements.
Thanks for the tip. Unfortunately in this instance it will not work as I do in fact query more than a < b. See the nearest function in patch 2.
Few more nitpicks about this and other patches in the series:
+static int compare_distance(const void *left, const void *right) +{
- const struct tinn *l = left;
- const struct tinn *r = right;
Please don't use void pointers. Use typed pointers, especially that you cast them to a hard-coded type.
The qsort function I use internally takes that particular function pointer.
+const void *init_tinn(struct tinn *dest, const void *buf, size_t num,
- size_t size, float (*distance)(const void *, const void *))
+{
- size_t i;
- size_t ref = rand() % num;
I'm not so sure using random reference point really necessary.
This is actually very important. By randomly selecting a reference point, the chance that we get worst-case linear performance out of the algorithm becomes extremely small.
+static struct tinn tinnPalette[256];
This won't work well with multi-threaded apps. You probably should make it on the stack for each X11DRV_DIB_SetImageBits_* function. Especially that some need only 2 elements. Btw, how big of the gain/loss whill there be for 1-bit and 4-bit DIBs?
I'm sure you are right. I'll change the array allocation as you suggest. I'm sure you mean the X11DRV_DIB_GetImageBits_* though. As a percentage probably quite significant, linear would be reasonably fast for small sized palettes though. Although this algorithm will never really be any slower.
q.rgbRed = srcval.peRed;
q.rgbGreen = srcval.peGreen;
q.rgbBlue = srcval.peBlue;
Looks much better, if written as: q = srcval;
I guess it does. I had to do a lot of copy+paste though so I kept the format the same throughout, since colors are not always retrieved in that way.
((srcval >> 7) & 0xf8) | /* r */
((srcval >> 12) & 0x07),
((srcval >> 2) & 0xf8) | /* g */
((srcval >> 7) & 0x07),
((srcval << 3) & 0xf8) | /* b */
((srcval >> 2) & 0x07) ) << (7-(x&7)) );
q.rgbRed = ((srcval >> 7) & 0xf8) |
((srcval >> 12) & 0x07);
q.rgbGreen = ((srcval >> 2) & 0xf8) |
((srcval >> 7) & 0x07);
q.rgbBlue = ((srcval << 3) & 0xf8) |
((srcval >> 2) & 0x07);
What happened to the comments and alignment?
Vitaliy Margolen.
The comments are not really needed now. First of all they weren't very descriptive, and also I'm now assigning the colors to individual bytes which are named appropriately so it is easy to see what they refer to. The code looks to be aligned I think.
Vitaly Budovski wrote:
Vitaliy Margolen wrote:
Vitaly Budovski wrote:
Make use of the Triangle Inequality Nearest Neighbour algorithm to find the nearest colour more efficiently than a simple linear search. The improvements are most noticeable with a palette of 256 colours. Testing shows approximately 3-4x performance increase for 256 colours.
Overall idea looks good, however the big problem with it is use of float point numbers. Fortunately you can get rid of them and use integers instead. Because you never use the distance in the calculations, but only to compare against other distances, you can skip "sqrt" and just compare squares, as the original code does. That will give you even more speed improvements.
Thanks for the tip. Unfortunately in this instance it will not work as I do in fact query more than a < b. See the nearest function in patch 2.
And what is so special about subtracting one from the other?
Few more nitpicks about this and other patches in the series:
+static int compare_distance(const void *left, const void *right) +{
- const struct tinn *l = left;
- const struct tinn *r = right;
Please don't use void pointers. Use typed pointers, especially that you cast them to a hard-coded type.
The qsort function I use internally takes that particular function pointer.
And the problem is what? Any pointer is compatible with void* (not other way around).
Vitaliy Margolen.
Vitaliy Margolen wrote:
Vitaly Budovski wrote:
Vitaliy Margolen wrote:
Thanks for the tip. Unfortunately in this instance it will not work as I do in fact query more than a < b. See the nearest function in patch 2.
And what is so special about subtracting one from the other?
A lot actually. As an example with squared distances: a^2 - b^2 <= c^2. If a^2 = 27, b^2 = 16, c^2 = 9, then 27 - 16 <= 9 is false. However, if we use the square root in distance calculations, sqrt(27) - 4 <= 3 is true. As you can see, we get entirely different results.
The qsort function I use internally takes that particular function pointer.
And the problem is what? Any pointer is compatible with void* (not other way around).
Vitaliy Margolen.
I get a warning from gcc: passing argument 4 of ‘qsort’ from incompatible pointer type.
Vitaly Budovski wrote:
Vitaliy Margolen wrote:
Vitaly Budovski wrote:
Vitaliy Margolen wrote: Thanks for the tip. Unfortunately in this instance it will not work as I do in fact query more than a < b. See the nearest function in patch 2.
And what is so special about subtracting one from the other?
A lot actually. As an example with squared distances: a^2 - b^2 <= c^2. If a^2 = 27, b^2 = 16, c^2 = 9, then 27 - 16 <= 9 is false. However, if we use the square root in distance calculations, sqrt(27) - 4 <= 3 is true. As you can see, we get entirely different results.
I didn't really read your code, but answering on above, it's still possible in this case to avoid sqrt. A bit of math: You want to check if a-b <= c. Assuming a>=b (otherwise it's always true), it's equal to: (a-b)^2 <= c^2 is equal to a^2 + b^2 - 2ab <= c^2 is equal to a^2 + b^2 - c^2 <= 2ab is equal to (assuming a^2 + b^2 - c^2 >= 0) (a^2 + b^2 - c^2)^2 <= (2ab)^2 is equal to (a^2 + b^2 - c^2)^2 <= 4 a^2 b^2 so if you store a^2, b^2 and c^2 instead of a, b, c you may compare it using a few integer operations. In C it would look like:
BOOL abccmp(int a2, int b2, int c2) { int tmp = a2+b2-c2; return a2 <= b2 || tmp <= 0 || tmp*tmp <= 4*a2*b2; }
It may be not so easy in a real code, but in most cases it should be possible to use such tricks.
Jacek