Henri Verbeet hverbeet@codeweavers.com wrote:
+static void d2d_point_lerp(D2D1_POINT_2F *out,
const D2D1_POINT_2F *a, const D2D1_POINT_2F *b, float t)
+{
- out->x = a->x * (1.0f - t) + b->x * t;
- out->y = a->y * (1.0f - t) + b->y * t;
+}
According to my investigation of a better bilinear blending (lerp) implementation the above formula is better to explicitely simplify to
out->x = a->x + (b->x - a->x) * t; out->y = a->y + (b->y - a->y) * t;
or even better to avoid floating point operations at all by replacing them either by fixed point math or a mmx/sse helper, which is able to considerably (x10) improve performance.
On 21 May 2017 at 18:23, Dmitry Timoshkov dmitry@baikal.ru wrote:
Henri Verbeet hverbeet@codeweavers.com wrote:
+static void d2d_point_lerp(D2D1_POINT_2F *out,
const D2D1_POINT_2F *a, const D2D1_POINT_2F *b, float t)
+{
- out->x = a->x * (1.0f - t) + b->x * t;
- out->y = a->y * (1.0f - t) + b->y * t;
+}
According to my investigation of a better bilinear blending (lerp) implementation the above formula is better to explicitely simplify to
out->x = a->x + (b->x - a->x) * t; out->y = a->y + (b->y - a->y) * t;
For performance, quite possibly. This isn't particularly performance sensitive code though, and I'd be surprised if that variant had better accuracy.
or even better to avoid floating point operations at all by replacing them either by fixed point math or a mmx/sse helper, which is able to considerably (x10) improve performance.
Yes, for performance sensitive code we'd want to use SSE intrinsics. There are places in d3dx9 where that would certainly be worthwhile.
Henri Verbeet hverbeet@gmail.com wrote:
+static void d2d_point_lerp(D2D1_POINT_2F *out,
const D2D1_POINT_2F *a, const D2D1_POINT_2F *b, float t)
+{
- out->x = a->x * (1.0f - t) + b->x * t;
- out->y = a->y * (1.0f - t) + b->y * t;
+}
According to my investigation of a better bilinear blending (lerp) implementation the above formula is better to explicitely simplify to
out->x = a->x + (b->x - a->x) * t; out->y = a->y + (b->y - a->y) * t;
For performance, quite possibly. This isn't particularly performance sensitive code though, and I'd be surprised if that variant had better accuracy.
I'd dare to claim that both variants should be equivalent in accuracy, and the 2nd version just avoids a redundant mutiplication.
On 21 May 2017 at 19:03, Dmitry Timoshkov dmitry@baikal.ru wrote:
I'd dare to claim that both variants should be equivalent in accuracy,
I'd be interested to see that analysis.
Henri Verbeet hverbeet@gmail.com wrote:
I'd dare to claim that both variants should be equivalent in accuracy,
I'd be interested to see that analysis.
I guess that an interested person would need to perform some elementary math things like opening braces and regrouping variables in an equation.
On 21 May 2017 at 20:58, Dmitry Timoshkov dmitry@baikal.ru wrote:
Henri Verbeet hverbeet@gmail.com wrote:
I'd dare to claim that both variants should be equivalent in accuracy,
I'd be interested to see that analysis.
I guess that an interested person would need to perform some elementary math things like opening braces and regrouping variables in an equation.
I was afraid that might be about the level of analysis done. Unfortunately floating-point arithmetic doesn't work that way. I don't care enough to do proper analysis and calculate error bounds, but consider for example the trivial case of t = 1.0f, in which case
out->x = a->x + (b->x - a->x) * t;
would be equivalent to
out->x = a->x + (b->x - a->x);
then, note that "a ⊕ (b ⊖ a)" isn't necessarily equal to a.
Of course plenty has been written on the subject of floating point computation, but some classical introductions are section 4.2 of TAoCP (volume 2) by Knuth, and the 1991 paper by David Goldberg.
Henri Verbeet hverbeet@gmail.com wrote:
I'd dare to claim that both variants should be equivalent in accuracy,
I'd be interested to see that analysis.
I guess that an interested person would need to perform some elementary math things like opening braces and regrouping variables in an equation.
I was afraid that might be about the level of analysis done. Unfortunately floating-point arithmetic doesn't work that way. I don't care enough to do proper analysis and calculate error bounds, but consider for example the trivial case of t = 1.0f, in which case
out->x = a->x + (b->x - a->x) * t;
would be equivalent to
out->x = a->x + (b->x - a->x);
then, note that "a ⊕ (b ⊖ a)" isn't necessarily equal to a.
Of course plenty has been written on the subject of floating point computation, but some classical introductions are section 4.2 of TAoCP (volume 2) by Knuth, and the 1991 paper by David Goldberg.
I guess that it's always possible to try justifying one way of doing math as a most preferred one. But in the case when one has to choose between "(a + b) * c" and "a * c + b * c" it's better to use common sense IMO. Since there are numerous ways how to compute linear blending, and every of them will always have subtle differences, is it really possible to plausibly explain a personal preference or justify some choice? Is there really "the true one" instead of considering every way having equal starting value?
On 22 May 2017 at 07:21, Dmitry Timoshkov dmitry@baikal.ru wrote:
I guess that it's always possible to try justifying one way of doing math as a most preferred one. But in the case when one has to choose between "(a + b) * c" and "a * c + b * c" it's better to use common sense IMO. Since there are numerous ways how to compute linear blending, and every of them will always have subtle differences, is it really possible to plausibly explain a personal preference or justify some choice? Is there really "the true one" instead of considering every way having equal starting value?
Well, you're the one trying to justify changing the current code, but yes.
There's a trade-off between performance and accuracy here. As I said in my initial reply, I don't think it really matters one way or another here, but there may very well be other places where trading accuracy for performance would be a legitimate choice to make. You seem to be arguing that there's no trade-off at all instead. Even if that were true, and it isn't — please read up on floating point computation before continuing that line of argument, that wouldn't justify changing the code, or indeed even arguing about it.
Henri Verbeet hverbeet@gmail.com wrote:
I guess that it's always possible to try justifying one way of doing math as a most preferred one. But in the case when one has to choose between "(a + b) * c" and "a * c + b * c" it's better to use common sense IMO. Since there are numerous ways how to compute linear blending, and every of them will always have subtle differences, is it really possible to plausibly explain a personal preference or justify some choice? Is there really "the true one" instead of considering every way having equal starting value?
Well, you're the one trying to justify changing the current code, but yes.
There's a trade-off between performance and accuracy here. As I said in my initial reply, I don't think it really matters one way or another here, but there may very well be other places where trading accuracy for performance would be a legitimate choice to make. You seem to be arguing that there's no trade-off at all instead. Even if that were true, and it isn't — please read up on floating point computation before continuing that line of argument, that wouldn't justify changing the code, or indeed even arguing about it.
I'm perfectly aware about floating point computations being somewhat tricky regarding accuracy. The starting point is to claim which version should be considered as "the true one", that's why I mentioned common sense. It's pretty fragile to use this kind of argument (or rather playful argumentation?) IMHO because when somebody next time will try to optimize/simplify/change some code that performs math computations it always will be tempting to use same kind of arguments, leading to an endless loop :)
On 21.05.2017 18:56, Henri Verbeet wrote:
On 21 May 2017 at 18:23, Dmitry Timoshkov dmitry@baikal.ru wrote:
Henri Verbeet hverbeet@codeweavers.com wrote:
+static void d2d_point_lerp(D2D1_POINT_2F *out,
const D2D1_POINT_2F *a, const D2D1_POINT_2F *b, float t)
+{
- out->x = a->x * (1.0f - t) + b->x * t;
- out->y = a->y * (1.0f - t) + b->y * t;
+}
According to my investigation of a better bilinear blending (lerp) implementation the above formula is better to explicitely simplify to
out->x = a->x + (b->x - a->x) * t; out->y = a->y + (b->y - a->y) * t;
For performance, quite possibly. This isn't particularly performance sensitive code though, and I'd be surprised if that variant had better accuracy.
If you go by the simple rule of addition and subtraction are bad, because small values get swallowed,
for t being almost 1.0f a->x * (1.0f - t) + b->x * t; will produce roughly b->x. In a->x + (b->x - a->x) * t what matters more is the result of (b->x - a->x), which might almost zero, leading to a->x to win the result. Remember, for this a->x and b->x must be almost equal.
for t being almost 0, both are dominated by a->x
for b->x being almost 0, and lets´s say t=1 we get in the first case b->x and in the second case 0, since a->x is assumed big enough to dominate b->x - a->x with -a->x.
for a->x being almost 0 and lets say t=0 the both case still give a->x
For me that means the weak points for case one are t being almost 1 and for case 2 b->x >> a->x.
bye Jochen