Hello
thanks for the review. I don't think that calling defines is the way to go. Indeed, I tested my patch and yours. Yours is about 12% slower than mine in my computer. And now, we try to take care of efficiency of this dll. So, it is not the time to increase latency.
I used 10 digits since there are a lot of computation, I want to avoid as much as possible big rounding errors. If we want to uniformize, then we should uniformize d3dxshmultiply 2,3,4 with 10 digits. But that is for an another patch.
Nozomi.
Looks pretty much ok, but isn't there a way to reduce the size a bit? Just see the dirty hack which is attached. D3DXSHMultiply6 will add a lot of lines too...
Also is there a reason why we use constants with different accuracy (e.g. 0.28209479f in D3DXSHMultiply4 and 0.2820948064f)?
Cheers Rico