On 25.02.2013 06:03, Nozomi Kodama wrote:
out.u.m[2][i] = v.z / signed_det;
out.u.m[3][i] = v.w / signed_det; }
*pout = out;
While you are at it, you may fix the indentation of out*, "}", "*pout = out;" and "return pout;".
signed_det = (i % 2)? -det: det;
Couldn't you just use something like "det = -det;" instead of the modulo? This should be a little bit faster.
I did some small tests for speed with the following results. You may also avoid such a lot of variable assignments like *pout = out and you may use 4 vecs instead. This should save ~48 assignments and it should also improve the speed a bit more (~10%). Though, native is still 40% faster than that.
With the change above it should look like: int i; D3DXVECTOR4 v, vec[4]; FLOAT det; ... for (i = 0; i < 4; i++) { vec[i].x = pm->u.m[i][0]; vec[i].y = pm->u.m[i][1]; vec[i].z = pm->u.m[i][2]; vec[i].w = pm->u.m[i][3]; }
for (i = 0; i < 4; i++) { switch (i) { case 0: D3DXVec4Cross(&v, &vec[1], &vec[2], &vec[3]); break; case 1: D3DXVec4Cross(&v, &vec[0], &vec[2], &vec[3]); break; case 2: D3DXVec4Cross(&v, &vec[0], &vec[1], &vec[3]); break; case 3: D3DXVec4Cross(&v, &vec[0], &vec[1], &vec[2]); break; } pout->u.m[0][i] = v.x / det; pout->u.m[1][i] = v.y / det; pout->u.m[2][i] = v.z / det; pout->u.m[3][i] = v.w / det; det = -det; } return pout;
Maybe we could reuse some calculations from the D3DXVec4Cross function ...
Cheers Rico