http://bugs.winehq.org/show_bug.cgi?id=17715
Summary: Incorrect translation of D3D asm instruction "expp" Product: Wine Version: 1.1.17 Platform: PC-x86-64 OS/Version: Linux Status: UNCONFIRMED Severity: normal Priority: P2 Component: directx-d3d AssignedTo: wine-bugs@winehq.org ReportedBy: liquid.acid@gmx.net CC: stefandoesinger@gmx.at
Hi there,
currently wined3d (I only tested this in ARB mode, but it might also affect GLSL mode) the D3D assembler instruction "expp" is incorrectly translated to ARBvp assembler language.
wined3d takes this D3D asm line "expp r3.y, r3" and translates it into "EXP R3.y, R3;". Well, this isn't working.
EXP is of scalarop type, the first parameter being of "masked destination register" type (masking is used in this case above) and the second parameter (here lies the problem) is of type "scalar source register".
Sadly R3 is a vector :(
Well, let's add there full shader source for completeness: -------------------------------------------- vs_1_0 //D3DX8 Shader Assembler Version 0.91 mov r0, v0 add r1, r0, -c85 dp3 r1, r1, r1 rsq r1.x, r1.y mov r3, r0 dp3 r3.w, r3, r3 rsq r3.w, r3.w mul r7, r3, r3.w mul r1.x, r1.x, r1.y mad r2.x, r1.x, c86.x, c86.y slt r4, r2.x, c0 mul r4, r4, -c83.w max r2.x, r2.x, -r2.x add r2.x, r4.z, r2.x mul r3, r2.x, c83.y expp r3.y, r3 mad r3, r3.y, c83.z, c83.w mul r2, r3.x, r3.x mul r3.y, r2.x, r3.x mul r3.z, r2.x, r3.y mul r3.w, r2.x, r3.z dp4 r2, c82, r3 mul r2, r2, c86.z mul r1.x, r1.x, c86.w add r1.x, c1, -r1.x max r1.x, c0, r1.x mad r0.y, r2, r1.x, r0.y dp4 oPos.x, r0, c2 dp4 oPos.y, r0, c3 dp4 oPos.z, r0, c4 dp4 oPos.w, r0, c5 mov oT0, v3 mov oT1, v3 mov oT2, v3 mov oT3, v3 mov oD0, c1 mov oFog.x, c0 --------------------------------------------
First of all, according to the current MSDN the instruction used above in the source is NOT valid.
See this: http://msdn.microsoft.com/en-us/library/bb173373(VS.85).aspx
They explicitly state "expp dst, src.{x|y|z|w}" as the syntax. They furthermore mention: ------------QUOTE--------------------------- src is a source register. Source register requires explicit use of replicate swizzle, that is, exactly one of the .x, .y, .z, .w swizzle components (or the .r, .g, .b, .a equivalents) must be specified. ------------UNQUOTE--------------------------
Well, this comment about the src reg doesn't seem to be valid at all....
Let's just take a look at the original D3D8 documentation ("D3DX8 Shader Assembler Version 0.91" <- !!!).
To fully quote this: ------------------------------------------------------ expp
Provides exponential 2x partial support.
Syntax: expp vDest, vSrc0
Registers: vDest: Destination register, holding the result of the operation. vSrc0: Source register, specifying the input argument.
Operation: The following code fragment shows the operations performed by the expp instruction to write a result to the destination. SetDestReg(); SetSrcReg(0);
float w = m_Source[0].w; float v = (float)floor(m_Source[0].w);
m_TmpReg.x = (float)pow(2, v); m_TmpReg.y = w - v;
// Reduced precision exponent float tmp = (float)pow(2, w); DWORD tmpd = *(DWORD*)&tmp & 0xffffff00;
m_TmpReg.z = *(float*)&tmpd; m_TmpReg.w = 1;
WriteResult();
Remarks: The expp instruction produces undefined results if fed a negative value for the exponent. This instruction provides exponential base 2 partial precision. It generates an approximate answer in vDest.z and allows for a more accurate determination of vDest.x*function(vDest.y), where function is a user approximation to 2*vDest.y over the limited range (0.0 <= vDest.y < 1.0). This instruction accepts a scalar source, and reduced precision arithmetic is acceptable in evaluating vDest.z. However, the approximation error must be less than 1/(211) the absolute error (10-bit precision) and over the range (0.0 <= t.y < 1.0). Also, expp returns 1.0 in w. The following example illustrates how the expp instruction might be used. expp r5, r0 ------------------------------------------------------
So, the correct translation should be "EXP R3.y, R3.w;"
CCing Stefan Dösinger and Henri Verbeet.
I tried to patch this myself, but it looks like that "expp" has to be moved out of shader_hw_map2gl to be able to do such an adjustement (but maybe not?).