Am Dienstag, 12. Mai 2009 21:17:10 schrieb Henri Verbeet:
I'm probably missing something obvious, but with pre-1.4 texcoord registers being read-write, what happens when you write to the register before using it to sample a texture?
Nothing special. However, you can't use it to sample afterwards, because you can't use a tex* instruction after a non-tex instruction. Ie:
ps_1_3 tex t0 mov t1, t0 tex t1 mov r0, t0
Z:\dev\shm\shader.txt(4,1): error X5055: Cannot use tex* instruction after non-tex* instruction.
With my patch, the mov will write to the temporary register form of t1; fragment.texcoord[1] will never be used in a correct shader. Now we don't enforce the tex-after-non-tex rule, but if we ever get such a shader enforcing this restriction is how to fix problems.
You cannot sample a Tx coord twice either:
ps_1_3 tex t0 tex t1 texbem t1, t0 mov r0, t1
Z:\dev\shm\shader.txt(4,1): error X5053: Tex register t1 already declared.
This even applies to texkill:
ps_1_3 tex t0 tex t1 texkill t1 mov r0, t0
Z:\dev\shm\shader.txt(4,1): error X5053: Tex register t1 already declared.
or
ps_1_3 texkill t0 tex t0 mov r0, t0
Z:\dev\shm\shader.txt(3,1): error X5053: Tex register t0 already declared.
If you want dependent reads in ps_1_3 and earlier, you have to use texreg2rgb:
ps_1_3 tex t0 texreg2rgb t1, t0 mov r0, t1
texreg2rgb knows that it should take the address from the tempreg form of t0, not fragment.texcoord[0]. You cannot use texreg2rgb to read from the texture coord:
ps_1_3 texreg2rgb t1, t0 mov r0, t1
Z:\dev\shm\shader.txt(2,1): error X5036: Read of uninitialized components(*) in t0: *r/0 *g/1 *b/2 *a/3
If you want to do that, you have to use texcrd:
ps_1_3 texcoord t0 texreg2rgb t1, t0 mov r0, t1
in which case texcoord to is a MOV_SAT t0, fragment.texcoord[0];
I've put it in the backend and not the frontend because a hypothetical nvts shader backend would be confused by this.
Completely off-topic of course, but isn't it somewhat questionable to anthropomorphize source code?
Maybe.
I had to make a decision where to put this register separation code, and I decided that this strange t0 register is valuable information on hardware/APIs that works the same way(ie, nvts, but not atifs), so implementing this register separation throws away information that may be important to such a shader implementation.