Hi,
I drafted what I think is a final implementation. This one should pass on all the HW you have. After that, I will clean the patches for submission.
Wine results are set to emulate the results with * +0.5 offset (unconditionally) * AMD swizzle, * 3d textures off, * Fetch4 on for all texldXX instructions. (texldp projected).
- Some AMD HW decides not to enable fetch4 on texldl, texldb, texldd. But I think it makes more sense to have it on, since some AMD devices have it on, and intel as well. - 3D textures are a mess, some enable fetch4, some not, some round the z axis to nearest texels, and some set it to 0. Best and simplest thing to do in my opinion is consider it totally broken, and leave it disabled, like some AMD HW does. Also because is quite hard to implement it in GL.
In the end I increased the test range to 2, to overcome rounding issues.
PD: Thanks Axel for the comment on the R500 bug. That is really helpful and explains why we are seeing the results we have. In the end, looks like it is true that Intel is following the spec, and is AMD the one that introduced the bug. It is funny though that AMD never amended the spec to clarify what they considered to be the default fetch4 behavior in their devices.
BR, Daniel
El lun., 28 ene. 2019 a las 22:42, Axel Davy (davyaxel0@gmail.com) escribió:
On 28/01/2019 11:16, Stefan Dösinger wrote:
On 27/01/2019 01:04, Axel Davy wrote:
Hi,
Another info about the 0.5 offset is the following comments in the r600 gallium driver: /* Gather4 should follow the same rules as bilinear filtering, but the hardware * incorrectly forces nearest filtering if the texture format is integer. * The only effect it has on Gather4, which always returns 4 texels for * bilinear filtering, is that the final coordinates are off by 0.5 of * the texel size.
This is interesting, and I guess it explains why I saw this behavior on r500 only when point mag filtering was enabled, but not when linear mag filters were set.
Does that also apply for minification filters?
I'm not able to say, I guess you'd have to ask an AMD dev.
I experimented with 3DMark06, disabling support for D24X8 texturing to force FETCH4.
Do you know any application that uses fetch4 without having an alternative codepath, or insisting on using it on AMD cards even though an alternative codepath like PCF is supported by the application and used on Nvidia cards? For us, the reason to implement fetch4 is because DF24 implies it, and there are games like CS:GO that insist on using DF24 on AMD cards even though it happily uses INTZ on Nvidia cards.
Well I think it makes sense to use DF24 over INTZ if one doesn't need stencil.
As for the apps, apparently some old AMD demos are supposed to use it, but I haven't tested.
Axel