-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
sprintf(register_name, "(%s + %u >= 0 &&
%s + %u < %u ? %s_c[%s + %u] : vec4(0.0))",
Did you do any testing on the performance impact of this? This seems like something where it's convenient to sacrifice performance everywhere for a corner case, but later on we wonder why games are unplayably slow. Keep in mind that currently it's pretty difficult to be GPU limited. With the CSMT patches you'll have a higher chance of being GPU limited, but there's no guarantee of course.
dx10+ GPUs don't return 0.0 reliably on Windows here. My Geforce GPUs returned the int constants at c[-32] to c[-48].
I think there are additional options of increasing our odds of behaving like Windows. E.g. with constant buffers we could create a buffer that matches the hardware page size (which we can't query, I know) and fill everything except the 256 d3d9 constants with zeroes. If the out of bounds read falls into the buffer we get a zero. If not, the hardware should detect the invalid read and I'd expect it to return zero due to exactly this d3d requirement.
Also I think d3d9 HW should "just work" in this regard because only 256 registers exist.
Yes, I am aware that there are more games than just The Witcher that are affected by this problem.
On 11 February 2016 at 22:03, Stefan Dösinger stefandoesinger@gmail.com wrote:
sprintf(register_name, "(%s + %u >= 0 &&
%s + %u < %u ? %s_c[%s + %u] : vec4(0.0))",
Did you do any testing on the performance impact of this? This seems like something where it's convenient to sacrifice performance everywhere for a corner case, but later on we wonder why games are unplayably slow. Keep in mind that currently it's pretty difficult to be GPU limited. With the CSMT patches you'll have a higher chance of being GPU limited, but there's no guarantee of course.
dx10+ GPUs don't return 0.0 reliably on Windows here. My Geforce GPUs returned the int constants at c[-32] to c[-48].
Yeah, I'm not all that convinced this is how we want to fix this, or that this is necessarily something that needs fixing on our end. I could perhaps be convinced to have a registry setting along the lines of StrictDrawOrdering though.
2016-02-11 22:03 GMT+01:00 Stefan Dösinger stefandoesinger@gmail.com:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
sprintf(register_name, "(%s + %u >= 0 &&
%s + %u < %u ? %s_c[%s + %u] : vec4(0.0))",
Did you do any testing on the performance impact of this?
No, no serious testing. I can try to collect some data though.
I think there are additional options of increasing our odds of behaving like Windows. E.g. with constant buffers we could create a buffer that matches the hardware page size (which we can't query, I know) and fill everything except the 256 d3d9 constants with zeroes. If the out of bounds read falls into the buffer we get a zero. If not, the hardware should detect the invalid read and I'd expect it to return zero due to exactly this d3d requirement.
In practice that should work although OpenGL doesn't offer any guarantee. Even ARB_robust_buffer_access_behavior explicitly mentions "Out-of-bounds reads may return values from within the buffer object or zero." I guess we can reevaluate that kind of solution when we'll switch to using UBOs for d3d8/9 constants.
Yes, I am aware that there are more games than just The Witcher that are affected by this problem.
That's ultimately the reason I sent this. Currently we don't quite know which other games are broken (maybe more subtly) by this issue and every time there is some kind of weird bug in a game one has always the suspicion "this isn't the float constant register overrun thing again, right?" in the back of his head...
2016-02-11 22:27 GMT+01:00 Henri Verbeet hverbeet@gmail.com:
Yeah, I'm not all that convinced this is how we want to fix this, or that this is necessarily something that needs fixing on our end. I could perhaps be convinced to have a registry setting along the lines of StrictDrawOrdering though.
Actually a registry setting (RangeCheckFloatConstants?) should work WRT my worry above. I'll send an updated version.