This implementation has some issues, e.g. a generated GLSL shader will produce a compilation error when the immediate constant buffer contains an unsigned integer which represents NaN. Morever, from testing a similar implementation, it seems that some drivers don't handle shaders with huge arrays very well and such shaders perform poorly. My current impression is that we should implement immediate constant buffers using a uniform variable.
Also, a test would be nice.
On Mon, Jun 13, 2016 at 6:41 PM, Guillaume Charifi guillaume.charifi@sfr.fr wrote:
Signed-off-by: Guillaume Charifi guillaume.charifi@sfr.fr
dlls/wined3d/glsl_shader.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/dlls/wined3d/glsl_shader.c b/dlls/wined3d/glsl_shader.c index 55affc1..def224c 100644 --- a/dlls/wined3d/glsl_shader.c +++ b/dlls/wined3d/glsl_shader.c @@ -1886,6 +1886,19 @@ static void shader_generate_glsl_declarations(const struct wined3d_context *cont if (shader->limits->constant_bool > 0 && reg_maps->boolean_constants) shader_addline(buffer, "uniform bool %s_b[%u];\n", prefix, shader->limits->constant_bool);
- if (reg_maps->icb)
- {
shader_addline(buffer, "const vec4 %s_icb[%u] = vec4[%u](\n",
prefix, reg_maps->icb->element_count / 4, reg_maps->icb->element_count / 4);
for (i = 0; i < reg_maps->icb->element_count / 4; i++) {
shader_glsl_append_imm_vec4(buffer, (float *)®_maps->icb->data[4 * i]);
if (i != reg_maps->icb->element_count / 4 - 1)
shader_addline(buffer, ",");
shader_addline(buffer, "\n");
}
shader_addline(buffer, ");\n");
- }
I think it would be better to store the number of vectors instead of the element count in "reg_maps->icb". Not that it matters much, but this block of code would certainly look nicer.
- for (i = 0; i < WINED3D_MAX_CBS; ++i) { if (reg_maps->cb_sizes[i])
@@ -8561,7 +8574,7 @@ static const SHADER_HANDLER shader_glsl_instruction_handler_table[WINED3DSIH_TAB /* WINED3DSIH_DCL_GLOBAL_FLAGS */ shader_glsl_nop, /* WINED3DSIH_DCL_HS_FORK_PHASE_INSTANCE_COUNT */ NULL, /* WINED3DSIH_DCL_HS_MAX_TESSFACTOR */ NULL,
- /* WINED3DSIH_DCL_IMMEDIATE_CONSTANT_BUFFER */ NULL,
- /* WINED3DSIH_DCL_IMMEDIATE_CONSTANT_BUFFER */ shader_glsl_nop, /* WINED3DSIH_DCL_INPUT */ shader_glsl_nop, /* WINED3DSIH_DCL_INPUT_CONTROL_POINT_COUNT */ NULL, /* WINED3DSIH_DCL_INPUT_PRIMITIVE */ shader_glsl_nop,
-- 2.7.4
On 13 June 2016 at 20:32, Józef Kucia joseph.kucia@gmail.com wrote:
This implementation has some issues, e.g. a generated GLSL shader will produce a compilation error when the immediate constant buffer contains an unsigned integer which represents NaN. Morever, from testing a similar implementation, it seems that some drivers don't handle shaders with huge arrays very well and such shaders perform poorly. My current impression is that we should implement immediate constant buffers using a uniform variable.
Which drivers are that? Using uniforms does have some advantages, but it also means the GLSL compiler potentially has less opportunities for optimisation. Perhaps that's ok if we can safely assume any such optimisations would have already been done by the HLSL compiler. There's also the consideration that accessing uniforms can potentially have higher latency than accessing immediate values, although I'm not aware of any hardware that supports arrays of immediate values, as opposed to single (vec4) values.
On Wed, Jun 15, 2016 at 1:35 PM, Henri Verbeet hverbeet@gmail.com wrote:
On 13 June 2016 at 20:32, Józef Kucia joseph.kucia@gmail.com wrote:
This implementation has some issues, e.g. a generated GLSL shader will produce a compilation error when the immediate constant buffer contains an unsigned integer which represents NaN. Morever, from testing a similar implementation, it seems that some drivers don't handle shaders with huge arrays very well and such shaders perform poorly. My current impression is that we should implement immediate constant buffers using a uniform variable.
Which drivers are that? Using uniforms does have some advantages, but it also means the GLSL compiler potentially has less opportunities for optimisation. Perhaps that's ok if we can safely assume any such optimisations would have already been done by the HLSL compiler. There's also the consideration that accessing uniforms can potentially have higher latency than accessing immediate values, although I'm not aware of any hardware that supports arrays of immediate values, as opposed to single (vec4) values.
Switching from arrays to uniforms improved performance noticeably for Gauntlet on OS X and on Linux with Nvidia binary drivers.
Mesa seems to lower const arrays to uniforms anyway. However, we cannot declare these arrays as const when uintBitsToFloat is used (at least in GLSL 1.30), so we are at risk of spilling these arrays to scratch memory. An excerpt from a comment in the Mesa source code: "Lower constant arrays to uniform arrays. Some driver backends (such as i965 and nouveau) don't handle constant arrays gracefully, instead treating them as ordinary writable temporary arrays. Since arrays can be large, this often means spilling them to scratch memory, which usually involves a large number of instructions.".
On 15 June 2016 at 15:22, Józef Kucia joseph.kucia@gmail.com wrote:
Switching from arrays to uniforms improved performance noticeably for Gauntlet on OS X and on Linux with Nvidia binary drivers.
Ok, that's good to know.
Mesa seems to lower const arrays to uniforms anyway. However, we cannot declare these arrays as const when uintBitsToFloat is used (at least in GLSL 1.30), so we are at risk of spilling these arrays to
You don't necessarily need uintBitsToFloat() when you declare the icb though, you can also do that conversion in shader_glsl_get_register_name() or shader_glsl_add_src_param().
On Wed, Jun 15, 2016 at 3:49 PM, Henri Verbeet hverbeet@gmail.com wrote:
You don't necessarily need uintBitsToFloat() when you declare the icb though, you can also do that conversion in shader_glsl_get_register_name() or shader_glsl_add_src_param().
Yeah, we could declare ICB as constant unsigned integer array. I guess we could spend more time to figure out if this approach works well but I am not sure if there is point in doing so. If Mesa lowers const arrays to uniforms we can use uniforms directly as well.
On 15 June 2016 at 15:57, Józef Kucia joseph.kucia@gmail.com wrote:
On Wed, Jun 15, 2016 at 3:49 PM, Henri Verbeet hverbeet@gmail.com wrote:
You don't necessarily need uintBitsToFloat() when you declare the icb though, you can also do that conversion in shader_glsl_get_register_name() or shader_glsl_add_src_param().
Yeah, we could declare ICB as constant unsigned integer array. I guess we could spend more time to figure out if this approach works well but I am not sure if there is point in doing so. If Mesa lowers const arrays to uniforms we can use uniforms directly as well.
Oh sure, I think using uniforms makes sense.