Re: [PATCH 2/3] d3dx9: Support relative addressing in preshader.

24 Mar 2017

2017-03-24 0:39 GMT+01:00 Paul Gofman gofmanp@gmail.com:
...
On 03/24/2017 01:57 AM, Matteo Bruni wrote:
...
2017-03-23 16:06 GMT+01:00 Paul Gofman gofmanp@gmail.com:
...
Signed-off-by: Paul Gofman gofmanp@gmail.com
dlls/d3dx9_36/preshader.c    | 103
++++++++++++++++++++++++++++++++++++-------
  dlls/d3dx9_36/tests/effect.c |  38 ++++++++--------
  2 files changed, 106 insertions(+), 35 deletions(-)

diff --git a/dlls/d3dx9_36/preshader.c b/dlls/d3dx9_36/preshader.c
index 9aedc99..2974ffd 100644
--- a/dlls/d3dx9_36/preshader.c
+++ b/dlls/d3dx9_36/preshader.c
@@ -191,6 +191,8 @@ struct d3dx_pres_operand
      /* offset is component index, not register index, e. g.
         offset for component c3.y is 13 (3 * 4 + 1) */
      unsigned int offset;

enum pres_reg_tables base_reg_table;
unsigned int base_reg_offset;
};

It would be nicer to create a struct for storing together table and
offset and to pass it around to the various functions taking both
arguments, like parse_pres_reg().
It should probably be a separate patch before this one.
Actually I can't guess the case when base_reg_table would be not a floating
constant preshader input. Maybe I just remove this variable and check with
FIXME during parsing?
Sure, that's fine. Still the struct seems useful for parameter
passing, if not right there in struct d3dx_pres_operand.
...
...
...

size = rs->table_sizes[table] *

table_info[table].reg_component_count;

if (offset >= size)
{
   if ((offset & 255) < size)


       offset &= 255;


   else


       offset = offset % size;


}

Is this just to try and avoid the modulo operation in some cases or is
there a deeper reason?
If the former, you could extend that to all the power-of-two sizes by
first checking if that's the case (!(size & size - 1)) and, if so,
and-ing offset with the mask (size - 1). That's assuming the compiler
doesn't do the same optimization already.
Let me know if there is some other reason instead.
Just doing (offset % size) will lead to different results and will break the
tests. The logic in this out of range index checking is a bit mind breaking.
I guess there was a deeper reason indeed ;)
Is that visible in the tests included in this series? Either way,
maybe add a comment so that it's obvious why that thing is there.
...
...
...
static double exec_get_arg(struct d3dx_regstore *rs, const struct
d3dx_pres_operand *opr, unsigned int comp)
  {

if (!regstore_is_val_set_reg(rs, opr->table, (opr->offset + comp) /

table_info[opr->table].reg_component_count))

   WARN("Using uninitialized input, table %u, offset %u.\n",



opr->table, opr->offset + comp);

unsigned int base_offset, index;

if (opr->base_reg_table == PRES_REGTAB_COUNT)
   index = 0;


else
   index = (int)exec_get_reg_value(rs, opr->base_reg_table,



opr->base_reg_offset);
I think it's better with lrint() in place of the cast.
I just want to note that whenever the input is from floating values FX
compiler generates preshader instructions which do the rounding, and skips
that if the base register comes from integers. Do you think we need to do
extra rounding in this case?
Ah, you're right, I somehow overlooked that. Actually I think
doublechecking (e.g. with a FIXME) that the source is really integer
would be useful. If it isn't too much of a mess it would be nice to
avoid the roundtrip to doubles for the index entirely.
...
...
...

if (index >= rs->table_sizes[opr->table] && opr->table ==

PRES_REGTAB_CONST)

{
   unsigned int size_pow2;



   for (size_pow2 = 1; size_pow2 < rs->table_sizes[opr->table];



size_pow2 <<= 1)

       ;


   index &= (size_pow2 - 1);


   if (index >= rs->table_sizes[opr->table])


       return 0.0;


}

I don't think we necessarily have to follow native's madness on buffer
overflows to the letter. Since you did, that's certainly fine though.
I think I finally guessed the principle how native d3dx clamps indexes,

at least finally I got stable match for the tests involving any index
numbers. I thought it could make some sense as we sometimes see apps
depending on out of bounds access behaviour, especially if native
implementation can return zeroes (either stable or by chance like here).
Yeah, sure, it's just that maybe getting all the finer details right
would turn out unnecessary in practice. Props to you for doing all the
testing and getting things correct to this degree :)
...
As I can guess the logic behind it, it has some allocation size on the

floating constant table which is the power of 2, and it first ultimately
puts the index to this allocation range with modulo operation. Then, if the
index is bigger than the actual number of constants, it returns zero. This
logic is a bit different for indexing immediate constants though (which is
used when indirect addressing selects something from vector or matrix),
which never returns ultimate zeroes, probably because the memory allocation
for immediate constants is always exact and is not done in chunks.
That seems like a reasonable deduction. We could do the same in our
implementation and avoid the modulo non-power-of-two sizes but that
can certainly be something left for future improvements (if ever,
there is something to be said about being moderate with memory
consumption).
...
I have no
clear idea though why (offset & 255) < size is a special case, can only
guess than maybe it is related somehow to casting offset to byte at some
point.
Yeah, that is also plausible, FWIW.

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: [PATCH 2/3] d3dx9: Support relative addressing in preshader.

Signed-off-by: Paul Gofman gofmanp@gmail.com