Signed-off-by: Ziqing Hui <zhui(a)codeweavers.com>
--
v2: d2d1/tests: Test value size checking for custom properties.
d2d1/tests: Add tests for GetPropertyCount().
d2d1/tests: Add tests for system property name, type, value size.
https://gitlab.winehq.org/wine/wine/-/merge_requests/330
Increment pixel pointer for every *pixel*, not every *stride*.
Signed-off-by: Jinoh Kang <jinoh.kang.kr(a)gmail.com>
--
v2: windowscodecs: Fix non-zero alpha detection in ImagingFactory_CreateBitmapFromHICON.
https://gitlab.winehq.org/wine/wine/-/merge_requests/315
The following thread is based partly on, and makes reference to, private
conversation, but for the sake of openness I've elected to post it to
wine-devel.
A long time ago, HLSL_IR_LOAD—then called HLSL_IR_DEREF—was this:
enum hlsl_ir_deref_type
{
HLSL_IR_DEREF_VAR,
HLSL_IR_DEREF_ARRAY,
HLSL_IR_DEREF_RECORD,
};
struct hlsl_deref
{
enum hlsl_ir_deref_type type;
union
{
struct hlsl_ir_var *var;
struct
{
struct hlsl_ir_node *array;
struct hlsl_ir_node *index;
} array;
struct
{
struct hlsl_ir_node *record;
struct hlsl_struct_field *field;
} record;
} v;
};
struct hlsl_ir_deref
{
struct hlsl_ir_node node;
struct hlsl_deref src;
};
Now, one problem with this is that it was kind of mean to RA and
liveness analysis. For example, a line of HLSL like
var.a.b = 2.0;
produced the following IR:
2: 2.0
3: deref(var)
4: @3.b
5: @4.c = @2
This is annoying because:
* to discover that "var" is written, @5 needs to reach upwards through
a deref chain;
* reaching through the deref chain requires lots of assert() statements;
* @3 implies that "var" is read, which it isn't (and, if we reach
upwards through the deref chain, @4 implies the same thing).
I proposed that instead of using generic node pointers, we could have
arbitrarily long deref chains encoded in the hlsl_deref structure
itself. [1]
There was some discussion on that—which is mostly concentrated in that
thread, and also IRC. Most of the concern is about being nicer to
liveness analysis and RA.
What ultimately ended up happening is that Matteo proposed numeric
(register) offsets calculated at parse time, which is fundamentally
similar to my idea except that it's a lot simpler to work with.
Interestingly, the problem of multiple register sets was brought up [2]:
From my testing it essentially does, yes, i.e. if you have
struct { int unused; float f; bool b; } s;
float4 main(float4 pos : POSITION) : POSITION
{
if (s.b)
return s.f;
return pos;
}
then "s" gets allocated to registers b0-b2 and c0-c1, but only b2
and c1 are ever used.
So yeah, it makes things pretty simple. I can see how it would have
been a lot uglier otherwise.
I guess we've finally run into that ugliness now :-(
The ultimate conclusions to draw from this historical exercise are:
- what I said about "we used to have derefs handled like that" is mostly
correct, although not quite. We did used to have more rich type
information, and we did decide that offsets calculated at parse time
were preferable to that type information, although I thought we at one
point had something like [1] in the tree, which we didn't. Anyway the
decision to use offsets calculated at parse time seems to have been
motivated only by simplicity. To be fair, at the time, it *was* simpler.
- [1] and the later patch that replaced it were mostly motivated by RA.
We will probably end up doing RA after SMxIR translation, but we may
very likely do RA *before* it as well (tracking e.g. SMx instructions
with register numbers instead of having def-use chains.) A more salient
concern is that I still don't like the idea of having instructions in
the tree that aren't actually translated (or translatable) to SMxIR,
which means that we shouldn't have instructions that yield e.g. structs.
The ugliness that we've run into is: how do we emit IR for the following
variable load?
struct apple
{
int a;
struct
{
Texture2D b;
int c;
} s;
} a;
/* in some expression */
func(a.s);
Unlike the SM1 example above, the register numbers don't match up.
Separately, it's kind of ugly that backend-specific details regarding
register size and alignment are leaking into the frontend so much.
Similarly, the amount of code that has to deal with matrix majority is
unfortunate.
The former problem can potentially be solved by embedding multiple
register offsets into hlsl_deref (one per register type). Neither this
nor the latter problem are prohibitive, and I was at one point in favour
of continuing to use register offsets everywhere, but at this point my
feeling has changed, and I think using register offsets is looking more
ugly than the alternatives. I get the impression that Francisco
disagrees, though, which is why we should probably hash this out now.
Nor do I think we should use both register offsets and component offsets
(either in the same node type, or in different node types). That just
makes the IR way more complicated. Rather, I think we should be doing
everything in *just* component offsets until translation from HLSL IR to
SMx IR.
In order to deal with the problem of translating dynamic offsets from
components to registers, I see three options:
(a) emit code at runtime, or do some sophisticated lowering,
(b) use special offsetof and sizeof nodes,
(c) introduce a structured deref type, much like [1]. Francisco was
actually proposing something like this, although with an array instead
of a recursive structure, which strikes me as an improvement.
My guess is that (a) is very hard. I haven't really tried to reason it
out, though.
Given a choice between (b) and (c), I'm more inclined to pick (c). It
makes the IR structure more restrictive, and those restrictions
fundamentally match the structured nature of the language we're working
with, both things I tend to like.
Note that either way we're going to need specialized functions to
resolve deref offsets in one step. I also think that should depend on
the domain—e.g. for copy-prop we'll actually want to do everything in
component counts, but when translating to SMxIR we'll evaluate given the
register alignment constraints of the shader model. In the case of (b)
it's not going to be as simple as running the existing constant folding
pass, because we can't actually fold the sizeof/offsetof constants
(unless we dup the node list, evaluate, and then fold, which seems very
hairy and more work than the alternative).
I invite thoughts—especially from Matteo, since we discussed this sort
of problem ages ago.
ἔρρωσθε,
Zeb
[1] https://www.winehq.org/pipermail/wine-devel/2020-April/164399.html
[2] https://www.winehq.org/pipermail/wine-devel/2020-April/165493.html