On Sat Oct 14 22:06:36 2023 +0000, Bartosz Kosiorek wrote:
> With previous implementation we have 2*4 float multiplication inside
> `GdipTransformMatrixPoints` function:
> ```
> pts[i].X = x * matrix->matrix[0] + y * matrix->matrix[2] + matrix->matrix[4];
> pts[i].Y = x * matrix->matrix[1] + y * matrix->matrix[3] + matrix->matrix[5];
> ```
> Here is the full source code:
> ```
> GpStatus WINGDIPAPI GdipTransformMatrixPoints(GpMatrix *matrix, GpPointF *pts,
> INT count)
> {
> REAL x, y;
> INT i;
> TRACE("(%s, %p, %d)\n", debugstr_matrix(matrix), pts, count);
> if(!matrix || !pts || count <= 0)
> return InvalidParameter;
> for(i = 0; i < count; i++)
> {
> x = pts[i].X;
> y = pts[i].Y;
> pts[i].X = x * matrix->matrix[0] + y * matrix->matrix[2] + matrix->matrix[4];
> pts[i].Y = x * matrix->matrix[1] + y * matrix->matrix[3] + matrix->matrix[5];
> }
> return Ok;
> }
> ```
> As the vector is (0,0) and (1, 1), we could replace invocation of this
> function by simple assigment (no need to multiple anything):
> ```
> scale_x = graphics->worldtrans.matrix[0] + graphics->worldtrans.matrix[2];
> scale_y = graphics->worldtrans.matrix[1] + graphics->worldtrans.matrix[3];
> ```
> We could also optimize it more and calculate width via `sqrt` only once
> (I would like to leave it for next MR).
This seems simple enough that I have no problem changing it without performance numbers. Note that we immediately turn around and do the same thing with graphics->gdi_transform (with some indirection through gdip_transform_points and get_graphics_transform that could be eliminated).
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/3971#note_48812
Goes atop MR 403 and 388. The last four commits belong to this MR.
--
v7: vkd3d-shader: Clone descriptor scan info from struct vkd3d_shader_desc.
vkd3d-shader/dxil: Read CBV descriptors.
vkd3d-shader/dxil: Validate the descriptor list metadata nodes.
vkd3d-shader/spirv: Align constant buffer sizes to 16 bytes.
vkd3d-shader/dxil: Read DXIL compute shader thread group dimensions.
vkd3d-shader/dxil: Read DXIL global flags.
vkd3d-shader: Define more global flags.
vkd3d-shader/dxil: Handle multi-row signature elements.
vkd3d-shader/dxil: Handle signature element additional tag/value pairs.
vkd3d-shader/dxil: Read the DXIL input and output signatures.
vkd3d-shader/dxil: Validate the entry point info.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/401
The PE build uses FlsAlloc(), which for our purposes makes no difference vs TlsAlloc(), and allows the use of a destruction callback.
--
v5: vkd3d: Replace the descriptor object cache with a thread-local implementation.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/384
This is in view of eventually running the vkd3d cross tests in the CI. With this MR d3d12 passes all the tests ([see a test pipeline](https://gitlab.winehq.org/giomasce/vkd3d/-/jobs/32587)), but some shader runner tests are still failing.
I haven't investigated in detail all these issues. Also, it is known that WARP doesn't emulate faithfully a hardware device. The idea is that having the CI able to quickly check most (even if not all) our tests on native is still better than nothing.
--
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/406
Signed-off-by: Nikolay Sivov <nsivov(a)codeweavers.com>
--
v15: vkd3d-shader/tpf: Write out 'switch' statements.
vkd3d-shader/hlsl: Add a pass to validate switch cases blocks.
vkd3d-shader/hlsl: Add a pass to remove unreachable code.
vkd3d-shader/hlsl: Add copy propagation logic for switches.
vkd3d-shader/hlsl: Validate break/continue context.
vkd3d-shader/hlsl: Check for duplicate case statements.
vkd3d-shader/hlsl: Add initial support for parsing 'switch' statements.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/361