I don't think this should be committed so late in the release cycle, but the work others have been doing with point checking functions inspired me to move on this, and I wanted to let folks know what I'm thinking about and get some feedback. Consider it a proof of concept for now.
The problem: GdipGetRegionHRgn is expensive. Regardless of how much of the region we actually care about, we must rasterize the entire plane of that region. Currently, we rely on gdi32 to do this, which doesn't provide us with any way to ask for a smaller area. This has been a real problem for applications that work with very large, non-rectangular regions or paths.
So it seems to me, the solution will involve writing two functions, one of which is in this MR: a bounding-box calculation for regions, and rasterization. Since the result of rasterization will be short-lived, I think it should always be an array of bytes (which are effectively a bitmap but don't require bitwise operations to work with). The rasterization should be limited to a specific integer rectangle, which in the case of GdipIsVisibleRegionPoint can be 1x1 pixel. We should then be able to eliminate all internal uses of GdipGetRegionHRgn, except when we have to return an HRGN to the application.
This would also be useful for implementing anti-aliasing, as we could keep the memory usage down by only rasterizing the part we need for each scanline at a time.
My current concerns about what I've written so far:
* Although it shouldn't change the behavior, I doubt it's adequately tested, even including the Mono tests. I'd like to ideally test every combine mode in a variety of cases.
* This is too clever for its own good. Unfortunately, with the complexity of handling so many combine modes as well as infinities for each region being combined, I think the only available choices are "too clever for its own good" and "20 different special cases that need to be considered manually". Still, I want to make this as simple and easy to follow as possible, within its requirements.
--
v2: gdiplus: Check bounding box in GdipIsVisibleRegionPoint.
https://gitlab.winehq.org/wine/wine/-/merge_requests/4206
I don't think this should be committed so late in the release cycle, but the work others have been doing with point checking functions inspired me to move on this, and I wanted to let folks know what I'm thinking about and get some feedback. Consider it a proof of concept for now.
The problem: GdipGetRegionHRgn is expensive. Regardless of how much of the region we actually care about, we must rasterize the entire plane of that region. Currently, we rely on gdi32 to do this, which doesn't provide us with any way to ask for a smaller area. This has been a real problem for applications that work with very large, non-rectangular regions or paths.
So it seems to me, the solution will involve writing two functions, one of which is in this MR: a bounding-box calculation for regions, and rasterization. Since the result of rasterization will be short-lived, I think it should always be an array of bytes (which are effectively a bitmap but don't require bitwise operations to work with). The rasterization should be limited to a specific integer rectangle, which in the case of GdipIsVisibleRegionPoint can be 1x1 pixel. We should then be able to eliminate all internal uses of GdipGetRegionHRgn, except when we have to return an HRGN to the application.
This would also be useful for implementing anti-aliasing, as we could keep the memory usage down by only rasterizing the part we need for each scanline at a time.
My current concerns about what I've written so far:
* Although it shouldn't change the behavior, I doubt it's adequately tested, even including the Mono tests. I'd like to ideally test every combine mode in a variety of cases.
* This is too clever for its own good. Unfortunately, with the complexity of handling so many combine modes as well as infinities for each region being combined, I think the only available choices are "too clever for its own good" and "20 different special cases that need to be considered manually". Still, I want to make this as simple and easy to follow as possible, within its requirements.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/4206
First part of the continuation of the implementation of non-constant offset dereferences (a.k.a. relative addressing) for SM4, now that we use vsir registers in tpf.c.
As a quick recap: while parsing HLSL we are expressing derefs as paths, and then we are lowering these paths into a single offset node (which is closer to the bytecode) using the replace_deref_path_with_offset() pass, right before register allocation.
This first part of the series splits this offset node into 2 parts:
- A constant uint, which will be called hlsl_deref.offset_const.
- A non-hlsl_ir_constant offset node that will only be present when we need relative addressing, that we will end up calling hlsl_deref.offset_rel.
Both these fields will be analog to the ones used in vsir register indexes, vkd3d_shader_register_index.rel_addr and vkd3d_shader_register_index.offset respectively, which is something we need for the second part of this series.
The following patches are in my [nonconst-offsets-8](https://gitlab.winehq.org/fcasas/vkd3d/-/commits/noncon… branch, if something is not clear in this series, it may be worth skimming through them.
Supersedes !229.
--
v4: vkd3d-shader/tpf: Declare indexable temps.
vkd3d-shader/hlsl: Mark vars that require non-constant dereferences.
vkd3d-shader/hlsl: Rename hlsl_deref.offset to hlsl_deref.rel_offset.
vkd3d-shader/hlsl: Absorb hlsl_ir_constant deref offsets into const_offset.
vkd3d-shader/hlsl: Express deref->offset in whole registers.
vkd3d-shader/hlsl: Split deref-offset into a node and a constant uint.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/396
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=55786
This changes the way we measure minimum timeouts in user32:msg tests.
Currently, it's done by using GetTickCount() to wait approximately 1 second (since GetTickCount() is approximate) and counting the number of times the timer fires in that second. On Wine, a range of 91-109 (equivalent to 9.17-10.98 ms) is accepted, and on Windows a range of 33-74 is additionally accepted (equivalent to 13.51-30.3 ms). (That wide range of Windows timings is obfuscated by presenting it as accepting 43 or 64 counts, each +/- 10, but those ranges actually touch.)
With the patch, we instead wait for the timer to fire 500 times (approximately 5 seconds on Wine, 8 seconds on Windows; I tried 100 times as earlier but didn't get as consistent results with it), calculate the delay between each time it fires using QueryPerformanceCounter, and take the median delay. On the test bot, all my results with this method were within 0.15 ms of the expected values, and most were within 0.05 ms. I'm using a maximum error of 1 ms, which is slightly more forgiving than before.
I'm hoping this cuts down on the CI failures in these tests, but I have no way of knowing for sure.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/4205