Convert all consecutive calls to d7_DrawPrimitive(TRIANGLE_FAN) into
a single call to d7_DrawPrimitive(TRIANGLE_LIST) with all the vertices.
Note, it *increase* the number of vertices, but bandwith is much less costly
than multiple calls.
Note, only a very precise subset of the calls get buffered in order to
ensure that the disruption is minimal.
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=33814
--
v6: ddraw: avoid magic fvf number
ddraw: using d3d_device7_DrawIndexedPrimitive()
https://gitlab.winehq.org/wine/wine/-/merge_requests/2105
Normalise the incoming vkd3d_shader_instruction IR to the shader model 6 pattern where only one patch constant function is emitted. This allows generation of a single patch constant function in SPIR-V.
--
v5: vkd3d-shader: Introduce an internal sm6 signature structure.
vkd3d-shader/spirv: Move the function declaration from spirv_compiler_begin_shader_phase() to spirv_compiler_enter_shader_phase().
vkd3d-shader/spirv: Remove the hull shader phase array.
vkd3d-shader/spirv: Merge all shader IR fork and join phases into a single phase.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/84
Sorry to let this stagnate for months, but I can confirm that an NVidia GPU which exposes 32768 as a limit for Vulkan will still refuse to create d3d12 textures with a dimension larger than 16384, even with feature level 12.1. (And it still exposes 16384 as a limit for d3d9). Given that, I think the right thing to do is just add a test that the limit is no higher than 16384, and probably also cap this in wined3d rather than ddraw.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/126#note_23674
I took some time to look into this to see if there's extra overhead, and while I think there are some things we could do better in draw_primitive(), there's probably not very much. Depending on what's in frame, the CS spends the majority of its time in draw_primitive(). Probably about 20% of that is spent acquiring the GL context, 40% loading the RTVs, 20% in context_apply_draw_state(); the rest is difficult to measure. This is on a relatively powerful radeonsi machine, with the swap interval hacked to zero; the total frame time is probably about 9 ms in the scenes I'm testing.
I think we can potentially cut draw_primitive() down to 10% of its current overhead if none of the state changes, but when we're doing 5000 draw calls per frame, even that may be too much. We could potentially buffer in wined3d, perhaps making use of EXT_multi_draw_arrays, but as Henri pointed out on IRC, we'd have to do a fair amount of work to invalidate (less than in ddraw itself), and this sort of thing probably doesn't perform well in newer d3d versions on Windows anyway. So buffering in ddraw is probably the right way to go. I'll look at the patch itself anon.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2105#note_23673
First part of v2 of !27, which aims to:
* Allow allocation of variables of complex types that contain both numerics and objects across multiple register sets (regsets).
* Support the tex2D and tex3D intrinsics, inferring generic samplers dimension from usage, writing sampler declarations, and writing sample instructions.
* Support for arrays of resources for both SM1 and SM4 (not to be confused with the resource-arrays of SM 5.1, which can have non-constant indexes).
* Support for resources declared within structs.
* Support for synthetic combined samplers for SM1 and synthetic separated samplers for SM4, considering that they can be arrays or members of structs.
* Imitate the way the native compiler assigns the register indexes of the resources on allocation, which proved to be the most difficult thing.
* Support for object components within complex input parameters.
* Small fixes to corner cases.
This part consists on parsing the `tex2D()` and `tex3D()` intrinsics and beginning to support the allocation of variables across multiple regsets.
The whole series, is on my [master6](https://gitlab.winehq.org/fcasas/vkd3d/-/commits/master6) branch.
--
v7: vkd3d-shader/hlsl: Allocate register reservations in a separate pass.
vkd3d-shader/hlsl: Respect object reservations even if the object is unused.
vkd3d-shader/hlsl: Allocate objects according to register set.
vkd3d-shader/hlsl: Keep an hlsl_reg for each register set in hlsl_ir_var.
vkd3d-shader/hlsl: Store the type's register size for each register set.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/66
On Windows it seems sending to port 0 does nothing and does not error.
Presently sendmsg errors with EINVAL.
This works around it, by checking if it's port 0 then skipping the data.
--
v16: ntdll: Do not send data to port 0.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2100
Without the changes, the test passes or fails based on the binutils version being used. v2.35 (used on the Debian 11 testbot) gives 64-bit DLLs an image base under 4GB, and the test always passes. On a system with binutils 2.37 or later, 64-bit DLLs are based above 4GBs, and the test will fail.
The map_view() change fixes native DLLs, and virtual_map_section() for
builtin DLLs. I wasn't sure how to test a native DLL.
This showed up under Wow64 when running the 64-bit Notepad++ installer
(a 32-bit EXE), which runs 32-bit regsvr32 to register a 64-bit DLL.
regsvr32 calls LoadLibraryExW() with LOAD_LIBRARY_AS_IMAGE_RESOURCE,
which was returning a truncated pointer to the DLLs base address.
Accessing this then crashed.
--
v2: ntdll: Respect zero_bits/limit when mapping a PE file.
ntdll/tests: Test NtMapViewOfSection with a 64-bit DLL and zero_bits > 31.
https://gitlab.winehq.org/wine/wine/-/merge_requests/269
On Windows it seems sending to port 0 does nothing and does not error.
Presently sendmsg errors with EINVAL.
This works around it, by checking if it's port 0 then skipping the data.
--
v17: ntdll: Do not send data to port 0.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2100
Multiple fork and join phases are eliminated. Signature elements are merged where required, and all input/output parameters are rewritten.
--
v3: vkd3d-shader/spirv: Move the function declaration from spirv_compiler_begin_shader_phase() to spirv_compiler_enter_shader_phase().
vkd3d-shader/spirv: Remove the hull shader phase array.
vkd3d-shader/spirv: Merge all shader IR fork and join phases into a single phase.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/84
This patch addresses an issue in Second Life and potentially other
multi-threaded applications which process WM_KEYDOWN in one thread
and then verify that the key is "still down" with GetAsyncKeyState
from another thread. Wine uses a per-thread key cache, resulting
in inconsistent views of key status. Caches are now invalidated
when an input event is injected by the driver or via SendInput.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2153
While running in XCode's profiler, I noticed memory leaks when safearrays were used in `For Each` statements.
The following code would leak a safearray allocation:
```
For Each obj In vpDict.Keys : Debug.Print "Key" : Next
```
This following code does not leak:
```
Dim x
x = vpDict.Keys
For Each obj In x : Debug.Print "Key" : Next
```
Fixes: https://bugs.winehq.org/show_bug.cgi?id=54456
--
v3: vbscript: Fix memory leak in owned safearray iterator
https://gitlab.winehq.org/wine/wine/-/merge_requests/2141
I ran into a script where someone placed a `:` on a new line after an `Else` but before another statement:
```
Else
: VelCoef = LinearEnvelope(BallPos, VelocityIn, VelocityOut)
if Enabled then aBall.Velx = aBall.Velx*VelCoef
if Enabled then aBall.Vely = aBall.Vely*VelCoef
end if
```
I confirmed that this is allowed and works.
I've updated the grammar, and replaced `NL` with `StSep_opt` as it seems to cover all the bases.
Fixes: https://bugs.winehq.org/show_bug.cgi?id=54234
--
v2: vbscript: fix compile when colon follows Else on new line
https://gitlab.winehq.org/wine/wine/-/merge_requests/2142
wine-gecko does actually support synchronous XMLHttpRequests, even if they are deprecated, but unfortunately it is very broken compared to IE or all the other major browsers (more details [here](https://bugzilla.mozilla.org/show_bug.cgi?id=697151)). Thankfully, it still provides us with the event message loop, which is enough to fix it on mshtml side and act like IE.
For sync XHRs, send() is supposed to block all script code, except for notifications sent to the sync XHR. This is because when send() blocks, no other piece of code in the script should execute other than the sync XHR handlers. Unfortunately, that's not the case in Gecko's broken implementation, which can execute handlers as if they were APCs while it's "blocked" in send().
Note that it doesn't actually block, though, because we still process the message loop (non-event related tasks such as navigation, paints, and other stuff), and that's normal. Gecko doesn't block everything related to script events, only some things on the document and timers, but that's far from enough and not even enough for our purposes since we use our own timer tasks. And not even message events are blocked, even though we *also* use our own message event dispatchers (but it doesn't block Gecko ones either).
So what we have to do is we need to track all the timers and events dispatched that result in script handlers being executed, while "blocking" inside of a sync XHR send(), and queue them up to defer them to be dispatched later, after send() unblocks. But this is easier said that done. There are many corner cases to consider here, for example:
* Events dispatched *synchronously* from within a sync XHR handler or other piece of code called from it.
* Nested sync XHR send() calls (called during handler of another sync XHR).
* Async XHRs having their events dispatched during a blocking sync XHR send() are complicated. `readyStateChange` for example needs to have the async XHR's readyState at the time it was sent, **not** at the time it was actually dispatched, since we're going to delay it and dispatch it later. It's similar with other XHR states, such as responseText (which can be partial).
* Aborts of async XHRs during such handlers.
These patches hopefully should address all the issues and, on a high level, work like this:
* Track the `readyState` and `responseText` length manually, so we can override / force them to specific values.
* "Snapshot" the async XHR at the time we encounter an event for it, and queue this event with such information. When later dispatching this event (after being deferred), we temporarily set the state of the async XHR to the snapshot.
* To deal with nested event dispatches, keep track of a "dispatch depth" (everytime we dispatch an event we increase the depth, and when it returns we decrease it).
* For sync XHRs, we note the depth when send() is called, and defer events at that depth, since they're dispatched during the "blocked" message loop and need to be delayed (except for sync XHR events, which is a special case since it must be dispatched synchronously). When it returns, we restore the previous blocking depth.
--
v2: mshtml: Send all readystatechange events for synchronous XHRs in IE9
mshtml: Implement synchronous XMLHttpRequest.
mshtml: Track responseText's length in XHRs and report it manually.
mshtml: Track readyState in XHRs and report it manually.
mshtml: Pass optional args to XMLHttpRequest.open() correctly.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2098