[PATCH v7 0/1] MR10541: vbscript: Fast-path assign_value for simple non-refcounted types.
Skip VariantCopyInd and VariantClear when both source and destination are simple scalar types (VT_I2, VT_I4, VT_R8, VT_BOOL, VT_EMPTY, etc.) that require no memory management. A direct struct assignment suffices for these types. Benchmarks (10M iterations, measured in isolation on top of master): - Do-While loop: 824 -> 472 ms (1.75×) - For + assignment: 324 -> 265 ms (1.22×) - Dict.Items iteration: 683 -> 324 ms (2.1×) - If-Else branching: 726 -> 644 ms (1.13×) Every variable assignment in the interpreter benefits. The fast path eliminates two cross-DLL calls to oleaut32 (VariantCopyInd + VariantClear) for each assignment of simple scalar types (VT_I2, VT_I4, VT_R8, VT_BOOL, etc.), replacing them with a single struct copy. -- v7: vbscript: Fast-path assign_value for simple non-refcounted types. https://gitlab.winehq.org/wine/wine/-/merge_requests/10541
From: Francis De Brabandere <francisdb@gmail.com> Skip VariantCopyInd and VariantClear when both source and destination are simple scalar types (VT_I2, VT_I4, VT_R8, VT_BOOL, VT_EMPTY, etc.) that require no memory management. A direct struct assignment suffices for these types. This avoids two cross-DLL calls into oleaut32 for every integer or float variable assignment. The fast path covers VT_EMPTY through VT_UINT (19 types) excluding VT_BSTR, VT_DISPATCH, VT_UNKNOWN, VT_VARIANT, and VT_RECORD which need refcounting or deep copy. Types with VT_BYREF or VT_ARRAY flags always take the slow path. --- dlls/vbscript/interp.c | 46 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/dlls/vbscript/interp.c b/dlls/vbscript/interp.c index 76ea15a6e2e..281b4201e78 100644 --- a/dlls/vbscript/interp.c +++ b/dlls/vbscript/interp.c @@ -899,11 +899,57 @@ static HRESULT interp_ident(exec_ctx_t *ctx) return stack_push(ctx, &v); } +/* Returns TRUE for scalar value types that need no memory management. + * These can be copied with a plain struct assignment and cleared by + * simply overwriting - no VariantCopyInd, VariantClear, or Release + * calls required. Excludes VT_BSTR, VT_DISPATCH, VT_UNKNOWN, + * VT_RECORD (ref-counted or allocated) and any VT_BYREF / VT_ARRAY + * combinations (indirect). */ +static inline BOOL is_simple_variant(const VARIANT *v) +{ + VARTYPE vt = V_VT(v); + + if (vt & ~VT_TYPEMASK) + return FALSE; + + switch (vt) + { + case VT_EMPTY: + case VT_NULL: + case VT_I2: + case VT_I4: + case VT_R4: + case VT_R8: + case VT_CY: + case VT_DATE: + case VT_ERROR: + case VT_BOOL: + case VT_DECIMAL: + case VT_I1: + case VT_UI1: + case VT_UI2: + case VT_UI4: + case VT_I8: + case VT_UI8: + case VT_INT: + case VT_UINT: + return TRUE; + default: + return FALSE; + } +} + static HRESULT assign_value(exec_ctx_t *ctx, VARIANT *dst, VARIANT *src, WORD flags) { VARIANT value; HRESULT hres; + if (is_simple_variant(src) && is_simple_variant(dst)) + { + *dst = *src; + return S_OK; + } + V_VT(&value) = VT_EMPTY; hres = VariantCopyInd(&value, src); if(FAILED(hres)) -- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/10541
Jacek Caban (@jacek) commented about dlls/vbscript/interp.c:
+ return TRUE; + default: + return FALSE; + } +} + static HRESULT assign_value(exec_ctx_t *ctx, VARIANT *dst, VARIANT *src, WORD flags) { VARIANT value; HRESULT hres;
+ if (is_simple_variant(src) && is_simple_variant(dst)) + { + *dst = *src; + return S_OK; + } It is a bit of a tradeoff, since for "non-simple" types we would make things slower. I'm not sure what makes the difference so noticeable. The type validation performed by `VariantCopy` and `VariantClear` should have overhead similar to your helper.
I see that, with the current code, we have an extra `VariantClear` call on the `value` variable (inside `VariantCopyInd`), which we need only for `VT_DISPATCH` handling. Would it help if we tried harder to do that only for `VT_DISPATCH`, and otherwise used `VariantCopyInd` directly into the destination? -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10541#note_139234
On Fri May 8 16:19:45 2026 +0000, Jacek Caban wrote:
It is a bit of a tradeoff, since for "non-simple" types we would make things slower. I'm not sure what makes the difference so noticeable. The type validation performed by `VariantCopy` and `VariantClear` should have overhead similar to your helper. I see that, with the current code, we have an extra `VariantClear` call on the `value` variable (inside `VariantCopyInd`), which we need only for `VT_DISPATCH` handling. Would it help if we tried harder to do that only for `VT_DISPATCH`, and otherwise used `VariantCopyInd` directly into the destination? I'm going to put this on draft, these are synthetic hot loop tests.
The main perf issue to look at first is !10528 Once that is merged I will again run my benchmarks + real world test suite with this patch and update this mr. https://github.com/francisdb/vpinball-test/tree/main/examples -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10541#note_139269
Closing, no real world perf improvement -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10541#note_140280
This merge request was closed by Francis De Brabandere. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10541
participants (3)
-
Francis De Brabandere -
Francis De Brabandere (@francisdb) -
Jacek Caban (@jacek)