[PATCH 0/1] MR11196: vbscript: Store global variables and functions in a single tagged tree.
ScriptDisp kept separate var_tree and func_tree even though a global name resolves to a single member. Replace them with one tree of scriptdisp_entry_t nodes that tag their payload (a variable or a function), so the global identifier path does one lookup instead of two and a name maps to exactly one member. The tagged payload leaves room for further kinds without adding more trees. This makes re-declaration match native: a Sub or Function re-declares a prior Dim, Sub or Function of the same name, while a Dim keeps an existing variable or function (a shadowed const is still replaced by a fresh variable). -- https://gitlab.winehq.org/wine/wine/-/merge_requests/11196
From: Francis De Brabandere <francisdb@gmail.com> ScriptDisp kept separate var_tree and func_tree even though a global name resolves to a single member. Replace them with one tree of scriptdisp_entry_t nodes that tag their payload (a variable or a function), so the global identifier path does one lookup instead of two and a name maps to exactly one member. The tagged payload leaves room for further kinds without adding more trees. This makes re-declaration match native: a Sub or Function re-declares a prior Dim, Sub or Function of the same name, while a Dim keeps an existing variable or function (a shadowed const is still replaced by a fresh variable). --- dlls/vbscript/interp.c | 3 +- dlls/vbscript/vbdisp.c | 97 +++++++++++++++++++++++++++------------- dlls/vbscript/vbscript.c | 42 ++++++++--------- dlls/vbscript/vbscript.h | 28 ++++++++++-- 4 files changed, 110 insertions(+), 60 deletions(-) diff --git a/dlls/vbscript/interp.c b/dlls/vbscript/interp.c index c57b28ab90e..6d107eb2419 100644 --- a/dlls/vbscript/interp.c +++ b/dlls/vbscript/interp.c @@ -296,7 +296,8 @@ static HRESULT add_dynamic_var(exec_ctx_t *ctx, const WCHAR *name, } { new_var->index = script_obj->global_vars_cnt; - rb_put(&script_obj->var_tree, new_var->name, &new_var->entry); + if(!script_disp_add_var(script_obj, new_var)) + return E_OUTOFMEMORY; } script_obj->global_vars[script_obj->global_vars_cnt++] = new_var; }else { diff --git a/dlls/vbscript/vbdisp.c b/dlls/vbscript/vbdisp.c index d953c19bb30..790dad2604d 100644 --- a/dlls/vbscript/vbdisp.c +++ b/dlls/vbscript/vbdisp.c @@ -29,36 +29,77 @@ static const GUID GUID_VBScriptTypeInfo = {0xc59c6b12,0xf6c1,0x11cf,{0x88,0x35,0 #define DISPID_FUNCTION_MASK 0x20000000 #define FDEX_VERSION_MASK 0xf0000000 -static int func_name_cmp(const void *key, const struct rb_entry *entry) +static int scriptdisp_member_cmp(const void *key, const struct rb_entry *entry) { - function_t *func = RB_ENTRY_VALUE(entry, function_t, entry); - return vbs_wcsicmp(key, func->name); + scriptdisp_entry_t *member = RB_ENTRY_VALUE(entry, scriptdisp_entry_t, entry); + return vbs_wcsicmp(key, member->name); } -static int var_name_cmp(const void *key, const struct rb_entry *entry) +scriptdisp_entry_t *script_disp_find_member(ScriptDisp *disp, const WCHAR *name) { - dynamic_var_t *var = RB_ENTRY_VALUE(entry, dynamic_var_t, entry); - return vbs_wcsicmp(key, var->name); + struct rb_entry *entry = rb_get(&disp->members, name); + return entry ? RB_ENTRY_VALUE(entry, scriptdisp_entry_t, entry) : NULL; +} + +/* Insert name into the member tree, or return the existing entry for it; the + caller fills in the type and payload. The name string must stay valid for + the lifetime of the entry. */ +static scriptdisp_entry_t *script_disp_add_member(ScriptDisp *disp, const WCHAR *name) +{ + scriptdisp_entry_t *member = script_disp_find_member(disp, name); + + if (member) { + member->name = name; + return member; + } + + if (!(member = heap_pool_alloc(&disp->heap, sizeof(*member)))) + return NULL; + member->name = name; + rb_put(&disp->members, name, &member->entry); + return member; +} + +scriptdisp_entry_t *script_disp_add_var(ScriptDisp *disp, dynamic_var_t *var) +{ + scriptdisp_entry_t *member = script_disp_add_member(disp, var->name); + + if (member) { + member->type = SCRIPTDISP_VAR; + member->u.var = var; + } + return member; +} + +scriptdisp_entry_t *script_disp_add_func(ScriptDisp *disp, function_t *func) +{ + scriptdisp_entry_t *member = script_disp_add_member(disp, func->name); + + if (member) { + member->type = SCRIPTDISP_FUNC; + member->u.func = func; + } + return member; } function_t *script_disp_find_func(ScriptDisp *disp, const WCHAR *name) { - struct rb_entry *entry = rb_get(&disp->func_tree, name); + scriptdisp_entry_t *member = script_disp_find_member(disp, name); - if (!entry) + if (!member || member->type != SCRIPTDISP_FUNC) return NULL; - return RB_ENTRY_VALUE(entry, function_t, entry); + return member->u.func; } dynamic_var_t *script_disp_find_var(ScriptDisp *disp, const WCHAR *name) { - struct rb_entry *entry = rb_get(&disp->var_tree, name); + scriptdisp_entry_t *member = script_disp_find_member(disp, name); - if (!entry) + if (!member || member->type != SCRIPTDISP_VAR) return NULL; - return RB_ENTRY_VALUE(entry, dynamic_var_t, entry); + return member->u.var; } static inline BOOL is_func_id(vbdisp_t *This, DISPID id) @@ -1209,10 +1250,10 @@ static HRESULT WINAPI ScriptTypeInfo_GetIDsOfNames(ITypeInfo *iface, LPOLESTR *r } { - struct rb_entry *entry = rb_get(&This->disp->var_tree, name); - if (entry) + dynamic_var_t *var = script_disp_find_var(This->disp, name); + if (var) { - pMemId[0] = RB_ENTRY_VALUE(entry, dynamic_var_t, entry)->index + 1; + pMemId[0] = var->index + 1; return S_OK; } } @@ -1513,10 +1554,9 @@ static HRESULT WINAPI ScriptTypeComp_Bind(ITypeComp *iface, LPOLESTR szName, ULO } { - struct rb_entry *entry = rb_get(&This->disp->var_tree, szName); - if (entry) + dynamic_var_t *var = script_disp_find_var(This->disp, szName); + if (var) { - dynamic_var_t *var = RB_ENTRY_VALUE(entry, dynamic_var_t, entry); if (!(flags & INVOKE_PROPERTYGET)) return TYPE_E_TYPEMISMATCH; hr = ITypeInfo_GetVarDesc(&This->ITypeInfo_iface, var->index, &pBindPtr->lpvardesc); @@ -1736,23 +1776,19 @@ static HRESULT WINAPI ScriptDisp_Invoke(IDispatchEx *iface, DISPID dispIdMember, static HRESULT WINAPI ScriptDisp_GetDispID(IDispatchEx *iface, BSTR bstrName, DWORD grfdex, DISPID *pid) { ScriptDisp *This = ScriptDisp_from_IDispatchEx(iface); - struct rb_entry *entry; + scriptdisp_entry_t *member; TRACE("(%p)->(%s %lx %p)\n", This, debugstr_w(bstrName), grfdex, pid); if(!This->ctx) return E_UNEXPECTED; - entry = rb_get(&This->var_tree, bstrName); - if(entry) { - *pid = RB_ENTRY_VALUE(entry, dynamic_var_t, entry)->index + 1; - return S_OK; - } - - entry = rb_get(&This->func_tree, bstrName); - if(entry) { - function_t *func = RB_ENTRY_VALUE(entry, function_t, entry); - *pid = func->index + 1 + DISPID_FUNCTION_MASK; + member = script_disp_find_member(This, bstrName); + if(member) { + if(member->type == SCRIPTDISP_FUNC) + *pid = member->u.func->index + 1 + DISPID_FUNCTION_MASK; + else + *pid = member->u.var->index + 1; return S_OK; } @@ -1892,8 +1928,7 @@ HRESULT create_script_disp(script_ctx_t *ctx, ScriptDisp **ret) script_disp->ref = 1; script_disp->ctx = ctx; heap_pool_init(&script_disp->heap); - rb_init(&script_disp->func_tree, func_name_cmp); - rb_init(&script_disp->var_tree, var_name_cmp); + rb_init(&script_disp->members, scriptdisp_member_cmp); script_disp->rnd = 0x50000; *ret = script_disp; diff --git a/dlls/vbscript/vbscript.c b/dlls/vbscript/vbscript.c index 43f8dd2fdd5..9f2aa30413c 100644 --- a/dlls/vbscript/vbscript.c +++ b/dlls/vbscript/vbscript.c @@ -140,12 +140,13 @@ HRESULT exec_global_code(script_ctx_t *ctx, vbscode_t *code, VARIANT *res, BOOL for (i = 0; i < code->main_code.var_cnt; i++) { - dynamic_var_t *existing = script_disp_find_var(obj, code->main_code.vars[i].name); + scriptdisp_entry_t *existing = script_disp_find_member(obj, code->main_code.vars[i].name); - /* A Dim may shadow a const from a previous compile unit: it creates a - fresh variable that name lookups resolve to from now on, while the - defining compile unit keeps using the inlined const value. */ - if (existing && !existing->is_const) + /* Dim is permissive: it keeps an existing variable or function of the + same name. A const from a previous compile unit is the exception: it + is shadowed by a fresh variable that name lookups resolve to from now + on, while the defining compile unit keeps using the inlined value. */ + if (existing && (existing->type != SCRIPTDISP_VAR || !existing->u.var->is_const)) continue; if (!(var = heap_pool_alloc(&obj->heap, sizeof(*var)))) @@ -158,33 +159,26 @@ HRESULT exec_global_code(script_ctx_t *ctx, vbscode_t *code, VARIANT *res, BOOL var->is_const = FALSE; var->array = NULL; var->index = obj->global_vars_cnt; - if (existing) - rb_remove(&obj->var_tree, &existing->entry); - rb_put(&obj->var_tree, var->name, &var->entry); + if (!script_disp_add_var(obj, var)) + return E_OUTOFMEMORY; obj->global_vars[obj->global_vars_cnt++] = var; } for (func_iter = code->funcs; func_iter; func_iter = func_iter->next) { - struct rb_entry *entry = rb_get(&obj->func_tree, func_iter->name); + scriptdisp_entry_t *existing = script_disp_find_member(obj, func_iter->name); - if (entry) - { - function_t *old_func = RB_ENTRY_VALUE(entry, function_t, entry); - size_t old_index = old_func->index; - /* global function already exists, replace it */ - rb_remove(&obj->func_tree, &old_func->entry); - func_iter->index = old_index; - obj->global_funcs[old_index] = func_iter; - rb_put(&obj->func_tree, func_iter->name, &func_iter->entry); - } + if (existing && existing->type == SCRIPTDISP_FUNC) + /* global function already exists, replace it in its slot */ + func_iter->index = existing->u.func->index; else - { - func_iter->index = obj->global_funcs_cnt; - obj->global_funcs[obj->global_funcs_cnt++] = func_iter; - rb_put(&obj->func_tree, func_iter->name, &func_iter->entry); - } + /* a new function, or one re-declaring a variable of the same name */ + func_iter->index = obj->global_funcs_cnt++; + + obj->global_funcs[func_iter->index] = func_iter; + if (!script_disp_add_func(obj, func_iter)) + return E_OUTOFMEMORY; } if (code->classes) diff --git a/dlls/vbscript/vbscript.h b/dlls/vbscript/vbscript.h index 9450851e4a4..9522a10503f 100644 --- a/dlls/vbscript/vbscript.h +++ b/dlls/vbscript/vbscript.h @@ -119,10 +119,27 @@ typedef struct _dynamic_var_t { const WCHAR *name; BOOL is_const; SAFEARRAY *array; - struct rb_entry entry; size_t index; } dynamic_var_t; +typedef enum { + SCRIPTDISP_VAR, + SCRIPTDISP_FUNC, +} scriptdisp_entry_type_t; + +/* A single named member of a ScriptDisp. A global name resolves to exactly + one member, so variables and functions share one tree; the payload selected + by type can grow to cover further kinds (e.g. cached host properties). */ +typedef struct { + struct rb_entry entry; + const WCHAR *name; + scriptdisp_entry_type_t type; + union { + dynamic_var_t *var; + function_t *func; + } u; +} scriptdisp_entry_t; + typedef struct { IDispatchEx IDispatchEx_iface; LONG ref; @@ -130,12 +147,13 @@ typedef struct { dynamic_var_t **global_vars; size_t global_vars_cnt; size_t global_vars_size; - struct rb_tree var_tree; function_t **global_funcs; size_t global_funcs_cnt; size_t global_funcs_size; - struct rb_tree func_tree; + + /* maps a global name to its scriptdisp_entry_t */ + struct rb_tree members; class_desc_t *classes; @@ -145,6 +163,9 @@ typedef struct { unsigned int rnd; } ScriptDisp; +scriptdisp_entry_t *script_disp_find_member(ScriptDisp *disp, const WCHAR *name); +scriptdisp_entry_t *script_disp_add_var(ScriptDisp *disp, dynamic_var_t *var); +scriptdisp_entry_t *script_disp_add_func(ScriptDisp *disp, function_t *func); dynamic_var_t *script_disp_find_var(ScriptDisp *disp, const WCHAR *name); typedef struct _builtin_prop_t builtin_prop_t; @@ -391,7 +412,6 @@ struct _function_t { unsigned code_off; vbscode_t *code_ctx; function_t *next; - struct rb_entry entry; size_t index; }; -- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/11196
@jacek This implements the consolidation suggested in !11136: global variables and functions now live in a single tree of tagged entries (`scriptdisp_entry_t` - type plus payload) instead of the separate `var_tree`/`func_tree`, so a global name resolves to exactly one member and the identifier path does one lookup instead of two. It also makes re-declaration match native (a Sub/Function re-declares a prior Dim/Sub/Function; Dim keeps an existing name). I left the per-named-item host `DISPID` cache out of this as a separate concern - the payload union can grow a host-property kind later if that's the direction. On the "emit local indexes bound in `exec_global_code`" idea, I'd like to confirm the approach before building it, since the binding has to coexist with the ctx-dependent resolution in `lookup_identifier`: a bare identifier can resolve to a function local, an argument, a runtime `dynamic_var`, a `Me`/class property, a global member, a builtin, or a named item, in that precedence - and the compiler can't statically tell which, so a slot can't be bound purely at compile time. Two directions I can see: 1. Compile-time slots, bound at link only where the name provably can't be shadowed by ctx (not a local/arg of the owning function, not a class property), falling back to the full lookup otherwise. 2. A per-instruction resolution cache filled on first execution and invalidated by a generation counter bumped whenever the global tree changes, sitting after the ctx-dependent checks. Did you have (1) in mind, or something closer to (2)? And how would you want `dynamic_var`/`Me`-property shadowing handled relative to a bound global slot? -- https://gitlab.winehq.org/wine/wine/-/merge_requests/11196#note_143613
This merge request was approved by Jacek Caban. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/11196
I was thinking along the lines of the compiler allocating an "extern" slot for each referenced global identifier, including unknown ones and class names, similar to what we currently do for declared variables. The bitcode would then address these slots directly instead of string identifiers. Each slot would initially be empty, but since we can always retrieve its corresponding name from the compiler metadata, we could easily fall back to our current behavior when necessary. When linking the code, we could fill these slots with direct pointers to the actual variables for dims, etc. For other identifiers, we could resolve and populate the slots opportunistically whenever we run into them for the first time (unless no other limitations apply). I guess `scriptdisp_entry_t` pointers could be used for these slots, but I'm not sure, there might be a better representation. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/11196#note_143645
participants (3)
-
Francis De Brabandere -
Francis De Brabandere (@francisdb) -
Jacek Caban (@jacek)