[PATCH v14 0/1] MR10244: vbscript: Fix Sub first argument parentheses handling.
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=54177 Fix parsing of expressions like `Sub (i - 20) * 8,y` where a space-separated parenthesized expression followed by a binary operator(*` /` \` ^` &`) was incorrectly treated as call argument parentheses instead of expression grouping. If you look at all the extra test cases in `lang.vbs` you will see which parentheses related issues this solves. I have been testing with a repo full of scripts: https://github.com/sverrewl/vpxtable_scripts There are still quite some compiler issues left but this addresses the biggest parser issues. -- v14: vbscript: Fix Sub first argument parentheses handling. https://gitlab.winehq.org/wine/wine/-/merge_requests/10244
From: Francis De Brabandere <francisdb@gmail.com> Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=54177 --- dlls/vbscript/lex.c | 73 +++++++++++++++++++++++++++++++- dlls/vbscript/parse.h | 2 + dlls/vbscript/parser.y | 12 ++++-- dlls/vbscript/tests/lang.vbs | 80 ++++++++++++++++++++++++++++++++++++ 4 files changed, 162 insertions(+), 5 deletions(-) diff --git a/dlls/vbscript/lex.c b/dlls/vbscript/lex.c index 75838ac586a..e4fe101a739 100644 --- a/dlls/vbscript/lex.c +++ b/dlls/vbscript/lex.c @@ -450,7 +450,8 @@ static int parse_next_token(void *lval, unsigned *loc, parser_ctx_t *ctx) return comment_line(ctx); ctx->ptr++; return '-'; - case '(': + case '(': { + const WCHAR *paren_pos = ctx->ptr; /* NOTE: * We resolve empty brackets in lexer instead of parser to avoid complex conflicts * in call statement special case |f()| without 'call' keyword @@ -464,10 +465,38 @@ static int parse_next_token(void *lval, unsigned *loc, parser_ctx_t *ctx) /* * Parser can't predict if bracket is part of argument expression or an argument * in call expression. We predict it here instead. + * + * In statement context, 'f (x) * 8, y' treats (x) as expression grouping, + * not call parens. Detect this when: identifier/')' precedes '(' with a space, + * and a binary-only operator (*, /, \, ^, &) follows the matching ')'. */ - if(ctx->last_token == tIdentifier || ctx->last_token == ')' || ctx->last_token == tME) + if(ctx->last_token == tIdentifier || ctx->last_token == ')' || ctx->last_token == tME) { + if(paren_pos > ctx->code && (paren_pos[-1] == ' ' || paren_pos[-1] == '\t') + && ctx->is_statement_ctx) { + const WCHAR *p = ctx->ptr; + int depth = 1; + + while(p < ctx->end && depth > 0) { + if(*p == '(') + depth++; + else if(*p == ')') + depth--; + if(depth > 0) + p++; + } + + if(depth == 0) { + p++; + while(p < ctx->end && (*p == ' ' || *p == '\t')) + p++; + if(p < ctx->end && (*p == '*' || *p == '/' || *p == '\\' || *p == '^' || *p == '&')) + return tEXPRLBRACKET; + } + } return '('; + } return tEXPRLBRACKET; + } case '"': return parse_string_literal(ctx, lval); case '#': @@ -546,5 +575,45 @@ int parser_lex(void *lval, unsigned *loc, parser_ctx_t *ctx) ctx->last_nl = ctx->ptr-ctx->code; } + /* Track paren depth and update is_statement_ctx only at depth 0. + * Tokens inside parentheses (e.g. the 'idx' in 'obj(idx).method') + * must not affect statement context tracking. */ + if(ret == '(' || ret == tEXPRLBRACKET) + ctx->paren_depth++; + else if(ret == ')' || ret == tEMPTYBRACKETS) { + if(ctx->paren_depth > 0) + ctx->paren_depth--; + } + + if(ctx->paren_depth == 0) { + switch(ret) { + case tNL: + case ':': + case tTHEN: + case tELSE: + case tELSEIF: + ctx->is_statement_ctx = TRUE; + break; + case tDOT: + case '.': + case ')': + case tEMPTYBRACKETS: + /* Dots and closing parens in member chains keep current state. */ + break; + case tIdentifier: + /* First identifier after statement start or after a dot keeps TRUE. + * An identifier after another identifier (e.g. 'Sub Arg') means + * we're in argument context, so set to FALSE. */ + if(ctx->last_token != '.' && ctx->last_token != tDOT + && ctx->last_token != tNL && ctx->last_token != ':' + && ctx->last_token != tTHEN && ctx->last_token != tELSE + && ctx->last_token != tELSEIF) + ctx->is_statement_ctx = FALSE; + break; + default: + ctx->is_statement_ctx = FALSE; + break; + } + } return (ctx->last_token = ret); } diff --git a/dlls/vbscript/parse.h b/dlls/vbscript/parse.h index 64c3dee2a69..b9fdcfb843a 100644 --- a/dlls/vbscript/parse.h +++ b/dlls/vbscript/parse.h @@ -301,6 +301,8 @@ typedef struct { LCID lcid; int last_token; + BOOL is_statement_ctx; + int paren_depth; unsigned last_nl; statement_t *stats; diff --git a/dlls/vbscript/parser.y b/dlls/vbscript/parser.y index 42df9afaf8f..6a3ca597f1f 100644 --- a/dlls/vbscript/parser.y +++ b/dlls/vbscript/parser.y @@ -772,9 +772,13 @@ static call_expression_t *make_call_expression(parser_ctx_t *ctx, expression_t * return call_expr; if(arguments->type != EXPR_NOARG) { - FIXME("Invalid syntax: missing comma\n"); - ctx->hres = E_FAIL; - return NULL; + /* 'f (x) + expr, ...' — combine bracketed arg with the +/- expression. */ + expression_t *remaining = arguments->next; + call_expr->args = new_binary_expression(ctx, EXPR_ADD, call_expr->args, arguments); + if(!call_expr->args) + return NULL; + call_expr->args->next = remaining; + return call_expr; } call_expr->args->next = arguments->next; @@ -1241,6 +1245,8 @@ HRESULT parse_script(parser_ctx_t *ctx, const WCHAR *code, const WCHAR *delimite ctx->hres = S_OK; ctx->error_loc = -1; ctx->last_token = tNL; + ctx->is_statement_ctx = TRUE; + ctx->paren_depth = 0; ctx->last_nl = 0; ctx->stats = ctx->stats_tail = NULL; ctx->class_decls = NULL; diff --git a/dlls/vbscript/tests/lang.vbs b/dlls/vbscript/tests/lang.vbs index 4374d7b1f25..ac0a8f581f3 100644 --- a/dlls/vbscript/tests/lang.vbs +++ b/dlls/vbscript/tests/lang.vbs @@ -1045,6 +1045,86 @@ Call TestSubExit2 TestSubMultiArgs 1, 2, 3, 4, 5 Call TestSubMultiArgs(1, 2, 3, 4, 5) +Sub TestSubParenExpr(a, b) + Call ok(a=16, "a = " & a) + Call ok(b=7, "b = " & b) +End Sub + +TestSubParenExpr (2) * 8, 7 +TestSubParenExpr 8 * (2), 7 + +Sub TestSubParenExprAdd(a, b) + Call ok(a=6, "a = " & a) + Call ok(b=7, "b = " & b) +End Sub + +TestSubParenExprAdd (2) + 4, 7 +TestSubParenExprAdd 4 + (2), 7 + +Sub TestSubParenExprNoSpace(a) + Call ok(a=6, "a = " & a) +End Sub + +TestSubParenExprNoSpace(10 \ 2) + 1 + +' Regression test: function call with space before ( in expression context +' e.g. x = (CInt (2) + 1) * 3 must parse and evaluate correctly +x = CInt (2) + 1 +Call ok(x = 3, "CInt (2) + 1 = " & x) +x = (CInt (2) + 1) * 3 +Call ok(x = 9, "(CInt (2) + 1) * 3 = " & x) + +' Regression test: function call with space before ( and * in expression context +' e.g. x = CInt (2) * 3 must treat CInt (2) as a function call, not expression grouping +x = CInt (2) * 3 +Call ok(x = 6, "CInt (2) * 3 = " & x) + +' Test member expression in statement context: obj.Method (x) * y, z +Class TestObjParenExpr + Sub Check(a, b) + Call ok(a=16, "obj a = " & a) + Call ok(b=7, "obj b = " & b) + End Sub +End Class + +Dim objParenExpr +Set objParenExpr = New TestObjParenExpr +objParenExpr.Check (2) * 8, 7 + +Sub TestSubParenExprConcat(a, b) + Call ok(a="helloworld", "a = " & a) + Call ok(b=7, "b = " & b) +End Sub + +TestSubParenExprConcat ("hello") & "world", 7 + +' Test: function call as argument with & after paren must be a call, not grouping +' e.g. TestSub Mid ("hello", 2) & "x" should call TestSub with "ellox" +' Mid("hello", 2) returns "ello", & "x" concatenates to "ellox" +Sub TestSubArgCallConcat(a) + Call ok(a="ellox", "a = " & a) +End Sub + +TestSubArgCallConcat Mid ("hello", 2) & "x" + +' Test: obj(idx).method (expr) * val, y in statement context +' The (expr) after .method must be expression grouping, not call paren +Class TestIndexedObjParenExpr + Public arr_(1) + Public Sub Init() + Set arr_(0) = New TestObjParenExpr + End Sub + Public Default Property Get Item(idx) + Set Item = arr_(idx) + End Property +End Class + +Dim idxObj +Set idxObj = New TestIndexedObjParenExpr +Call idxObj.Init() +idxObj(0).Check (2) * 8, 7 + + Sub TestSubLocalVal x = false Call ok(not x, "local x is not false?") -- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/10244
@jacek I would really like this one to be reviewed. It's the fix that brings the most value of all my MR's. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10244#note_132969
Sorry that some MRs take longer than others to get to. I'd like to see this bug fixed the most as well, but it is also a tricky one for me to review. It has been a long time since I worked on vbscript, so I need to relearn the code in the process. Overall, the patch is doing a lot of logic in the lexer that really belongs in the parser. This is already true with the existing hack, but extending it this much further does not seem right to me. I know that it is tricky, which is why we have the hack, but as your tests show, such hacks tend to push problems into corner cases rather than being a complete solution. My preference would be to handle this in the parser, even if it means more invasive changes there. If that is not feasible, at least let us try to keep the special-casing to a minimum. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10244#note_133015
On Fri Mar 20 13:57:42 2026 +0000, Jacek Caban wrote:
Sorry that some MRs take longer than others to get to. I'd like to see this bug fixed the most as well, but it is also a tricky one for me to review. It has been a long time since I worked on vbscript, so I need to relearn the code in the process. Overall, the patch is doing a lot of logic in the lexer that really belongs in the parser. This is already true with the existing hack, but extending it this much further does not seem right to me. I know that it is tricky, which is why we have the hack, but as your tests show, such hacks tend to push problems into corner cases rather than being a complete solution. My preference would be to handle this in the parser, even if it means more invasive changes there. If that is not feasible, at least let us try to keep the special-casing to a minimum. I think this can maybe be solved on the parser side but with a lot more duplication. The current result is what I ended up with after many iterations. It's also already validated and in use on the vpinball project.
I fear that some of these issues just don't have a clean solution. But I can have another go on the parser side. Maybe in a different MR with the same tests. Then we can compare. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10244#note_133030
On Fri Mar 20 13:57:42 2026 +0000, Francis De Brabandere wrote:
I think this can maybe be solved on the parser side but with a lot more duplication. The current result is what I ended up with after many iterations. It's also already validated and in use on the vpinball project. I fear that some of these issues just don't have a clean solution. But I can have another go on the parser side. Maybe in a different MR with the same tests. Then we can compare. After having another look I don't think we can handle this without the help of the lexer prediction/parenthesis tracking because bison can't look ahead. A GLR parser can do this, eg https://github.com/uwol/proleap-vb6-parser/blob/main/src/main/antlr4/io/prol...
The current solution works, there's good test coverage. Nothing keeps somebody to come up with a cleaner solution later. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10244#note_133122
participants (3)
-
Francis De Brabandere -
Francis De Brabandere (@francisdb) -
Jacek Caban (@jacek)