> static BOOL skip_spaces(parser_ctx_t *ctx)
> {
> - while(ctx->ptr < ctx->end && isspaceW(*ctx->ptr)) {
> + while(ctx->ptr < ctx->end && (isspaceW(*ctx->ptr) || *ctx->ptr == 0xFEFF /* UTF16 BOM */)) {
> if(is_endline(*ctx->ptr++))
> ctx->nl = TRUE;
> }
This looks correct according to ECMA-252 section 7.2 - all of the
following is a whitespace:
- tab and vertical tab, 0x9 and 0xb;
- form feed 0xc
- space 0x20
- NBSP 0xa0
- UTF-16 BOM 0xfeff
- any other Unicode "space separator"
Hopefully isspaceW() covers everything but the BOM. What worries me is
that isspaceW() itself is used in numerous places in code on its own. So
probably we need more tests to cover more cases where space separators
could be used, and later have our own is_space() call that will conform
to the standard.