From: Arkadiusz Hiler ahiler@codeweavers.com
Signed-off-by: Arkadiusz Hiler ahiler@codeweavers.com --- programs/cmd/batch.c | 4 ++++ programs/cmd/tests/batch.c | 22 ++++++++++++++++++++++ programs/cmd/tests/test_builtins.cmd | 3 +++ programs/cmd/tests/test_builtins.cmd.exp | 2 ++ 4 files changed, 31 insertions(+)
diff --git a/programs/cmd/batch.c b/programs/cmd/batch.c index 9a262c5fec5..5665eb8be53 100644 --- a/programs/cmd/batch.c +++ b/programs/cmd/batch.c @@ -272,6 +272,10 @@ WCHAR *WCMD_fgets(WCHAR *buf, DWORD noChars, HANDLE h) for (p = bufA; p < (bufA + charsRead); p = CharNextExA(cp, p, 0)) { if (*p == '\n' || *p == '\r') break; + if (*p == '\0') { + heap_free(bufA); + return NULL; + } }
/* Sets file pointer to the start of the next line, if any */ diff --git a/programs/cmd/tests/batch.c b/programs/cmd/tests/batch.c index 3846693ef2e..ade6587be81 100644 --- a/programs/cmd/tests/batch.c +++ b/programs/cmd/tests/batch.c @@ -455,6 +455,24 @@ static int cmd_available(void) return FALSE; }
+void create_nul_test_file(void) +{ + HANDLE file; + DWORD size; + BOOL bres; + char contents[] = "a b c\nd e\0f\ng h i"; + + file = CreateFileA("nul_test_file", GENERIC_WRITE, 0, NULL, CREATE_ALWAYS, + FILE_ATTRIBUTE_NORMAL, NULL); + ok(file != INVALID_HANDLE_VALUE, "CreateFile failed\n"); + if(file == INVALID_HANDLE_VALUE) + return; + + bres = WriteFile(file, contents, ARRAYSIZE(contents), &size, NULL); + ok(bres, "Could not write to file: %lu\n", GetLastError()); + CloseHandle(file); +} + START_TEST(batch) { int argc; @@ -479,9 +497,13 @@ START_TEST(batch) } shortpath_len = GetShortPathNameA(path, shortpath, ARRAY_SIZE(shortpath));
+ create_nul_test_file(); + argc = winetest_get_mainargs(&argv); if(argc > 2) run_from_file(argv[2]); else EnumResourceNamesA(NULL, "TESTCMD", test_enum_proc, 0); + + DeleteFileA("nul_test_file"); } diff --git a/programs/cmd/tests/test_builtins.cmd b/programs/cmd/tests/test_builtins.cmd index 3f410e55166..3829be50f1a 100644 --- a/programs/cmd/tests/test_builtins.cmd +++ b/programs/cmd/tests/test_builtins.cmd @@ -1957,6 +1957,9 @@ FOR /F "delims=. tokens=1*" %%A IN (testfile) DO @echo 5:%%A,%%B FOR /F "delims=. tokens=2*" %%A IN (testfile) DO @echo 6:%%A,%%B FOR /F "delims=. tokens=3*" %%A IN (testfile) DO @echo 7:%%A,%%B del testfile +rem file contains NUL, created by the .exe +for /f %%A in (nul_test_file) DO echo %%A +for /f "tokens=*" %%A in (nul_test_file) DO echo %%A
echo ------------ Testing del ------------ echo abc > file diff --git a/programs/cmd/tests/test_builtins.cmd.exp b/programs/cmd/tests/test_builtins.cmd.exp index 8b6e0914112..437ded18000 100644 --- a/programs/cmd/tests/test_builtins.cmd.exp +++ b/programs/cmd/tests/test_builtins.cmd.exp @@ -1315,6 +1315,8 @@ h=%h i=b j=c k= l= m=%m n=%n o=%o@or_broken@h=%h i=b j=c k= l= m= n=%n o=%o 4:3.14,%B 5:3,14 6:14, +a +a b c ------------ Testing del ------------ deleting 'file' errorlevel is 0, good
the changes to fgets don't look fully right you should return the characters in bufA until the NUL character (as done for \n) if there are some, and NULL in case there are none... (it works in your test case as the string 'a b c' is terminated by the new line)
Hi. Thanks for the review.
Either I don't fully understand your comment or it doesn't work the way you describe. According to my testing the line containing NUL and every line after it seems to be fully ignored by `FOR /F`. If we would return characters up to NUL that would not be the case.
Can you elaborate a bit more?
Hi Arek
I'm not concerned about droping all characters after the NULL (1)
I'm concerned by what's before the NUL character! and basically what should be an marker for end of file
if instead of writing "a b c\n\0f .." into your test file, you just write "a b c\0" (removing the CR at the end of "a b c", I believe that "a b c" should be returned by WCMD_fgets and subsequent WCMD_fgets should return NULL to mark EOF.
(so from a cursory look in pseudo code ``` if (*p == '\0' && p > bufA) break; /* reposition at '\0' to reread the NULL in next call) */ if (*p == '\0' && p == bufA) return NULL; ``` )
(1): if you have time, it would be interesting to see what happens if instead of \0, ^Z (0x1A) is used (it's the end of text file marker in the old DOS days). I wouldn't be surprised if it stops also reading the file. (and maybe at some point we'd better use read/write from msvcrt which should handle all this DOS stuff directly, but that's likely a too large change for what you're trying to solve)
since WCMD_fgets is used by most of the input methods, I want to be sure that this doens't break other things
I tested some more on my side, and it's not what I expected :-( https://testbot.winehq.org/JobDetails.pl?Key=117992
these tests need to be extended (cannot tell which one of the two new tests display nothing) (likely the contents1 displays nothing and for contents2 printf doesn't print the ctrl-z character)
anyway, I tried also to run this simple batch on windows: `"echo\necho\0\echo"` and it hangs after printing the first 'echo on' (so \0 isn't a generic EOF marker)
so changing WCMD_fgets when running into \0 to return EOF isn't likely the right solution (at least as a generic solution) (and I still don't like in case of "abc\0" not returning the "abc" characters)
more tests are needed :-(
Hi Arek,
After testing a bit more locally, what I get is: - no support for ctrl-z in for /F handling - current code (before your patch) is looping endlessly in WCMD_fgets when a stray NUL character is read from a file - native cmd.exe behavior (tested in W11) is inconsistent + native cmd stops reading a line with a stray NUL (like your tests show) and throws it away + native cmd loops endlessly in some cases (like when parsing a .bat/.cmd file) ("echo\necho\0echo" thingie)
So it boils down to what error handling we put in place in case of a NUL character in text file.
Your patch implies: - never loop endlessly for all callers on NUL character - NUL is an EOF marker - throw away previous characters on line (if any)
So I'd rather fix WCMD_fgets with attached [patch](/uploads/24ba4006fb80e1460e6f0251e0dd9160/err): - never loop endlessly for all callers on NUL character (not consistent with native, but we can live with that) - NUL is not an EOF marker - just include the NUL in the returned buffer + so if caller cares (like in 'for /F' handler), he can take appropriate actions + if caller doesn't care (like the other actual callers in builtin cmd.exe, he will get the characters before the NUL, and continue on next line
But I agree that we likely haven't been hit by the differences above as it would be still be looping endlessly <g>.
Ah. Thanks for testing! I didn't even know about the `^Z` behavior. A lot of the Windows things are fairly foreign to me.
I was testing `WCMD_for()` while failing to recognize that `WCMD_fgets()` may be used by other things that behave a bit differently.
Your attached patch looks better than my attempt at the fix and solves my problem. Can you make it in a proper patch so we can supersede this MR? I don't feel like claiming authorship on this one.
I did some extra testing myself and there's even more weirdness to `FOR /F`... It handles UTF-16 but only in some cases. When you try to read a UTF-16 file with a BOM it just stop on the first NUL byte, but if the output comes from a command (via USEBACKQ, one example would be `wmic`) it seems to handle UTF-16 just fine. So it looks like there's some conversion happening when shelling out. But that looks like a separate problem.
Hi Arek
regarding the UTF16 items: - looks like the handling of file reading and command output (via pipes or similar) isn't fully consistent :-( - we may have to reconsider the unique WCMD_fgets implementation. - yes separate issue (do you have an app depending on it?). At least this MR will get the case for reading from a file with a BOM right <g>