[PATCH v2 0/2] MR8331: urlmon: Fix out-of-bound access when trying to remove dot segments from URI.
Dot segments removal in parse_canonicalize makes the assumption that the entire path part of the URI fits into the output buffer (len is the length the output URI would have taken had the output buffer been large enough). So when the output buffer is too small, remove_dot_segments will access out-of-bound. Considering URI unescaping could be happening at the same time, there is no simple way to calculate the required output buffer size correctly without keeping the actual output URI in memory. * * * As you can see from the [test results](https://testbot.winehq.org/JobDetails.pl?Key=158742), this seems to be bugged on native. I don't know if we need to replicate native's behavior -- v2: iertutil: Fix out-of-bound access in remove_dot_segments. urlmon/tests: Test output length calculation when output buffer is too small. https://gitlab.winehq.org/wine/wine/-/merge_requests/8331
From: Yuxuan Shui <yshui@codeweavers.com> Native handles this poorly. When the buffer isn't large enough for URI unescaping, native will return STATUS_SUCCESS with a 0 output length. Besides that, on <= Windows 10 v1507, there are additional cases where the returned length is 0. This appears to have been fixed in later versions. Unsure if we need bug compatibility here. --- dlls/urlmon/tests/uri.c | 26 +++++++++++++++++++++++--- 1 file changed, 23 insertions(+), 3 deletions(-) diff --git a/dlls/urlmon/tests/uri.c b/dlls/urlmon/tests/uri.c index 4a0ae350b33..ec8a451aba1 100644 --- a/dlls/urlmon/tests/uri.c +++ b/dlls/urlmon/tests/uri.c @@ -8250,6 +8250,9 @@ typedef struct _uri_parse_test { const char *property; HRESULT expected; BOOL todo; + /* Whether CoInternetParseIUri returns S_OK when provided + * with a buffer that's too small. */ + BOOL insufficient_buffer_ok; const char *property2; } uri_parse_test; @@ -8257,7 +8260,8 @@ static const uri_parse_test uri_parse_tests[] = { /* PARSE_CANONICALIZE tests. */ {"zip://google.com/test<|>",0,PARSE_CANONICALIZE,0,"zip://google.com/test<|>",S_OK,FALSE}, {"http://google.com/test<|>",0,PARSE_CANONICALIZE,0,"http://google.com/test%3C%7C%3E",S_OK,FALSE}, - {"http://google.com/%30%23%3F",0,PARSE_CANONICALIZE,URL_UNESCAPE,"http://google.com/0#?",S_OK,FALSE}, + {"http://google.com/%30%23%3F",0,PARSE_CANONICALIZE,URL_UNESCAPE,"http://google.com/0#?",S_OK,FALSE,TRUE}, + {"http://google.com/%30%23%3F/..",0,PARSE_CANONICALIZE,URL_UNESCAPE,"http://google.com/",S_OK,FALSE,TRUE}, {"test <|>",Uri_CREATE_ALLOW_RELATIVE,PARSE_CANONICALIZE,URL_ESCAPE_UNSAFE,"test %3C%7C%3E",S_OK,FALSE}, {"test <|>",Uri_CREATE_ALLOW_RELATIVE,PARSE_CANONICALIZE,URL_ESCAPE_SPACES_ONLY,"test%20<|>",S_OK,FALSE}, {"test%20<|>",Uri_CREATE_ALLOW_RELATIVE,PARSE_CANONICALIZE,URL_UNESCAPE|URL_ESCAPE_UNSAFE,"test%20%3C%7C%3E",S_OK,FALSE}, @@ -8317,7 +8321,7 @@ static const uri_parse_test uri_parse_tests[] = { {"file://server/test",0,PARSE_SITE,0,"server",S_OK,FALSE}, /* PARSE_DOMAIN tests. */ - {"http://google.com.uk/",0,PARSE_DOMAIN,0,"google.com.uk",S_OK,FALSE,"com.uk"}, + {"http://google.com.uk/",0,PARSE_DOMAIN,0,"google.com.uk",S_OK,FALSE,FALSE,"com.uk"}, {"http://google.com.com/",0,PARSE_DOMAIN,0,"com.com",S_OK,FALSE}, {"test/test",Uri_CREATE_ALLOW_RELATIVE,PARSE_DOMAIN,0,"",S_OK,FALSE}, {"file://server/test",0,PARSE_DOMAIN,0,"",S_OK,FALSE}, @@ -11725,9 +11729,25 @@ static void test_CoInternetParseIUri(void) { hr = pCreateUri(uriW, test.uri_flags, 0, &uri); ok(SUCCEEDED(hr), "Error: CreateUri returned 0x%08lx on uri_parse_tests[%ld].\n", hr, i); if(SUCCEEDED(hr)) { - WCHAR result[INTERNET_MAX_URL_LENGTH+1]; + WCHAR result[INTERNET_MAX_URL_LENGTH+1], short_result[1]; DWORD result_len = -1; + if (!test.expected && lstrlenA(test.property) > 1) { + HRESULT expected_hr = test.insufficient_buffer_ok ? 0 : STRSAFE_E_INSUFFICIENT_BUFFER; + DWORD expected_len = test.insufficient_buffer_ok ? 0 : lstrlenA(test.property); + /* test result_len calculation with insufficient buffer. */ + hr = pCoInternetParseIUri(uri, test.action, test.flags, short_result, 1, &result_len, 0); + todo_wine_if(test.insufficient_buffer_ok) + ok(hr == expected_hr, + "Error: CoInternetParseIUri returned 0x%08lx, expected 0x%08lx on uri_parse_tests[%ld].\n", + hr, expected_hr, i); + todo_wine_if(test.insufficient_buffer_ok) + ok((expected_len == result_len || broken(0 == result_len) /* <= win10 v1507 */) || + (test.property2 && lstrlenA(test.property2) == result_len), + "Error: Expected %lu, but got %ld instead on uri_parse_tests[%ld] - %s.\n", + expected_len, result_len, i, wine_dbgstr_w(uriW)); + } + hr = pCoInternetParseIUri(uri, test.action, test.flags, result, INTERNET_MAX_URL_LENGTH+1, &result_len, 0); todo_wine_if(test.todo) ok(hr == test.expected, -- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/8331
From: Yuxuan Shui <yshui@codeweavers.com> parse_canonicalize will stop writing into the output buffer when the amount to write exceeded output_len, but it will still call remove_dot_segments and give it a path length that goes past the end of the output buffer, causing it to read out of bound. Additionally, we can't simply reject the input once we exceeded the capacity of the output buffer because dot segments removal might bring the length down within the limit. The current implementation is incorrect in this case since the output has already been cut off. This commit creates a temporary buffer and does dot segments removal first, and then everything else. There is further evidence that this is how native does it too, because if a path segment is "%2E." (%2E is "." when decoded), native will not remove it, which it would have done if dot segments removal happens after unescaping. This is included as an additional uri_parse_test. --- dlls/iertutil/uri.c | 53 +++++++++++++++++++---------------------- dlls/urlmon/tests/uri.c | 1 + 2 files changed, 25 insertions(+), 29 deletions(-) diff --git a/dlls/iertutil/uri.c b/dlls/iertutil/uri.c index 38408c4baf5..c2ad0915bf8 100644 --- a/dlls/iertutil/uri.c +++ b/dlls/iertutil/uri.c @@ -6332,8 +6332,8 @@ static HRESULT combine_uri(Uri *base, Uri *relative, DWORD flags, IUri **result, static HRESULT parse_canonicalize(const Uri *uri, DWORD flags, LPWSTR output, DWORD output_len, DWORD *result_len) { - const WCHAR *ptr = NULL; - WCHAR *path = NULL; + const WCHAR *ptr = NULL, *end_ptr = NULL; + WCHAR *reduced = NULL; const WCHAR **pptr; DWORD len = 0; BOOL reduce_path; @@ -6354,24 +6354,28 @@ static HRESULT parse_canonicalize(const Uri *uri, DWORD flags, LPWSTR output, reduce_path = !(flags & URL_DONT_SIMPLIFY) && ptr && check_hierarchical(pptr); - for(ptr = uri->canon_uri; ptr < uri->canon_uri+uri->canon_len; ++ptr) { - BOOL do_default_action = TRUE; - - /* Keep track of the path if we need to remove dot segments from - * it later. - */ - if(reduce_path && !path && ptr == uri->canon_uri+uri->path_start) - path = output+len; - - /* Check if it's time to reduce the path. */ - if(reduce_path && ptr == uri->canon_uri+uri->path_start+uri->path_len) { - DWORD current_path_len = (output+len) - path; - DWORD new_path_len = remove_dot_segments(path, current_path_len); + /* Reduce path first, otherwise it's difficult to check if the output will exceed + * output_len during escaping/unescaping. */ + if (reduce_path && uri->path_start > -1) + { + DWORD path_end, new_path_end; + path_end = uri->path_start+uri->path_len; + reduced = malloc(sizeof(WCHAR) * uri->canon_len); + memcpy(reduced, uri->canon_uri, sizeof(WCHAR) * path_end); + new_path_end = uri->path_start+remove_dot_segments(reduced+uri->path_start, uri->path_len); + memcpy(reduced+new_path_end, uri->canon_uri+path_end, + sizeof(WCHAR) * (uri->canon_len-path_end)); + ptr = reduced; + end_ptr = ptr+new_path_end+(uri->canon_len-path_end); + } + else + { + ptr = uri->canon_uri; + end_ptr = ptr+uri->canon_len; + } - /* Update the current length. */ - len -= (current_path_len-new_path_len); - reduce_path = FALSE; - } + for(; ptr < end_ptr; ++ptr) { + BOOL do_default_action = TRUE; if(*ptr == '%') { const WCHAR decoded = decode_pct_val(ptr); @@ -6416,16 +6420,7 @@ static HRESULT parse_canonicalize(const Uri *uri, DWORD flags, LPWSTR output, } } - /* Sometimes the path is the very last component of the IUri, so - * see if the dot segments need to be reduced now. - */ - if(reduce_path && path) { - DWORD current_path_len = (output+len) - path; - DWORD new_path_len = remove_dot_segments(path, current_path_len); - - /* Update the current length. */ - len -= (current_path_len-new_path_len); - } + if (reduced) free(reduced); if(len < output_len) output[len] = 0; diff --git a/dlls/urlmon/tests/uri.c b/dlls/urlmon/tests/uri.c index ec8a451aba1..6767f278579 100644 --- a/dlls/urlmon/tests/uri.c +++ b/dlls/urlmon/tests/uri.c @@ -8270,6 +8270,7 @@ static const uri_parse_test uri_parse_tests[] = { {"http://google.com/test/../",Uri_CREATE_NO_CANONICALIZE,PARSE_CANONICALIZE,URL_NO_META,"http://google.com/test/../",S_OK,FALSE}, {"http://google.com/test/../",Uri_CREATE_NO_CANONICALIZE,PARSE_CANONICALIZE,0,"http://google.com/",S_OK,FALSE}, {"zip://google.com/test/../",Uri_CREATE_NO_CANONICALIZE,PARSE_CANONICALIZE,0,"zip://google.com/",S_OK,FALSE}, + {"zip://google.com/test/%2E./",Uri_CREATE_NO_CANONICALIZE,PARSE_CANONICALIZE,URL_UNESCAPE,"zip://google.com/test/../",S_OK,FALSE,TRUE}, {"file:///c:/test/../test",Uri_CREATE_NO_CANONICALIZE,PARSE_CANONICALIZE,URL_DONT_SIMPLIFY,"file:///c:/test/../test",S_OK,FALSE}, /* PARSE_FRIENDLY tests. */ -- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/8331
Hi, it's been a while since I somewhat forgotten about this MR. I updated the fix to do a dots removal pass with a temporary buffer first. And fixed the tests to check against what the native does and use `todo_wine`. Native's behavior is different before/after Windows 10 v1507. I believe `broken()` is appropriate in this case? -- https://gitlab.winehq.org/wine/wine/-/merge_requests/8331#note_136645
Jacek Caban (@jacek) commented about dlls/iertutil/uri.c:
- * it later. - */ - if(reduce_path && !path && ptr == uri->canon_uri+uri->path_start) - path = output+len; - - /* Check if it's time to reduce the path. */ - if(reduce_path && ptr == uri->canon_uri+uri->path_start+uri->path_len) { - DWORD current_path_len = (output+len) - path; - DWORD new_path_len = remove_dot_segments(path, current_path_len); + /* Reduce path first, otherwise it's difficult to check if the output will exceed + * output_len during escaping/unescaping. */ + if (reduce_path && uri->path_start > -1) + { + DWORD path_end, new_path_end; + path_end = uri->path_start+uri->path_len; + reduced = malloc(sizeof(WCHAR) * uri->canon_len); Please check for allocation failure.
-- https://gitlab.winehq.org/wine/wine/-/merge_requests/8331#note_136772
participants (3)
-
Jacek Caban (@jacek) -
Yuxuan Shui -
Yuxuan Shui (@yshui)