Normally I am a strong proponent of adjusting implementations one small step at a time, but this is a bit of an extreme case.
The current state of UrlCanonicalize() is a prototypical example of "spaghetti code": the implementation has been repeatedly adjusted to fix a new discovered edge case, without properly testing the scope of the new logic, with the effect that the current logic is a scattered mess of conditions that interact in unpredictable ways.
To be fair, the actual function is much more complicated than one would anticipate, and the number of new weird edge cases I found while trying to flesh out existing logic was rather exhausting.
I initially tried to "fix" the existing implementation one step at a time. After racking up dozens of commits without an end in sight, though, and having to waste time carefully unravelling the broken code in the right order so as to avoid temporary failures, I decided instead to just fix everything at once, effectively rewriting the function from scratch, and this proved to be much more productive.
Hopefully the huge swath of new tests is enough to prove that this new implementation really is correct, and has no more spaghetti than what Microsoft made necessary.
-- v3: shlwapi/tests: Add many more tests for UrlCanonicalize(). kernelbase: Use scheme_is_opaque() in UrlIs(). kernelbase: Reimplement UrlCanonicalize().
From: Zebediah Figura zfigura@codeweavers.com
--- dlls/urlmon/tests/misc.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/dlls/urlmon/tests/misc.c b/dlls/urlmon/tests/misc.c index 796f54a4f62..e65feed7318 100644 --- a/dlls/urlmon/tests/misc.c +++ b/dlls/urlmon/tests/misc.c @@ -401,6 +401,17 @@ static void test_CoInternetParseUrl(void) if(parse_tests[i].rootdocument) ok(!lstrcmpW(parse_tests[i].rootdocument, buf), "[%d] wrong rootdocument, received %s\n", i, wine_dbgstr_w(buf)); } + + size = 0xdeadbeef; + hres = pCoInternetParseUrl(L"http://a/b/../c", PARSE_CANONICALIZE, 0, buf, 3, &size, 0); + ok(hres == E_POINTER, "got %#lx\n", hres); + ok(size == wcslen(L"http://a/c") + 1, "got %lu\n", size); + + size = 0xdeadbeef; + hres = pCoInternetParseUrl(L"http://a/b/../c", PARSE_CANONICALIZE, 0, buf, sizeof(buf), &size, 0); + ok(hres == S_OK, "got %#lx\n", hres); + ok(!wcscmp(buf, L"http://a/c"), "got %s\n", debugstr_w(buf)); + ok(size == wcslen(buf), "got %lu\n", size); }
static void test_CoInternetCompareUrl(void)
From: Zebediah Figura zfigura@codeweavers.com
--- dlls/shlwapi/tests/url.c | 245 +++++++++++++++++++-------------------- 1 file changed, 120 insertions(+), 125 deletions(-)
diff --git a/dlls/shlwapi/tests/url.c b/dlls/shlwapi/tests/url.c index 160f7696cc2..d99be7b6a43 100644 --- a/dlls/shlwapi/tests/url.c +++ b/dlls/shlwapi/tests/url.c @@ -74,127 +74,6 @@ static const TEST_URL_APPLY TEST_APPLY[] = { {"file://server/share", URL_APPLY_GUESSSCHEME, S_FALSE}, };
-/* ################ */ - -typedef struct _TEST_URL_CANONICALIZE { - const char *url; - DWORD flags; - const char *expecturl; - BOOL todo; -} TEST_URL_CANONICALIZE; - -static const TEST_URL_CANONICALIZE TEST_CANONICALIZE[] = { - {"", 0, ""}, - {"http://www.winehq.org/tests/../tests/../..", 0, "http://www.winehq.org/", TRUE}, - {"http://www.winehq.org/..", 0, "http://www.winehq.org/..", FALSE}, - {"http://www.winehq.org/tests/tests2/../../tests", 0, "http://www.winehq.org/tests", FALSE}, - {"http://www.winehq.org/tests/../tests", 0, "http://www.winehq.org/tests", FALSE}, - {"http://www.winehq.org/tests%5Cn", URL_WININET_COMPATIBILITY|URL_ESCAPE_SPACES_ONLY|URL_ESCAPE_UNSAFE, "http://www.winehq.org/tests", FALSE}, - {"http://www.winehq.org/tests%5Cr", URL_WININET_COMPATIBILITY|URL_ESCAPE_SPACES_ONLY|URL_ESCAPE_UNSAFE, "http://www.winehq.org/tests", FALSE}, - {"http://www.winehq.org/tests%5Cr", 0, "http://www.winehq.org/tests", FALSE}, - {"http://www.winehq.org/tests%5Cr", URL_DONT_SIMPLIFY, "http://www.winehq.org/tests", FALSE}, - {"http://www.winehq.org/tests/../tests/", 0, "http://www.winehq.org/tests/", FALSE}, - {"http://www.winehq.org/tests/../tests/..", 0, "http://www.winehq.org/", FALSE}, - {"http://www.winehq.org/tests/../tests/../", 0, "http://www.winehq.org/", FALSE}, - {"http://www.winehq.org/tests/..", 0, "http://www.winehq.org/", FALSE}, - {"http://www.winehq.org/tests/../", 0, "http://www.winehq.org/", FALSE}, - {"http://www.winehq.org/tests/..?query=x&return=y", 0, "http://www.winehq.org/?query=x&return=y", FALSE}, - {"http://www.winehq.org/tests/../?query=x&return=y", 0, "http://www.winehq.org/?query=x&return=y", FALSE}, - {"\tht\ttp\t://www\t.w\tineh\t\tq.or\tg\t/\ttests/..\t?\tquer\ty=x\t\t&re\tturn=y\t\t", 0, "http://www.winehq.org/?query=x&return=y", FALSE}, - {"http://www.winehq.org/tests/..#example", 0, "http://www.winehq.org/#example", FALSE}, - {"http://www.winehq.org/tests/../#example", 0, "http://www.winehq.org/#example", FALSE}, - {"http://www.winehq.org/tests%5C%5C../#example", 0, "http://www.winehq.org/#example", FALSE}, - {"http://www.winehq.org/tests/..%5C%5C#example", 0, "http://www.winehq.org/#example", FALSE}, - {"http://www.winehq.org%5C%5Ctests/../#example", 0, "http://www.winehq.org/#example", FALSE}, - {"http://www.winehq.org/tests/../#example", URL_DONT_SIMPLIFY, "http://www.winehq.org/tests/../#example", FALSE}, - {"http://www.winehq.org/tests/foo bar", URL_ESCAPE_SPACES_ONLY | URL_DONT_ESCAPE_EXTRA_INFO, "http://www.winehq.org/tests/foo%20bar", FALSE}, - {"http://www.winehq.org/tests/foo%20bar", URL_UNESCAPE, "http://www.winehq.org/tests/foo bar", FALSE}, - {"http://www.winehq.org", 0, "http://www.winehq.org/", FALSE}, - {"http:///www.winehq.org", 0, "http:///www.winehq.org", FALSE}, - {"http:////www.winehq.org", 0, "http:////www.winehq.org", FALSE}, - {"file:///c:/tests/foo%20bar", URL_UNESCAPE, "file:///c:/tests/foo bar", FALSE}, - {"file:///c:/tests\foo%20bar", URL_UNESCAPE, "file:///c:/tests/foo bar", FALSE}, - {"file:///c:/tests/foo%20bar", 0, "file:///c:/tests/foo%20bar", FALSE}, - {"file:///c:/tests/foo%20bar", URL_FILE_USE_PATHURL, "file://c:\tests\foo bar", FALSE}, - {"file://localhost/c:/tests/../tests/foo%20bar", URL_FILE_USE_PATHURL, "file://c:\tests\foo bar", FALSE}, - {"file://localhost\c:/tests/../tests/foo%20bar", URL_FILE_USE_PATHURL, "file://c:\tests\foo bar", FALSE}, - {"file://localhost\\c:/tests/../tests/foo%20bar", URL_FILE_USE_PATHURL, "file://c:\tests\foo bar", FALSE}, - {"file://localhost\c:\tests/../tests/foo%20bar", URL_FILE_USE_PATHURL, "file://c:\tests\foo bar", FALSE}, - {"file://c:/tests/../tests/foo%20bar", URL_FILE_USE_PATHURL, "file://c:\tests\foo bar", FALSE}, - {"file://c:/tests\../tests/foo%20bar", URL_FILE_USE_PATHURL, "file://c:\tests\foo bar", FALSE}, - {"file://c:/tests/foo%20bar", URL_FILE_USE_PATHURL, "file://c:\tests\foo bar", FALSE}, - {"file:///c://tests/foo%20bar", URL_FILE_USE_PATHURL, "file://c:\\tests\foo bar", FALSE}, - {"file:///c:\tests\foo bar", 0, "file:///c:/tests/foo bar", FALSE}, - {"file:///c:\tests\foo bar", URL_DONT_SIMPLIFY, "file:///c:/tests/foo bar", FALSE}, - {"file:///c:\tests\foobar", 0, "file:///c:/tests/foobar", FALSE}, - {"file:///c:\tests\foobar", URL_WININET_COMPATIBILITY, "file://c:\tests\foobar", FALSE}, - {"file://home/user/file", 0, "file://home/user/file", FALSE}, - {"file:///home/user/file", 0, "file:///home/user/file", FALSE}, - {"file:////home/user/file", 0, "file://home/user/file", FALSE}, - {"file://home/user/file", URL_WININET_COMPATIBILITY, "file://\\home\user\file", FALSE}, - {"file:///home/user/file", URL_WININET_COMPATIBILITY, "file://\home\user\file", FALSE}, - {"file:////home/user/file", URL_WININET_COMPATIBILITY, "file://\\home\user\file", FALSE}, - {"file://///home/user/file", URL_WININET_COMPATIBILITY, "file://\\home\user\file", FALSE}, - {"file://C:/user/file", 0, "file:///C:/user/file", FALSE}, - {"file://C:/user/file/../asdf", 0, "file:///C:/user/asdf", FALSE}, - {"file:///C:/user/file", 0, "file:///C:/user/file", FALSE}, - {"file:////C:/user/file", 0, "file:///C:/user/file", FALSE}, - {"file://C:/user/file", URL_WININET_COMPATIBILITY, "file://C:\user\file", FALSE}, - {"file:///C:/user/file", URL_WININET_COMPATIBILITY, "file://C:\user\file", FALSE}, - {"file:////C:/user/file", URL_WININET_COMPATIBILITY, "file://C:\user\file", FALSE}, - {"http:///www.winehq.org", 0, "http:///www.winehq.org", FALSE}, - {"http:///www.winehq.org", URL_WININET_COMPATIBILITY, "http:///www.winehq.org", FALSE}, - {"http://www.winehq.org/site/about", URL_FILE_USE_PATHURL, "http://www.winehq.org/site/about", FALSE}, - {"file_://www.winehq.org/site/about", URL_FILE_USE_PATHURL, "file_://www.winehq.org/site/about", FALSE}, - {"c:\dir\file", 0, "file:///c:/dir/file", FALSE}, - {"file:///c:\dir\file", 0, "file:///c:/dir/file", FALSE}, - {"c:dir\file", 0, "file:///c:dir/file", FALSE}, - {"c:\tests\foo bar", URL_FILE_USE_PATHURL, "file://c:\tests\foo bar", FALSE}, - {"c:\tests\foo bar", 0, "file:///c:/tests/foo%20bar", FALSE}, - {"c\t:\t\te\tsts\fo\to \tbar\t", 0, "file:///c:/tests/foo%20bar", FALSE}, - {"res://file", 0, "res://file/", FALSE}, - {"res://file", URL_FILE_USE_PATHURL, "res://file/", FALSE}, - {"res:///c:/tests/foo%20bar", URL_UNESCAPE, "res:///c:/tests/foo bar", FALSE}, - {"res:///c:/tests\foo%20bar", URL_UNESCAPE, "res:///c:/tests\foo bar", FALSE}, - {"res:///c:/tests/foo%20bar", 0, "res:///c:/tests/foo%20bar", FALSE}, - {"res:///c:/tests/foo%20bar", URL_FILE_USE_PATHURL, "res:///c:/tests/foo%20bar", FALSE}, - {"res://c:/tests/../tests/foo%20bar", URL_FILE_USE_PATHURL, "res://c:/tests/foo%20bar", FALSE}, - {"res://c:/tests\../tests/foo%20bar", URL_FILE_USE_PATHURL, "res://c:/tests/foo%20bar", FALSE}, - {"res://c:/tests/foo%20bar", URL_FILE_USE_PATHURL, "res://c:/tests/foo%20bar", FALSE}, - {"res:///c://tests/foo%20bar", URL_FILE_USE_PATHURL, "res:///c://tests/foo%20bar", FALSE}, - {"res:///c:\tests\foo bar", 0, "res:///c:\tests\foo bar", FALSE}, - {"res:///c:\tests\foo bar", URL_DONT_SIMPLIFY, "res:///c:\tests\foo bar", FALSE}, - {"res://c:\tests\foo bar/res", URL_FILE_USE_PATHURL, "res://c:\tests\foo bar/res", FALSE}, - {"res://c:\tests/res\foo%20bar/strange\sth", 0, "res://c:\tests/res\foo%20bar/strange\sth", FALSE}, - {"res://c:\tests/res\foo%20bar/strange\sth", URL_FILE_USE_PATHURL, "res://c:\tests/res\foo%20bar/strange\sth", FALSE}, - {"res://c:\tests/res\foo%20bar/strange\sth", URL_UNESCAPE, "res://c:\tests/res\foo bar/strange\sth", FALSE}, - {"/A/../B/./C/../../test_remove_dot_segments", 0, "/test_remove_dot_segments", FALSE}, - {"/A/../B/./C/../../test_remove_dot_segments", URL_FILE_USE_PATHURL, "/test_remove_dot_segments", FALSE}, - {"/A/../B/./C/../../test_remove_dot_segments", URL_WININET_COMPATIBILITY, "/test_remove_dot_segments", FALSE}, - {"/A/B\C/D\E", 0, "/A/B\C/D\E", FALSE}, - {"/A/B\C/D\E", URL_FILE_USE_PATHURL, "/A/B\C/D\E", FALSE}, - {"/A/B\C/D\E", URL_WININET_COMPATIBILITY, "/A/B\C/D\E", FALSE}, - {"///A/../B", 0, "///B", FALSE}, - {"///A/../B", URL_FILE_USE_PATHURL, "///B", FALSE}, - {"///A/../B", URL_WININET_COMPATIBILITY, "///B", FALSE}, - {"A", 0, "A", FALSE}, - {"../A", 0, "../A", FALSE}, - {"A/../B", 0, "B", TRUE}, - {"/uri-res/N2R?urn:sha1:B3K", URL_DONT_ESCAPE_EXTRA_INFO | URL_WININET_COMPATIBILITY /*0x82000000*/, "/uri-res/N2R?urn:sha1:B3K", FALSE} /*LimeWire online installer calls this*/, - {"http:www.winehq.org/dir/../index.html", 0, "http:www.winehq.org/index.html"}, - {"http://localhost/test.html", URL_FILE_USE_PATHURL, "http://localhost/test.html%22%7D, - {"http://localhost/te%20st.html", URL_FILE_USE_PATHURL, "http://localhost/te%20st.html%22%7D, - {"http://www.winehq.org/%E6%A1%9C.html", URL_FILE_USE_PATHURL, "http://www.winehq.org/%E6%A1%9C.html%22%7D, - {"mk:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm", 0, "mk:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm"}, - {"ftp:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm", 0, "ftp:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm"}, - {"file:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm", 0, "file:@MSITStore:C:/Program Files/AutoCAD 2008/Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm"}, - {"http:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm", 0, "http:@MSITStore:C:/Program Files/AutoCAD 2008/Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm"}, - {"http:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm", URL_FILE_USE_PATHURL, "http:@MSITStore:C:/Program Files/AutoCAD 2008/Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm"}, - {"mk:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm", URL_FILE_USE_PATHURL, "mk:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm"}, -}; - -/* ################ */ - typedef struct _TEST_URL_ESCAPE { const char *url; DWORD flags; @@ -1084,6 +963,124 @@ static void test_UrlCanonicalizeA(void) DWORD urllen; HRESULT hr;
+ static const struct + { + const char *url; + DWORD flags; + const char *expect; + BOOL todo; + } + tests[] = + { + {"", 0, ""}, + {"http://www.winehq.org/tests/../tests/../..", 0, "http://www.winehq.org/", TRUE}, + {"http://www.winehq.org/..", 0, "http://www.winehq.org/..%22%7D, + {"http://www.winehq.org/tests/tests2/../../tests", 0, "http://www.winehq.org/tests%22%7D, + {"http://www.winehq.org/tests/../tests", 0, "http://www.winehq.org/tests%22%7D, + {"http://www.winehq.org/tests%5Cn", URL_WININET_COMPATIBILITY|URL_ESCAPE_SPACES_ONLY|URL_ESCAPE_UNSAFE, "http://www.winehq.org/tests%22%7D, + {"http://www.winehq.org/tests%5Cr", URL_WININET_COMPATIBILITY|URL_ESCAPE_SPACES_ONLY|URL_ESCAPE_UNSAFE, "http://www.winehq.org/tests%22%7D, + {"http://www.winehq.org/tests%5Cr", 0, "http://www.winehq.org/tests%22%7D, + {"http://www.winehq.org/tests%5Cr", URL_DONT_SIMPLIFY, "http://www.winehq.org/tests%22%7D, + {"http://www.winehq.org/tests/../tests/", 0, "http://www.winehq.org/tests/%22%7D, + {"http://www.winehq.org/tests/../tests/..", 0, "http://www.winehq.org/%22%7D, + {"http://www.winehq.org/tests/../tests/../", 0, "http://www.winehq.org/%22%7D, + {"http://www.winehq.org/tests/..", 0, "http://www.winehq.org/%22%7D, + {"http://www.winehq.org/tests/../", 0, "http://www.winehq.org/%22%7D, + {"http://www.winehq.org/tests/..?query=x&return=y", 0, "http://www.winehq.org/?query=x&return=y%22%7D, + {"http://www.winehq.org/tests/../?query=x&return=y", 0, "http://www.winehq.org/?query=x&return=y%22%7D, + {"\tht\ttp\t://www\t.w\tineh\t\tq.or\tg\t/\ttests/..\t?\tquer\ty=x\t\t&re\tturn=y\t\t", 0, "http://www.winehq.org/?query=x&return=y%22%7D, + {"http://www.winehq.org/tests/..#example", 0, "http://www.winehq.org/#example%22%7D, + {"http://www.winehq.org/tests/../#example", 0, "http://www.winehq.org/#example%22%7D, + {"http://www.winehq.org/tests%5C%5C../#example", 0, "http://www.winehq.org/#example%22%7D, + {"http://www.winehq.org/tests/..%5C%5C#example", 0, "http://www.winehq.org/#example%22%7D, + {"http://www.winehq.org%5C%5Ctests/../#example", 0, "http://www.winehq.org/#example%22%7D, + {"http://www.winehq.org/tests/../#example", URL_DONT_SIMPLIFY, "http://www.winehq.org/tests/../#example%22%7D, + {"http://www.winehq.org/tests/foo bar", URL_ESCAPE_SPACES_ONLY | URL_DONT_ESCAPE_EXTRA_INFO, "http://www.winehq.org/tests/foo%20bar%22%7D, + {"http://www.winehq.org/tests/foo%20bar", URL_UNESCAPE, "http://www.winehq.org/tests/foo bar"}, + {"http://www.winehq.org", 0, "http://www.winehq.org/%22%7D, + {"http:///www.winehq.org", 0, "http:///www.winehq.org%22%7D, + {"http:////www.winehq.org", 0, "http:////www.winehq.org%22%7D, + {"file:///c:/tests/foo%20bar", URL_UNESCAPE, "file:///c:/tests/foo bar"}, + {"file:///c:/tests\foo%20bar", URL_UNESCAPE, "file:///c:/tests/foo bar"}, + {"file:///c:/tests/foo%20bar", 0, "file:///c:/tests/foo%20bar"}, + {"file:///c:/tests/foo%20bar", URL_FILE_USE_PATHURL, "file://c:\tests\foo bar"}, + {"file://localhost/c:/tests/../tests/foo%20bar", URL_FILE_USE_PATHURL, "file://c:\tests\foo bar"}, + {"file://localhost\c:/tests/../tests/foo%20bar", URL_FILE_USE_PATHURL, "file://c:\tests\foo bar"}, + {"file://localhost\\c:/tests/../tests/foo%20bar", URL_FILE_USE_PATHURL, "file://c:\tests\foo bar"}, + {"file://localhost\c:\tests/../tests/foo%20bar", URL_FILE_USE_PATHURL, "file://c:\tests\foo bar"}, + {"file://c:/tests/../tests/foo%20bar", URL_FILE_USE_PATHURL, "file://c:\tests\foo bar"}, + {"file://c:/tests\../tests/foo%20bar", URL_FILE_USE_PATHURL, "file://c:\tests\foo bar"}, + {"file://c:/tests/foo%20bar", URL_FILE_USE_PATHURL, "file://c:\tests\foo bar"}, + {"file:///c://tests/foo%20bar", URL_FILE_USE_PATHURL, "file://c:\\tests\foo bar"}, + {"file:///c:\tests\foo bar", 0, "file:///c:/tests/foo bar"}, + {"file:///c:\tests\foo bar", URL_DONT_SIMPLIFY, "file:///c:/tests/foo bar"}, + {"file:///c:\tests\foobar", 0, "file:///c:/tests/foobar"}, + {"file:///c:\tests\foobar", URL_WININET_COMPATIBILITY, "file://c:\tests\foobar"}, + {"file://home/user/file", 0, "file://home/user/file"}, + {"file:///home/user/file", 0, "file:///home/user/file"}, + {"file:////home/user/file", 0, "file://home/user/file"}, + {"file://home/user/file", URL_WININET_COMPATIBILITY, "file://\\home\user\file"}, + {"file:///home/user/file", URL_WININET_COMPATIBILITY, "file://\home\user\file"}, + {"file:////home/user/file", URL_WININET_COMPATIBILITY, "file://\\home\user\file"}, + {"file://///home/user/file", URL_WININET_COMPATIBILITY, "file://\\home\user\file"}, + {"file://C:/user/file", 0, "file:///C:/user/file"}, + {"file://C:/user/file/../asdf", 0, "file:///C:/user/asdf"}, + {"file:///C:/user/file", 0, "file:///C:/user/file"}, + {"file:////C:/user/file", 0, "file:///C:/user/file"}, + {"file://C:/user/file", URL_WININET_COMPATIBILITY, "file://C:\user\file"}, + {"file:///C:/user/file", URL_WININET_COMPATIBILITY, "file://C:\user\file"}, + {"file:////C:/user/file", URL_WININET_COMPATIBILITY, "file://C:\user\file"}, + {"http:///www.winehq.org", 0, "http:///www.winehq.org%22%7D, + {"http:///www.winehq.org", URL_WININET_COMPATIBILITY, "http:///www.winehq.org%22%7D, + {"http://www.winehq.org/site/about", URL_FILE_USE_PATHURL, "http://www.winehq.org/site/about%22%7D, + {"file_://www.winehq.org/site/about", URL_FILE_USE_PATHURL, "file_://www.winehq.org/site/about"}, + {"c:\dir\file", 0, "file:///c:/dir/file"}, + {"file:///c:\dir\file", 0, "file:///c:/dir/file"}, + {"c:dir\file", 0, "file:///c:dir/file"}, + {"c:\tests\foo bar", URL_FILE_USE_PATHURL, "file://c:\tests\foo bar"}, + {"c:\tests\foo bar", 0, "file:///c:/tests/foo%20bar"}, + {"c\t:\t\te\tsts\fo\to \tbar\t", 0, "file:///c:/tests/foo%20bar"}, + {"res://file", 0, "res://file/"}, + {"res://file", URL_FILE_USE_PATHURL, "res://file/"}, + {"res:///c:/tests/foo%20bar", URL_UNESCAPE, "res:///c:/tests/foo bar"}, + {"res:///c:/tests\foo%20bar", URL_UNESCAPE, "res:///c:/tests\foo bar"}, + {"res:///c:/tests/foo%20bar", 0, "res:///c:/tests/foo%20bar"}, + {"res:///c:/tests/foo%20bar", URL_FILE_USE_PATHURL, "res:///c:/tests/foo%20bar"}, + {"res://c:/tests/../tests/foo%20bar", URL_FILE_USE_PATHURL, "res://c:/tests/foo%20bar"}, + {"res://c:/tests\../tests/foo%20bar", URL_FILE_USE_PATHURL, "res://c:/tests/foo%20bar"}, + {"res://c:/tests/foo%20bar", URL_FILE_USE_PATHURL, "res://c:/tests/foo%20bar"}, + {"res:///c://tests/foo%20bar", URL_FILE_USE_PATHURL, "res:///c://tests/foo%20bar"}, + {"res:///c:\tests\foo bar", 0, "res:///c:\tests\foo bar"}, + {"res:///c:\tests\foo bar", URL_DONT_SIMPLIFY, "res:///c:\tests\foo bar"}, + {"res://c:\tests\foo bar/res", URL_FILE_USE_PATHURL, "res://c:\tests\foo bar/res"}, + {"res://c:\tests/res\foo%20bar/strange\sth", 0, "res://c:\tests/res\foo%20bar/strange\sth"}, + {"res://c:\tests/res\foo%20bar/strange\sth", URL_FILE_USE_PATHURL, "res://c:\tests/res\foo%20bar/strange\sth"}, + {"res://c:\tests/res\foo%20bar/strange\sth", URL_UNESCAPE, "res://c:\tests/res\foo bar/strange\sth"}, + {"/A/../B/./C/../../test_remove_dot_segments", 0, "/test_remove_dot_segments"}, + {"/A/../B/./C/../../test_remove_dot_segments", URL_FILE_USE_PATHURL, "/test_remove_dot_segments"}, + {"/A/../B/./C/../../test_remove_dot_segments", URL_WININET_COMPATIBILITY, "/test_remove_dot_segments"}, + {"/A/B\C/D\E", 0, "/A/B\C/D\E"}, + {"/A/B\C/D\E", URL_FILE_USE_PATHURL, "/A/B\C/D\E"}, + {"/A/B\C/D\E", URL_WININET_COMPATIBILITY, "/A/B\C/D\E"}, + {"///A/../B", 0, "///B"}, + {"///A/../B", URL_FILE_USE_PATHURL, "///B"}, + {"///A/../B", URL_WININET_COMPATIBILITY, "///B"}, + {"A", 0, "A"}, + {"../A", 0, "../A"}, + {"A/../B", 0, "B", TRUE}, + {"/uri-res/N2R?urn:sha1:B3K", URL_DONT_ESCAPE_EXTRA_INFO | URL_WININET_COMPATIBILITY /*0x82000000*/, "/uri-res/N2R?urn:sha1:B3K"} /*LimeWire online installer calls this*/, + {"http:www.winehq.org/dir/../index.html", 0, "http:www.winehq.org/index.html"}, + {"http://localhost/test.html", URL_FILE_USE_PATHURL, "http://localhost/test.html%22%7D, + {"http://localhost/te%20st.html", URL_FILE_USE_PATHURL, "http://localhost/te%20st.html%22%7D, + {"http://www.winehq.org/%E6%A1%9C.html", URL_FILE_USE_PATHURL, "http://www.winehq.org/%E6%A1%9C.html%22%7D, + {"mk:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm", 0, "mk:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm"}, + {"ftp:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm", 0, "ftp:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm"}, + {"file:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm", 0, "file:@MSITStore:C:/Program Files/AutoCAD 2008/Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm"}, + {"http:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm", 0, "http:@MSITStore:C:/Program Files/AutoCAD 2008/Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm"}, + {"http:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm", URL_FILE_USE_PATHURL, "http:@MSITStore:C:/Program Files/AutoCAD 2008/Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm"}, + {"mk:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm", URL_FILE_USE_PATHURL, "mk:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm"}, + }; + urllen = lstrlenA(winehqA);
/* Parameter checks */ @@ -1161,10 +1158,8 @@ static void test_UrlCanonicalizeA(void) ok(hr == S_OK, "hr = %lx\n", hr);
/* test url-modification */ - for (i = 0; i < ARRAY_SIZE(TEST_CANONICALIZE); i++) { - check_url_canonicalize(i, TEST_CANONICALIZE[i].url, TEST_CANONICALIZE[i].flags, - TEST_CANONICALIZE[i].expecturl, TEST_CANONICALIZE[i].todo); - } + for (i = 0; i < ARRAY_SIZE(tests); i++) + check_url_canonicalize(i, tests[i].url, tests[i].flags, tests[i].expect, tests[i].todo); }
/* ########################### */
From: Zebediah Figura zfigura@codeweavers.com
--- dlls/shlwapi/tests/url.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/dlls/shlwapi/tests/url.c b/dlls/shlwapi/tests/url.c index d99be7b6a43..6f8bae7a21a 100644 --- a/dlls/shlwapi/tests/url.c +++ b/dlls/shlwapi/tests/url.c @@ -777,7 +777,6 @@ static void check_url_canonicalize(int index, const char *szUrl, DWORD dwFlags, CHAR szReturnUrl[INTERNET_MAX_URL_LENGTH]; WCHAR wszReturnUrl[INTERNET_MAX_URL_LENGTH]; LPWSTR wszUrl = GetWideString(szUrl); - LPWSTR wszExpectUrl = GetWideString(szExpectUrl); LPWSTR wszConvertedUrl; HRESULT ret;
@@ -804,7 +803,6 @@ static void check_url_canonicalize(int index, const char *szUrl, DWORD dwFlags, FreeWideString(wszConvertedUrl);
FreeWideString(wszUrl); - FreeWideString(wszExpectUrl); }
From: Zebediah Figura zfigura@codeweavers.com
--- dlls/shlwapi/tests/url.c | 51 ++++++++++++++++++++-------------------- 1 file changed, 26 insertions(+), 25 deletions(-)
diff --git a/dlls/shlwapi/tests/url.c b/dlls/shlwapi/tests/url.c index 6f8bae7a21a..3856701f24b 100644 --- a/dlls/shlwapi/tests/url.c +++ b/dlls/shlwapi/tests/url.c @@ -772,37 +772,38 @@ static void test_UrlGetPart(void) }
/* ########################### */ -static void check_url_canonicalize(int index, const char *szUrl, DWORD dwFlags, const char *szExpectUrl, BOOL todo) +static void check_url_canonicalize(const char *url, DWORD flags, const char *expect, BOOL todo) { - CHAR szReturnUrl[INTERNET_MAX_URL_LENGTH]; - WCHAR wszReturnUrl[INTERNET_MAX_URL_LENGTH]; - LPWSTR wszUrl = GetWideString(szUrl); - LPWSTR wszConvertedUrl; - HRESULT ret; + char output[INTERNET_MAX_URL_LENGTH]; + WCHAR outputW[INTERNET_MAX_URL_LENGTH]; + WCHAR *urlW = GetWideString(url); + WCHAR *expectW; + HRESULT hr; + DWORD size;
- DWORD dwSize; + winetest_push_context("URL %s, flags %#lx", debugstr_a(url), flags);
- dwSize = INTERNET_MAX_URL_LENGTH; - ret = UrlCanonicalizeA(szUrl, NULL, &dwSize, dwFlags); - ok(ret == E_INVALIDARG, "Got unexpected hr %#lx for index %d.\n", ret, index); - ret = UrlCanonicalizeA(szUrl, szReturnUrl, &dwSize, dwFlags); - ok(ret == S_OK || (!szUrl[0] && ret == S_FALSE) /* Vista+ */, - "Got unexpected hr %#lx for index %d.\n", ret, index); + size = INTERNET_MAX_URL_LENGTH; + hr = UrlCanonicalizeA(url, NULL, &size, flags); + ok(hr == E_INVALIDARG, "Got unexpected hr %#lx.\n", hr); + hr = UrlCanonicalizeA(url, output, &size, flags); + ok(hr == S_OK || (!url[0] && hr == S_FALSE) /* Vista+ */, "Got unexpected hr %#lx.\n", hr); todo_wine_if (todo) - ok(strcmp(szReturnUrl,szExpectUrl)==0, "UrlCanonicalizeA dwFlags 0x%08lx url '%s' Expected "%s", but got "%s", index %d\n", dwFlags, szUrl, szExpectUrl, szReturnUrl, index); + ok(!strcmp(output, expect), "Expected %s, got %s.\n", debugstr_a(expect), debugstr_a(output)); + + size = INTERNET_MAX_URL_LENGTH; + hr = UrlCanonicalizeW(urlW, NULL, &size, flags); + ok(hr == E_INVALIDARG, "Got unexpected hr %#lx.\n", hr); + hr = UrlCanonicalizeW(urlW, outputW, &size, flags); + ok(hr == S_OK, "Got unexpected hr %#lx.\n", hr);
- dwSize = INTERNET_MAX_URL_LENGTH; - ret = UrlCanonicalizeW(wszUrl, NULL, &dwSize, dwFlags); - ok(ret == E_INVALIDARG, "Got unexpected hr %#lx for index %d.\n", ret, index); - ret = UrlCanonicalizeW(wszUrl, wszReturnUrl, &dwSize, dwFlags); - ok(ret == S_OK, "Got unexpected hr %#lx for index %d.\n", ret, index); + expectW = GetWideString(output); + ok(!wcscmp(outputW, expectW), "Expected %s, got %s.\n", debugstr_w(expectW), debugstr_w(outputW)); + FreeWideString(expectW);
- wszConvertedUrl = GetWideString(szReturnUrl); - ok(lstrcmpW(wszReturnUrl, wszConvertedUrl)==0, - "Strings didn't match between ansi and unicode UrlCanonicalize, index %d!\n", index); - FreeWideString(wszConvertedUrl); + FreeWideString(urlW);
- FreeWideString(wszUrl); + winetest_pop_context(); }
@@ -1157,7 +1158,7 @@ static void test_UrlCanonicalizeA(void)
/* test url-modification */ for (i = 0; i < ARRAY_SIZE(tests); i++) - check_url_canonicalize(i, tests[i].url, tests[i].flags, tests[i].expect, tests[i].todo); + check_url_canonicalize(tests[i].url, tests[i].flags, tests[i].expect, tests[i].todo); }
/* ########################### */
From: Zebediah Figura zfigura@codeweavers.com
And expand them a bit while we're at it. --- dlls/shlwapi/tests/url.c | 72 +++++++++++++++++++++++++++------------- 1 file changed, 49 insertions(+), 23 deletions(-)
diff --git a/dlls/shlwapi/tests/url.c b/dlls/shlwapi/tests/url.c index 3856701f24b..dbee7926aa8 100644 --- a/dlls/shlwapi/tests/url.c +++ b/dlls/shlwapi/tests/url.c @@ -1263,34 +1263,13 @@ static void check_url_combine(const char *szUrl1, const char *szUrl2, DWORD dwFl wszUrl2 = GetWideString(szUrl2); wszExpectUrl = GetWideString(szExpectUrl);
- hr = UrlCombineA(szUrl1, szUrl2, NULL, NULL, dwFlags); - ok(hr == E_INVALIDARG, "UrlCombineA returned 0x%08lx, expected 0x%08lx\n", hr, E_INVALIDARG); - - dwSize = 0; - hr = UrlCombineA(szUrl1, szUrl2, NULL, &dwSize, dwFlags); - ok(hr == E_POINTER, "Checking length of string, return was 0x%08lx, expected 0x%08lx\n", hr, E_POINTER); - ok(dwSize == dwExpectLen+1, "Got length %ld, expected %ld\n", dwSize, dwExpectLen+1); - - dwSize--; - hr = UrlCombineA(szUrl1, szUrl2, szReturnUrl, &dwSize, dwFlags); - ok(hr == E_POINTER, "UrlCombineA returned 0x%08lx, expected 0x%08lx\n", hr, E_POINTER); - ok(dwSize == dwExpectLen+1, "Got length %ld, expected %ld\n", dwSize, dwExpectLen+1); - + dwSize = ARRAY_SIZE(szReturnUrl); hr = UrlCombineA(szUrl1, szUrl2, szReturnUrl, &dwSize, dwFlags); ok(hr == S_OK, "Got unexpected hr %#lx.\n", hr); ok(dwSize == dwExpectLen, "Got length %ld, expected %ld\n", dwSize, dwExpectLen); ok(!strcmp(szReturnUrl, szExpectUrl), "Expected %s, got %s.\n", szExpectUrl, szReturnUrl);
- dwSize = 0; - hr = UrlCombineW(wszUrl1, wszUrl2, NULL, &dwSize, dwFlags); - ok(hr == E_POINTER, "Checking length of string, return was 0x%08lx, expected 0x%08lx\n", hr, E_POINTER); - ok(dwSize == dwExpectLen+1, "Got length %ld, expected %ld\n", dwSize, dwExpectLen+1); - - dwSize--; - hr = UrlCombineW(wszUrl1, wszUrl2, wszReturnUrl, &dwSize, dwFlags); - ok(hr == E_POINTER, "UrlCombineW returned 0x%08lx, expected 0x%08lx\n", hr, E_POINTER); - ok(dwSize == dwExpectLen+1, "Got length %ld, expected %ld\n", dwSize, dwExpectLen+1); - + dwSize = ARRAY_SIZE(wszReturnUrl); hr = UrlCombineW(wszUrl1, wszUrl2, wszReturnUrl, &dwSize, dwFlags); ok(hr == S_OK, "Got unexpected hr %#lx.\n", hr); ok(dwSize == dwExpectLen, "Got length %ld, expected %ld\n", dwSize, dwExpectLen); @@ -1308,7 +1287,54 @@ static void check_url_combine(const char *szUrl1, const char *szUrl2, DWORD dwFl
static void test_UrlCombine(void) { + WCHAR bufferW[30]; + char buffer[30]; unsigned int i; + HRESULT hr; + DWORD size; + + hr = UrlCombineA("http://base/", "relative", NULL, NULL, 0); + ok(hr == E_INVALIDARG, "Got hr %#lx.\n", hr); + + size = 0; + hr = UrlCombineA("http://base/", "relative", NULL, &size, 0); + ok(hr == E_POINTER, "Got hr %#lx.\n", hr); + ok(size == strlen("http://base/relative") + 1, "Got size %lu.\n", size); + + --size; + strcpy(buffer, "x"); + hr = UrlCombineA("http://base/", "relative", buffer, &size, 0); + ok(hr == E_POINTER, "Got hr %#lx.\n", hr); + ok(size == strlen("http://base/relative") + 1, "Got size %lu.\n", size); + ok(!strcmp(buffer, "x"), "Got buffer contents %s.\n", debugstr_a(buffer)); + + strcpy(buffer, "x"); + hr = UrlCombineA("http://base/", "relative", buffer, &size, 0); + ok(hr == S_OK, "Got hr %#lx.\n", hr); + ok(size == strlen("http://base/relative"), "Got size %lu.\n", size); + ok(!strcmp(buffer, "http://base/relative"), "Got buffer contents %s.\n", debugstr_a(buffer)); + + hr = UrlCombineW(L"http://base/", L"relative", NULL, NULL, 0); + ok(hr == E_INVALIDARG, "Got hr %#lx.\n", hr); + + size = 0; + hr = UrlCombineW(L"http://base/", L"relative", NULL, &size, 0); + ok(hr == E_POINTER, "Got hr %#lx.\n", hr); + ok(size == strlen("http://base/relative") + 1, "Got size %lu.\n", size); + + --size; + wcscpy(bufferW, L"x"); + hr = UrlCombineW(L"http://base/", L"relative", bufferW, &size, 0); + ok(hr == E_POINTER, "Got hr %#lx.\n", hr); + ok(size == strlen("http://base/relative") + 1, "Got size %lu.\n", size); + ok(!wcscmp(bufferW, L"x"), "Got buffer contents %s.\n", debugstr_a(buffer)); + + wcscpy(bufferW, L"x"); + hr = UrlCombineW(L"http://base/", L"relative", bufferW, &size, 0); + ok(hr == S_OK, "Got hr %#lx.\n", hr); + ok(size == strlen("http://base/relative"), "Got size %lu.\n", size); + ok(!wcscmp(bufferW, L"http://base/relative"), "Got buffer contents %s.\n", debugstr_w(bufferW)); + for (i = 0; i < ARRAY_SIZE(TEST_COMBINE); i++) { check_url_combine(TEST_COMBINE[i].url1, TEST_COMBINE[i].url2, TEST_COMBINE[i].flags, TEST_COMBINE[i].expecturl); }
From: Zebediah Figura zfigura@codeweavers.com
--- dlls/kernelbase/path.c | 31 +++++++++++++------------------ 1 file changed, 13 insertions(+), 18 deletions(-)
diff --git a/dlls/kernelbase/path.c b/dlls/kernelbase/path.c index 40ddf840045..1e55bcbcf40 100644 --- a/dlls/kernelbase/path.c +++ b/dlls/kernelbase/path.c @@ -4662,7 +4662,7 @@ HRESULT WINAPI UrlCombineA(const char *base, const char *relative, char *combine HRESULT WINAPI UrlCombineW(const WCHAR *baseW, const WCHAR *relativeW, WCHAR *combined, DWORD *combined_len, DWORD flags) { DWORD i, len, process_case = 0, myflags, sizeloc = 0; - LPWSTR work, preliminary, mbase, mrelative; + LPWSTR work, preliminary, mbase, canonicalized; PARSEDURLW base, relative; HRESULT hr;
@@ -4677,7 +4677,7 @@ HRESULT WINAPI UrlCombineW(const WCHAR *baseW, const WCHAR *relativeW, WCHAR *co /* Get space for duplicates of the input and the output */ preliminary = heap_alloc(3 * INTERNET_MAX_URL_LENGTH * sizeof(WCHAR)); mbase = preliminary + INTERNET_MAX_URL_LENGTH; - mrelative = mbase + INTERNET_MAX_URL_LENGTH; + canonicalized = mbase + INTERNET_MAX_URL_LENGTH; *preliminary = '\0';
/* Canonicalize the base input prior to looking for the scheme */ @@ -4685,10 +4685,6 @@ HRESULT WINAPI UrlCombineW(const WCHAR *baseW, const WCHAR *relativeW, WCHAR *co len = INTERNET_MAX_URL_LENGTH; UrlCanonicalizeW(baseW, mbase, &len, myflags);
- /* Canonicalize the relative input prior to looking for the scheme */ - len = INTERNET_MAX_URL_LENGTH; - UrlCanonicalizeW(relativeW, mrelative, &len, myflags); - /* See if the base has a scheme */ if (ParseURLW(mbase, &base) != S_OK) { @@ -4787,12 +4783,12 @@ HRESULT WINAPI UrlCombineW(const WCHAR *baseW, const WCHAR *relativeW, WCHAR *co * the last '/' */
- if (ParseURLW(mrelative, &relative) != S_OK) + if (ParseURLW(relativeW, &relative) != S_OK) { /* No scheme in relative */ TRACE("no scheme detected in Relative\n"); - relative.pszSuffix = mrelative; /* case 3,4,5 depends on this */ - relative.cchSuffix = lstrlenW(mrelative); + relative.pszSuffix = relativeW; /* case 3,4,5 depends on this */ + relative.cchSuffix = lstrlenW( relativeW ); if (*relativeW == ':') { /* Case that is either left alone or uses base. */ @@ -4804,26 +4800,26 @@ HRESULT WINAPI UrlCombineW(const WCHAR *baseW, const WCHAR *relativeW, WCHAR *co process_case = 1; break; } - if (is_drive_spec( mrelative )) + if (is_drive_spec( relativeW )) { /* case that becomes "file:///" */ lstrcpyW(preliminary, L"file:///"); process_case = 1; break; } - if (*mrelative == '/' && *(mrelative+1) == '/') + if (relativeW[0] == '/' && relativeW[1] == '/') { /* Relative has location and the rest. */ process_case = 3; break; } - if (*mrelative == '/') + if (*relativeW == '/') { /* Relative is root to location. */ process_case = 4; break; } - if (*mrelative == '#') + if (*relativeW == '#') { if (!(work = wcschr(base.pszSuffix+base.cchSuffix, '#'))) work = (LPWSTR)base.pszSuffix + lstrlenW(base.pszSuffix); @@ -4880,12 +4876,12 @@ HRESULT WINAPI UrlCombineW(const WCHAR *baseW, const WCHAR *relativeW, WCHAR *co { case 1: /* Return relative appended to whatever is in combined (which may the string "file:///" */ - lstrcatW(preliminary, mrelative); + lstrcatW(preliminary, relativeW); break;
case 2: /* Relative replaces scheme and location */ - lstrcpyW(preliminary, mrelative); + lstrcpyW(preliminary, relativeW); break;
case 3: @@ -4920,12 +4916,11 @@ HRESULT WINAPI UrlCombineW(const WCHAR *baseW, const WCHAR *relativeW, WCHAR *co
if (hr == S_OK) { - /* Reuse mrelative as temp storage as it's already allocated and not needed anymore */ if (*combined_len == 0) *combined_len = 1; - hr = UrlCanonicalizeW(preliminary, mrelative, combined_len, flags & ~URL_FILE_USE_PATHURL); + hr = UrlCanonicalizeW(preliminary, canonicalized, combined_len, flags & ~URL_FILE_USE_PATHURL); if (SUCCEEDED(hr) && combined) - lstrcpyW(combined, mrelative); + lstrcpyW( combined, canonicalized );
TRACE("return-%ld len=%ld, %s\n", process_case, *combined_len, debugstr_w(combined)); }
From: Zebediah Figura zfigura@codeweavers.com
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=23166 --- dlls/kernelbase/path.c | 878 +++++++++++++++++++++++++++------------ dlls/shlwapi/tests/url.c | 12 +- 2 files changed, 618 insertions(+), 272 deletions(-)
diff --git a/dlls/kernelbase/path.c b/dlls/kernelbase/path.c index 1e55bcbcf40..e361a5a5ee5 100644 --- a/dlls/kernelbase/path.c +++ b/dlls/kernelbase/path.c @@ -1,6 +1,7 @@ /* * Copyright 2018 Nikolay Sivov * Copyright 2018 Zhiyi Zhang + * Copyright 2021-2023 Zebediah Figura * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -18,7 +19,9 @@ */
#include <stdarg.h> +#include <stdbool.h> #include <string.h> +#include <wchar.h>
#include "windef.h" #include "winbase.h" @@ -91,6 +94,38 @@ static WCHAR *heap_strdupAtoW(const char *str) return ret; }
+static bool array_reserve(void **elements, size_t *capacity, size_t count, size_t size) +{ + unsigned int new_capacity, max_capacity; + void *new_elements; + + if (count <= *capacity) + return true; + + max_capacity = ~(SIZE_T)0 / size; + if (count > max_capacity) + return false; + + new_capacity = max(4, *capacity); + while (new_capacity < count && new_capacity <= max_capacity / 2) + new_capacity *= 2; + if (new_capacity < count) + new_capacity = max_capacity; + + if (!(new_elements = heap_realloc( *elements, new_capacity * size ))) + return false; + + *elements = new_elements; + *capacity = new_capacity; + + return true; +} + +static bool is_slash( char c ) +{ + return c == '/' || c == '\'; +} + static BOOL is_drive_spec( const WCHAR *str ) { return isalpha( str[0] ) && str[1] == ':'; @@ -2770,6 +2805,13 @@ url_schemes[] = { URL_SCHEME_RES, L"res"}, };
+static const WCHAR *parse_scheme( const WCHAR *p ) +{ + while (isalnum( *p ) || *p == '+' || *p == '-' || *p == '.') + ++p; + return p; +} + static DWORD get_scheme_code(const WCHAR *scheme, DWORD scheme_len) { unsigned int i; @@ -3554,348 +3596,666 @@ HRESULT WINAPI UrlCanonicalizeA(const char *src_url, char *canonicalized, DWORD return hr; }
-HRESULT WINAPI UrlCanonicalizeW(const WCHAR *src_url, WCHAR *canonicalized, DWORD *canonicalized_len, DWORD flags) +static bool scheme_is_opaque( URL_SCHEME scheme ) { - WCHAR *url_copy, *url, *wk2, *mp, *mp2; - DWORD nByteLen, nLen, nWkLen; - const WCHAR *wk1, *root; - DWORD escape_flags; - WCHAR slash = '\0'; - HRESULT hr = S_OK; - BOOL is_file_url; - INT state; + switch (scheme) + { + case URL_SCHEME_ABOUT: + case URL_SCHEME_JAVASCRIPT: + case URL_SCHEME_MAILTO: + case URL_SCHEME_SHELL: + case URL_SCHEME_VBSCRIPT: + return true;
- TRACE("%s, %p, %p, %#lx\n", wine_dbgstr_w(src_url), canonicalized, canonicalized_len, flags); + default: + return false; + } +}
- if (!src_url || !canonicalized || !canonicalized_len || !*canonicalized_len) - return E_INVALIDARG; +static bool scheme_preserves_backslashes( URL_SCHEME scheme ) +{ + switch (scheme) + { + case URL_SCHEME_FTP: + case URL_SCHEME_INVALID: + case URL_SCHEME_LOCAL: + case URL_SCHEME_MK: + case URL_SCHEME_RES: + case URL_SCHEME_UNKNOWN: + case URL_SCHEME_WAIS: + return true;
- if (!*src_url) + default: + return false; + } +} + +static bool scheme_uses_hostname( URL_SCHEME scheme ) +{ + switch (scheme) { - *canonicalized = 0; - return S_OK; + case URL_SCHEME_ABOUT: + case URL_SCHEME_JAVASCRIPT: + case URL_SCHEME_MAILTO: + case URL_SCHEME_MK: + case URL_SCHEME_SHELL: + case URL_SCHEME_VBSCRIPT: + return false; + + default: + return true; } +}
- /* Remove '\t' characters from URL */ - nByteLen = (lstrlenW(src_url) + 1) * sizeof(WCHAR); /* length in bytes */ - url = HeapAlloc(GetProcessHeap(), 0, nByteLen); - if(!url) - return E_OUTOFMEMORY; +static bool scheme_char_is_separator( URL_SCHEME scheme, WCHAR c ) +{ + if (c == '/') + return true; + if (c == '\' && scheme != URL_SCHEME_INVALID && scheme != URL_SCHEME_UNKNOWN) + return true; + return false; +} + +static bool scheme_char_is_hostname_separator( URL_SCHEME scheme, DWORD flags, WCHAR c ) +{ + switch (c) + { + case 0: + case '/': + return true; + case '\': + return !scheme_preserves_backslashes( scheme ); + case '?': + return scheme != URL_SCHEME_FILE || (flags & (URL_WININET_COMPATIBILITY | URL_FILE_USE_PATHURL)); + case '#': + return scheme != URL_SCHEME_FILE; + default: + return false; + } +}
- wk1 = src_url; - wk2 = url; - do +static bool scheme_char_is_dot_separator( URL_SCHEME scheme, DWORD flags, WCHAR c ) +{ + switch (c) { - while(*wk1 == '\t') - wk1++; - *wk2++ = *wk1; - } while (*wk1++); + case 0: + case '/': + case '?': + return true; + case '#': + return (scheme != URL_SCHEME_FILE || !(flags & (URL_WININET_COMPATIBILITY | URL_FILE_USE_PATHURL))); + case '\': + return (scheme != URL_SCHEME_INVALID && scheme != URL_SCHEME_UNKNOWN && scheme != URL_SCHEME_MK); + default: + return false; + } +}
- /* Allocate memory for simplified URL (before escaping) */ - nByteLen = (wk2-url)*sizeof(WCHAR); - url_copy = heap_alloc(nByteLen + sizeof(L"file:///")); - if (!url_copy) +/* There are essentially two types of behaviour concerning dot simplification, + * not counting opaque schemes: + * + * 1) Simplify dots if and only if the first element is not a single or double + * dot. If a double dot would rewind past the root, ignore it. For example: + * + * http://hostname/a/../../b/. -> http://hostname/b/ + * http://hostname/./../../b/. -> http://hostname/./../../b/. + * + * 2) Effectively treat all paths as relative. Always simplify, except if a + * double dot would rewind past the root, in which case emit it verbatim. + * For example: + * + * wine://hostname/a/../../b/. -> wine://hostname/../b/ + * wine://hostname/./../../b/. -> wine://hostname/../b/ + * + * For unclear reasons, this behaviour also correlates with whether a final + * slash is always emitted after a single or double dot (e.g. if + * URL_DONT_SIMPLIFY is specified). The former type does not emit a slash; the + * latter does. + */ +static bool scheme_is_always_relative( URL_SCHEME scheme, DWORD flags ) +{ + switch (scheme) { - heap_free(url); - return E_OUTOFMEMORY; + case URL_SCHEME_INVALID: + case URL_SCHEME_UNKNOWN: + return true; + + case URL_SCHEME_FILE: + return flags & (URL_WININET_COMPATIBILITY | URL_FILE_USE_PATHURL); + + default: + return false; } +} + +struct string_buffer +{ + WCHAR *string; + size_t len, capacity; +}; + +static void append_string( struct string_buffer *buffer, const WCHAR *str, size_t len ) +{ + array_reserve( (void **)&buffer->string, &buffer->capacity, buffer->len + len, sizeof(WCHAR) ); + memcpy( buffer->string + buffer->len, str, len * sizeof(WCHAR) ); + buffer->len += len; +} + +static void append_char( struct string_buffer *buffer, WCHAR c ) +{ + append_string( buffer, &c, 1 ); +} + +static char get_slash_dir( URL_SCHEME scheme, DWORD flags, char src, const struct string_buffer *dst ) +{ + if (src && scheme_preserves_backslashes( scheme )) + return src; + + if (scheme == URL_SCHEME_FILE && (flags & (URL_FILE_USE_PATHURL | URL_WININET_COMPATIBILITY)) + && !wmemchr( dst->string, '#', dst->len )) + return '\';
- is_file_url = !wcsncmp(url, L"file:", 5); + return '/'; +} + +static void rewrite_url( struct string_buffer *dst, const WCHAR *url, DWORD *flags_ptr ) +{ + DWORD flags = *flags_ptr; + bool pathurl = (flags & (URL_FILE_USE_PATHURL | URL_WININET_COMPATIBILITY)); + bool is_relative = false, has_hostname = false, has_initial_slash = false; + const WCHAR *query = NULL, *hash = NULL; + URL_SCHEME scheme = URL_SCHEME_INVALID; + size_t query_len = 0, hash_len = 0; + const WCHAR *scheme_end, *src_end; + const WCHAR *hostname = NULL; + size_t hostname_len = 0; + const WCHAR *src = url; + size_t root_offset;
- if ((nByteLen >= 5*sizeof(WCHAR) && !wcsncmp(url, L"http:", 5)) || is_file_url) - slash = '/'; + /* Determine the scheme. */
- if ((flags & (URL_FILE_USE_PATHURL | URL_WININET_COMPATIBILITY)) && is_file_url) - slash = '\'; + scheme_end = parse_scheme( url );
- if (nByteLen >= 4*sizeof(WCHAR) && !wcsncmp(url, L"res:", 4)) + if (*scheme_end == ':' && scheme_end >= url + 2) { - flags &= ~URL_FILE_USE_PATHURL; - slash = '\0'; + size_t scheme_len = scheme_end + 1 - url; + + scheme = get_scheme_code( url, scheme_len - 1 ); + + for (size_t i = 0; i < scheme_len; ++i) + append_char( dst, tolower( *src++ )); } + else if (url[0] == '\' && url[1] == '\') + { + append_string( dst, L"file:", 5 ); + if (!pathurl && !(flags & URL_UNESCAPE)) + flags |= URL_ESCAPE_UNSAFE | URL_ESCAPE_PERCENT; + scheme = URL_SCHEME_FILE;
- /* - * state = - * 0 initial 1,3 - * 1 have 2[+] alnum 2,3 - * 2 have scheme (found :) 4,6,3 - * 3 failed (no location) - * 4 have // 5,3 - * 5 have 1[+] alnum 6,3 - * 6 have location (found /) save root location - */ + has_hostname = true; + }
- wk1 = url; - wk2 = url_copy; - state = 0; + if (is_escaped_drive_spec( url )) + { + append_string( dst, L"file://", 7 ); + if (!pathurl && !(flags & URL_UNESCAPE)) + flags |= URL_ESCAPE_UNSAFE | URL_ESCAPE_PERCENT; + scheme = URL_SCHEME_FILE;
- /* Assume path */ - if (url[1] == ':') + hostname_len = 0; + has_hostname = true; + } + else if (scheme == URL_SCHEME_MK) { - lstrcpyW(wk2, L"file:///"); - wk2 += lstrlenW(wk2); - if (flags & (URL_FILE_USE_PATHURL | URL_WININET_COMPATIBILITY)) + if (src[0] == '@') { - slash = '\'; - --wk2; + while (*src && *src != '/') + append_char( dst, *src++ ); + if (*src == '/') + append_char( dst, *src++ ); + else + append_char( dst, '/' ); + + if ((src[0] == '.' && scheme_char_is_dot_separator( scheme, flags, src[1] )) || + (src[0] == '.' && src[1] == '.' && scheme_char_is_dot_separator( scheme, flags, src[2] ))) + is_relative = true; } - else - flags |= URL_ESCAPE_UNSAFE; - state = 5; - is_file_url = TRUE; } - else if (url[0] == '/') + else if (scheme_uses_hostname( scheme ) && scheme_char_is_separator( scheme, src[0] ) + && scheme_char_is_separator( scheme, src[1] )) { - state = 5; - is_file_url = TRUE; + append_char( dst, scheme_preserves_backslashes( scheme ) ? src[0] : '/' ); + append_char( dst, scheme_preserves_backslashes( scheme ) ? src[1] : '/' ); + src += 2; + if (scheme == URL_SCHEME_FILE && is_slash( src[0] ) && is_slash( src[1] )) + { + while (is_slash( *src )) + ++src; + } + + hostname = src; + + while (!scheme_char_is_hostname_separator( scheme, flags, *src )) + ++src; + hostname_len = src - hostname; + has_hostname = true; + has_initial_slash = true; } + else if (scheme_char_is_separator( scheme, src[0] )) + { + has_initial_slash = true; + + if (scheme == URL_SCHEME_UNKNOWN || scheme == URL_SCHEME_INVALID) + { + /* Special case: an unknown scheme starting with a single slash + * considers the "root" to be the single slash. + * Most other schemes treat it as an empty path segment instead. */ + append_char( dst, *src++ ); + + if (*src == '\') + ++src; + } + else if (scheme == URL_SCHEME_FILE) + { + src++; + + append_string( dst, L"//", 2 );
- while (*wk1) + hostname_len = 0; + has_hostname = true; + } + } + else { - switch (state) + if (scheme == URL_SCHEME_FILE) { - case 0: - if (!isalnum(*wk1)) {state = 3; break;} - *wk2++ = *wk1++; - if (!isalnum(*wk1)) {state = 3; break;} - *wk2++ = *wk1++; - state = 1; - break; - case 1: - *wk2++ = *wk1; - if (*wk1++ == ':') state = 2; - break; - case 2: - *wk2++ = *wk1++; - if (*wk1 != '/') {state = 6; break;} - *wk2++ = *wk1++; - if ((flags & URL_FILE_USE_PATHURL) && nByteLen >= 9*sizeof(WCHAR) && is_file_url - && !wcsncmp(wk1, L"localhost", 9)) + if (is_escaped_drive_spec( src )) + { + append_string( dst, L"//", 2 ); + hostname_len = 0; + has_hostname = true; + } + else { - wk1 += 9; - while (*wk1 == '\' && (flags & URL_FILE_USE_PATHURL)) - wk1++; + if (flags & URL_FILE_USE_PATHURL) + append_string( dst, L"//", 2 ); } + } + } + + if (scheme == URL_SCHEME_FILE && (flags & URL_FILE_USE_PATHURL)) + flags |= URL_UNESCAPE; + + *flags_ptr = flags;
- if (*wk1 == '/' && (flags & URL_FILE_USE_PATHURL)) - wk1++; - else if (is_file_url) + if (has_hostname) + { + if (scheme == URL_SCHEME_FILE) + { + bool is_drive = false; + + if (is_slash( *src )) + ++src; + + if (hostname_len >= 2 && is_escaped_drive_spec( hostname )) + { + hostname_len = 0; + src = hostname; + is_drive = true; + } + else if (is_escaped_drive_spec( src )) { - const WCHAR *body = wk1; + is_drive = true; + }
- while (*body == '/') - ++body; + if (pathurl) + { + if (hostname_len == 9 && !wcsnicmp( hostname, L"localhost", 9 )) + { + hostname_len = 0; + if (is_slash( *src )) + ++src; + if (is_escaped_drive_spec( src )) + is_drive = true; + }
- if (is_drive_spec( body )) + if (!is_drive) { - if (!(flags & (URL_WININET_COMPATIBILITY | URL_FILE_USE_PATHURL))) + if (hostname_len) { - if (slash) - *wk2++ = slash; - else - *wk2++ = '/'; + append_string( dst, L"\\", 2 ); + append_string( dst, hostname, hostname_len ); } + + if ((*src && *src != '?') || (flags & URL_WININET_COMPATIBILITY)) + append_char( dst, get_slash_dir( scheme, flags, 0, dst )); } - else + } + else + { + if (hostname_len) + append_string( dst, hostname, hostname_len ); + append_char( dst, '/' ); + } + + if (is_drive) + { + /* Root starts after the first slash when file flags are in use, + * but directly after the drive specification if not. */ + if (pathurl) { - if (flags & URL_WININET_COMPATIBILITY) + while (!scheme_char_is_hostname_separator( scheme, flags, *src )) + append_char( dst, *src++ ); + if (is_slash( *src )) { - if (*wk1 == '/' && *(wk1 + 1) != '/') - { - *wk2++ = '\'; - } - else - { - *wk2++ = '\'; - *wk2++ = '\'; - } - } - else - { - if (*wk1 == '/' && *(wk1+1) != '/') - { - if (slash) - *wk2++ = slash; - else - *wk2++ = '/'; - } + append_char( dst, '\' ); + src++; } } - wk1 = body; - } - state = 4; - break; - case 3: - nWkLen = lstrlenW(wk1); - memcpy(wk2, wk1, (nWkLen + 1) * sizeof(WCHAR)); - mp = wk2; - wk1 += nWkLen; - wk2 += nWkLen; - - if (slash) - { - while (mp < wk2) + else { - if (*mp == '/' || *mp == '\') - *mp = slash; - mp++; + append_char( dst, *src++ ); + append_char( dst, *src++ ); + if (is_slash( *src )) + { + append_char( dst, '/' ); + src++; + } } } - break; - case 4: - if (!isalnum(*wk1) && (*wk1 != '-') && (*wk1 != '.') && (*wk1 != ':')) - { - state = 3; - break; - } - while (isalnum(*wk1) || (*wk1 == '-') || (*wk1 == '.') || (*wk1 == ':')) - *wk2++ = *wk1++; - state = 5; - if (!*wk1) + } + else + { + for (size_t i = 0; i < hostname_len; ++i) { - if (slash) - *wk2++ = slash; + if (scheme == URL_SCHEME_UNKNOWN || scheme == URL_SCHEME_INVALID) + append_char( dst, hostname[i] ); else - *wk2++ = '/'; + append_char( dst, tolower( hostname[i] )); } - break; - case 5: - if (*wk1 != '/' && *wk1 != '\') + + if (*src == '/' || *src == '\') { - state = 3; - break; + append_char( dst, scheme_preserves_backslashes( scheme ) ? *src : '/' ); + src++; } - while (*wk1 == '/' || *wk1 == '\') + else { - if (slash) - *wk2++ = slash; - else - *wk2++ = *wk1; - wk1++; + append_char( dst, '/' ); } - state = 6; - break; - case 6: - if (flags & URL_DONT_SIMPLIFY) + } + + if ((src[0] == '.' && scheme_char_is_dot_separator( scheme, flags, src[1] )) || + (src[0] == '.' && src[1] == '.' && scheme_char_is_dot_separator( scheme, flags, src[2] ))) + { + if (!scheme_is_always_relative( scheme, flags )) + is_relative = true; + } + } + + /* root_offset now points to the point past which we will not rewind. + * If there is a hostname, it points to the character after the closing + * slash. */ + + root_offset = dst->len; + + /* Break up the rest of the URL into the body, query, and hash parts. */ + + src_end = src + wcslen( src ); + + if (scheme_is_opaque( scheme )) + { + /* +1 for null terminator */ + append_string( dst, src, src_end + 1 - src ); + return; + } + + if (scheme == URL_SCHEME_FILE) + { + if (!pathurl) + { + if (src[0] == '#') + hash = src; + else if (is_slash( src[0] ) && src[1] == '#') + hash = src + 1; + + if (src[0] == '?') + query = src; + else if (is_slash( src[0] ) && src[1] == '?') + query = src + 1; + } + else + { + query = wcschr( src, '?' ); + } + + if (!hash) + { + for (const WCHAR *p = src; p < src_end; ++p) { - state = 3; - break; + if (!wcsnicmp( p, L".htm#" , 5)) + hash = p + 4; + else if (!wcsnicmp( p, L".html#", 6 )) + hash = p + 5; } + } + } + else + { + query = wcschr( src, '?' ); + hash = wcschr( src, '#' ); + } + + if (query) + query_len = ((hash && hash > query) ? hash : src_end) - query; + if (hash) + hash_len = ((query && query > hash) ? query : src_end) - hash; + + if (query) + src_end = query; + if (hash && hash < src_end) + src_end = hash; + + if (scheme == URL_SCHEME_UNKNOWN && !has_initial_slash) + { + if (!(flags & URL_DONT_SIMPLIFY) && src[0] == '.' && src_end == src + 1) + src++; + flags |= URL_DONT_SIMPLIFY; + } + + while (src < src_end) + { + bool is_dots = false; + size_t len;
- /* Now at root location, cannot back up any more. */ - /* "root" will point at the '/' */ + for (len = 0; src + len < src_end && !scheme_char_is_separator( scheme, src[len] ); ++len) + ;
- root = wk2-1; - while (*wk1) + if (src[0] == '.' && scheme_char_is_dot_separator( scheme, flags, src[1] )) + { + if (!is_relative) { - mp = wcschr(wk1, '/'); - mp2 = wcschr(wk1, '\'); - if (mp2 && (!mp || mp2 < mp)) - mp = mp2; - if (!mp) + if (flags & URL_DONT_SIMPLIFY) + { + is_dots = true; + } + else { - nWkLen = lstrlenW(wk1); - memcpy(wk2, wk1, (nWkLen + 1) * sizeof(WCHAR)); - wk1 += nWkLen; - wk2 += nWkLen; + ++src; + if (*src == '/' || *src == '\') + ++src; continue; } - nLen = mp - wk1; - if (nLen) + } + } + else if (src[0] == '.' && src[1] == '.' && scheme_char_is_dot_separator( scheme, flags, src[2] )) + { + if (!is_relative) + { + if (flags & URL_DONT_SIMPLIFY) { - memcpy(wk2, wk1, nLen * sizeof(WCHAR)); - wk2 += nLen; - wk1 += nLen; + is_dots = true; } - if (slash) - *wk2++ = slash; - else - *wk2++ = *wk1; - wk1++; - - while (*wk1 == '.') + else if (dst->len == root_offset && scheme_is_always_relative( scheme, flags )) { - TRACE("found '/.'\n"); - if (wk1[1] == '/' || wk1[1] == '\') - { - /* case of /./ -> skip the ./ */ - wk1 += 2; - } - else if (wk1[1] == '.' && (wk1[2] == '/' || wk1[2] == '\' || wk1[2] == '?' - || wk1[2] == '#' || !wk1[2])) - { - /* case /../ -> need to backup wk2 */ - TRACE("found '/../'\n"); - *(wk2-1) = '\0'; /* set end of string */ - mp = wcsrchr(root, '/'); - mp2 = wcsrchr(root, '\'); - if (mp2 && (!mp || mp2 < mp)) - mp = mp2; - if (mp && (mp >= root)) - { - /* found valid backup point */ - wk2 = mp + 1; - if(wk1[2] != '/' && wk1[2] != '\') - wk1 += 2; - else - wk1 += 3; - } - else - { - /* did not find point, restore '/' */ - *(wk2-1) = slash; - break; - } - } + /* We could also use is_dots here, except that we need to + * update root afterwards. */ + + append_char( dst, *src++ ); + append_char( dst, *src++ ); + if (*src == '/' || *src == '\') + append_char( dst, get_slash_dir( scheme, flags, *src++, dst )); else - break; + append_char( dst, get_slash_dir( scheme, flags, 0, dst )); + root_offset = dst->len; + continue; + } + else + { + if (dst->len > root_offset) + --dst->len; /* rewind past the last slash */ + + while (dst->len > root_offset && !scheme_char_is_separator( scheme, dst->string[dst->len - 1] )) + --dst->len; + + src += 2; + if (*src == '/' || *src == '\') + ++src; + continue; } } - *wk2 = '\0'; - break; - default: - FIXME("how did we get here - state=%d\n", state); - heap_free(url_copy); - heap_free(url); - return E_INVALIDARG; } - *wk2 = '\0'; - TRACE("Simplified, orig <%s>, simple <%s>\n", wine_dbgstr_w(src_url), wine_dbgstr_w(url_copy)); + + if (len) + { + append_string( dst, src, len ); + src += len; + } + + if (*src == '?' || *src == '#' || !*src) + { + if (scheme == URL_SCHEME_UNKNOWN && !has_initial_slash) + is_dots = false; + + if (is_dots && scheme_is_always_relative( scheme, flags )) + append_char( dst, get_slash_dir( scheme, flags, 0, dst )); + } + else /* slash */ + { + append_char( dst, get_slash_dir( scheme, flags, *src++, dst )); + } } - nLen = lstrlenW(url_copy); - while ((nLen > 0) && ((url_copy[nLen-1] <= ' '))) - url_copy[--nLen]=0;
- if ((flags & URL_UNESCAPE) || - ((flags & URL_FILE_USE_PATHURL) && nByteLen >= 5*sizeof(WCHAR) && !wcsncmp(url, L"file:", 5))) + /* If the source was non-empty but collapsed to an empty string, output a + * single slash. */ + if (!dst->len && src_end != url) + append_char( dst, '/' ); + + /* UNKNOWN and FILE schemes usually reorder the ? before the #, but others + * emit them in the original order. */ + if (query && hash && scheme != URL_SCHEME_FILE && scheme != URL_SCHEME_INVALID && scheme != URL_SCHEME_UNKNOWN) + { + if (query < hash) + { + append_string( dst, query, query_len ); + append_string( dst, hash, hash_len ); + } + else + { + append_string( dst, hash, hash_len ); + append_string( dst, query, query_len ); + } + } + else if (!(scheme == URL_SCHEME_FILE && (flags & URL_FILE_USE_PATHURL))) { - UrlUnescapeW(url_copy, NULL, &nLen, URL_UNESCAPE_INPLACE); + if (query) + append_string( dst, query, query_len ); + + if (hash) + append_string( dst, hash, hash_len ); }
+ append_char( dst, 0 ); +} + +HRESULT WINAPI UrlCanonicalizeW(const WCHAR *src_url, WCHAR *canonicalized, DWORD *canonicalized_len, DWORD flags) +{ + struct string_buffer rewritten = {0}; + DWORD escape_flags; + HRESULT hr = S_OK; + const WCHAR *src; + WCHAR *url, *dst; + DWORD len; + + TRACE("%s, %p, %p, %#lx\n", wine_dbgstr_w(src_url), canonicalized, canonicalized_len, flags); + + if (!src_url || !canonicalized || !canonicalized_len || !*canonicalized_len) + return E_INVALIDARG; + + if (!*src_url) + { + *canonicalized = 0; + return S_OK; + } + + /* PATHURL takes precedence. */ + if (flags & URL_FILE_USE_PATHURL) + flags &= ~URL_WININET_COMPATIBILITY; + + /* strip initial and final C0 control characters and space */ + src = src_url; + while (*src > 0 && *src <= 0x20) + ++src; + len = wcslen( src ); + while (len && src[len - 1] > 0 && src[len - 1] <= 0x20) + --len; + + if (!(url = HeapAlloc( GetProcessHeap(), 0, (len + 1) * sizeof(WCHAR) ))) + return E_OUTOFMEMORY; + + dst = url; + for (size_t i = 0; i < len; ++i) + { + if (src[i] != '\t' && src[i] != '\n' && src[i] != '\r') + *dst++ = src[i]; + } + *dst++ = 0; + + rewrite_url( &rewritten, url, &flags ); + + if (flags & URL_UNESCAPE) + { + len = rewritten.len; + UrlUnescapeW( rewritten.string, NULL, &len, URL_UNESCAPE_INPLACE); + rewritten.len = wcslen( rewritten.string ) + 1; + } + + /* URL_ESCAPE_SEGMENT_ONLY seems to be ignored. */ escape_flags = flags & (URL_ESCAPE_UNSAFE | URL_ESCAPE_SPACES_ONLY | URL_ESCAPE_PERCENT | - URL_DONT_ESCAPE_EXTRA_INFO | URL_ESCAPE_SEGMENT_ONLY); + URL_DONT_ESCAPE_EXTRA_INFO);
if (escape_flags) { escape_flags &= ~URL_ESCAPE_UNSAFE; - hr = UrlEscapeW(url_copy, canonicalized, canonicalized_len, escape_flags); + hr = UrlEscapeW( rewritten.string, canonicalized, canonicalized_len, escape_flags ); } else { /* No escaping needed, just copy the string */ - nLen = lstrlenW(url_copy); - if (nLen < *canonicalized_len) - memcpy(canonicalized, url_copy, (nLen + 1)*sizeof(WCHAR)); + if (rewritten.len <= *canonicalized_len) + { + memcpy( canonicalized, rewritten.string, rewritten.len * sizeof(WCHAR) ); + *canonicalized_len = rewritten.len - 1; + } else { hr = E_POINTER; - nLen++; + *canonicalized_len = rewritten.len; } - *canonicalized_len = nLen; }
- heap_free(url_copy); - heap_free(url); + heap_free( rewritten.string ); + heap_free( url );
if (hr == S_OK) TRACE("result %s\n", wine_dbgstr_w(canonicalized)); @@ -4223,13 +4583,6 @@ HRESULT WINAPI UrlGetPartA(const char *url, char *out, DWORD *out_len, DWORD par return hr; }
-static const WCHAR *parse_scheme( const WCHAR *p ) -{ - while (isalnum( *p ) || *p == '+' || *p == '-' || *p == '.') - ++p; - return p; -} - static const WCHAR *parse_url_element( const WCHAR *url, const WCHAR *separators ) { const WCHAR *p; @@ -4239,11 +4592,6 @@ static const WCHAR *parse_url_element( const WCHAR *url, const WCHAR *separators return url + wcslen( url ); }
-static BOOL is_slash( char c ) -{ - return c == '/' || c == '\'; -} - static void parse_url( const WCHAR *url, struct parsed_url *pl ) { const WCHAR *work; diff --git a/dlls/shlwapi/tests/url.c b/dlls/shlwapi/tests/url.c index dbee7926aa8..53cc6d4a066 100644 --- a/dlls/shlwapi/tests/url.c +++ b/dlls/shlwapi/tests/url.c @@ -772,7 +772,7 @@ static void test_UrlGetPart(void) }
/* ########################### */ -static void check_url_canonicalize(const char *url, DWORD flags, const char *expect, BOOL todo) +static void check_url_canonicalize(const char *url, DWORD flags, const char *expect) { char output[INTERNET_MAX_URL_LENGTH]; WCHAR outputW[INTERNET_MAX_URL_LENGTH]; @@ -788,8 +788,7 @@ static void check_url_canonicalize(const char *url, DWORD flags, const char *exp ok(hr == E_INVALIDARG, "Got unexpected hr %#lx.\n", hr); hr = UrlCanonicalizeA(url, output, &size, flags); ok(hr == S_OK || (!url[0] && hr == S_FALSE) /* Vista+ */, "Got unexpected hr %#lx.\n", hr); - todo_wine_if (todo) - ok(!strcmp(output, expect), "Expected %s, got %s.\n", debugstr_a(expect), debugstr_a(output)); + ok(!strcmp(output, expect), "Expected %s, got %s.\n", debugstr_a(expect), debugstr_a(output));
size = INTERNET_MAX_URL_LENGTH; hr = UrlCanonicalizeW(urlW, NULL, &size, flags); @@ -967,12 +966,11 @@ static void test_UrlCanonicalizeA(void) const char *url; DWORD flags; const char *expect; - BOOL todo; } tests[] = { {"", 0, ""}, - {"http://www.winehq.org/tests/../tests/../..", 0, "http://www.winehq.org/", TRUE}, + {"http://www.winehq.org/tests/../tests/../..", 0, "http://www.winehq.org/%22%7D, {"http://www.winehq.org/..", 0, "http://www.winehq.org/..%22%7D, {"http://www.winehq.org/tests/tests2/../../tests", 0, "http://www.winehq.org/tests%22%7D, {"http://www.winehq.org/tests/../tests", 0, "http://www.winehq.org/tests%22%7D, @@ -1066,7 +1064,7 @@ static void test_UrlCanonicalizeA(void) {"///A/../B", URL_WININET_COMPATIBILITY, "///B"}, {"A", 0, "A"}, {"../A", 0, "../A"}, - {"A/../B", 0, "B", TRUE}, + {"A/../B", 0, "B"}, {"/uri-res/N2R?urn:sha1:B3K", URL_DONT_ESCAPE_EXTRA_INFO | URL_WININET_COMPATIBILITY /*0x82000000*/, "/uri-res/N2R?urn:sha1:B3K"} /*LimeWire online installer calls this*/, {"http:www.winehq.org/dir/../index.html", 0, "http:www.winehq.org/index.html"}, {"http://localhost/test.html", URL_FILE_USE_PATHURL, "http://localhost/test.html%22%7D, @@ -1158,7 +1156,7 @@ static void test_UrlCanonicalizeA(void)
/* test url-modification */ for (i = 0; i < ARRAY_SIZE(tests); i++) - check_url_canonicalize(tests[i].url, tests[i].flags, tests[i].expect, tests[i].todo); + check_url_canonicalize(tests[i].url, tests[i].flags, tests[i].expect); }
/* ########################### */
From: Zebediah Figura zfigura@codeweavers.com
--- dlls/kernelbase/path.c | 11 +---------- 1 file changed, 1 insertion(+), 10 deletions(-)
diff --git a/dlls/kernelbase/path.c b/dlls/kernelbase/path.c index e361a5a5ee5..e3f06966789 100644 --- a/dlls/kernelbase/path.c +++ b/dlls/kernelbase/path.c @@ -4816,16 +4816,7 @@ BOOL WINAPI UrlIsA(const char *url, URLIS Urlis) case URLIS_OPAQUE: base.cbSize = sizeof(base); if (ParseURLA(url, &base) != S_OK) return FALSE; /* invalid scheme */ - switch (base.nScheme) - { - case URL_SCHEME_MAILTO: - case URL_SCHEME_SHELL: - case URL_SCHEME_JAVASCRIPT: - case URL_SCHEME_VBSCRIPT: - case URL_SCHEME_ABOUT: - return TRUE; - } - return FALSE; + return scheme_is_opaque( base.nScheme );
case URLIS_FILEURL: return (CompareStringA(LOCALE_INVARIANT, NORM_IGNORECASE, url, 5, "file:", 5) == CSTR_EQUAL);
From: Zebediah Figura zfigura@codeweavers.com
--- dlls/shlwapi/tests/url.c | 1165 +++++++++++++++++++++++++++++++++++--- 1 file changed, 1083 insertions(+), 82 deletions(-)
diff --git a/dlls/shlwapi/tests/url.c b/dlls/shlwapi/tests/url.c index 53cc6d4a066..e51a080c330 100644 --- a/dlls/shlwapi/tests/url.c +++ b/dlls/shlwapi/tests/url.c @@ -950,53 +950,861 @@ static void test_UrlEscapeW(void) } }
-/* ########################### */ +struct canonicalize_test +{ + const char *url; + DWORD flags; + const char *expect; +};
static void test_UrlCanonicalizeA(void) { - unsigned int i; CHAR szReturnUrl[4*INTERNET_MAX_URL_LENGTH]; CHAR longurl[4*INTERNET_MAX_URL_LENGTH]; + char url[200], expect[200]; + unsigned int f, i, j; DWORD dwSize; DWORD urllen; HRESULT hr;
+ static const struct canonicalize_test unk_scheme_tests[] = + { + /* Single and double dots behave as one would expect, with the following + * notable rules: + * + * (1) A single or double dot as the first element (the "hostname") is + * always emitted as-is. + * + * (2) If a double dot would undo the hostname, it is emitted as-is + * instead. + * + * (3) If a single or double dot is the last element (either because of + * the above rule or because of URL_DONT_SIMPLIFY), a trailing + * backslash is appended. + * + * A trailing backslash is always appended after the hostname. + */ + + {"//", 0, "///"}, + {"//a", 0, "//a/"}, + {"//a/", 0, "//a/"}, + {"//a/b", 0, "//a/b"}, + {"//a/b/", 0, "//a/b/"}, + {"//.", 0, "//./"}, + {"//./", 0, "//./"}, + {"//./a", 0, "//./a"}, + {"//././a", 0, "//./a"}, + {"//a/.", 0, "//a/"}, + {"//a/./", 0, "//a/"}, + {"//a/./b", 0, "//a/b"}, + {"///./a", 0, "///a"}, + {"//a/.b/", 0, "//a/.b/"}, + {"//a/b./", 0, "//a/b./"}, + + {"//..", 0, "//../"}, + {"//../", 0, "//../"}, + {"//../a", 0, "//../a"}, + {"//../a/..", 0, "//../"}, + {"//.././..", 0, "//../../"}, + {"//../a/../..", 0, "//../../"}, + {"//./a/../..", 0, "//./../"}, + {"//a/..", 0, "//a/../"}, + {"//a/../../b/./c/..", 0, "//a/../../b/"}, + {"//a/b/..", 0, "//a/"}, + {"//a/b/...", 0, "//a/b/..."}, + {"//a/b/../", 0, "//a/"}, + {"//a/b/../c", 0, "//a/c"}, + {"//a/b/../c/..", 0, "//a/"}, + {"//a/b/../c/../..", 0, "//a/../"}, + {"//a/b/../../../c", 0, "//a/../../c"}, + {"///..", 0, "///../"}, + {"////..", 0, "///"}, + {"//a/..b/", 0, "//a/..b/"}, + {"//a/b../", 0, "//a/b../"}, + {"//A/B", 0, "//A/B"}, + + {"//././a", URL_DONT_SIMPLIFY, "//././a"}, + {"//a/.", URL_DONT_SIMPLIFY, "//a/./"}, + {"//a/./", URL_DONT_SIMPLIFY, "//a/./"}, + {"//a/./b", URL_DONT_SIMPLIFY, "//a/./b"}, + {"///./a", URL_DONT_SIMPLIFY, "///./a"}, + + {"//..", URL_DONT_SIMPLIFY, "//../"}, + {"//../", URL_DONT_SIMPLIFY, "//../"}, + {"//../a", URL_DONT_SIMPLIFY, "//../a"}, + {"//../a/..", URL_DONT_SIMPLIFY, "//../a/../"}, + {"//../a/...", URL_DONT_SIMPLIFY, "//../a/..."}, + {"//.././..", URL_DONT_SIMPLIFY, "//.././../"}, + {"//../a/../..", URL_DONT_SIMPLIFY, "//../a/../../"}, + {"//./a/../..", URL_DONT_SIMPLIFY, "//./a/../../"}, + {"//a/..", URL_DONT_SIMPLIFY, "//a/../"}, + {"//a/../../b/./c/..", URL_DONT_SIMPLIFY, "//a/../../b/./c/../"}, + {"//a/b/..", URL_DONT_SIMPLIFY, "//a/b/../"}, + {"//a/b/../", URL_DONT_SIMPLIFY, "//a/b/../"}, + {"//a/b/../c", URL_DONT_SIMPLIFY, "//a/b/../c"}, + {"//a/b/../c/..", URL_DONT_SIMPLIFY, "//a/b/../c/../"}, + {"//a/b/../c/../..", URL_DONT_SIMPLIFY, "//a/b/../c/../../"}, + {"///..", URL_DONT_SIMPLIFY, "///../"}, + {"////..", URL_DONT_SIMPLIFY, "////../"}, + + /* After ? or #, dots are not simplified. */ + {"//a/b?c/./d", 0, "//a/b?c/./d"}, + {"//a/b#c/./d", 0, "//a/b#c/./d"}, + {"//a/b#c/.", 0, "//a/b#c/."}, + /* ? and # can also be considered a boundary for trailing dots. */ + {"//a/b/.?", 0, "//a/b/?"}, + {"//a/b/..?", 0, "//a/?"}, + {"//a/b/..?", URL_DONT_SIMPLIFY, "//a/b/../?"}, + {"//a/b/.#", 0, "//a/b/#"}, + {"//a/b/..#", 0, "//a/#"}, + {"//a/b/..#", URL_DONT_SIMPLIFY, "//a/b/../#"}, + {"//a/..?", 0, "//a/../?"}, + {"//a/..#", 0, "//a/../#"}, + {"//..?", 0, "//../?"}, + {"//..#", 0, "//../#"}, + {"//?/a/./", 0, "///?/a/./"}, + {"//#/a/./", 0, "///#/a/./"}, + /* The first ? is reordered before the first #. */ + {"//a/b#c?d", 0, "//a/b?d#c"}, + {"//a/b?c#d?e", 0, "//a/b?c#d?e"}, + {"//a/b#c?d#e", 0, "//a/b?d#e#c"}, + {"//a/b#c#d?e", 0, "//a/b?e#c#d"}, + {"//a/b#c?d?e", 0, "//a/b?d?e#c"}, + + /* Backslashes are not treated as path separators. */ + {"//a/b\c/../.\", 0, "//a/.\"}, + {"//a\b/../", 0, "//a\b/../"}, + {"//a/b\../", 0, "//a/b\../"}, + {"//a/b/..\", 0, "//a/b/..\"}, + + /* Whitespace and unsafe characters are not (by default) escaped. */ + {"//a/b &c", 0, "//a/b &c"}, + + /* If one slash is omitted, the rules are much the same, except that + * there is no "hostname". Single dots are always collapsed; double dots + * are collapsed unless they would undo the "scheme". */ + + {"/a", 0, "/a"}, + {"/a/", 0, "/a/"}, + {"/.", 0, "/"}, + {"/./", 0, "/"}, + {"/././a", 0, "/a"}, + {"/a/.", 0, "/a/"}, + {"/a/./", 0, "/a/"}, + {"/a/./b", 0, "/a/b"}, + + {"/..", 0, "/../"}, + {"/../", 0, "/../"}, + {"/../a", 0, "/../a"}, + {"/../a/..", 0, "/../"}, + {"/a/..", 0, "/"}, + {"/a/../..", 0, "/../"}, + {"/a/b/..", 0, "/a/"}, + {"/a/b/../", 0, "/a/"}, + {"/a/b/../c", 0, "/a/c"}, + {"/a/b/../c/..", 0, "/a/"}, + {"/a/b/../c/../..", 0, "/"}, + + {"/a/b?c/./d", 0, "/a/b?c/./d"}, + {"/a/b#c/./d", 0, "/a/b#c/./d"}, + {"/a/b#c?d", 0, "/a/b?d#c"}, + + /* Just as above, backslashes are not treated as path separators. */ + {"/a/b\c/../.\", 0, "/a/.\"}, + {"/a/b\/c", 0, "/a/b\/c"}, + {"/a/b\.c", 0, "/a/b\.c"}, + /* If the first character after the slash is a backslash, it is skipped. + * It is not interpreted as a forward slash. + * The tests above show that this is not due to the backslash being + * interpreted as an escape character. */ + {"/\././a", 0, "/a"}, + /* The sequence // does not result in use of the double-slash rules. + * Rather, the resulting // is treated as an empty path element. */ + {"/\/././a", 0, "//a"}, + {"/\/././a/", 0, "//a/"}, + {"/\/..", 0, "/"}, + {"//a/\b", 0, "//a/\b"}, + + {"/a/b &c", 0, "/a/b &c"}, + + {"//a/b%20%26c", URL_UNESCAPE, "//a/b &c"}, + }; + static const struct { const char *url; DWORD flags; const char *expect; + const char *expect_ftp; } - tests[] = + http_tests[] = + { + /* A set of schemes including http differs from the "default" behaviour + * in the following ways: + * + * (1) If a double dot would undo the hostname, it is dropped instead. + * + * (2) If the first element after the hostname is a single or double + * dot, no further dots are simplified. + * + * (3) Trailing backslashes are not automatically appended after dots. + */ + + {"//", 0, "///"}, + {"//a", 0, "//a/"}, + {"//a/", 0, "//a/"}, + {"//a/b", 0, "//a/b"}, + {"//a/b/", 0, "//a/b/"}, + {"//.", 0, "//./"}, + {"//./", 0, "//./"}, + {"//././a/.", 0, "//././a/."}, + {"//a/.", 0, "//a/."}, + {"//a/./b/./../", 0, "//a/./b/./../"}, + {"//a/b/.", 0, "//a/b/"}, + {"//a/b/.", URL_DONT_SIMPLIFY, "//a/b/."}, + {"//a/b/./", 0, "//a/b/"}, + {"//a/b/./c", 0, "//a/b/c"}, + {"///./a", 0, "///./a"}, + {"////./a", 0, "////a"}, + + {"//..", 0, "//../"}, + {"//../", 0, "//../"}, + {"//../a", 0, "//../a"}, + {"//../a/..", 0, "//../"}, + {"//../a/../..", 0, "//../"}, + {"//./a/../..", 0, "//./"}, + {"//a/../", 0, "//a/../"}, + {"//a/../../b/./../", 0, "//a/../../b/./../"}, + {"//a/.././", 0, "//a/.././"}, + {"//a/b/..", 0, "//a/"}, + {"//a/b/..", URL_DONT_SIMPLIFY, "//a/b/.."}, + {"//a/b/../", 0, "//a/"}, + {"//a/b/.././", 0, "//a/"}, + {"//a/b/../c", 0, "//a/c"}, + {"//a/b/../c/..", 0, "//a/"}, + {"//a/b/../c/../..", 0, "//a/"}, + {"//a/b/../../../c", 0, "//a/c"}, + {"///a/.", 0, "///a/"}, + {"///..", 0, "///.."}, + {"////..", 0, "///"}, + {"//a//../../..", 0, "//a/"}, + + {"//a/b?c/./d", 0, "//a/b?c/./d"}, + {"//a/b#c/./d", 0, "//a/b#c/./d"}, + {"//a/b#c?d", 0, "//a/b#c?d"}, + {"//a/b?c#d", 0, "//a/b?c#d"}, + + {"//localhost/b", 0, "//localhost/b"}, + + /* Most of these schemes translates backslashes to forward slashes, + * including the initial pair, and interpret them appropriately. + * + * A few schemes, including ftp, don't translate backslashes to forward + * slashes, but still interpret them as path separators, with the + * exception that the hostname must end in a forward slash. */ + + {"//a/b\", 0, "//a/b/", "//a/b\"}, + {"//a/b\./c", 0, "//a/b/c", "//a/b\c"}, + {"//a/b/.\c", 0, "//a/b/c"}, + {"//a/b\c/../.\", 0, "//a/b/", "//a/b\"}, + {"//a\b", 0, "//a/b", "//a\b/"}, + {"//a\b/..", 0, "//a/", "//a\b/.."}, + {"//a/b\c", 0, "//a/b/c", "//a/b\c"}, + {"/\a\..", 0, "//a/..", "/\a\../"}, + {"\/a\..", 0, "//a/..", "\/a\../"}, + + {"//a/b &c", 0, "//a/b &c"}, + + /* If one or both slashes is missing, the portion after the colon is + * treated like a normal path, without a hostname. Single and double + * dots are always collapsed, and double dots which would rewind past + * the scheme are dropped instead. */ + + {"a", 0, "a"}, + {"a/", 0, "a/"}, + {"a/.", 0, "a/"}, + {"a/..", 0, ""}, + {"a/../..", 0, ""}, + {"a/../..", URL_DONT_SIMPLIFY, "a/../.."}, + {"", 0, ""}, + {"/", 0, "/"}, + {"/.", 0, "/"}, + {"/..", 0, ""}, + {"/../..", 0, ""}, + {".", 0, ""}, + {"..", 0, ""}, + {"./", 0, ""}, + {"../", 0, ""}, + + {"a/b?c/.\d", 0, "a/b?c/.\d"}, + {"a/b#c/.\d", 0, "a/b#c/.\d"}, + + {"a\b\", 0, "a/b/", "a\b\"}, + + {"a/b &c", 0, "a/b &c"}, + + {"/foo/bar/baz", URL_ESCAPE_SEGMENT_ONLY, "/foo/bar/baz"}, + {"/foo/bar/baz?a#b", URL_ESCAPE_SEGMENT_ONLY, "/foo/bar/baz?a#b"}, + + {"//www.winehq.org/tests\n", URL_ESCAPE_SPACES_ONLY | URL_ESCAPE_UNSAFE, "//www.winehq.org/tests"}, + {"//www.winehq.org/tests\r", URL_ESCAPE_SPACES_ONLY | URL_ESCAPE_UNSAFE, "//www.winehq.org/tests"}, + {"//www.winehq.org/tests/foo bar", URL_ESCAPE_SPACES_ONLY | URL_DONT_ESCAPE_EXTRA_INFO, "//www.winehq.org/tests/foo%20bar"}, + {"//www.winehq.org/tests/foo%20bar", 0, "//www.winehq.org/tests/foo%20bar"}, + {"//www.winehq.org/tests/foo%20bar", URL_UNESCAPE, "//www.winehq.org/tests/foo bar"}, + {"//www.winehq.org/%E6%A1%9C.html", 0, "//www.winehq.org/%E6%A1%9C.html"}, + }; + + static const struct canonicalize_test opaque_tests[] = { + /* Opaque protocols, predictably, do not modify the portion after the + * scheme. */ + {"//a/b/./c/../d\e", 0, "//a/b/./c/../d\e"}, + {"/a/b/./c/../d\e", 0, "/a/b/./c/../d\e"}, + {"a/b/./c/../d\e", 0, "a/b/./c/../d\e"}, {"", 0, ""}, - {"http://www.winehq.org/tests/../tests/../..", 0, "http://www.winehq.org/%22%7D, - {"http://www.winehq.org/..", 0, "http://www.winehq.org/..%22%7D, - {"http://www.winehq.org/tests/tests2/../../tests", 0, "http://www.winehq.org/tests%22%7D, - {"http://www.winehq.org/tests/../tests", 0, "http://www.winehq.org/tests%22%7D, - {"http://www.winehq.org/tests%5Cn", URL_WININET_COMPATIBILITY|URL_ESCAPE_SPACES_ONLY|URL_ESCAPE_UNSAFE, "http://www.winehq.org/tests%22%7D, - {"http://www.winehq.org/tests%5Cr", URL_WININET_COMPATIBILITY|URL_ESCAPE_SPACES_ONLY|URL_ESCAPE_UNSAFE, "http://www.winehq.org/tests%22%7D, - {"http://www.winehq.org/tests%5Cr", 0, "http://www.winehq.org/tests%22%7D, - {"http://www.winehq.org/tests%5Cr", URL_DONT_SIMPLIFY, "http://www.winehq.org/tests%22%7D, - {"http://www.winehq.org/tests/../tests/", 0, "http://www.winehq.org/tests/%22%7D, - {"http://www.winehq.org/tests/../tests/..", 0, "http://www.winehq.org/%22%7D, - {"http://www.winehq.org/tests/../tests/../", 0, "http://www.winehq.org/%22%7D, - {"http://www.winehq.org/tests/..", 0, "http://www.winehq.org/%22%7D, - {"http://www.winehq.org/tests/../", 0, "http://www.winehq.org/%22%7D, - {"http://www.winehq.org/tests/..?query=x&return=y", 0, "http://www.winehq.org/?query=x&return=y%22%7D, - {"http://www.winehq.org/tests/../?query=x&return=y", 0, "http://www.winehq.org/?query=x&return=y%22%7D, - {"\tht\ttp\t://www\t.w\tineh\t\tq.or\tg\t/\ttests/..\t?\tquer\ty=x\t\t&re\tturn=y\t\t", 0, "http://www.winehq.org/?query=x&return=y%22%7D, - {"http://www.winehq.org/tests/..#example", 0, "http://www.winehq.org/#example%22%7D, - {"http://www.winehq.org/tests/../#example", 0, "http://www.winehq.org/#example%22%7D, - {"http://www.winehq.org/tests%5C%5C../#example", 0, "http://www.winehq.org/#example%22%7D, - {"http://www.winehq.org/tests/..%5C%5C#example", 0, "http://www.winehq.org/#example%22%7D, - {"http://www.winehq.org%5C%5Ctests/../#example", 0, "http://www.winehq.org/#example%22%7D, - {"http://www.winehq.org/tests/../#example", URL_DONT_SIMPLIFY, "http://www.winehq.org/tests/../#example%22%7D, - {"http://www.winehq.org/tests/foo bar", URL_ESCAPE_SPACES_ONLY | URL_DONT_ESCAPE_EXTRA_INFO, "http://www.winehq.org/tests/foo%20bar%22%7D, - {"http://www.winehq.org/tests/foo%20bar", URL_UNESCAPE, "http://www.winehq.org/tests/foo bar"}, - {"http://www.winehq.org", 0, "http://www.winehq.org/%22%7D, - {"http:///www.winehq.org", 0, "http:///www.winehq.org%22%7D, - {"http:////www.winehq.org", 0, "http:////www.winehq.org%22%7D, + {"//a/b &c", 0, "//a/b &c"}, + {"//a/b%20%26c", URL_UNESCAPE, "//a/b &c"}, + }; + + static const struct canonicalize_test file_tests[] = + { + /* file:// is almost identical to http://, except that a URL beginning + * with file://// (four or more slashes) is stripped down to two + * slashes. The first non-empty element is interpreted as a hostname; + * and the rest follows the usual rules. + * + * The intent here is probably to detect UNC paths, although it's + * unclear why an arbitrary number of slashes are skipped in that case. + */ + + {"file://", 0, "file:///"}, + {"file://a", 0, "file://a/"}, + {"file://a/", 0, "file://a/"}, + {"file://a//", 0, "file://a//"}, + {"file://a/b", 0, "file://a/b"}, + {"file://a/b/", 0, "file://a/b/"}, + {"file://.", 0, "file://./"}, + {"file://./", 0, "file://./"}, + {"file://././a/.", 0, "file://././a/."}, + {"file://a/.", 0, "file://a/."}, + {"file://a/./b/./../", 0, "file://a/./b/./../"}, + {"file://a/b/.", 0, "file://a/b/"}, + {"file://a/b/.", URL_DONT_SIMPLIFY, "file://a/b/."}, + {"file://a/b/./", 0, "file://a/b/"}, + {"file://a/b/./c", 0, "file://a/b/c"}, + {"file:///./a", 0, "file:///./a"}, + {"file:////./a", 0, "file://./a"}, + + {"file://..", 0, "file://../"}, + {"file://../", 0, "file://../"}, + {"file://../a", 0, "file://../a"}, + {"file://../a/..", 0, "file://../"}, + {"file://../a/../..", 0, "file://../"}, + {"file://./a/../..", 0, "file://./"}, + {"file://a/../", 0, "file://a/../"}, + {"file://a/../../b/./../", 0, "file://a/../../b/./../"}, + {"file://a/.././", 0, "file://a/.././"}, + {"file://a/b/..", 0, "file://a/"}, + {"file://a/b/../", 0, "file://a/"}, + {"file://a/b/.././", 0, "file://a/"}, + {"file://a/b/../c", 0, "file://a/c"}, + {"file://a/b/../c/..", 0, "file://a/"}, + {"file://a/b/../c/../..", 0, "file://a/"}, + {"file://a/b/../../../c", 0, "file://a/c"}, + {"file:///.", 0, "file:///."}, + {"file:///..", 0, "file:///.."}, + {"file:///a/.", 0, "file:///a/"}, + + {"file:////", 0, "file:///"}, + {"file:////a/./b/../c", 0, "file://a/./b/../c"}, + {"file://///a/./b/../c", 0, "file://a/./b/../c"}, + {"file://////a/./b/../c", 0, "file://a/./b/../c"}, + {"file:////a/b/./../c", 0, "file://a/c"}, + {"file:////a/b/./../..", 0, "file://a/"}, + {"file://///a/b/./../c", 0, "file://a/c"}, + {"file://////a/b/./../c", 0, "file://a/c"}, + {"file:////.", 0, "file://./"}, + {"file:////..", 0, "file://../"}, + {"file:////./b/./../c", 0, "file://./c"}, + {"file:////./b/./../..", 0, "file://./"}, + {"file://///./b/./../c", 0, "file://./c"}, + {"file://////./b/./../c", 0, "file://./c"}, + + /* Drive-like paths get an extra slash (i.e. an empty hostname, to + * signal that the host is the local machine). The drive letter is + * treated as the path root. */ + {"file://a:", 0, "file:///a:"}, + {"file://a:/b", 0, "file:///a:/b"}, + {"file://a:/b/../..", 0, "file:///a:/"}, + {"file://a:/./../..", 0, "file:///a:/./../.."}, + {"file://a|/b", 0, "file:///a|/b"}, + {"file://ab:/c", 0, "file://ab:/c"}, + {"file:///a:", 0, "file:///a:"}, + {"file:////a:", 0, "file:///a:"}, + {"file://///a:", 0, "file:///a:"}, + {"file://host/a:/b/../..", 0, "file://host/a:/"}, + + /* URL_FILE_USE_PATHURL (and URL_WININET_COMPATIBILITY) have their own + * set of rules: + * + * (1) Dot processing works exactly like the "unknown scheme" rules, + * instead of the file/http rules demonstrated above. + * + * (2) Some number of backslashes is appended after the two forward + * slashes. The number basically corresponds to the detected path + * type (two for a remote path, one for a local path, none for a + * local drive path). A local path is one where the hostname is + * empty or "localhost". If all path elements are empty then no + * backslashes are appended. + */ + + {"file://", URL_FILE_USE_PATHURL, "file://"}, + {"file://a", URL_FILE_USE_PATHURL, "file://\\a"}, + {"file://a/", URL_FILE_USE_PATHURL, "file://\\a"}, + {"file://a//", URL_FILE_USE_PATHURL, "file://\\a\\"}, + {"file://a/b", URL_FILE_USE_PATHURL, "file://\\a\b"}, + {"file://a//b", URL_FILE_USE_PATHURL, "file://\\a\\b"}, + {"file://a/.", URL_FILE_USE_PATHURL, "file://\\a\"}, + {"file://a/../../b/./c/..", URL_FILE_USE_PATHURL, "file://\\a\..\..\b\"}, + {"file://./../../b/./c/..", URL_FILE_USE_PATHURL, "file://\\.\..\..\b\"}, + {"file://../../../b/./c/..", URL_FILE_USE_PATHURL, "file://\\..\..\..\b\"}, + {"file://a/b/.", URL_FILE_USE_PATHURL, "file://\\a\b\"}, + {"file://a/b/.", URL_FILE_USE_PATHURL | URL_DONT_SIMPLIFY, "file://\\a\b\.\"}, + {"file://a/b/../../../c", URL_FILE_USE_PATHURL, "file://\\a\..\..\c"}, + + {"file:///", URL_FILE_USE_PATHURL, "file://"}, + {"file:///.", URL_FILE_USE_PATHURL, "file://\"}, + {"file:///..", URL_FILE_USE_PATHURL, "file://\..\"}, + {"file:///../../b/./c/..", URL_FILE_USE_PATHURL, "file://\..\..\b\"}, + {"file:///a/b/./c/..", URL_FILE_USE_PATHURL, "file://\a\b\"}, + + {"file:////", URL_FILE_USE_PATHURL, "file://"}, + {"file:////.", URL_FILE_USE_PATHURL, "file://\\."}, + {"file:////a/./b/../c", URL_FILE_USE_PATHURL, "file://\\a\c"}, + {"file://///", URL_FILE_USE_PATHURL, "file://"}, + {"file://///a/./b/../c", URL_FILE_USE_PATHURL, "file://\\a\c"}, + + {"file://a:", URL_FILE_USE_PATHURL, "file://a:"}, + {"file://a:/", URL_FILE_USE_PATHURL, "file://a:\"}, + {"file://a:/b", URL_FILE_USE_PATHURL, "file://a:\b"}, + {"file://a:/b/../..", URL_FILE_USE_PATHURL, "file://a:\..\"}, + {"file://a|/b", URL_FILE_USE_PATHURL, "file://a|\b"}, + {"file:///a:", URL_FILE_USE_PATHURL, "file://a:"}, + {"file:////a:", URL_FILE_USE_PATHURL, "file://a:"}, + + /* URL_WININET_COMPATIBILITY is almost identical, but ensures a trailing + * backslash in two cases: + * + * (1) if all path elements are empty, + * + * (2) if the path consists of just the hostname. + */ + + {"file://", URL_WININET_COMPATIBILITY, "file://\"}, + {"file://a", URL_WININET_COMPATIBILITY, "file://\\a\"}, + {"file://a/", URL_WININET_COMPATIBILITY, "file://\\a\"}, + {"file://a//", URL_WININET_COMPATIBILITY, "file://\\a\\"}, + {"file://a/b", URL_WININET_COMPATIBILITY, "file://\\a\b"}, + {"file://a//b", URL_WININET_COMPATIBILITY, "file://\\a\\b"}, + {"file://a/.", URL_WININET_COMPATIBILITY, "file://\\a\"}, + {"file://a/../../b/./c/..", URL_WININET_COMPATIBILITY, "file://\\a\..\..\b\"}, + {"file://./../../b/./c/..", URL_WININET_COMPATIBILITY, "file://\\.\..\..\b\"}, + {"file://../../../b/./c/..", URL_WININET_COMPATIBILITY, "file://\\..\..\..\b\"}, + {"file://a/b/../../../c", URL_WININET_COMPATIBILITY, "file://\\a\..\..\c"}, + + {"file:///", URL_WININET_COMPATIBILITY, "file://\"}, + {"file:///.", URL_WININET_COMPATIBILITY, "file://\"}, + {"file:///..", URL_WININET_COMPATIBILITY, "file://\..\"}, + {"file:///../../b/./c/..", URL_WININET_COMPATIBILITY, "file://\..\..\b\"}, + {"file:///a/b/./c/..", URL_WININET_COMPATIBILITY, "file://\a\b\"}, + + {"file:////", URL_WININET_COMPATIBILITY, "file://\"}, + {"file:////.", URL_WININET_COMPATIBILITY, "file://\\.\"}, + {"file:////a/./b/../c", URL_WININET_COMPATIBILITY, "file://\\a\c"}, + {"file://///", URL_WININET_COMPATIBILITY, "file://\"}, + {"file://///a/./b/../c", URL_WININET_COMPATIBILITY, "file://\\a\c"}, + + {"file://a:", URL_WININET_COMPATIBILITY, "file://a:"}, + + {"file://", URL_FILE_USE_PATHURL | URL_WININET_COMPATIBILITY, "file://"}, + + {"file://localhost/a", 0, "file://localhost/a"}, + {"file://localhost//a", 0, "file://localhost//a"}, + {"file://localhost/a:", 0, "file://localhost/a:"}, + {"file://localhost/a:/b/../..", 0, "file://localhost/a:/"}, + {"file://localhost/a:/./../..", 0, "file://localhost/a:/./../.."}, + {"file://localhost", URL_FILE_USE_PATHURL, "file://"}, + {"file://localhost/", URL_FILE_USE_PATHURL, "file://"}, + {"file://localhost/b", URL_FILE_USE_PATHURL, "file://\b"}, + {"file://127.0.0.1/b", URL_FILE_USE_PATHURL, "file://\\127.0.0.1\b"}, + {"file://localhost//b", URL_FILE_USE_PATHURL, "file://\b"}, + {"file://localhost///b", URL_FILE_USE_PATHURL, "file://\\b"}, + {"file:///localhost/b", URL_FILE_USE_PATHURL, "file://\localhost\b"}, + {"file:////localhost/b", URL_FILE_USE_PATHURL, "file://\b"}, + {"file://///localhost/b", URL_FILE_USE_PATHURL, "file://\b"}, + {"file://localhost/a:", URL_FILE_USE_PATHURL, "file://a:"}, + {"file://localhost/a:/b/../..", URL_FILE_USE_PATHURL, "file://a:\..\"}, + {"file://localhost?a/b", URL_FILE_USE_PATHURL, "file://"}, + {"file://localhost#a/b", URL_FILE_USE_PATHURL, "file://\\localhost#a/b"}, + {"file://localhostq", URL_FILE_USE_PATHURL, "file://\\localhostq"}, + + {"file://localhost", URL_WININET_COMPATIBILITY, "file://\"}, + {"file://localhost/", URL_WININET_COMPATIBILITY, "file://\"}, + {"file://localhost/b", URL_WININET_COMPATIBILITY, "file://\b"}, + {"file://localhost//b", URL_WININET_COMPATIBILITY, "file://\b"}, + {"file://127.0.0.1/b", URL_WININET_COMPATIBILITY, "file://\\127.0.0.1\b"}, + {"file://localhost/a:", URL_WININET_COMPATIBILITY, "file://a:"}, + {"file://localhost?a/b", URL_WININET_COMPATIBILITY, "file://\?a/b"}, + {"file://localhost#a/b", URL_WININET_COMPATIBILITY, "file://\\localhost#a/b"}, + + /* # has some weird behaviour: + * + * - Dot processing happens normally after it, including rewinding past + * the #. It's not treated as a path separator for the purposes of + * rewinding. + * + * - However, if neither file flag is used, and the first character + * after the hostname (plus an optional slash) is a hash, no dot + * processing takes place. + * + * - If the previous path segment ends in .htm or .html, the rest of + * the URL is emitted verbatim (no dot or slash canonicalization). + * This does not apply to the hostname. If URL_FILE_USE_PATHURL is + * used, though, the rest of the URL including the # is omitted. + * + * - It is treated as a path terminator for dots, but only if neither + * file flag is used. It does not begin a path element. + * + * - If there is a # anywhere in the output string (and the string + * doesn't fall under the .html exception), all subsequent slashes + * are converted to forward slashes instead of backslashes. + * This means that rewinding past the hash will revert to backslashes. + * (This of course only affects the case where file flags are used; + * if no file flags are used then slashes are converted to forward + * slashes anyway.) + */ + {"file://a/b#c/../d\e", 0, "file://a/d/e"}, + {"file://a/b#c/./d\e", 0, "file://a/b#c/d/e"}, + {"file://a/b.htm#c/../d\e", 0, "file://a/b.htm#c/../d\e"}, + {"file://a/b.html#c/../d\e", 0, "file://a/b.html#c/../d\e"}, + {"file://a/b.hTmL#c/../d\e", 0, "file://a/b.hTmL#c/../d\e"}, + {"file://a/b.xhtml#c/../d\e", 0, "file://a/d/e"}, + {"file://a/b.php#c/../d\e", 0, "file://a/d/e"}, + {"file://a/b.asp#c/../d\e", 0, "file://a/d/e"}, + {"file://a/b.aspx#c/../d\e", 0, "file://a/d/e"}, + {"file://a/b.ht#c/../d\e", 0, "file://a/d/e"}, + {"file://a/b.txt#c/../d\e", 0, "file://a/d/e"}, + {"file://a/b.htmlq#c/../d\e", 0, "file://a/d/e"}, + {"file://a/b.html/q#c/../d\e", 0, "file://a/b.html/d/e"}, + {"file://a/.html#c/../d\e", 0, "file://a/.html#c/../d\e"}, + {"file://a/html#c/../d\e", 0, "file://a/d/e"}, + {"file://a/b#c/./d.html#e/../f", 0, "file://a/b#c/d.html#e/../f"}, + {"file://a.html#/b/../c", 0, "file://a.html#/c"}, + {"file://a/b#c/../d/e", URL_FILE_USE_PATHURL, "file://\\a\d\e"}, + {"file://a/b#c/./d/e", URL_FILE_USE_PATHURL, "file://\\a\b#c/d/e"}, + {"file://a/b.html#c/../d\e", URL_FILE_USE_PATHURL, "file://\\a\b.html"}, + {"file://a/b.html#c/../d\e", URL_FILE_USE_PATHURL | URL_WININET_COMPATIBILITY, "file://\\a\b.html"}, + {"file://a/b#c/../d/e", URL_WININET_COMPATIBILITY, "file://\\a\d\e"}, + {"file://a/b#c/./d/e", URL_WININET_COMPATIBILITY, "file://\\a\b#c/d/e"}, + {"file://a/b.html#c/../d\e", URL_WININET_COMPATIBILITY, "file://\\a\b.html#c/../d\e"}, + {"file://a/c#/../d", 0, "file://a/d"}, + {"file://a/c#/../d", URL_FILE_USE_PATHURL, "file://\\a\d"}, + {"file://a/c#/../d", URL_WININET_COMPATIBILITY, "file://\\a\d"}, + {"file://a/#c/../d\e", 0, "file://a/#c/../d\e"}, + {"file://a/#c/../d/e", URL_FILE_USE_PATHURL, "file://\\a\d\e"}, + {"file://a/#c/../d/e", URL_WININET_COMPATIBILITY, "file://\\a\d\e"}, + {"file://a//#c/../d", 0, "file://a//#c/../d"}, + {"file://a//#c/../d", URL_FILE_USE_PATHURL, "file://\\a\\d"}, + {"file://a//#c/../d", URL_WININET_COMPATIBILITY, "file://\\a\\d"}, + {"file://a/\#c/../d", 0, "file://a//#c/../d"}, + {"file://a///#c/../d", 0, "file://a///d"}, + {"file://a/b/#c/../d", 0, "file://a/b/d"}, + {"file://a/b/.#c", 0, "file://a/b/#c"}, + {"file://a/b/..#c", 0, "file://a/#c"}, + {"file://a/b/.#c", URL_FILE_USE_PATHURL, "file://\\a\b\.#c"}, + {"file://a/b/..#c", URL_FILE_USE_PATHURL, "file://\\a\b\..#c"}, + {"file://a/b/.#c", URL_WININET_COMPATIBILITY, "file://\\a\b\.#c"}, + {"file://a/b/..#c", URL_WININET_COMPATIBILITY, "file://\\a\b\..#c"}, + {"file://a/b#../c", 0, "file://a/b#../c"}, + {"file://a/b/#../c", 0, "file://a/b/#../c"}, + {"file://a/b#../c", URL_FILE_USE_PATHURL, "file://\\a\b#../c"}, + {"file://a/b#../c", URL_WININET_COMPATIBILITY, "file://\\a\b#../c"}, + {"file://#/b\./", 0, "file://#/b/"}, + {"file://#/./b\./", 0, "file://#/./b/./"}, + {"file://#/b\./", URL_FILE_USE_PATHURL, "file://\\#/b/"}, + {"file://#/b\./", URL_WININET_COMPATIBILITY, "file://\\#/b/"}, + {"file://a#/b\./", 0, "file://a#/b/"}, + {"file://a#/./b\./", 0, "file://a#/./b/./"}, + {"file://a#/b\./", URL_FILE_USE_PATHURL, "file://\\a#/b/"}, + {"file://a#/b\./", URL_WININET_COMPATIBILITY, "file://\\a#/b/"}, + {"file://a#/b\./", URL_FILE_USE_PATHURL | URL_DONT_SIMPLIFY, "file://\\a#/b/./"}, + {"file://a#/b\.", URL_FILE_USE_PATHURL | URL_DONT_SIMPLIFY, "file://\\a#/b/./"}, + {"file://a#/b/../../", 0, "file://a#/"}, + + /* ? is similar, with the following exceptions: + * + * - URLs ending in .htm(l) are not treated specially. + * + * - With URL_FILE_USE_PATHURL, the rest of the URL including the ? is + * just omitted (much like the .html case above). + * + * - With URL_WININET_COMPATIBILITY, the rest of the URL is always + * emitted verbatim (completely opaque, like other schemes). + */ + + {"file://a/b?c/../d\e", 0, "file://a/d/e"}, + {"file://a/b.html?c/../d\e", 0, "file://a/d/e"}, + {"file://a/b?c/../d\e", URL_FILE_USE_PATHURL, "file://\\a\b"}, + {"file://a/b.html?c/../d\e", URL_FILE_USE_PATHURL, "file://\\a\b.html"}, + {"file://a/b?c/../d\e", URL_WININET_COMPATIBILITY, "file://\\a\b?c/../d\e"}, + {"file://a/b.html?c/../d\e", URL_WININET_COMPATIBILITY, "file://\\a\b.html?c/../d\e"}, + {"file://a/b?c/../d", URL_FILE_USE_PATHURL | URL_WININET_COMPATIBILITY, "file://\\a\b"}, + {"file://a/?c/../d", 0, "file://a/?c/../d"}, + {"file://a/?c/../d", URL_FILE_USE_PATHURL, "file://\\a"}, + {"file://a/?c/../d", URL_WININET_COMPATIBILITY, "file://\\a\?c/../d"}, + {"file://a//?c/../d", 0, "file://a//?c/../d"}, + {"file://a//?c/../d", URL_FILE_USE_PATHURL, "file://\\a\\"}, + {"file://a//?c/../d", URL_WININET_COMPATIBILITY, "file://\\a\\?c/../d"}, + {"file://a/\?c/../d", 0, "file://a//?c/../d"}, + {"file://a///?c/../d", 0, "file://a///d"}, + {"file://a/b/?c/../d", 0, "file://a/b/d"}, + {"file://a/b/.?c", 0, "file://a/b/?c"}, + {"file://a/b/..?c", 0, "file://a/?c"}, + {"file://a/b/.?c", URL_FILE_USE_PATHURL, "file://\\a\b\"}, + {"file://a/b/..?c", URL_FILE_USE_PATHURL, "file://\\a\"}, + {"file://a/b/.?c", URL_WININET_COMPATIBILITY, "file://\\a\b\?c"}, + {"file://a/b/..?c", URL_WININET_COMPATIBILITY, "file://\\a\?c"}, + {"file://?/a\./", 0, "file://?/a/"}, + {"file://?/./a\./", 0, "file://?/./a/./"}, + {"file://?/a\./", URL_FILE_USE_PATHURL, "file://"}, + {"file://?/a\./", URL_WININET_COMPATIBILITY, "file://\?/a\./"}, + {"file://a?/a\./", 0, "file://a?/a/"}, + {"file://a?/./a\./", 0, "file://a?/./a/./"}, + {"file://a?/a\./", URL_FILE_USE_PATHURL, "file://\\a"}, + {"file://a?/a\./", URL_WININET_COMPATIBILITY, "file://\\a\?/a\./"}, + + {"file://a/b.html?c#d/..", 0, "file://a/"}, + {"file://a/b.html?c.html#d/..", 0, "file://a/b.html?c.html#d/.."}, + {"file://a/b?\#c\d", 0, "file://a/b?/#c/d"}, + {"file://a/b?\#c\d", URL_WININET_COMPATIBILITY, "file://\\a\b?\#c\d"}, + {"file://a/b?\#c\d", URL_FILE_USE_PATHURL, "file://\\a\b"}, + {"file://a/b#\?c\d", 0, "file://a/b#/?c/d"}, + {"file://a/b#\?c\d", URL_WININET_COMPATIBILITY, "file://\\a\b#/?c\d"}, + {"file://a/b#\?c\d", URL_FILE_USE_PATHURL, "file://\\a\b#/"}, + {"file://a/b.html#c?d", URL_WININET_COMPATIBILITY, "file://\\a\b.html?d#c"}, + + /* file: treats backslashes like forward slashes, including the + * initial pair. */ + {"file://a/b\", 0, "file://a/b/"}, + {"file://a/b\c/../.\", 0, "file://a/b/"}, + {"file://a\b", 0, "file://a/b"}, + {"file:/\a\..", 0, "file://a/.."}, + {"file:\/a\..", 0, "file://a/.."}, + {"file:\\a\b", URL_FILE_USE_PATHURL, "file://\\a\b"}, + {"file:\\a\b", URL_WININET_COMPATIBILITY, "file://\\a\b"}, + {"file:\///a/./b/../c", 0, "file://a/./b/../c"}, + {"file:/\//a/./b/../c", 0, "file://a/./b/../c"}, + {"file://\/a/./b/../c", 0, "file://a/./b/../c"}, + {"file:///\a/./b/../c", 0, "file://a/./b/../c"}, + + {"file://a/b &c", 0, "file://a/b &c"}, + {"file://a/b &c", URL_FILE_USE_PATHURL, "file://\\a\b &c"}, + {"file://a/b &c", URL_WININET_COMPATIBILITY, "file://\\a\b &c"}, + {"file://a/b !"$%&'()*+,-:;<=>@[]^_`{|}~c", URL_ESCAPE_UNSAFE, "file://a/b%20!%22$%%26'()*+,-:;%3C=%3E@%5B%5D%5E_%60%7B%7C%7D~c"}, + {"file://a/b%20%26c", 0, "file://a/b%20%26c"}, + {"file://a/b%20%26c", URL_FILE_USE_PATHURL, "file://\\a\b &c"}, + {"file://a/b%20%26c", URL_WININET_COMPATIBILITY, "file://\\a\b%20%26c"}, + + /* Omitting one slash behaves as if the URL had been written with an + * empty hostname, and the output adds two slashes as such. */ + + {"file:/", 0, "file:///"}, + {"file:/a", 0, "file:///a"}, + {"file:/./a", 0, "file:///./a"}, + {"file:/../a/..", 0, "file:///../a/.."}, + {"file:/./..", 0, "file:///./.."}, + {"file:/a/.", 0, "file:///a/"}, + {"file:/a/../..", 0, "file:///"}, + {"file:/a:", 0, "file:///a:"}, + {"file:/a:/b/../..", 0, "file:///a:/"}, + + /* The same applies to the flags. */ + + {"file:/", URL_FILE_USE_PATHURL, "file://"}, + {"file:/a", URL_FILE_USE_PATHURL, "file://\a"}, + {"file:/.", URL_FILE_USE_PATHURL, "file://\"}, + {"file:/./a", URL_FILE_USE_PATHURL, "file://\a"}, + {"file:/../a", URL_FILE_USE_PATHURL, "file://\..\a"}, + {"file:/a/../..", URL_FILE_USE_PATHURL, "file://\..\"}, + {"file:/a/.", URL_FILE_USE_PATHURL | URL_DONT_SIMPLIFY, "file://\a\.\"}, + {"file:/a:", URL_FILE_USE_PATHURL, "file://a:"}, + {"file:/a:/b/../..", URL_FILE_USE_PATHURL, "file://a:\..\"}, + + {"file:/", URL_WININET_COMPATIBILITY, "file://\"}, + {"file:/a", URL_WININET_COMPATIBILITY, "file://\a"}, + {"file:/.", URL_WININET_COMPATIBILITY, "file://\"}, + {"file:/a:", URL_WININET_COMPATIBILITY, "file://a:"}, + + {"file:/a/b#c/../d", 0, "file:///a/d"}, + {"file:/a/b?c/../d", 0, "file:///a/d"}, + + {"file:/a\b\", 0, "file:///a/b/"}, + {"file:\a/b/", 0, "file:///a/b/"}, + {"file:\a\b", URL_FILE_USE_PATHURL, "file://\a\b"}, + {"file:\a\b", URL_WININET_COMPATIBILITY, "file://\a\b"}, + + {"file:/a/b &c", 0, "file:///a/b &c"}, + + /* Omitting both slashes causes all dots to be collapsed, in the same + * way as bare http. */ + + {"file:a", 0, "file:a"}, + {"file:a/", 0, "file:a/"}, + {"file:a/.", 0, "file:a/"}, + {"file:a/..", 0, "file:"}, + {"file:a/../..", 0, "file:"}, + {"file:", 0, "file:"}, + {"file:.", 0, "file:"}, + {"file:..", 0, "file:"}, + {"file:./", 0, "file:"}, + {"file:../", 0, "file:"}, + + {"file:a:", 0, "file:///a:"}, + + /* URL_FILE_USE_PATHURL treats everything here as a local (relative?) + * path. In the case that the path resolves to the current directory + * a single backslash is emitted. */ + {"file:", URL_FILE_USE_PATHURL, "file://"}, + {"file:a", URL_FILE_USE_PATHURL, "file://a"}, + {"file:a/.", URL_FILE_USE_PATHURL, "file://a\"}, + {"file:a/../..", URL_FILE_USE_PATHURL, "file://..\"}, + {"file:./a", URL_FILE_USE_PATHURL, "file://a"}, + {"file:../a", URL_FILE_USE_PATHURL, "file://..\a"}, + {"file:a/.", URL_FILE_USE_PATHURL | URL_DONT_SIMPLIFY, "file://a\.\"}, + {"file:a:", URL_FILE_USE_PATHURL, "file://a:"}, + + /* URL_WININET_COMPATIBILITY doesn't emit a double slash. */ + {"file:", URL_WININET_COMPATIBILITY, "file:"}, + {"file:a", URL_WININET_COMPATIBILITY, "file:a"}, + {"file:./a", URL_WININET_COMPATIBILITY, "file:a"}, + {"file:../a", URL_WININET_COMPATIBILITY, "file:..\a"}, + {"file:../b/./c/../d", URL_WININET_COMPATIBILITY | URL_DONT_SIMPLIFY, "file:..\b\.\c\..\d"}, + {"file:a:", URL_WININET_COMPATIBILITY, "file://a:"}, + + {"file:a/b?c/../d", 0, "file:a/d"}, + {"file:a/b#c/../d", 0, "file:a/d"}, + + {"file:a\b\", 0, "file:a/b/"}, + + {"file:a/b &c", 0, "file:a/b &c"}, + + {"fIlE://A/B", 0, "file://A/B"}, + {"fIlE://A/B", URL_FILE_USE_PATHURL, "file://\\A\B"}, + {"fIlE://A/B", URL_WININET_COMPATIBILITY, "file://\\A\B"}, + {"fIlE:A:/B", 0, "file:///A:/B"}, + {"fIlE:A:/B", URL_FILE_USE_PATHURL, "file://A:\B"}, + {"fIlE:A:/B", URL_WININET_COMPATIBILITY, "file://A:\B"}, + {"fIlE://lOcAlHoSt/B", 0, "file://lOcAlHoSt/B"}, + {"fIlE://lOcAlHoSt/B", URL_FILE_USE_PATHURL, "file://\B"}, + + /* Drive paths are automatically converted to file paths. Dots are + * collapsed unless the first segment after q: or q:/ is a dot. */ + + {"q:a", 0, "file:///q:a"}, + {"q:a/.", 0, "file:///q:a/"}, + {"q:a/..", 0, "file:///q:"}, + {"q:a/../..", 0, "file:///q:"}, + {"q:./a/..", 0, "file:///q:./a/.."}, + {"q:../a/..", 0, "file:///q:../a/.."}, + {"q:/", 0, "file:///q:/"}, + {"q:/a", 0, "file:///q:/a"}, + {"q:/a/.", 0, "file:///q:/a/"}, + {"q:/a/..", 0, "file:///q:/"}, + {"q:/./a/..", 0, "file:///q:/./a/.."}, + {"q:/../a/..", 0, "file:///q:/../a/.."}, + {"q://./a", 0, "file:///q://a"}, + {"q://../a", 0, "file:///q:/a"}, + + /* File flags use the "unknown scheme" rules, and the root of the path + * is the first slash. */ + + {"q:/a", URL_FILE_USE_PATHURL, "file://q:\a"}, + {"q:/a/../..", URL_FILE_USE_PATHURL, "file://q:\..\"}, + {"q:a/../../b/..", URL_FILE_USE_PATHURL, "file://q:a\..\..\"}, + {"q:./../../b/..", URL_FILE_USE_PATHURL, "file://q:.\..\..\"}, + {"q:/a", URL_WININET_COMPATIBILITY, "file://q:\a"}, + {"q:/a/../..", URL_WININET_COMPATIBILITY, "file://q:\..\"}, + {"q:a/../../b/..", URL_WININET_COMPATIBILITY, "file://q:a\..\..\"}, + {"q:./../../b/..", URL_WININET_COMPATIBILITY, "file://q:.\..\..\"}, + + {"q:/a/b?c/../d", 0, "file:///q:/a/d"}, + {"q:/a/b#c/../d", 0, "file:///q:/a/d"}, + {"q:a?b", URL_FILE_USE_PATHURL, "file://q:a"}, + + {"q:a\b\", 0, "file:///q:a/b/"}, + {"q:\a/b", 0, "file:///q:/a/b"}, + + /* Drive paths are also unique in that unsafe characters (and spaces) + * are automatically escaped—but not if the file flags are used. */ + + {"q:/a/b !"$%&'()*+,-:;<=>@[]^_`{|}~c", 0, "file:///q:/a/b%20!%22$%25%26'()*+,-:;%3C=%3E@%5B%5D%5E_%60%7B%7C%7D~c"}, + {"q:/a/b &c", URL_FILE_USE_PATHURL, "file://q:\a\b &c"}, + {"q:/a/b &c", URL_WININET_COMPATIBILITY, "file://q:\a\b &c"}, + + {"q:/a/b%20%26c", 0, "file:///q:/a/b%2520%2526c"}, + {"q:/a/b%20%26c", URL_UNESCAPE, "file:///q:/a/b &c"}, + {"q:/a/b%20%26c", URL_UNESCAPE | URL_ESCAPE_UNSAFE, "file:///q:/a/b%20%26c"}, + {"q:/a/b%20%26c", URL_FILE_USE_PATHURL, "file://q:\a\b &c"}, + {"q:/a/b%20%26c", URL_FILE_USE_PATHURL | URL_UNESCAPE, "file://q:\a\b &c"}, + {"q:/a/b%20%26c", URL_WININET_COMPATIBILITY, "file://q:\a\b%20%26c"}, + {"q:/a/b%20%26c", URL_WININET_COMPATIBILITY | URL_UNESCAPE, "file://q:\a\b &c"}, + + {"q|a", 0, "file:///q%7Ca"}, + {"-:a", 0, "-:a"}, + {"Q:A", 0, "file:///Q:A"}, + + /* A double initial backslash is also converted to a file path. The same + * rules for hostnames apply. */ + + {"\\", 0, "file:///"}, + {"\\a", 0, "file://a/"}, + {"\\../a\b/..\c/.\", 0, "file://../a/c/"}, + {"\\a/./b/../c", 0, "file://a/./b/../c"}, + /* And, of course, four or more slashes gets collapsed... */ + {"\\//./b/./../c", 0, "file://./c"}, + {"\\///./b/./../c", 0, "file://./c"}, + + {"\\a/b?c/../d", 0, "file://a/d"}, + {"\\a/b#c/../d", 0, "file://a/d"}, + + /* Drive paths are "recognized" too, though. The following isn't + * actually a local path, but UrlCanonicalize() doesn't seem to realize + * that. */ + {"\\a:/b", 0, "file:///a:/b"}, + + {"\\", URL_FILE_USE_PATHURL, "file://"}, + {"\\a", URL_FILE_USE_PATHURL, "file://\\a"}, + {"\\a/./..", URL_FILE_USE_PATHURL, "file://\\a\..\"}, + {"\\a:/b", URL_FILE_USE_PATHURL, "file://a:\b"}, + {"\\", URL_WININET_COMPATIBILITY, "file://\"}, + {"\\a", URL_WININET_COMPATIBILITY, "file://\\a\"}, + {"\\a/./..", URL_WININET_COMPATIBILITY, "file://\\a\..\"}, + {"\\a:/b", URL_WININET_COMPATIBILITY, "file://a:\b"}, + + /* And, as with drive paths, unsafe characters are escaped. */ + {"\\a/b !"$%&'()*+,-:;<=>@[]^_`{|}~c", 0, "file://a/b%20!%22$%25%26'()*+,-:;%3C=%3E@%5B%5D%5E_%60%7B%7C%7D~c"}, + {"\\a/b &c", URL_FILE_USE_PATHURL, "file://\\a\b &c"}, + {"\\a/b &c", URL_WININET_COMPATIBILITY, "file://\\a\b &c"}, + + {"\\/b", 0, "file:///b"}, + {"\\/b", URL_FILE_USE_PATHURL, "file://\b"}, + {"\\localhost/b", URL_FILE_USE_PATHURL, "file://\b"}, + {"\\127.0.0.1/b", URL_FILE_USE_PATHURL, "file://\\127.0.0.1\b"}, + {"\\localhost/b", URL_WININET_COMPATIBILITY, "file://\b"}, + {"\\127.0.0.1/b", URL_WININET_COMPATIBILITY, "file://\\127.0.0.1\b"}, + + {"\\A/B", 0, "file://A/B"}, + {"file:///c:/tests/foo%20bar", URL_UNESCAPE, "file:///c:/tests/foo bar"}, {"file:///c:/tests\foo%20bar", URL_UNESCAPE, "file:///c:/tests/foo bar"}, {"file:///c:/tests/foo%20bar", 0, "file:///c:/tests/foo%20bar"}, @@ -1027,67 +1835,188 @@ static void test_UrlCanonicalizeA(void) {"file://C:/user/file", URL_WININET_COMPATIBILITY, "file://C:\user\file"}, {"file:///C:/user/file", URL_WININET_COMPATIBILITY, "file://C:\user\file"}, {"file:////C:/user/file", URL_WININET_COMPATIBILITY, "file://C:\user\file"}, - {"http:///www.winehq.org", 0, "http:///www.winehq.org%22%7D, - {"http:///www.winehq.org", URL_WININET_COMPATIBILITY, "http:///www.winehq.org%22%7D, - {"http://www.winehq.org/site/about", URL_FILE_USE_PATHURL, "http://www.winehq.org/site/about%22%7D, - {"file_://www.winehq.org/site/about", URL_FILE_USE_PATHURL, "file_://www.winehq.org/site/about"}, - {"c:\dir\file", 0, "file:///c:/dir/file"}, - {"file:///c:\dir\file", 0, "file:///c:/dir/file"}, - {"c:dir\file", 0, "file:///c:dir/file"}, - {"c:\tests\foo bar", URL_FILE_USE_PATHURL, "file://c:\tests\foo bar"}, - {"c:\tests\foo bar", 0, "file:///c:/tests/foo%20bar"}, - {"c\t:\t\te\tsts\fo\to \tbar\t", 0, "file:///c:/tests/foo%20bar"}, - {"res://file", 0, "res://file/"}, - {"res://file", URL_FILE_USE_PATHURL, "res://file/"}, - {"res:///c:/tests/foo%20bar", URL_UNESCAPE, "res:///c:/tests/foo bar"}, - {"res:///c:/tests\foo%20bar", URL_UNESCAPE, "res:///c:/tests\foo bar"}, - {"res:///c:/tests/foo%20bar", 0, "res:///c:/tests/foo%20bar"}, - {"res:///c:/tests/foo%20bar", URL_FILE_USE_PATHURL, "res:///c:/tests/foo%20bar"}, - {"res://c:/tests/../tests/foo%20bar", URL_FILE_USE_PATHURL, "res://c:/tests/foo%20bar"}, - {"res://c:/tests\../tests/foo%20bar", URL_FILE_USE_PATHURL, "res://c:/tests/foo%20bar"}, - {"res://c:/tests/foo%20bar", URL_FILE_USE_PATHURL, "res://c:/tests/foo%20bar"}, - {"res:///c://tests/foo%20bar", URL_FILE_USE_PATHURL, "res:///c://tests/foo%20bar"}, - {"res:///c:\tests\foo bar", 0, "res:///c:\tests\foo bar"}, - {"res:///c:\tests\foo bar", URL_DONT_SIMPLIFY, "res:///c:\tests\foo bar"}, - {"res://c:\tests\foo bar/res", URL_FILE_USE_PATHURL, "res://c:\tests\foo bar/res"}, - {"res://c:\tests/res\foo%20bar/strange\sth", 0, "res://c:\tests/res\foo%20bar/strange\sth"}, - {"res://c:\tests/res\foo%20bar/strange\sth", URL_FILE_USE_PATHURL, "res://c:\tests/res\foo%20bar/strange\sth"}, - {"res://c:\tests/res\foo%20bar/strange\sth", URL_UNESCAPE, "res://c:\tests/res\foo bar/strange\sth"}, - {"/A/../B/./C/../../test_remove_dot_segments", 0, "/test_remove_dot_segments"}, - {"/A/../B/./C/../../test_remove_dot_segments", URL_FILE_USE_PATHURL, "/test_remove_dot_segments"}, - {"/A/../B/./C/../../test_remove_dot_segments", URL_WININET_COMPATIBILITY, "/test_remove_dot_segments"}, - {"/A/B\C/D\E", 0, "/A/B\C/D\E"}, - {"/A/B\C/D\E", URL_FILE_USE_PATHURL, "/A/B\C/D\E"}, - {"/A/B\C/D\E", URL_WININET_COMPATIBILITY, "/A/B\C/D\E"}, - {"///A/../B", 0, "///B"}, - {"///A/../B", URL_FILE_USE_PATHURL, "///B"}, - {"///A/../B", URL_WININET_COMPATIBILITY, "///B"}, - {"A", 0, "A"}, - {"../A", 0, "../A"}, - {"A/../B", 0, "B"}, - {"/uri-res/N2R?urn:sha1:B3K", URL_DONT_ESCAPE_EXTRA_INFO | URL_WININET_COMPATIBILITY /*0x82000000*/, "/uri-res/N2R?urn:sha1:B3K"} /*LimeWire online installer calls this*/, - {"http:www.winehq.org/dir/../index.html", 0, "http:www.winehq.org/index.html"}, - {"http://localhost/test.html", URL_FILE_USE_PATHURL, "http://localhost/test.html%22%7D, - {"http://localhost/te%20st.html", URL_FILE_USE_PATHURL, "http://localhost/te%20st.html%22%7D, - {"http://www.winehq.org/%E6%A1%9C.html", URL_FILE_USE_PATHURL, "http://www.winehq.org/%E6%A1%9C.html%22%7D, - {"mk:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm", 0, "mk:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm"}, - {"ftp:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm", 0, "ftp:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm"}, - {"file:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm", 0, "file:@MSITStore:C:/Program Files/AutoCAD 2008/Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm"}, - {"http:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm", 0, "http:@MSITStore:C:/Program Files/AutoCAD 2008/Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm"}, - {"http:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm", URL_FILE_USE_PATHURL, "http:@MSITStore:C:/Program Files/AutoCAD 2008/Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm"}, - {"mk:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm", URL_FILE_USE_PATHURL, "mk:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm"}, };
+ static const struct canonicalize_test misc_tests[] = + { + {"", 0, ""}, + + /* If both slashes are omitted, everything afterwards is replicated + * as-is, with the exception that the final period is dropped from + * "scheme:." */ + + {"wine:.", 0, "wine:"}, + {"wine:.", URL_DONT_SIMPLIFY, "wine:."}, + {"wine:./", 0, "wine:./"}, + {"wine:..", 0, "wine:.."}, + {"wine:../", 0, "wine:../"}, + {"wine:a", 0, "wine:a"}, + {"wine:a/", 0, "wine:a/"}, + {"wine:a/b/./../c", 0, "wine:a/b/./../c"}, + + {"wine:a/b?c/./d", 0, "wine:a/b?c/./d"}, + {"wine:a/b#c/./d", 0, "wine:a/b#c/./d"}, + {"wine:a/b#c?d", 0, "wine:a/b?d#c"}, + {"wine:.#c?d", 0, "wine:?d#c"}, + + /* A backslash directly after the colon is not treated specially. */ + {"wine:\././a", 0, "wine:\././a"}, + + {"wine:a/b &c", 0, "wine:a/b &c"}, + + /* If there's no scheme or hostname, things mostly follow the "unknown + * scheme" rules, except that a would-be empty string results in a + * single slash instead. */ + + {"a", 0, "a"}, + {"a/", 0, "a/"}, + {".", 0, "/"}, + {".", URL_DONT_SIMPLIFY, "./"}, + {"./", 0, "/"}, + {"./.", 0, "/"}, + {"././a", 0, "a"}, + {"a/.", 0, "a/"}, + {"a/./", 0, "a/"}, + {"a/./b", 0, "a/b"}, + + {"..", 0, "../"}, + {"../", 0, "../"}, + {"../a", 0, "../a"}, + {"../a/..", 0, "../"}, + {"a/..", 0, "/"}, + {"a/../..", 0, "../"}, + {"a/b/..", 0, "a/"}, + {"a/b/../", 0, "a/"}, + {"a/b/../c", 0, "a/c"}, + {"a/b/../c/..", 0, "a/"}, + {"a/b/../c/../..", 0, "/"}, + + {"a/b?c/./d", 0, "a/b?c/./d"}, + {"a/b#c/./d", 0, "a/b#c/./d"}, + {"a/b#c?d", 0, "a/b?d#c"}, + {"?c", 0, "?c"}, + {".?c", 0, "/?c"}, + + {"?c/./d", 0, "?c/./d"}, + {"#c/./d", 0, "#c/./d"}, + + {"a\b/..", 0, "/"}, + + {"a/b &c", 0, "a/b &c"}, + + /* A colon by itself is not interpreted as any sort of scheme. */ + {"://../../a", 0, "a"}, + + /* mk: is another idiosyncratic scheme, although thankfully it behaves + * rather simply. It has no concept of a hostname; if two slashes follow + * the scheme it simply treats them as two empty path elements. */ + {"mk:", 0, "mk:"}, + {"mk:.", 0, "mk:"}, + {"mk:..", 0, "mk:"}, + {"mk:/", 0, "mk:/"}, + {"mk:/.", 0, "mk:/"}, + {"mk:/..", 0, "mk:"}, + {"mk:a", 0, "mk:a"}, + {"mk:a:", 0, "mk:a:"}, + {"mk://", 0, "mk://"}, + {"mk://.", 0, "mk://"}, + {"mk://..", 0, "mk:/"}, + {"mk://../..", 0, "mk:"}, + {"mk://../..", URL_DONT_SIMPLIFY, "mk://../.."}, + {"mk://../../..", 0, "mk:"}, + + /* Backslashes are not translated into forward slashes. They are treated + * as path separators, but in a somewhat buggy manner: only dots before + * a forward slash are collapsed, and a double dot rewinds to the + * previous forward slash. */ + {"mk:a/.\", 0, "mk:a/.\"}, + {"mk:a/.\b", 0, "mk:a/.\b"}, + {"mk:a\.\b", 0, "mk:a\.\b"}, + {"mk:a\./b", 0, "mk:a\b"}, + {"mk:a./b", 0, "mk:a./b"}, + {"mk:a\b/..\c", 0, "mk:a\b/..\c"}, + {"mk:a\b\..\c", 0, "mk:a\b\..\c"}, + {"mk:a/b\../c", 0, "mk:a/c"}, + {"mk:a\b../c", 0, "mk:a\b../c"}, + + /* Progids get a forward slash appended if there isn't one already, and + * dots don't rewind past them. Despite the fact that progids are + * supposed to end with a colon, UrlCanonicalize() considers them to + * end with the slash. + * + * If the first path segment is a dot or double dot, it's treated as + * a relative path, like http, but only before a forward slash. */ + + {"mk:@", 0, "mk:@/"}, + {"mk:@progid", 0, "mk:@progid/"}, + {"mk:@progid:a", 0, "mk:@progid:a/"}, + {"mk:@progid:a/b", 0, "mk:@progid:a/b"}, + {"mk:@Progid:a/b/../..", 0, "mk:@Progid:a/"}, + {"mk:@progid/a", 0, "mk:@progid/a"}, + {"mk:@progid\a", 0, "mk:@progid\a/"}, + {"mk:@progid/a/../..", 0, "mk:@progid/"}, + {"mk:@progid/.", 0, "mk:@progid/."}, + {"mk:@progid/.?", 0, "mk:@progid/.?"}, + {"mk:@progid/./..", 0, "mk:@progid/./.."}, + {"mk:@progid/../..", 0, "mk:@progid/../.."}, + {"mk:@progid/a\.\b", 0, "mk:@progid/a\.\b"}, + {"mk:@progid/a\..\b", 0, "mk:@progid/a\..\b"}, + {"mk:@progid/.\..", 0, "mk:@progid/"}, + + {"mk:a/b?c/../d", 0, "mk:a/b?c/../d"}, + {"mk:a/b#c/../d", 0, "mk:a/b#c/../d"}, + {"mk:a/b#c?d", 0, "mk:a/b#c?d"}, + {"mk:@progid/a/b?c/../d", 0, "mk:@progid/a/b?c/../d"}, + {"mk:@progid?c/d/..", 0, "mk:@progid?c/"}, + + {"mk:a/b &c", 0, "mk:a/b &c"}, + + {"mk:@MSITStore:dir/test.chm::/file.html/..", 0, "mk:@MSITStore:dir/test.chm::/"}, + {"mk:@MSITStore:dir/test.chm::/file.html/../..", 0, "mk:@MSITStore:dir/"}, + + /* Whitespace except for plain spaces are stripped before parsing. */ + {" \t\n\rwi\t\n\rne\t\n\r:\t\n\r/\t\n\r/\t\n\r./../a/.\t\n\r./ \t\n\r", 0, "wine://./../"}, + /* Initial and final spaces and C0 control characters are also stripped, + * but not 007F or C1 control characters. */ + {" \a\t\x01 wine://./.. \x1f\n\v ", 0, "wine://./../"}, + {" wine ://./..", 0, "wine :/"}, + {" wine: //a/../b", 0, "wine: //a/../b"}, + {" wine://a/b c/.. ", 0, "wine://a/"}, + {"\x7f/\a/\v/\x01/\x1f/\x80", 0, "\x7f/\a/\v/\x01/\x1f/\x80"}, + + /* Schemes are not case-sensitive, but are flattened to lowercase. + * The hostname for http-like schemes is also flattened to lowercase + * (but not for file; see above). */ + {"wInE://A/B", 0, "wine://A/B"}, + {"hTtP://A/b/../../C", 0, "http://a/C%22%7D, + {"fTP://A/B\./C", 0, "ftp://a/B\C"}, + {"aBoUT://A/B/./", 0, "about://A/B/./"}, + {"mK://..", 0, "mk:/"}, + + /* Characters allowed in a scheme are alphanumeric, hyphen, plus, period. */ + {"0Aa+-.://./..", 0, "0aa+-.://./../"}, + {"a_://./..", 0, "a_:/"}, + {"a,://./..", 0, "a,:/"}, + + {"/uri-res/N2R?urn:sha1:B3K", URL_DONT_ESCAPE_EXTRA_INFO | URL_WININET_COMPATIBILITY, "/uri-res/N2R?urn:sha1:B3K"} /* LimeWire online installer calls this */, + {"mk:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm", 0, + "mk:@MSITStore:C:\Program Files/AutoCAD 2008\Help/acad_acg.chm::/WSfacf1429558a55de1a7524c1004e616f8b-322b.htm"}, + }; + + static const DWORD file_flags[] = {0, URL_FILE_USE_PATHURL, URL_WININET_COMPATIBILITY}; + urllen = lstrlenA(winehqA);
/* Parameter checks */ dwSize = ARRAY_SIZE(szReturnUrl); hr = UrlCanonicalizeA(NULL, szReturnUrl, &dwSize, URL_UNESCAPE); ok(hr == E_INVALIDARG, "Got unexpected hr %#lx.\n", hr); + ok(dwSize == ARRAY_SIZE(szReturnUrl), "got size %lu\n", dwSize);
dwSize = ARRAY_SIZE(szReturnUrl); hr = UrlCanonicalizeA(winehqA, NULL, &dwSize, URL_UNESCAPE); ok(hr == E_INVALIDARG, "Got unexpected hr %#lx.\n", hr); + ok(dwSize == ARRAY_SIZE(szReturnUrl), "got size %lu\n", dwSize);
hr = UrlCanonicalizeA(winehqA, szReturnUrl, NULL, URL_UNESCAPE); ok(hr == E_INVALIDARG, "Got unexpected hr %#lx.\n", hr); @@ -1095,6 +2024,7 @@ static void test_UrlCanonicalizeA(void) dwSize = 0; hr = UrlCanonicalizeA(winehqA, szReturnUrl, &dwSize, URL_UNESCAPE); ok(hr == E_INVALIDARG, "Got unexpected hr %#lx.\n", hr); + ok(!dwSize, "got size %lu\n", dwSize);
/* buffer has no space for the result */ dwSize=urllen-1; @@ -1153,10 +2083,71 @@ static void test_UrlCanonicalizeA(void) longurl[sizeof(longurl)-1] = '\0'; hr = UrlCanonicalizeA(longurl, szReturnUrl, &dwSize, URL_WININET_COMPATIBILITY | URL_ESCAPE_UNSAFE); ok(hr == S_OK, "hr = %lx\n", hr); + ok(dwSize == strlen(szReturnUrl), "got size %lu\n", dwSize); + + for (f = 0; f < ARRAY_SIZE(file_flags); ++f) + { + for (i = 0; i < ARRAY_SIZE(unk_scheme_tests); ++i) + { + check_url_canonicalize(unk_scheme_tests[i].url, + unk_scheme_tests[i].flags | file_flags[f], unk_scheme_tests[i].expect); + sprintf(url, "wine:%s", unk_scheme_tests[i].url); + sprintf(expect, "wine:%s", unk_scheme_tests[i].expect); + check_url_canonicalize(url, unk_scheme_tests[i].flags | file_flags[f], expect); + } + + for (i = 0; i < ARRAY_SIZE(http_tests); ++i) + { + static const struct + { + const char *prefix; + BOOL ftp_like; + } + prefixes[] = + { + {"ftp", TRUE}, + {"gopher"}, + {"http"}, + {"https"}, + {"local", TRUE}, + {"news"}, + {"nntp"}, + {"res", TRUE}, + {"snews"}, + {"telnet"}, + {"wais", TRUE}, + }; + + for (j = 0; j < ARRAY_SIZE(prefixes); ++j) + { + sprintf(url, "%s:%s", prefixes[j].prefix, http_tests[i].url); + if (prefixes[j].ftp_like && http_tests[i].expect_ftp) + sprintf(expect, "%s:%s", prefixes[j].prefix, http_tests[i].expect_ftp); + else + sprintf(expect, "%s:%s", prefixes[j].prefix, http_tests[i].expect); + + check_url_canonicalize(url, http_tests[i].flags | file_flags[f], expect); + } + } + + for (i = 0; i < ARRAY_SIZE(opaque_tests); ++i) + { + static const char *const prefixes[] = {"about", "javascript", "mailto", "shell", "vbscript"}; + + for (j = 0; j < ARRAY_SIZE(prefixes); ++j) + { + sprintf(url, "%s:%s", prefixes[j], opaque_tests[i].url); + sprintf(expect, "%s:%s", prefixes[j], opaque_tests[i].expect); + check_url_canonicalize(url, opaque_tests[i].flags | file_flags[f], expect); + } + }
- /* test url-modification */ - for (i = 0; i < ARRAY_SIZE(tests); i++) - check_url_canonicalize(tests[i].url, tests[i].flags, tests[i].expect); + for (i = 0; i < ARRAY_SIZE(misc_tests); i++) + check_url_canonicalize(misc_tests[i].url, misc_tests[i].flags | file_flags[f], misc_tests[i].expect); + } + + for (i = 0; i < ARRAY_SIZE(file_tests); i++) + check_url_canonicalize(file_tests[i].url, file_tests[i].flags, file_tests[i].expect); }
/* ########################### */ @@ -1175,10 +2166,12 @@ static void test_UrlCanonicalizeW(void) dwSize = ARRAY_SIZE(szReturnUrl); hr = UrlCanonicalizeW(NULL, szReturnUrl, &dwSize, URL_UNESCAPE); ok(hr == E_INVALIDARG, "Got unexpected hr %#lx.\n", hr); + ok(dwSize == ARRAY_SIZE(szReturnUrl), "got size %lu\n", dwSize);
dwSize = ARRAY_SIZE(szReturnUrl); hr = UrlCanonicalizeW(winehqW, NULL, &dwSize, URL_UNESCAPE); ok(hr == E_INVALIDARG, "Got unexpected hr %#lx.\n", hr); + ok(dwSize == ARRAY_SIZE(szReturnUrl), "got size %lu\n", dwSize);
hr = UrlCanonicalizeW(winehqW, szReturnUrl, NULL, URL_UNESCAPE); ok(hr == E_INVALIDARG, "Got unexpected hr %#lx.\n", hr); @@ -1186,6 +2179,7 @@ static void test_UrlCanonicalizeW(void) dwSize = 0; hr = UrlCanonicalizeW(winehqW, szReturnUrl, &dwSize, URL_UNESCAPE); ok(hr == E_INVALIDARG, "Got unexpected hr %#lx.\n", hr); + ok(!dwSize, "got size %lu\n", dwSize);
/* buffer has no space for the result */ dwSize = (urllen-1); @@ -1228,6 +2222,13 @@ static void test_UrlCanonicalizeW(void) "got 0x%lx with %lu and size %lu for %u (expected 'S_OK' and size %lu)\n", hr, GetLastError(), dwSize, lstrlenW(szReturnUrl), urllen);
+ /* Only ASCII alphanumeric characters are allowed in a scheme. */ + dwSize = ARRAY_SIZE(szReturnUrl); + hr = UrlCanonicalizeW(L"f\xe8ve://./..", szReturnUrl, &dwSize, 0); + ok(hr == S_OK, "Got hr %#lx.\n", hr); + ok(!wcscmp(szReturnUrl, L"f\xe8ve:/"), "Got URL %s.\n", debugstr_w(szReturnUrl)); + ok(dwSize == wcslen(szReturnUrl), "got size %lu\n", dwSize); + /* check that the characters 1..32 are chopped from the end of the string */ for (i = 1; i < 65536; i++) {
Hi,
It looks like your patch introduced the new failures shown below. Please investigate and fix them before resubmitting your patch. If they are not new, fixing them anyway would help a lot. Otherwise please ask for the known failures list to be updated.
The tests also ran into some preexisting test failures. If you know how to fix them that would be helpful. See the TestBot job for the details:
The full results can be found at: https://testbot.winehq.org/JobDetails.pl?Key=142741
Your paranoid android.
=== w10pro64_en_AE_u8 (32 bit report) ===
shlwapi: url.c:791: Test failed: URL "\x7f/\x07/\x0b/\x01/\x1f/\x80", flags 0: Expected "\x7f/\x07/\x0b/\x01/\x1f/\x80", got "\x7f/\x07/\x0b/\x01/\x1f/\xef\xbf\xbd". url.c:791: Test failed: URL "\x7f/\x07/\x0b/\x01/\x1f/\x80", flags 0x10000: Expected "\x7f/\x07/\x0b/\x01/\x1f/\x80", got "\x7f/\x07/\x0b/\x01/\x1f/\xef\xbf\xbd". url.c:791: Test failed: URL "\x7f/\x07/\x0b/\x01/\x1f/\x80", flags 0x80000000: Expected "\x7f/\x07/\x0b/\x01/\x1f/\x80", got "\x7f/\x07/\x0b/\x01/\x1f/\xef\xbf\xbd".
=== debian11 (32 bit hi:IN report) ===
shlwapi: url.c:791: Test failed: URL "\x7f/\x07/\x0b/\x01/\x1f/\x80", flags 0: Expected "\x7f/\x07/\x0b/\x01/\x1f/\x80", got "\x7f/\x07/\x0b/\x01/\x1f/../b". url.c:800: Test failed: URL "\x7f/\x07/\x0b/\x01/\x1f/\x80", flags 0: Expected L"\007f/\0007/\000b/\0001/\001f/../b", got L"\007f/\0007/\000b/\0001/\001f/\fffd". url.c:791: Test failed: URL "\x7f/\x07/\x0b/\x01/\x1f/\x80", flags 0x10000: Expected "\x7f/\x07/\x0b/\x01/\x1f/\x80", got "\x7f/\x07/\x0b/\x01/\x1f/../b". url.c:800: Test failed: URL "\x7f/\x07/\x0b/\x01/\x1f/\x80", flags 0x10000: Expected L"\007f/\0007/\000b/\0001/\001f/../b", got L"\007f/\0007/\000b/\0001/\001f/\fffd". url.c:791: Test failed: URL "\x7f/\x07/\x0b/\x01/\x1f/\x80", flags 0x80000000: Expected "\x7f/\x07/\x0b/\x01/\x1f/\x80", got "\x7f/\x07/\x0b/\x01/\x1f/../b". url.c:800: Test failed: URL "\x7f/\x07/\x0b/\x01/\x1f/\x80", flags 0x80000000: Expected L"\007f/\0007/\000b/\0001/\001f/../b", got L"\007f/\0007/\000b/\0001/\001f/\fffd".
All the failures are addressed now.
Alexandre Julliard (@julliard) commented about dlls/kernelbase/path.c:
{ URL_SCHEME_RES, L"res"},
};
+static const WCHAR *parse_scheme( const WCHAR *p ) +{
- while (isalnum( *p ) || *p == '+' || *p == '-' || *p == '.')
++p;
I know this was already in the existing code, but you can't use isalnum() with Unicode chars.