On Thu Feb 6 16:35:42 2025 +0000, Jinoh Kang wrote:
I suppose a theoretical fix here would be to do a case insensitive
comparison. But I'm not sure about the details. Does wineserver's tolower match the kernel's (whatever it uses for case folding)? NTFS has a `$UpCase` table (in filesystem data) for UCS-2 codepoints, presumably initialized from some NLS table at filesystem creation (partition format). Wineserver `to_lower` uses a preloaded `casemap` lookup table, in turn loaded from `l_intl.nls`. So I'm guessing they match most of the time, although theoretical deviation is possible from very old NTFS filesystems. Meanwhile, it seems like ext4 uses UTF8_NFDICF (NFD normalization plus removal of ignorable code points before case folding). Since NTFS uses legacy UTF-16 with surrogates permitted, I think it's safe to assume it (NTFS) has no support for case-folding in Supplementary Multilingual Plane codepoints (U+10000 and above).
`rename` doesn't even work properly in this case, because it's a no-op
on case-folding (it can't change case like wine), unlike Windows. On a pure case-change operation on ext4 +F directory (e.g., `FOObar` -> `fooBAR`), `stat()` will basically report that the destination is a hard link to the source. Instead of unlinking the source immediately, we should:
- Rename the source to some other temporary name.
- In case of a casefold (+F) directory, the destination will
disappear as well (since it's just the source). 2. Create a hard link from the destination to the temporary name via `link(2)`.
- If this succeeds, then the destination (`fooBAR` in the example
above) was removed in step (1), which means it was a casefold (+F) directory and we have successfully changed the case. Proceed to (3).
- If this fails with `EEXIST`, the destination still exists. Proceed
to (3).
- If this fails with any other reason, revert (1) and exit early.
- Unlink the temporary name.
In case of casefold filesystems, the source briefly disappears in (1) but comes back with the destination name in (2). In case of non-casefold filesystems, this is still correct: since (2) does nothing, (1) and (3) together simply unlinks the source without touching the destination.
That actually looks cleaner than what I had in mind (rename to temporary, then rename to destination), but there is still the same problem that we need temporary filename. Not sure how to do that in a good/clean way?
Also to alleviate possible worries, even though this complicates the code a bit, it's a path that won't be taken except when renaming files to themselves, but in different casing, so it won't slow down the most general path. And even in this case, it was currently broken, so IMO correctness is more important than performance when the latter means it isn't even working (treated as no-op).