http://bugs.winehq.org/show_bug.cgi?id=3817
ebfe knabberknusperhaus@yahoo.de changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |knabberknusperhaus@yahoo.de
--- Comment #25 from ebfe knabberknusperhaus@yahoo.de 2008-02-03 16:34:26 --- i've taken a look at this and imho there are several things important about this whole thing:
- the major performance-drawback comes from wine trying to find files in large directories. I've attached a profiling-graph that shows how 70% of the cpu-cycles for a single file-lookup are spend in the function 'wine_nt_to_unix_file_name'. different profiling-scenarios reveal similar results.
- the basic problem is to map a case-insensitive filesystem like NTFS/FAT to a case-sensitive filesystem like linux vfs. when wine gets a request for "foo.BAR", it looks for the that filename. when it is not found, wine scrolls through the entire directory, converting every filename to utf8 and doing a case-insensitive comparison on that. this is done for every filename-lookup so in a directory of a thousand files, adding another 100 files leads to 100.000 utf8/case-conversions.
- the hash- and utf8-conversion functions are horribly slow. those functions rely on byte-wise operations which are in fact only emulated on newer (read: 468 and up) processors. rewriting the utf8-conversion to do 2 bytes at a time (e.g. build a conversion-table at runtime for two-byte input) will effectively double those functions' performance.
there are some general problems:
- currently wine simply opens a stream to the directory and reads all entries. as there is no ordering of filenames applied, the first entry that matches wins. the order is completely left to the OS and therefor random from our point of view. this may lead to the situation, where wine returns the file "foo.Bar" one time and "Foo.bar" another time, when asked for "FOO.BAR". - as far as i can see, there is an unsolvable file-locking problem. creating a file on the "real" fs implies automatic locking against other processes which may do the same. since filenames are immutable between all processes performing on the "real" filesystem, but not between those and wine, we will always theoretically face toctou-bugs.
imho the best solution for this (except the locking-thing) would be to rely on a cached intermediate state. when looking up files in a certain directory, we read the entire directory into a cache-structure (including ordering Foo.bar and foo.Bar) and start monitoring this directory using inotify. one big advantage about inotify is the clarity of messages: when a new file is added to the directory in question, we get to know the file's name in advance and don't need to re-read the entire directory. consider a directory with 1.000 files, adding 100 files would require reading the directory once, reading from the cache 100 times and adding to the cache 100 times. if inotify is not present, we can always fall back to the slower version already implemented.