http://bugs.winehq.org/show_bug.cgi?id=16061
Summary: HEAP_FindSubHeap taking excessive CPU time Product: Wine Version: 1.1.5 Platform: Other OS/Version: other Status: UNCONFIRMED Severity: enhancement Priority: P2 Component: ntdll AssignedTo: wine-bugs@winehq.org ReportedBy: jburgess777@googlemail.com
Created an attachment (id=17290) --> (http://bugs.winehq.org/attachment.cgi?id=17290) oprofile output and analysis
I'm a developer of a utility called osm2pgsql which parses the XML map data produced by the OpenStreetMap project. Normally this tool runs on Linux but I recently compiled a version for Windows using MinGW and ran it with Wine as a test. It ran, but was very slow.
The execution time for the application is about 90 seconds for the Linux native version vs 78 minutes with the Windows binary executing under Wine. A user has reported that it takes about 5 minutes to run natively on Windows but I do not know what hardware that was and have not measured it myself.
Running oprofile identified HEAP_FindSubHeap() as being responsible for over 75% of the CPU total usage. The application allocates memory for lots of small objects and strings so a large number of calls to malloc/free is to be expected, but not the slow performance.
This may ultimately be a duplicate of bug 9263 but I figured this may be an interesting data point for you. It looks like optimising the linear list search in HEAP_FindSubHeap may be a useful performance win for Wine.
I'll attach the oprofile output. If you are interested in this as a test case then I can probably produce a simplified standalone executable exhibiting the same behaviour.
http://bugs.winehq.org/show_bug.cgi?id=16061
--- Comment #1 from Alexandre Julliard julliard@winehq.org 2008-11-15 03:49:34 --- (In reply to comment #0)
I'll attach the oprofile output. If you are interested in this as a test case then I can probably produce a simplified standalone executable exhibiting the same behaviour.
If you could that would be great. I expect the heap code to be slower than the Linux heap because we have to do some extra checks to cope with broken apps, but it shouldn't be that bad.
http://bugs.winehq.org/show_bug.cgi?id=16061
--- Comment #2 from Jon Burgess jburgess777@googlemail.com 2008-11-15 08:51:19 --- I've created a cut down version of the osm2pgsql utility which demonstrates the slow down. The archive at the link below has the sources and some pre-compiled executables. The information on how I ran the test is included in README.TXT.
http://tile.openstreetmap.org/direct/osm2pgsql-wine-example.tar.bz2
http://bugs.winehq.org/show_bug.cgi?id=16061
--- Comment #3 from Jon Burgess jburgess777@googlemail.com 2008-11-16 07:41:31 --- Created an attachment (id=17310) --> (http://bugs.winehq.org/attachment.cgi?id=17310) Memory allocation speed test
In case the last program was a little too complex, here is a trivial one which demonstrates the non-linear allocation performance under wine. The program allocates a total of 1GB of memory using 1kB allocations and times how long it takes for each 100MB. On Linux the allocation rate is constant. On wine it drops:
[jburgess@shark win-exception]$ gcc -Wall -o test test.c [jburgess@shark win-exception]$ ./test Allocated 100MB of data in 0.09 seconds (1055.02MB/s) Allocated 100MB of data in 0.09 seconds (1064.45MB/s) Allocated 100MB of data in 0.09 seconds (1073.83MB/s) Allocated 100MB of data in 0.09 seconds (1085.35MB/s) Allocated 100MB of data in 0.09 seconds (1085.84MB/s) Allocated 100MB of data in 0.09 seconds (1097.26MB/s) Allocated 100MB of data in 0.09 seconds (1090.07MB/s) Allocated 100MB of data in 0.09 seconds (1089.22MB/s) Allocated 100MB of data in 0.09 seconds (1087.18MB/s) Allocated 100MB of data in 0.10 seconds (994.81MB/s)
[jburgess@shark win-exception]$ i686-pc-mingw32-gcc -Wall -o test.exe test.c [jburgess@shark win-exception]$ wine ./test.exe Allocated 100MB of data in 0.15 seconds (670.75MB/s) Allocated 100MB of data in 0.28 seconds (357.92MB/s) Allocated 100MB of data in 0.45 seconds (224.02MB/s) Allocated 100MB of data in 0.68 seconds (148.07MB/s) Allocated 100MB of data in 0.85 seconds (117.64MB/s) Allocated 100MB of data in 1.03 seconds (97.39MB/s) Allocated 100MB of data in 1.31 seconds (76.22MB/s) Allocated 100MB of data in 1.77 seconds (56.45MB/s) Allocated 100MB of data in 2.42 seconds (41.36MB/s) Allocated 100MB of data in 3.28 seconds (30.52MB/s)
http://bugs.winehq.org/show_bug.cgi?id=16061
Austin English austinenglish@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Keywords| |download, testcase
http://bugs.winehq.org/show_bug.cgi?id=16061
--- Comment #4 from Alexandre Julliard julliard@winehq.org 2008-11-17 08:07:40 --- Created an attachment (id=17322) --> (http://bugs.winehq.org/attachment.cgi?id=17322) Increase subheap size quadratically
Thanks for the test case. Something like this should help.
http://bugs.winehq.org/show_bug.cgi?id=16061
Austin English austinenglish@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Keywords| |patch
http://bugs.winehq.org/show_bug.cgi?id=16061
--- Comment #5 from Jon Burgess jburgess777@googlemail.com 2008-11-17 14:49:37 --- The patch fixes the speed problem in both the small example and the original osm2pgsql program. Unfortunately the osm2pgsql program now fails to import the Finnish data set. It runs out of memory trying to allocate 570MB when it already has about 1.2GB allocated:
mmap2(NULL, 570490880, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
Perhaps the upper bound needs to be dropped to something smaller, e.g. 128MB. Or perhaps it could be made intelligent and try backing down the size if the mmap fails?
http://bugs.winehq.org/show_bug.cgi?id=16061
--- Comment #6 from Alexandre Julliard julliard@winehq.org 2008-11-17 15:35:50 --- A 128Mb limit should be fine. I doubt we need to fall back to something smaller, it's OK if we don't manage to allocate every last bit of VM. Also if you need that much memory you should probably set the LARGE_ADDRESS_AWARE, that should give you another Gb or so.
http://bugs.winehq.org/show_bug.cgi?id=16061
--- Comment #7 from Jon Burgess jburgess777@googlemail.com 2008-11-17 16:38:02 --- I dropped the limit to 128MB and now the simple test program fails at 1900MB.
Setting the LARGE_ADDRESS_AWARE flag does not seem to make any difference though:
$ i686-pc-mingw32-gcc -o test.exe test.c -Wl,--large-address-aware $ i686-pc-mingw32-objdump -p test.exe | grep large large address aware $ wine ./test.exe ... Allocated 100MB of data in 1.93 seconds (51.79MB/s) malloc(4096) failed at 1900Mb
This is with wine-1.1.6 + your patch, adjusted to 128MB. It is a 64 bit host w/ 4GB of RAM + 6GB swap.
Any ideas?
http://bugs.winehq.org/show_bug.cgi?id=16061
--- Comment #8 from Jon Burgess jburgess777@googlemail.com 2008-11-17 17:23:15 --- I found the reason, I was running in Win98 mode. Once I switched to WinXP the test app got up to 2714MB. At this point I think the process hit the 4GB virtual size limit.
The cut down osm2pgsql utility now runs through the Finnish data set fine. It takes about twice as long as the native Linux version which I think is really pretty good.
http://bugs.winehq.org/show_bug.cgi?id=16061
Alexandre Julliard julliard@winehq.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED Resolution| |FIXED
--- Comment #9 from Alexandre Julliard julliard@winehq.org 2008-11-18 09:48:56 --- I committed the patch. Thanks for your help!
http://bugs.winehq.org/show_bug.cgi?id=16061
Alexandre Julliard julliard@winehq.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED
--- Comment #10 from Alexandre Julliard julliard@winehq.org 2008-11-21 10:32:16 --- Closing bugs fixed in 1.1.9.