http://bugs.winehq.org/show_bug.cgi?id=13335
Tony Wasserka tony.wasserka@freenet.de changed:
What |Removed |Added ---------------------------------------------------------------------------- CC|tony.wasserka@freenet.de |
Gerald Pfeifer gerald@pfeifer.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |gerald@pfeifer.com
Brian Rogers brian@xyzw.org changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |brian@xyzw.org
Paul "TBBle" Hampson Paul.Hampson@Pobox.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |Paul.Hampson@Pobox.com Attachment #20401|0 |1 is obsolete| |
--- Comment #140 from François Gouget fgouget@codeweavers.com 2009-03-25 04:49:06 --- Created an attachment (id=20117) --> (http://bugs.winehq.org/attachment.cgi?id=20117) Test application
I am attaching a nice little application which can be used to reproduce and analyze this issue, without requiring an OpenGL capable machine (so it can be used in virtual machines).
What it does is allocate memory via either Unix malloc() or Unix mmap(). The nice thing is that you can compile it as a Winelib application, but also as a native application, including as a native application loaded at a specific address (all the instructions are in the C file). So you can use it to compare the situation in Wine with the one in native applications. For instance to allocate 500 chunks of 10MB, call it as follows:
./memtest malloc 10 500 or ./memtest mmap 10 500 or wine ./memtest.exe.so malloc 10 500 or wine ./memtest.exe.so mmap 10 500
Of course you won't be able to allocate 5GB of memory, the allocations will start failing before that (don't worry it won't bring down your machine, we have overcommit to thank for that). But what's interesting is that you'll get the addresses of all the successful allocations, the total amount of memory allocated, and a pause at the end of the application so you can inspect the memory map (via winedbg or /proc).
Here are some results: * Native on Linux Allocations start around 0xb74e8000 and go down to x00300000, then to 0xb805f000 and up to 0xbee5f000 for a total of around 3000MB allocated. The application load address has no impact.
* Winelib on Linux Allocations start at 0x7e053000 and go down to 0x60700000 so that only 480MB can be allocated. Everything below that is reserved.
* Native on FreeBSD 7.0 Allocations start after the main executable and go up. So by default they start around 0x28300000 and go up to 0xbed00000 for a total of around 2400MB allocated. But if the executable is loaded at a higher address, such as Wine's default 0x7bf00400, then allocations start at 0x9c200000 and end at 0xbe800000 so that only 560MB can be allocated.
* Winelib on FreeBSD 7.0 Allocations start at 0x7e4dd000 and end at 0x7eedd000 so that only 20MB can be allocated. That's obviously way too little.
It would be nice to retry this with the mmap patch, but unfortunately it does not apply anymore.
--- Comment #141 from Rico kgbricola@web.de 2009-03-27 06:46:08 --- Created an attachment (id=20139) --> (http://bugs.winehq.org/attachment.cgi?id=20139) Results of the test application for different plattforms (linux{32,32pae,64})
Attached is the result for the test application for different linux versions (32Bit, 32Bit+pae, 64Bit, +wine).
The result is that wine shows on all tested linux platforms the same behaviour. This proves my comment (http://bugs.winehq.org/show_bug.cgi?id=16456#c48) and shows that this hasn't any influence on the tested platforms which wine is used on.
--- Comment #142 from Rico kgbricola@web.de 2009-04-02 14:31:39 ---
It would be nice to retry this with the mmap patch, but unfortunately it does not apply anymore.
Your test app doesn't trigger any call to mmap (with modified mmap patch). So the result is identical to my previous test. Could anyone confirm that?
--- Comment #143 from Rico kgbricola@web.de 2009-04-03 13:44:51 --- Created an attachment (id=20275) --> (http://bugs.winehq.org/attachment.cgi?id=20275) Test results with wglgears on wine on different linux versions (32Bit, 32Bit+pae, 64Bit, 64Bit+mem4g)
This test addresses the speed problem on several machines.
Attached is the test run of wglgears on wine on different linux kernels with a nvidia graphics card. The result is that in every frame on a clean 64Bit system there are 3 additional mmap/munmap calls which aren't there in any other run (32bit,32bitpae,64bitmem4g).
Is there an ati user with a machine like this (x86_64, ati card, 4+GB ram) who could run the test?
--- Comment #144 from Rico kgbricola@web.de 2009-04-10 11:45:56 --- Created an attachment (id=20366) --> (http://bugs.winehq.org/attachment.cgi?id=20366) wglgears debug output on 64bit with original mmap and virtual mmap wrapper
I've done some further testing with the slowness on a nvidia gpu and >4gb ram on 64bit. I've added some debug output to the patch. Also, I checked the test on an ati x800 with fglrx on the same machine and there was no slowdown.
The file names have this meanings: 64bit - linux 64bit kernel mem4g - kernel option mem=4g was used orig - original mmap was taken slow - the new mmap (virtual_mmap_wrapper) was taken
Result: The logs 64bit_mmap_slow.log and 64bitmem4g_mmap_slow.log differ at line 178. And there the nvidia driver couldn't get memory, which probably leads to a slow rendering path. Also note that the run on 64bit with mem=4g option isn't slow! Probably I should have given it another name ...
Test system: geforce 8800gts, nvidia driver 185.19, kernel 2.6.27.21-170.2.56.fc10.x86_64
--- Comment #145 from Paul "TBBle" Hampson Paul.Hampson@Pobox.com 2009-04-12 13:07:26 --- Created an attachment (id=20401) --> (http://bugs.winehq.org/attachment.cgi?id=20401) Forward port of proposed patch to 1.1.19
I've ported and hopefully fixed-up the patch from 1.1.15 (attachment 19905) for 1.1.19 (current HEAD).
Because wine_pthread_callbacks and related code went away, the code's shorter but slightly ass. The main ass thing is that the mmap wrappers are now calling wine_pthread_get_functions every time, because they need to get the changed mmap redirections once ntdll's virtual.c has applied them, but there's no other signalling system in place for libwine's code (which receives the changed mmap redirections) to tell the loader to use them.
Moving the mmap wrappers to libwine would fix this, but then the mmap wrappers don't override the libc functions as libwine is loaded after libc. It might be possible to get around that by redirecting the loader mmap wrappers back into libwine. I'm not sure that's a lot less ass, but it might be faster, and would fix the slight chance that exit_thread calls the wrong munmap.
Anyway, on my 4gB x86_64 system with nVidia 8600M running driver 180.29, this patch causes Warhammer Online to not crash due to an OUT_OF_MEMORY error from OpenGL, and the test application attached here goes from allocating 480MB to 4040MB or 4050MB compiled as a Winelib application (mmap 10 500). For comparison, native 32-bit build of the test app produces 4070MB, and 64-bit build (printfs needed fixing, will attach modified source later) happily allocations 5000MB (ie. everything asked for) in the mmap 10 500 test.
--- Comment #146 from Paul "TBBle" Hampson Paul.Hampson@Pobox.com 2009-04-12 13:12:42 --- Created an attachment (id=20402) --> (http://bugs.winehq.org/attachment.cgi?id=20402) François Gouget's test application, 64-bit clean
64-bit clean version of the test application.
--- Comment #147 from Paul "TBBle" Hampson Paul.Hampson@Pobox.com 2009-04-12 20:10:45 --- It's been pointed out to me on IRC that the patch doesn't fix the malloc case in the test application, presumably because glibc's malloc doesn't call glibc's mmap wrapper code directly, or at least not in a way we can intercept.
So if we need to fix malloc as well as mmap, then we need to redirect malloc to HeapAlloc or something...
--- Comment #148 from Paul "TBBle" Hampson Paul.Hampson@Pobox.com 2009-04-12 22:32:58 --- I've knocked up a quick malloc/calloc/free/realloc passthrough (that does't redirect to HeapAlloc, just passes straight on to the RTLD_NEXT function) and it seems to cause the test app to lock up.
However, redirecting malloc won't capture, for example, a library that uses C++ internally (as the C++ allocator is not required to use malloc internally) anyway.
On examination, libc's internal malloc appears to be calling the mmap2 system call.
Anyway, I think overriding malloc should be addressed with a separate patch, since the mmap patch is sufficient to fix the GL memory exhaustion failures this bug was submitted for, and libraries are likely to malloc much smaller amounts of memory than they're willing to mmap...
--- Comment #149 from Paul "TBBle" Hampson Paul.Hampson@Pobox.com 2009-04-12 22:46:12 --- A quick note, electric-fence or duma are examples of things that override malloc.
Duma overrides: void * malloc(size_t size) void * calloc(size_t nelem, size_t elsize) void free(void * address) void * memalign(size_t alignment, size_t size) int posix_memalign(void **memptr, size_t alignment, size_t size) void * realloc(void * oldBuffer, size_t newSize) void * valloc(size_t size) char * strdup(const char * str) void * memcpy(void *dest, const void *src, size_t size) char * strcpy(char *dest, const char *src) char * strncpy(char *dest, const char *src, size_t size) char * strcat(char *dest, const char *src) char * strncat(char *dest, const char *src, size_t size)
--- Comment #150 from Paul "TBBle" Hampson Paul.Hampson@Pobox.com 2009-04-13 11:34:42 --- Just a quick note, I've confirmed from looking at the disassembly of both 32-bit and 64-bit libc6 on Debian/unstable (glibc 2.9) that malloc there calls glibc's mmmap methods directly, not via the plt (compared to strdup calling malloc via the plt for example, so it uses an overridden malloc)
So the original intent of the patch to fix malloc by replacing mmap isn't portable, so if malloc needs fixing, malloc and friends need wrapping.
--- Comment #151 from Paul "TBBle" Hampson Paul.Hampson@Pobox.com 2009-04-16 08:58:34 --- Created an attachment (id=20483) --> (http://bugs.winehq.org/attachment.cgi?id=20483) Forward port of proposed patch to 1.1.19
Slight update to mmap redirection patch, to only call wine_pthread_get_functions when it is safe to do so, ie. we've finished our dlsyming.
--- Comment #152 from Paul "TBBle" Hampson Paul.Hampson@Pobox.com 2009-04-16 09:07:09 --- Created an attachment (id=20484) --> (http://bugs.winehq.org/attachment.cgi?id=20484) Redirect malloc to Win32 Heap
This is a redirection of malloc along the lines of attachment 20483 and its predecessors into the Win32 Heap.
It makes memtest (attachment 20402) able to malloc 2gB with the 500 x 10MB test up from 480MB.
Structurally, it's not spectacular. The redirects don't really belong in ntdll/virtual.c that I can see, they are only there because I don't want to add another caller of wine_set_pthread_functions unless it's the earliest point there GetProcessHeap() becomes valid.
And the calling of wine_get_pthread_functions in the mmap and malloc override sets is nasty, but works.
The disadvantage of this patch mainly is that instead of having a 500MB address space for POSIX malloc and a 2gB address space for Win32 HeapAllocate, POSIX malloc and Win32 HeapAllocate share a 2gB address space. More space for POSIX, less total address space.
This could be alleviated by only redirecting POSIX when out of POSIX address space, I guess.
At this point, I don't know that this change is even neccessary, we're not seeing POSIX address space exhaustion due to malloc, that I am aware of or can replicate except in memtest. ^_^
Anyway, this is more of a proof-of-concept/hammer-upon patch, free free to take it and mangle it into shape. You may add
Thrown-Together-In-An-Evening: Paul "TBBle" Hampson Paul.Hampson@Pobox.com
or similar if you wish. ^_^
--- Comment #153 from DL taedium_vitae@eml.cc 2009-04-16 22:37:34 --- Just tested the latest mmap and malloc patch.When both are applied to 1.1.19, stalker and bioshock no longer crash immediately at specific points of each game.Without the malloc patch applied, these crashes still occur, so the malloc patch certainly seems to be necessary.These 2 patches seem to work at least as well as the preloader hack.I'll have to to play a long session to see if any crashes occur, as this was where the games would eventually crash with the preloader hack.