Hello,
I was debugging on a popular game [1] [2] (9 GB), it crashes on start.
+relay,+seh,+tid log show that there are some calls to wined3d.wined3d_mutex_lock / wined3d.wined3d_buffer_map / wined3d.wined3d_mutex_unlock before crashing, so I turn on +d3d9 trace to get a new log in details.
To my surprise, the game doesn't crash anymore with +d3d9 trace!
After a serials of bisect, I found d3d9_AddRef is the key function call related to the crashing. If I remove the TRACE statement in line 71, the game will crash with +d3d9 as well; if I upgrade the TRACE in 71 to FIXME, the game will not crash even without +d3d9.
66 static ULONG WINAPI d3d9_AddRef(IDirect3D9Ex *iface) 67 { 68 struct d3d9 *d3d9 = impl_from_IDirect3D9Ex(iface); 69 ULONG refcount = InterlockedIncrement(&d3d9->refcount); 70 71 TRACE("%p increasing refcount to %u.\n", iface, refcount); 72 73 return refcount; 74 }
Further tests show that the simplest hack to avoid crashing is replacing line 71 to: FIXME("anything %x\n", 0xdeadbeef);
I have no idea what the real fix is, any suggestion what is the next step to debug?
Thanks in advance!
Environment: Wine 1.6-rc5 Ubuntu 12.04.2 LTS gcc 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)
The installer ship with native d3dx9_40.dll
Backtrace on crash: Wine-dbg>c Unhandled exception: page fault on read access to 0x00000001 in 32-bit code (0x0649e6e9). Register dump: CS:0073 SS:007b DS:007b ES:007b FS:0033 GS:003b EIP:0649e6e9 ESP:0033cc60 EBP:00000000 EFLAGS:00210206( R- -- I - -P- ) EAX:00000001 EBX:19b871e0 ECX:00000000 EDX:00000000 ESI:23f420b0 EDI:23f4f20c Stack dump: 0x0033cc60: 80004005 19c0e0c0 00000000 19b871e0 0x0033cc70: 00000000 23f420b0 23f4f1f0 19b871e0 0x0033cc80: 00000018 00000040 0000001c 00000000 0x0033cc90: 0a9101df 00000020 00000034 7bc34159 0x0033cca0: 00110064 ffffffff 0033ccd8 7bc34d2d 0x0033ccb0: 00000000 0000000c 00000010 00000001 Backtrace: =>0 0x0649e6e9 in kg3dengine (+0x16e6e9) (0x00000000) 0x0649e6e9: movl 0x0(%eax),%ecx
[1] http://jx3.xoyo.com/download [2] http://jx3.client.cdn.kingsoft.com/JXOnline3-v3.0.4.4980/JXOnline3-v3.0.4.49...
-- Regards, Qian Hong
On 19 July 2013 15:34, Qian Hong fracting@gmail.com wrote:
Further tests show that the simplest hack to avoid crashing is replacing line 71 to: FIXME("anything %x\n", 0xdeadbeef);
I have no idea what the real fix is, any suggestion what is the next step to debug?
Well, it sounds a bit like some form of memory corruption. Sometimes WINEDEBUG=warn+heap helps with tracking those down. Also, I don't suppose the d3d9 object was already destroyed earlier?
On Fri, Jul 19, 2013 at 04:06:56PM +0200, Henri Verbeet wrote:
On 19 July 2013 15:34, Qian Hong fracting@gmail.com wrote:
Further tests show that the simplest hack to avoid crashing is replacing line 71 to: FIXME("anything %x\n", 0xdeadbeef);
I have no idea what the real fix is, any suggestion what is the next step to debug?
Well, it sounds a bit like some form of memory corruption. Sometimes WINEDEBUG=warn+heap helps with tracking those down. Also, I don't suppose the d3d9 object was already destroyed earlier?
Or timing ... Also can you print out refcount there to see if the refcounting is good?
Ciao, Marcus
On Sat, Jul 20, 2013 at 2:21 AM, Marcus Meissner marcus@jet.franken.de wrote:
Or timing ... Also can you print out refcount there to see if the refcounting is good?
Thanks Marcus, yes, I tried replace the TRACE to printf, to my surprise, the magic disappear. In other words, if I replace the TRACE in d3d9_AddRef to FIXME, the crashing would be worked around. However, replacing the TRACE to printf doesn't help. I'm not sure why.
by replacing TRACE to printf, I can reproduce the crashing while have a look at the refcounting, it seems good to me.
Hello,
Thanks everyone's help, sorry for delay, I was blocked by something else. I've created http://bugs.winehq.org/show_bug.cgi?id=34125 to track down this bug.
On Fri, Jul 19, 2013 at 10:06 PM, Henri Verbeet hverbeet@gmail.com wrote:
Well, it sounds a bit like some form of memory corruption. Sometimes WINEDEBUG=warn+heap helps with tracking those down.
Thanks Henri, warn+heap doesn't show warning, I've attached to Bug 34125, it shows some HeapSetInformation calls, I don't know if it harms.
Also, I don't suppose the d3d9 object was already destroyed earlier?
By reading +d3d9 log, I believe the d3d9 object was not destroyed earlier.
On Jul 19, 2013, at 8:34 AM, Qian Hong wrote:
I was debugging on a popular game [1] [2] (9 GB), it crashes on start.
+relay,+seh,+tid log show that there are some calls to wined3d.wined3d_mutex_lock / wined3d.wined3d_buffer_map / wined3d.wined3d_mutex_unlock before crashing, so I turn on +d3d9 trace to get a new log in details.
To my surprise, the game doesn't crash anymore with +d3d9 trace!
After a serials of bisect, I found d3d9_AddRef is the key function call related to the crashing. If I remove the TRACE statement in line 71, the game will crash with +d3d9 as well; if I upgrade the TRACE in 71 to FIXME, the game will not crash even without +d3d9.
66 static ULONG WINAPI d3d9_AddRef(IDirect3D9Ex *iface) 67 { 68 struct d3d9 *d3d9 = impl_from_IDirect3D9Ex(iface); 69 ULONG refcount = InterlockedIncrement(&d3d9->refcount); 70 71 TRACE("%p increasing refcount to %u.\n", iface, refcount); 72 73 return refcount; 74 }
Further tests show that the simplest hack to avoid crashing is replacing line 71 to: FIXME("anything %x\n", 0xdeadbeef);
I have no idea what the real fix is, any suggestion what is the next step to debug?
Backtrace on crash: Wine-dbg>c Unhandled exception: page fault on read access to 0x00000001 in 32-bit code (0x0649e6e9).
This reminds me of bugs where programs fail to initialize variables before use. So, the variable contains whatever value happened to be left at that stack location by previous calls. The specific behavior of those previous calls can leave different values, so changing seemingly unrelated code changes the crash (or makes it go away).
Instead of enabling the trace, try replacing it with something like:
char dummy[256]; memset(dummy, 0x55, sizeof(dummy));
I bet the crash will change from a read access to 0x00000001 to 0x55555555.
The question is: is it a bug in the game or a bug in Wine where Wine is returning a garbage pointer to the game and the game is innocently dereferencing it? For that, you'd have to trace the execution from the call to d3d9_AddRef() to the crashing point to see what else is called and whether it uses an uninitialized value.
Of course, another approach would be to run under Valgrind, if you can get it to work.
-Ken
Hi Ken,
On Sun, Jul 21, 2013 at 2:39 PM, Ken Thomases ken@codeweavers.com wrote:
Instead of enabling the trace, try replacing it with something like:
char dummy[256]; memset(dummy, 0x55, sizeof(dummy));
I bet the crash will change from a read access to 0x00000001 to 0x55555555.
Thanks for the advice, it sounds cool and I learned new things, thank you. However, it doesn't help in this case, I didn't found any change after the dummy hack.
The question is: is it a bug in the game or a bug in Wine where Wine is returning a garbage pointer to the game and the game is innocently dereferencing it? For that, you'd have to trace the execution from the call to d3d9_AddRef() to the crashing point to see what else is called and whether it uses an uninitialized value.
Of course, another approach would be to run under Valgrind, if you can get it to work.
I've run it under Valgrind, there are many outputs, I'm not sure which part of valgrind output is important and which part could be ignore, the output is attached to bug 34125 as well.
Quite a lot of output are related to native msvcr80 installed by the app, I tried force using buitlin msvcr80, unfortunately the game doesn't start at all with buitlin msvcr80. I've send one test case (Patch 97446), but there is still other msvcrt bugs after that issue.
By the way, the game ship with many native d3dx9_??, and with buitin d3dx9_?? it doesn't start as well.
On Jul 24, 2013, at 12:49 PM, Qian Hong wrote:
Hi Ken,
On Sun, Jul 21, 2013 at 2:39 PM, Ken Thomases ken@codeweavers.com wrote:
Instead of enabling the trace, try replacing it with something like:
char dummy[256]; memset(dummy, 0x55, sizeof(dummy));
I bet the crash will change from a read access to 0x00000001 to 0x55555555.
Thanks for the advice, it sounds cool and I learned new things, thank you. However, it doesn't help in this case, I didn't found any change after the dummy hack.
Hmm. You may need to mark the dummy array as volatile to prevent it from being optimized away. Worth trying. You might also increase the size of the array.
-Ken
Hi Ken,
On Thu, Jul 25, 2013 at 2:33 AM, Ken Thomases ken@codeweavers.com wrote:
Hmm. You may need to mark the dummy array as volatile to prevent it from being optimized away. Worth trying. You might also increase the size of the array.
You are right, thanks! Increasing the size of the array helps, the minimum working size I found is 588, I've attach the patch and the backtrace to Bug 34125.
In my hack, I let: + char dummy[588]; + memset(dummy, 0x55, sizeof(dummy)); + dummy[4] = 0x88; + dummy[5] = 0x77; + dummy[6] = 0x66; + dummy[7] = 0x55; /* why 584 doesn't work? */
And I get a crashing like: Unhandled exception: page fault on read access to 0x55667788 in 32-bit code
I've also hack wined3d/directx.c in a similar way, the minimum size is 592: + char dummy[592]; + memset(dummy, 0xee, sizeof(dummy)); + dummy[8] = 0xdd; + dummy[9] = 0xcc; + dummy[10] = 0xbb; + dummy[11] = 0xaa; /* why 584 and 588 doesn't work ? */ The crashing is like: Unhandled exception: page fault on read access to 0xaabbccdd in 32-bit code
When combining the d3d9 hack and the wined3d hack, the behavior is the same to d3d9 hack only.
I'll see how far I can get from here, also I'm open to any suggestion and appreciate to any help!
Thanks again.
Hi Qian,
On Jul 25, 2013, at 11:53 AM, Qian Hong wrote:
On Thu, Jul 25, 2013 at 2:33 AM, Ken Thomases ken@codeweavers.com wrote:
Hmm. You may need to mark the dummy array as volatile to prevent it from being optimized away. Worth trying. You might also increase the size of the array.
You are right, thanks! Increasing the size of the array helps, the minimum working size I found is 588, I've attach the patch and the backtrace to Bug 34125.
In my hack, I let:
- char dummy[588];
- memset(dummy, 0x55, sizeof(dummy));
- dummy[4] = 0x88;
- dummy[5] = 0x77;
- dummy[6] = 0x66;
- dummy[7] = 0x55;
/* why 584 doesn't work? */
And I get a crashing like: Unhandled exception: page fault on read access to 0x55667788 in 32-bit code
I think your Valgrind results are telling us that there's a bug in the game where it's using an uninitialized stack variable.
There's still a chance that it's something in Wine that's using the uninitialized variable and passing a garbage value to the game, causing it to crash. I'm not sure if Valgrind would attribute that to Wine for copying the uninitialized variable's value or to the game for using what Wine gave it. I _think_ Valgrind would blame Wine in that case. Since it's blaming the game, I lean toward the bug being in the game.
There may not be any solution other than reporting the issue to the game developer.
-Ken
On Fri, Jul 26, 2013 at 2:52 AM, Ken Thomases ken@codeweavers.com wrote:
I think your Valgrind results are telling us that there's a bug in the game where it's using an uninitialized stack variable.
There's still a chance that it's something in Wine that's using the uninitialized variable and passing a garbage value to the game, causing it to crash. I'm not sure if Valgrind would attribute that to Wine for copying the uninitialized variable's value or to the game for using what Wine gave it. I _think_ Valgrind would blame Wine in that case. Since it's blaming the game, I lean toward the bug being in the game.
There may not be any solution other than reporting the issue to the game developer.
Hmm, thanks a lot for the great analysis :) I've just sent a message to someone in the game company, not sure if they care :)
On Jul 25, 2013, at 2:11 PM, Qian Hong wrote:
On Fri, Jul 26, 2013 at 2:52 AM, Ken Thomases ken@codeweavers.com wrote:
I think your Valgrind results are telling us that there's a bug in the game where it's using an uninitialized stack variable.
There's still a chance that it's something in Wine that's using the uninitialized variable and passing a garbage value to the game, causing it to crash. I'm not sure if Valgrind would attribute that to Wine for copying the uninitialized variable's value or to the game for using what Wine gave it. I _think_ Valgrind would blame Wine in that case. Since it's blaming the game, I lean toward the bug being in the game.
There may not be any solution other than reporting the issue to the game developer.
Hmm, thanks a lot for the great analysis :) I've just sent a message to someone in the game company, not sure if they care :)
I suppose another possibility is that Wine is returning a value to the game which is causing it to take a code path it never takes on Windows and it's only this code path which results in the game reading the uninitialized variable. In that case, there's still a game bug, but we may be able to work around it by changing Wine.
It's going to be really hard to figure out, though, without cooperation from the game developers.
-Ken