On Thu, Sep 15, 2011 at 8:55 PM, Bernhard Loos bernhardloos@googlemail.com wrote:
On Thu, Sep 15, 2011 at 8:33 PM, Dan Kegel dank@kegel.com wrote:
On Thu, Sep 15, 2011 at 11:13 AM, Bernhard Loos bernhardloos@googlemail.com wrote:
It might be just me, but I've seen five very strange test failures today out of about 30 build/test runs. Has anybody else noticed problems?
It smells like the rpcrt4 change.
I thought so, too.
I admit I'm a bit at a loss, how this could happen. Could you create a +seh,+rpc log? And does it go away, if you revert those changes?
I've reproduced it twice now in an hour on two quad-core machines with the script
for try in `seq 1 100` do echo try $try server/wineserver -k || true rm -rf ~/.wine cd dlls/advapi32/tests rm -f *.ok make test cd ../../msi/tests rm -f action.ok make action.ok cd ../../.. done
so it may take me a while to verify that it goes away after reverting the rpc change or to get you a detailed log.
In one of the failures, I got the service.ok failure from above:
service.c:152: Test failed: Expected success, got error 1060 err:rpc:I_RpcGetBuffer no binding service.c:176: Test failed: Expected ERROR_SERVICE_DOES_NOT_EXIST, got 123
In the other, I got a crash in service.ok: ../../../tools/runtest -q -P wine -M advapi32.dll -T ../../.. -p advapi32_test.exe.so service.c && touch service.ok err:rpc:I_RpcGetBuffer no binding err:rpc:I_RpcGetBuffer no binding wine: Unhandled page fault on read access to 0x60481043 at address 0x7bc34227 (thread 0021), starting debugger.. err:ntdll:RtlpWaitForCriticalSection section 0x110530 "?" wait timed out in thread 002a, blocked by 0021, retrying (60 sec) ...
This crash is the most interesting thing by far. It looks like it's somewhere in ntdll. Could you please check where exectly? Ntdll should always get mapped to the same place, so if you didn't make any changes to it in the meantime, you can check it with some random wine process without waiting for the crash to happen again
I don't get backtraces much anymore because of that darn deadlock (unrelated to the current problem), and btall doesn't seem to show a crash (though it does have some interesting stack traces); see the attachment.
Austin may have a related crash in rpcrt4_test.exe.so server.c, see http://bugs.winehq.org/show_bug.cgi?id=28383#c2
This patch should fix the problem: http://source.winehq.org/patches/data/79042 I switched it back to use nonoverlapped named pipe functions, as only the ConnectNamedPipe operation actually needs the overlapped mode, so I used a dedicated thread for it. The other option would be to create the overlapped completation event for each read/write operation, but this would result in a lot more overhead.
Bernhard