Hello, I am working on a DDraw rewrite with WineD3D, and I've hit a problem with Half-Life, which looks like a stack corruption. HL makes a ret from a function which leads into sensless code, and I wasn't able to debug it.
That's what happens: From 0x00428a35 there's a call to a function at 0x00408b70. This function calls a lot of DDraw calls, which seem to work fine, but on return, it doesn't return to 0x00428a3a, as it should, but to 0x7fbcd888, which is valid, but contains only garbage code.
I assumed that some call runs amok on the stack, and I've set tried setting breakpoints at 0x00428a35, 0x00408b70 and some other addresses. Oddly, this doesn't work(winedbg disables them when continuing), although I can access the code right after start. I can set a DebugBreak() in the last called function(IDirectDraw::FlipToGDISurface), but there it is too late.
Are there any other ways to debug this? Does anyone know why setting the breakpoints fails?
Well, that's my suspicion for the problem: (an any COM expert comment on this?) To create interfaces for IDirectDraw1 to 7, I created 4 lpVtbl structures for each version, and I cast most DD7 funtions info the older versions. Only where the type or the number of parameters is different, I use a wrapper function. At DD creation, I use whatever version was requested by the app for the new object. The old dd version used a number of macros to cast the various interfaces(defined in ddcomimpl.h). I don't use them. Might this cause the problem?
Thanks for your help, Stefan
Are there any other ways to debug this? Does anyone know why setting the breakpoints fails?
when winedbg cannot set a bp (when continuing or something) it tells it with a message like "invalid addr for bp, disabling it". but, are you sure of the addresses? I mean, how did you get the 0x428a35 address ? In some cases, running the program from the debugger and from the command line lead to different memory layouts, hence potentially different addresses. A+
On Tue, Jan 03, 2006 at 09:17:14PM +0100, Stefan Dösinger wrote:
To create interfaces for IDirectDraw1 to 7, I created 4 lpVtbl structures for each version, and I cast most DD7 funtions info the older versions. Only where the type or the number of parameters is different, I use a wrapper function. At DD creation, I use whatever version was requested by the app for the new object. The old dd version used a number of macros to cast the various interfaces(defined in ddcomimpl.h). I don't use them. Might this cause the problem?
Best would to see the actual code for that as I do not really understand what you did by reading your description of it.
But I still find what you wrote suspicious: if you have 4 VTables you should NEVER cast functions even if they have the same signature - casts are only useful if multiple object versions share the same VTable. Basically (from what I remember :-) ), the pointer to the VTable is stored at the address returned to the application as the COM object. Wine then use a fixed offset to find it's private data from the COM object (basically, the offset between the start of Wine's data to the VTable it returned to the application). Of course, if you have 4 VTables, these offsets are different => you cannot find the address of Wine's internal data without knowing exactly which object was given as an argument to the function.
So by just casting, you will apply the wrong offset and so have completely bogus internal datas (for example, if you do another DDraw call inside the called DDraw function), you may use completely bogus values to do the jump (and so maybe jump to a function which does not have the correct signature => stack corruption).
Lionel
Hi,
Best would to see the actual code for that as I do not really understand what you did by reading your description of it.
I only have to implement D3D Textures, IDirect3DVertexBuffer and IDirect3DExecuteBuffer, then my new patch will be more or less complete and I'll post it here. Shouldn't take too long, depending on the work I have to do for University ;)
But I still find what you wrote suspicious: if you have 4 VTables you should NEVER cast functions even if they have the same signature - casts are only useful if multiple object versions share the same VTable. Basically (from what I remember :-) ), the pointer to the VTable is stored at the address returned to the application as the COM object. Wine then use a fixed offset to find it's private data from the COM object (basically, the offset between the start of Wine's data to the VTable it returned to the application). Of course, if you have 4 VTables, these offsets are different => you cannot find the address of Wine's internal data without knowing exactly which object was given as an argument to the function.
That's basically what the original version does with the macros in ddcomimpl. When I looked at those I didn't really understand their functionality and what they were good for, so I did it a little different.
That's what it basically looks like( IDirectDraw for example):
My IDirectDrawImpl structure :
struct IDirectDrawImpl { IDirectDraw7Vtbl *lpVtbl; ULONG ref;
/* Other members follow here */ }
Most implementation functions are DD7: HRESULT WINAPI IDirectDrawImpl_SetCooperativeLevel(IDirectDraw7 *iface, HWND hWnd, DWORD dwFlags); HRESULT WINAPI IDirectDrawImpl_SetDisplayMode(IDirectDraw7 *iface, DWORD dwWidth, DWORD dwHeight, DWORD dwBPP, DWORD dwRefreshRate, DWORD dwFlags);
I have 4 versions of the Vtables:
IDirectDraw7Vtbl IDirectDraw7_Vtbl = { /* IUnknown */ ... /* IDirectDraw7 */ ... IDirectDrawImpl_SetCooperativeLevel, IDirectDrawImpl_SetDisplayMode, ... }
IDirectDraw4Vtbl IDirectDraw4_Vtbl = { /* IUnknown */ ... /* IDirectDraw4 */ ... ( void * ) IDirectDrawImpl_SetCooperativeLevel, ( void * ) IDirectDrawImpl_SetDisplayMode, ... }
The arguments for SetCooperativeLevel and SetDisplayMode are the same for DD7 and DD4 ( IDirectDrawX *, HWND, DWORD), (IDirectDrawX *, DWORD, DWORD, DWORD, DWORD, DWORD). The only difference is the IDirectDrawX pointer, so I have to use a cast to supress the warning in older Vtables( The (void *) cast is equal to the XCAST makro in the original implementation).
A problem arises then the arguments are different in varios versions: For example IDirectDraw::SetDisplayMode: it only takes a Pointer and 3 DWORDs. Do I created another function:
HRESULT WINAPI IDirectDrawImpl1_SetDisplayMode(IDirectDraw *iface, DWORD dwWidth, DWORD dwHeight, DWORD dwBPP);
And for the DD1 Vtable I use
IDirectDrawVtbl IDirectDraw1_Vtbl = { /* IUnknown */ ... /* IDirectDraw4 */ ... ( void * ) IDirectDrawImpl_SetCooperativeLevel, IDirectDrawImpl1_SetDisplayMode, ... }
Such functions eighter contain a independent implementation, like in SetDisplayMode(which is only a one-liner call to WineD3D) or they call the V7 implementation(Like in IDirect3DDevcie). I can't use the VTable for this, because there's no DD7 Vtable assigned, so I call the I<Iface>Impl_<Method> function directly(This happens a few times in IDirect3DDevice).
In the implementation functions I can get the Implementation structure easily:
IDirectDrawImpl *This = (IDirectDrawImpl *) iface;
There's no difference in the address of the VTable and the Implementation structure.
When creating the Interface, I allocate a I<Interface>Impl structure, and I assign the VTable of the requested version:
if(DD7) object->lpVtbl = &IDirectDraw7_Vtbl; else if(DD4) (IDirectDraw4Vtbl *) object->lpVtbl = &IDirectDraw4_Vtbl; else if(DD2) (IDirectDraw2Vtbl *) object->lpVtbl = &IDirectDraw2_Vtbl; else if(DD1) (IDirectDrawVtbl *) object->lpVtbl = &IDirectDraw1_Vtbl;
If any implementation funtion really needs to know the version of the interface, it can compare the VTable addresses:
if(This->lpVtbl == &IDirectDraw7_Vtbl) { DD7 } else if( (IDirectDraw4Vtbl *) This->lpVtbl == &IDirectDraw4_Vtbl) { DD4 } ....
So I do not need a Version member in the Implementation structures ;)
This works for all games I tried so far, except for HL 1.1.1.0, so I suspect the problem to be somewhere else. HL 1.1.1.1 (the Steam version) works, except that it crashes in native shdocvw when entering the game(This happens with the original ddraw version too). OpenGL mode still works fine. If there are problems I don't know yet, they'll arise in IDirect3DDevice, as it has heavily different VTables.
What are the advantages of this? * No need for the Thunk_I<Interface>_<Method> functions, except for IDirectDraw::SetDisplayMode. The app calls directly into the methods. * No need for the ICOM_THIS_FROM, COM_INTERFACE_CAST, ... Makros. * No difference between the Implementation address and the address passed to the app. This makes traces easier to read IMO.
The problems? * The (void *) cast makes it hard to find incorrectly assigned VTable members. Maybe that's the problem with HL 1.1.1.0 * Implementation functions can't use the VTables * It's different from other dlls that use multiple interface versions. (Are there any? I found only ddraw)
I think I can send a new patch in the next 2 weeks, so you can look at the whole thing.
Stefan
Stefan Dösinger wrote:
IDirectDraw7Vtbl IDirectDraw7_Vtbl = { /* IUnknown */ ... /* IDirectDraw7 */ ... IDirectDrawImpl_SetCooperativeLevel, IDirectDrawImpl_SetDisplayMode, ... }
IDirectDraw4Vtbl IDirectDraw4_Vtbl = { /* IUnknown */ ... /* IDirectDraw4 */ ... ( void * ) IDirectDrawImpl_SetCooperativeLevel, ( void * ) IDirectDrawImpl_SetDisplayMode, ... }
You can't do this. In order to access the object that the vtable is part of you need to subtract an offset from the vtable pointer. If you use the same function for two vtables then which offset do you use? The answer: both or neither - it is impossible. Therefore, even though the implementation of SetCooperativeLevel may be the same for the second vtable, you still need to implement another function for it, even if it just forwards to the true implementation.
You should never cast function pointers in vtables. The risks are just too great.
On Sat, Jan 07, 2006 at 12:02:27PM +0100, Stefan Dösinger wrote:
When creating the Interface, I allocate a I<Interface>Impl structure, and I assign the VTable of the requested version:
if(DD7) object->lpVtbl = &IDirectDraw7_Vtbl; else if(DD4) (IDirectDraw4Vtbl *) object->lpVtbl = &IDirectDraw4_Vtbl; else if(DD2) (IDirectDraw2Vtbl *) object->lpVtbl = &IDirectDraw2_Vtbl; else if(DD1) (IDirectDrawVtbl *) object->lpVtbl = &IDirectDraw1_Vtbl;
Well, an object can have multiple interfaces and still be the same object. So basically if at each new 'QueryInterface' call you allocation a new I<Interface>Impl structure it's wrong regarding COM too...
For example, you could create a surface1, query interface it to be a surface2. Then if you changed stuff using the 'surface1' methods (like, for example, attaching a palette to it), you should be able to query the same palette back using the 'surface2' methods ... Which would not be the case using your solution as both interfaces have separate contexts.
So your object need to have all vtables set at the beginning and the only difference between the interfaces is the pointer returned to the application.
What are the advantages of this?
- No need for the Thunk_I<Interface>_<Method> functions, except for
IDirectDraw::SetDisplayMode. The app calls directly into the methods.
- No need for the ICOM_THIS_FROM, COM_INTERFACE_CAST, ... Makros.
- No difference between the Implementation address and the address passed to
the app. This makes traces easier to read IMO.
Yeah but all this is wrong and does not respect what COM is... I did not spend hours writing the Python script that auto-generated most of the initial code for nothing :-)
Lionel
PS: actually at the beginning of the rewrite we toyed with removing the thunks and adding a pointer to the object data after each VTable. This would have removed all the ICOM_THIS_FROM stuff as getting the address of the object itself would just have been to read the 4 bytes after the VTable. As this idea was frowned upon by AJ, we went the thunks way :-)
Well, an object can have multiple interfaces and still be the same object. So basically if at each new 'QueryInterface' call you allocation a new I<Interface>Impl structure it's wrong regarding COM too...
For example, you could create a surface1, query interface it to be a surface2. Then if you changed stuff using the 'surface1' methods (like, for example, attaching a palette to it), you should be able to query the same palette back using the 'surface2' methods ... Which would not be the case using your solution as both interfaces have separate contexts.
Right, didn't know this QueryInterface thing. Of course if an application QueryInterface()s a IDirectDrawSurface for IDirectDrawSurface2, I replace the VTable and return the same pointer. This scews up the IDirectDrawSurface of course...
Well, I'll use the original way. I guess I should have studied the COM basics before :)
You were right: The crash in HL 1.1.1.0 is fixed with this :) :)
Thanks, Stefan