Hello all,
As I mentinoed in a recent post, I have almost completed the GLX->WGL conversion. Last night I tracked down my last bug that was causing most of my demo apps to fail.
It seems that the problem was the conversion from glXGetProcAddress to using wglGetProcAddress. wined3d uses glXGetProcAddress to get the OpenGL extension function pointers, which is what wglGetProcAddress also does. However wglGetProcAddress _first_ checks opengl32.dll for the extension and returns the thunk function pointer if it exists, and only then falls back to libGL.so by calling glXGetProcAddress.
So now I am stuck... if I use wglGetProcAddress for OpenGL extensions I get crashes in most D3D9 applications. If I use glXGetProcAddress in wined3d everything works fine, but then wined3d is still dependent on glx.
So my questions:
1) should the thunks returned from wglGetProcAddress be causing crashes at all? Note that they don't crash right away, but "eventually" usually during a call to glDrawArrays it seems (after a call to glSecondaryColor3fEXT).
2) what is the reason for wglGetProcAddress to check opengl32.dll before libGL.so? Would the reverse logic still be a resonable solution?
Regards, Aric
On Wednesday 14 December 2005 04:53, Aric Cyr wrote:
Hello all,
Hi,
As I mentinoed in a recent post, I have almost completed the GLX->WGL conversion. Last night I tracked down my last bug that was causing most of my demo apps to fail.
It seems that the problem was the conversion from glXGetProcAddress to using wglGetProcAddress. wined3d uses glXGetProcAddress to get the OpenGL extension function pointers, which is what wglGetProcAddress also does. However wglGetProcAddress _first_ checks opengl32.dll for the extension and returns the thunk function pointer if it exists, and only then falls back to libGL.so by calling glXGetProcAddress.
So now I am stuck... if I use wglGetProcAddress for OpenGL extensions I get crashes in most D3D9 applications. If I use glXGetProcAddress in wined3d everything works fine, but then wined3d is still dependent on glx.
So my questions:
- should the thunks returned from wglGetProcAddress be causing crashes at
all? Note that they don't crash right away, but "eventually" usually during a call to glDrawArrays it seems (after a call to glSecondaryColor3fEXT).
The real problem here is "why it crash ?" It shouldn't as wine gl* calls are only "decorators" calls Maybe we have a declaration error...
Can you provide more informations: - WINEDEBUG=+opengl log + winedbg crash log
- what is the reason for wglGetProcAddress to check opengl32.dll before libGL.so? Would the reverse logic still be a resonable solution?
No
Regards, Aric
Regards, Raphael
Raphael <fenix <at> club-internet.fr> writes:
On Wednesday 14 December 2005 04:53, Aric Cyr wrote:
- should the thunks returned from wglGetProcAddress be causing crashes at
all? Note that they don't crash right away, but "eventually" usually during a call to glDrawArrays it seems (after a call to glSecondaryColor3fEXT).
The real problem here is "why it crash ?" It shouldn't as wine gl* calls are only "decorators" calls Maybe we have a declaration error...
Yes that was what I was thinking as well. I haven't looked into it very much yet, but I'll double check the definitions, although since opengl_ext.c is autogenerated I can't imagine why there would be a problem (unless include/wine/wined3d_gl.h has the wrong definitions... I'll have to check that out)
Also I'm not sure if calling conventions matter, or what is the correct way to define a function. I'm referring to the WINAPI vs APIENTRY (vs stdcall?) function pointer prefixes. Any pointers (no pun intended) in this area would be much appreciated.
Can you provide more informations:
- WINEDEBUG=+opengl log + winedbg crash log
Will do when I get home tonight, but there wasn't any useful info in there. I went through it many times. Also wine doesn't cleanly crash (if there is such a thing), instead it hangs on the backtrace and all I see is the first bt entry like:
<debug stuff> 1) memcpy .... <hang>
I can kill it with a ctrl-c just fine though.
- what is the reason for wglGetProcAddress to check opengl32.dll before libGL.so? Would the reverse logic still be a resonable solution?
No
Ya kinda didn't think so :)
Thanks, Aric
Aric Cyr <Aric.Cyr <at> gmail.com> writes:
Raphael <fenix <at> club-internet.fr> writes:
On Wednesday 14 December 2005 04:53, Aric Cyr wrote:
- should the thunks returned from wglGetProcAddress be causing crashes at
all? Note that they don't crash right away, but "eventually" usually during a call to glDrawArrays it seems (after a call to glSecondaryColor3fEXT).
The real problem here is "why it crash ?" It shouldn't as wine gl* calls are only "decorators" calls Maybe we have a declaration error...
Yes that was what I was thinking as well. I haven't looked into it very much yet, but I'll double check the definitions, although since opengl_ext.c is autogenerated I can't imagine why there would be a problem (unless include/wine/wined3d_gl.h has the wrong definitions... I'll have to check that out)
Also I'm not sure if calling conventions matter, or what is the correct way to define a function. I'm referring to the WINAPI vs APIENTRY (vs stdcall?) function pointer prefixes. Any pointers (no pun intended) in this area would be much appreciated.
Can you provide more informations:
- WINEDEBUG=+opengl log + winedbg crash log
I debugged it more last night and I found the offending call. If I comment out just that one (probably unneeded from what I can tell) function everything works great.
The code is in dlls/wined3d/drawprim.c (about halfway down):
if (GL_SUPPORT(EXT_SECONDARY_COLOR)) {
glDisableClientState(GL_SECONDARY_COLOR_ARRAY_EXT); checkGLcall("glDisableClientState(GL_SECONDARY_COLOR_ARRAY_EXT)"); GL_EXTCALL(glSecondaryColor3fEXT)(0, 0, 0); checkGLcall("glSecondaryColor3fEXT(0, 0, 0)"); } else {
/* Missing specular color is not critical, no warnings */ VTRACE(("Specular colour is not supported in this GL implementation\n")); }
If I comment out the GL_EXTCALL(glSecondaryColor3fEXT)(0,0,0); line everything works perfectly (well, the four dx9 demos I have been using). I tried debugging it (put a breakpoint in wine_glSecondaryColor3fEXT) but the function seems to work fine. It is only after this function is called, followed shortly by glDrawArrays() that the actual segfault occurs (inside glDrawArrays there is a NULL dereference it seems). I would blame ATI's drivers but as I said using glXGetProcAddress instead of wglGetProcAddress works without commenting this line.
The only other thing I can think of is that the calling conventions are messed up somehow, either due to mismatched function declarations (which I thoroughly checked, but still could have missed something...) or some weird wine thing I don't understand.
Also other GL_EXTCALL functions were not affected. For example, glPointParameter...EXT() calls are frequently made and don't seem to cause any problems at all.
Any help or tips would be much appreciated, as I am running out of things to try and debug.
- Aric
Aric Cyr <Aric.Cyr <at> gmail.com> writes:
I debugged it more last night and I found the offending call. If I comment out just that one (probably unneeded from what I can tell) function everything works great.
Replying to myself again...
Okay so I think this whole problem might turn out to be a compiler bug. I reverted all changes to drawprim.c I made to track down this bug, then I tried a something different. Since I was crashing in glDrawArrays() I was thinking that the vertex array or normal array was not properly set for some reason. Out of the 4 demos I test with, one of them is unaffected by my bug, and running with WINEDEBUG=d3d_draw I can see that the glDrawArrays path is not taken. So that narrowed the problem down to the glDrawArrays code path for me. In checking the glVertexPointer function call I changed the VTRACE message from
/* Note dwType == float3 or float4 == 2 or 3 */ VTRACE(("glVertexPointer(%ld, GL_FLOAT, %ld, %p)\n", sd->u.s.position.dwStride, sd->u.s.position.dwType + 1, sd->u.s.position.lpData));
to
/* Note dwType == float3 or float4 == 2 or 3 */ VTRACE(("glVertexPointer(%u, %u, %ld, %p)\n", WINED3D_ATR_SIZE(position), WINED3D_ATR_GLTYPE(position), sd->u.s.position.dwStride, sd->u.s.position.lpData));
After making this change everything started working... this is silly though since VTRACE is just a macro around TRACE(). So either there is stack corruption somewhere or the compiler is wonky. Since it is gcc-4.0.2 I'm guessing the compiler is buggy (Ubuntu breezy's standard gcc is 4.0.2 now). I'll try recompiling with gcc-3.3 later and see if the original code is working.
Still interested in any comments though :)
On Fri, 2005-12-16 at 02:08 +0000, Aric Cyr wrote:
Aric Cyr <Aric.Cyr <at> gmail.com> writes:
I debugged it more last night and I found the offending call. If I comment out just that one (probably unneeded from what I can tell) function everything works great.
Replying to myself again...
Okay so I think this whole problem might turn out to be a compiler bug. I reverted all changes to drawprim.c I made to track down this bug, then I tried a something different. Since I was crashing in glDrawArrays() I was thinking that the vertex array or normal array was not properly set for some reason. Out of the 4 demos I test with, one of them is unaffected by my bug, and running with WINEDEBUG=d3d_draw I can see that the glDrawArrays path is not taken. So that narrowed the problem down to the glDrawArrays code path for me. In checking the glVertexPointer function call I changed the VTRACE message from
/* Note dwType == float3 or float4 == 2 or 3 */ VTRACE(("glVertexPointer(%ld, GL_FLOAT, %ld, %p)\n", sd->u.s.position.dwStride, sd->u.s.position.dwType + 1, sd->u.s.position.lpData));
to
/* Note dwType == float3 or float4 == 2 or 3 */ VTRACE(("glVertexPointer(%u, %u, %ld, %p)\n", WINED3D_ATR_SIZE(position), WINED3D_ATR_GLTYPE(position), sd->u.s.position.dwStride, sd->u.s.position.lpData));
After making this change everything started working... this is silly though since VTRACE is just a macro around TRACE(). So either there is stack corruption somewhere or the compiler is wonky. Since it is gcc-4.0.2 I'm guessing the compiler is buggy (Ubuntu breezy's standard gcc is 4.0.2 now). I'll try recompiling with gcc-3.3 later and see if the original code is working.
Still interested in any comments though :)
Hi Aric, If you see stack corruption like this, you might want to try compiling with optimization turned off. put the -O0 (a capital letter O followed by a zero) flag in your CFLAGS when you run configure. I had a similar situation where gcc was using fuzzy math when working with structures, and turning off optimization helped.
James
James Liggett <jrliggett <at> cox.net> writes:
Hi Aric, If you see stack corruption like this, you might want to try compiling with optimization turned off. put the -O0 (a capital letter O followed by a zero) flag in your CFLAGS when you run configure. I had a similar situation where gcc was using fuzzy math when working with structures, and turning off optimization helped.
Thanks for the suggestion, but I already have wine compiled with only "-g". Last time I checked gcc doesn't enable optimizations unless a -O option is specified explicitly (don't know if this has changed though).
Anyways, I think I finally figured this all out. As I thought it turned out to be stack corruption, due to differing calling conventions. It just required fixing up my function pointers to be WINAPI (which I believe becomes stdcall) for all the wgl functions.
Now everything works fine, except this brings up another issue with wglGetProcAddress. The problem is that all gl extensions function pointers are declared WINAPI, and indeed this is what type of functoin wglGetProcAddress is expected to return. However for extensions that are not registered in opengl_ext.c wglGetProcAddress falls back to glXGetProcAddressARB which would return a non-WINAPI function pointer. After the first call to such a function we would have corrupted the stack. There doesn't seem to be a nice way to fix this that I can think of. wglGetProcAddress should never return a non-WINAPI function though, as that is just asking for trouble. Better to return NULL instead of falling back to glXGetProcAddressARB.
As an aside to the wglGetProcAddress issue, all GL extenions called by wined3d are now passed through the thunks. It is only one extra function call so the impact should be minimal (too bad we can't inline these!), but I thought I should mention it anyways.
After cleaning up everything I've worked on I think I'll start submitting patches. First some pre-requisite patches for dlls/opengl32 and include/ and then the changes for dlls/wined3d. All together it is quite a large change, but I should be able to break it up into manageable pieces (hopefully!).
Regards, Aric
Aric Cyr wrote:
James Liggett <jrliggett <at> cox.net> writes:
Hi Aric, If you see stack corruption like this, you might want to try compiling with optimization turned off. put the -O0 (a capital letter O followed by a zero) flag in your CFLAGS when you run configure. I had a similar situation where gcc was using fuzzy math when working with structures, and turning off optimization helped.
Thanks for the suggestion, but I already have wine compiled with only "-g". Last time I checked gcc doesn't enable optimizations unless a -O option is specified explicitly (don't know if this has changed though).
Anyways, I think I finally figured this all out. As I thought it turned out to be stack corruption, due to differing calling conventions. It just required fixing up my function pointers to be WINAPI (which I believe becomes stdcall) for all the wgl functions.
Now everything works fine, except this brings up another issue with wglGetProcAddress. The problem is that all gl extensions function pointers are declared WINAPI, and indeed this is what type of functoin wglGetProcAddress is expected to return. However for extensions that are not registered in opengl_ext.c wglGetProcAddress falls back to glXGetProcAddressARB which would return a non-WINAPI function pointer. After the first call to such a function we would have corrupted the stack. There doesn't seem to be a nice way to fix this that I can think of. wglGetProcAddress should never return a non-WINAPI function though, as that is just asking for trouble. Better to return NULL instead of falling back to glXGetProcAddressARB.
As an aside to the wglGetProcAddress issue, all GL extenions called by wined3d are now passed through the thunks. It is only one extra function call so the impact should be minimal (too bad we can't inline these!), but I thought I should mention it anyways.
After cleaning up everything I've worked on I think I'll start submitting patches. First some pre-requisite patches for dlls/opengl32 and include/ and then the changes for dlls/wined3d. All together it is quite a large change, but I should be able to break it up into manageable pieces (hopefully!).
Regards, Aric
Wine uses -O2 by default even if you don't tell it to optimize. -Os will product buggy code, and -mfpmath=sse,387 (yes only when you ask for BOTH sse and 387) will problems occur (big one at that)
On 12/17/05, Segin segin2005@gmail.com wrote:
Aric Cyr wrote: James Liggett <jrliggett <at> cox.net> writes:
Hi Aric, If you see stack corruption like this, you might want to try compiling with optimization turned off. put the -O0 (a capital letter O followed by a zero) flag in your CFLAGS when you run configure. I had a similar situation where gcc was using fuzzy math when working with structures, and turning off optimization helped.
Thanks for the suggestion, but I already have wine compiled with only "-g". Last time I checked gcc doesn't enable optimizations unless a -O option is specified explicitly (don't know if this has changed though).
Wine uses -O2 by default even if you don't tell it to optimize. -Os will product buggy code, and -mfpmath=sse,387 (yes only when you ask for BOTH sse and 387) will problems occur (big one at that)
If you run configure with no other options "-g -O2" is default, however I run it as "CFLAGS=-g CXXFLAGS=-g ./configure"... so I'm positive (verified by make output) that there is no -O2 or any other options besides -g anywhere. Anyways it turned out not to be a compiler bug, so this is not too important now, but thanks for the suggestions.
- Aric
-- Aric Cyr <Aric.Cyr at gmail dot com> (http://acyr.net)
On Sat, 2005-12-17 at 07:07 +0000, Aric Cyr wrote:
Thanks for the suggestion, but I already have wine compiled with only "-g". Last time I checked gcc doesn't enable optimizations unless a -O option is specified explicitly (don't know if this has changed though).
Anyways, I think I finally figured this all out. As I thought it turned out to be stack corruption, due to differing calling conventions. It just required fixing up my function pointers to be WINAPI (which I believe becomes stdcall) for all the wgl functions.
I think you're right IIRC. I think all win32 calls are stdcall (or sometimes pascal, which is more or less the same thing I think, where the function cleans up the stack and not the caller, as opposed to cdecl, where the opposite happens) Calling conventions are a lot of fun, aren't they? :)
Now everything works fine, except this brings up another issue with wglGetProcAddress. The problem is that all gl extensions function pointers are declared WINAPI, and indeed this is what type of functoin wglGetProcAddress is expected to return. However for extensions that are not registered in opengl_ext.c wglGetProcAddress falls back to glXGetProcAddressARB which would return a non-WINAPI function pointer. After the first call to such a function we would have corrupted the stack. There doesn't seem to be a nice way to fix this that I can think of. wglGetProcAddress should never return a non-WINAPI function though, as that is just asking for trouble. Better to return NULL instead of falling back to glXGetProcAddressARB.
I agree here as well. The whole idea here is to avoid glx and use the WGL layer to abstract it, right? It would be a really bad idea to trust glXGetProcAddressARB because you can't really know for sure if the calling convention is right, and that issue becomes a problem with any shared library on any platform, precisely because of the problems you were experiencing with the stack.
As an aside to the wglGetProcAddress issue, all GL extenions called by wined3d are now passed through the thunks. It is only one extra function call so the impact should be minimal (too bad we can't inline these!), but I thought I should mention it anyways.
Good work! ;-)
After cleaning up everything I've worked on I think I'll start submitting patches. First some pre-requisite patches for dlls/opengl32 and include/ and then the changes for dlls/wined3d. All together it is quite a large change, but I should be able to break it up into manageable pieces (hopefully!).
Great! I'd really like to see this kind of thing come to fruition. I think it would be especially good to see wined3d run on windows for testing purposes.
Regards, Aric
Aric Cyr wrote:
Now everything works fine, except this brings up another issue with wglGetProcAddress. The problem is that all gl extensions function pointers are declared WINAPI, and indeed this is what type of functoin wglGetProcAddress is expected to return. However for extensions that are not registered in opengl_ext.c wglGetProcAddress falls back to glXGetProcAddressARB which would return a non-WINAPI function pointer. After the first call to such a function we would have corrupted the stack. There doesn't seem to be a nice way to fix this that I can think of. wglGetProcAddress should never return a non-WINAPI function though, as that is just asking for trouble. Better to return NULL instead of falling back to glXGetProcAddressARB.
Maybe it'd be possible to make a wrapper function -- a WINAPI function pointer that just does a call to this glXGetProcAddressARB ? (Just me thinking aloud ;-) )
On Friday 16 December 2005 23:07, Aric Cyr wrote:
Last time I checked gcc doesn't enable optimizations unless a -O option is specified explicitly (don't know if this has changed though).
I realize it is a moot point wrt the topic, but as you may know, there's an easy way to check. Running
sh$ echo 'main () {}' > bloated.c sh$ gcc -v -Q $CFLAGS bloated.c
will, amongst other things, dump exactly what optimizations (in terms of atomic gcc command-line optimizations) are used given the current setup and CFLAGS environment variable.
My 3.4.4 does indeed do exactly the same thing with "-g" and "-g -O0".
On Wed, Dec 14, 2005 at 03:53:12AM +0000, Aric Cyr wrote:
So now I am stuck... if I use wglGetProcAddress for OpenGL extensions I get crashes in most D3D9 applications. If I use glXGetProcAddress in wined3d everything works fine, but then wined3d is still dependent on glx.
The answer is easy (did not read the complete thread in details to know if you found out the solution or not): basically, 'wglGetProcAddress' returns functions as expected by Win32 applications, so using the 'stdcall' calling convention whereas 'glXGetProcAddress' returns them in the standard Unix calling convention 'cdecl'.
So you basically have the same problem with GL extensions that you had with direct linking to OpenGL32.DLL instead of to libGL.so => all the calls going through function pointers that you retrieved via 'wglGetProcAddress' will go through thunks to change the calling convention (at the price of a slight performance hit).
Moreover, you will have a nice 'header' head-ache as you won't be able to rely on the Linux distribution's version of 'glext.h' but on a version compatible with Windows that adds the proper 'STDCALL' types to the function pointer prototypes.
Lionel
On 12/17/05, Lionel Ulmer lionel.ulmer@free.fr wrote:
On Wed, Dec 14, 2005 at 03:53:12AM +0000, Aric Cyr wrote:
So now I am stuck... if I use wglGetProcAddress for OpenGL extensions I get crashes in most D3D9 applications. If I use glXGetProcAddress in wined3d everything works fine, but then wined3d is still dependent on glx.
The answer is easy (did not read the complete thread in details to know if you found out the solution or not): basically, 'wglGetProcAddress' returns functions as expected by Win32 applications, so using the 'stdcall' calling convention whereas 'glXGetProcAddress' returns them in the standard Unix calling convention 'cdecl'.
So you basically have the same problem with GL extensions that you had with direct linking to OpenGL32.DLL instead of to libGL.so => all the calls going through function pointers that you retrieved via 'wglGetProcAddress' will go through thunks to change the calling convention (at the price of a slight performance hit).
That's exactly right. That's the conclusion I came to as well. It's all buried in my previous reply somewhere :)
Moreover, you will have a nice 'header' head-ache as you won't be able to rely on the Linux distribution's version of 'glext.h' but on a version compatible with Windows that adds the proper 'STDCALL' types to the function pointer prototypes.
Actually the header 'ache' doesn't hurt so much. The standard (official) glext.h from SGI defines APIENTRY, which we can just define to WINAPI (or __stdcall) before including glext.h. By default, on Linux anyways, APIENTRY is defined as nothing so there shouldn't be any problems. I have this all working already, and managed to get rid of all those horrible copy&paste #defines in wined3d_gl.h and a few other places as well. All-in-all things look a lot cleaner both in dlls/opengl32 and dlls/wined3d. The only catch is to watch out for what should be WINAPI and what should not, but it seems I got it all under control so far. Looking forward to your comments once I start sending the patches in.
Regards, Aric
-- Aric Cyr <Aric.Cyr at gmail dot com> (http://acyr.net)