I have told you about the long delays I have encountered with Guild Wars 2 under wine. I mentioned that I suspected the wined3d_mutex was the source of the problem. I have just finished a test that shows that it is NOT the source of this problem.
Specifically, I dummied the wined3d_mutex_lock and wined3d_mutex_unloc procedures so that they did nothing, That version showed the same pattern of delays that the unaltered system showed.
If anybody has an idea about what might be causing these delays, please tell me.
Note that my review of the use of this mutex indicates that it is of very questionable value. In my opinion it should be replaced by a set of read/write locks built into the individual data structures. Comments?
max
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Am 2014-03-15 21:17, schrieb Max TenEyck Woodbury:
If anybody has an idea about what might be causing these delays, please tell me.
Sorry, no idea :-( . Maybe perf or oprofile show something.
Note that my review of the use of this mutex indicates that it is of very questionable value. In my opinion it should be replaced by a set of read/write locks built into the individual data structures.
I have to give the infamous answers "it depends" and "this needs tests". It's best to use the same lock granularity as native, everything else is asking for trouble.
My very hacky testing suggests that native does not do any locking unless D3DCREATE_MULTITHREADED is passed. Applications can still call d3d from multiple threads without that flag, but then they have to do their own locking. If they don't, the application crashes sooner rather than later.
Ddraw has one global DLL lock. That can be demonstrated easily with the various enum callbacks in the API. Native does not bother to release the lock when calling the application callback.
I suspect d3d8/9 has a per device lock, but I have not done any testing to confirm this. IDirect3DResource9::SetPrivateData is your ticket to finding out. It's the only way you can make native d3d8/9 call your code (IUnknown::AddRef/Release, thanks to D3DSPD_IUNKNOWN). Maybe Microsoft was careful enough to release the lock before calling those methods, but I'd bet a beer that they weren't.
The other thing to consider wrt locking is the multithreaded command stream. Some fields need synchronization. We can't use the wined3d lock for that. My general goal is to avoid sharing resource / device / whatever fields between the main and worker thread as much as possible, and if I do (e.g. (sub)resource.locations) set up CS and resource idle waits in such a way that the main thread knows no command is scheduled or being executed while it operates on those fields. Right now there are plenty of bugs in my preview code (e.g. struct wined3d_buffer.flags).
On 03/17/2014 06:28 AM, Stefan Dösinger wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Am 2014-03-15 21:17, schrieb Max TenEyck Woodbury:
If anybody has an idea about what might be causing these delays, please tell me.
Sorry, no idea :-( . Maybe perf or oprofile show something.
OK. A new technique I probably need to learn. Please tell me about it.
I was thinking that 'top' has some ability to monitor process wait states and I was going to start from there.
Note that my review of the use of this mutex indicates that it is of very questionable value. In my opinion it should be replaced by a set of read/write locks built into the individual data structures.
I have to give the infamous answers "it depends" and "this needs tests". It's best to use the same lock granularity as native, everything else is asking for trouble.
Well, at least as fine as native. Finer might be safer since there is little besides direct examination of the native execution stream (a big NONO!) that assures that all grain boundaries have been identified.
My very hacky testing suggests that native does not do any locking unless D3DCREATE_MULTITHREADED is passed. Applications can still call d3d from multiple threads without that flag, but then they have to do their own locking. If they don't, the application crashes sooner rather than later.
:-)
Ddraw has one global DLL lock. That can be demonstrated easily with the various enum callbacks in the API. Native does not bother to release the lock when calling the application callback.
OK. I believe you, but exactly what kind of lock is it and how did you go about identifying it? The fact that 'we' hold it during the call backs is something that bothered me. I also noted that 'we' sometimes take it out twice which also makes me think its use is rather superficial.
I suspect d3d8/9 has a per device lock, but I have not done any testing to confirm this. IDirect3DResource9::SetPrivateData is your ticket to finding out. It's the only way you can make native d3d8/9 call your code (IUnknown::AddRef/Release, thanks to D3DSPD_IUNKNOWN). Maybe Microsoft was careful enough to release the lock before calling those methods, but I'd bet a beer that they weren't.
:-) No bet. Even good coders have a bit of trouble with that kind of thing, and while they do have some of those, they also have some of the other kind.
The other thing to consider wrt locking is the multithreaded command stream. Some fields need synchronization. We can't use the wined3d lock for that. My general goal is to avoid sharing resource / device / whatever fields between the main and worker thread as much as possible, and if I do (e.g. (sub)resource.locations) set up CS and resource idle waits in such a way that the main thread knows no command is scheduled or being executed while it operates on those fields. Right now there are plenty of bugs in my preview code (e.g. struct wined3d_buffer.flags).
I'm not sure I'm following this properly, particularly the part about the main thread knowing about commands scheduled on other threads...
max
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Am 2014-03-17 15:30, schrieb Max TenEyck Woodbury:
Well, at least as fine as native. Finer might be safer since there is little besides direct examination of the native execution stream (a big NONO!) that assures that all grain boundaries have been identified.
An application may (inadvertently rather than knowingly) expect d3d to block the execution of one of its threads and corrupt internal data if we don't. Unless we don't know of any such case we don't have to copy every stupid behavior native has, but there's also no point in being extra clever about things when applications don't expect d3d to do that.
OK. I believe you, but exactly what kind of lock is it and how did you go about identifying it? The fact that 'we' hold it during the call backs is something that bothered me. I also noted that 'we' sometimes take it out twice which also makes me think its use is rather superficial.
Something like this:
event e1 = cleared, e2 = cleared; IDirectDraw ddraw1, ddraw2;
Thread 1: { Start thread 2; Wait for event e1; ddraw1->AddRef(); Signal event e2; return; }
Thread 2: { ddraw2->EnumSurfaces (callback); callback: { Signal event e1; Wait for event e2 - expect a timeout; } }
The Addref call in thread 1 does not finish until the EnumSurfaces callback finishes. Since the callback waits for e2 this pseudocode deadlocks. The deadlock is detected by a timed out wait with a reasonably large (e.g. 500ms) wait time.
This deadlock even happens with two totally unrelated ddraw objects, and functions that shouldn't need a lock like AddRef. (AddRef can just be implemented using interlocked increment ops like in Wine.)
Alexandre said there was no point in having such a test in the tree unless we find an application that depends on the coarse locking behavior. You can search wine-patches for the test, I think I submitted it somewhen in late 2006 or 2007.
The same can be done in d3d9, but the AddRef and Release methods of an IUnknown * passed to SetPrivateData are the only case I know of where those libraries call application code. I expect d3d9 to be better wrt concurrency than ddraw. But it needs tests, not suspicions.
On 03/17/2014 10:50 AM, Stefan Dösinger wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Am 2014-03-17 15:30, schrieb Max TenEyck Woodbury:
Well, at least as fine as native. Finer might be safer since there is little besides direct examination of the native execution stream (a big NONO!) that assures that all grain boundaries have been identified.
An application may (inadvertently rather than knowingly) expect d3d to block the execution of one of its threads and corrupt internal data if we don't. Unless we don't know of any such case we don't have to copy every stupid behavior native has, but there's also no point in being extra clever about things when applications don't expect d3d to do that.
Yes, but it still might be 'better to err on the side of caution'...
OK. I believe you, but exactly what kind of lock is it and how did you go about identifying it? The fact that 'we' hold it during the call backs is something that bothered me. I also noted that 'we' sometimes take it out twice which also makes me think its use is rather superficial.
Something like this:
event e1 = cleared, e2 = cleared; IDirectDraw ddraw1, ddraw2;
Thread 1: { Start thread 2; Wait for event e1; ddraw1->AddRef(); Signal event e2; return; }
Thread 2: { ddraw2->EnumSurfaces (callback); callback: { Signal event e1; Wait for event e2 - expect a timeout; } }
The Addref call in thread 1 does not finish until the EnumSurfaces callback finishes. Since the callback waits for e2 this pseudocode deadlocks. The deadlock is detected by a timed out wait with a reasonably large (e.g. 500ms) wait time.
This deadlock even happens with two totally unrelated ddraw objects, and functions that shouldn't need a lock like AddRef. (AddRef can just be implemented using interlocked increment ops like in Wine.)
Alexandre said there was no point in having such a test in the tree unless we find an application that depends on the coarse locking behavior. You can search wine-patches for the test, I think I submitted it somewhen in late 2006 or 2007.
The same can be done in d3d9, but the AddRef and Release methods of an IUnknown * passed to SetPrivateData are the only case I know of where those libraries call application code. I expect d3d9 to be better wrt concurrency than ddraw. But it needs tests, not suspicions.
OK. that shows that there is such a lock, but it leaves the question of exactly what kind of lock it is open. If I remember correctly, there are three kinds:
- Critical sections that can be entered multiple times by a thread without blocking but delays other threads, which is what wine uses.
- Mutex locks that would block if the same thread tries to take it out a second time.
- Read/write locks that can be taken out many times by many threads for reading, but only allow one writer which blocks further readers and waits for existing readers to finish.
max
Am 17.03.2014 um 17:20 schrieb Max Woodbury mtewoodbury@gmail.com:
Yes, but it still might be 'better to err on the side of caution'…
Except that "caution" usually means coarse locking. It’s much easier to screw up fine-grained locking.
On 03/17/2014 03:00 PM, Stefan Dösinger wrote:
Am 17.03.2014 um 17:20 schrieb Max Woodbury mtewoodbury@gmail.com:
Yes, but it still might be 'better to err on the side of caution'…
Except that "caution" usually means coarse locking. It’s much easier to screw up fine-grained locking.
I was thinking otherwise, but I see your point.
max