FYI Re: wined3d_mutex

List overview All Threads

newer

older

Re: gdi32/tests: GetCharABCWidthsA...

GSoC 2014: Mentor signups are open

Max TenEyck Woodbury

15 Mar 2014 15 Mar '14

8:17 p.m.

I have told you about the long delays I have encountered with Guild Wars 2 under wine. I mentioned that I suspected the wined3d_mutex was the source of the problem. I have just finished a test that shows that it is NOT the source of this problem.

Specifically, I dummied the wined3d_mutex_lock and wined3d_mutex_unloc procedures so that they did nothing, That version showed the same pattern of delays that the unaltered system showed.

If anybody has an idea about what might be causing these delays, please tell me.

Note that my review of the use of this mutex indicates that it is of very questionable value. In my opinion it should be replaced by a set of read/write locks built into the individual data structures. Comments?

max

Show replies by date

Stefan Dösinger

17 Mar 17 Mar

10:28 a.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

Am 2014-03-15 21:17, schrieb Max TenEyck Woodbury:

...

If anybody has an idea about what might be causing these delays, please tell me.

Sorry, no idea :-( . Maybe perf or oprofile show something.

...

Note that my review of the use of this mutex indicates that it is of very questionable value. In my opinion it should be replaced by a set of read/write locks built into the individual data structures.

I have to give the infamous answers "it depends" and "this needs tests". It's best to use the same lock granularity as native, everything else is asking for trouble.

My very hacky testing suggests that native does not do any locking unless D3DCREATE_MULTITHREADED is passed. Applications can still call d3d from multiple threads without that flag, but then they have to do their own locking. If they don't, the application crashes sooner rather than later.

Ddraw has one global DLL lock. That can be demonstrated easily with the various enum callbacks in the API. Native does not bother to release the lock when calling the application callback.

I suspect d3d8/9 has a per device lock, but I have not done any testing to confirm this. IDirect3DResource9::SetPrivateData is your ticket to finding out. It's the only way you can make native d3d8/9 call your code (IUnknown::AddRef/Release, thanks to D3DSPD_IUNKNOWN). Maybe Microsoft was careful enough to release the lock before calling those methods, but I'd bet a beer that they weren't.

The other thing to consider wrt locking is the multithreaded command stream. Some fields need synchronization. We can't use the wined3d lock for that. My general goal is to avoid sharing resource / device / whatever fields between the main and worker thread as much as possible, and if I do (e.g. (sub)resource.locations) set up CS and resource idle waits in such a way that the main thread knows no command is scheduled or being executed while it operates on those fields. Right now there are plenty of bugs in my preview code (e.g. struct wined3d_buffer.flags).

...PGP SIGNATURE...

-----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJTJs5kAAoJEN0/YqbEcdMwpMkP/jy5crUdlP6NOUjhOKPR8fd0 Hoy8bdxgkexPHgRFm7hDgFfSrfEDvOKRuwgD8cpFC+AtC1pwMdstPo7/VrwI7CWH diNyytRt+lvK74BR0zyXeTW7KAkw2+doCc28o8IWSEn9HtexdYmnlyBLRoMwHT4/ WRCtQb+mriLctnMRpp3LVOTJhwESTA74heSOsE11YtkqRYflMlKh7/8iLSHnRt6l FkHyY1ajLYkIB8NsbUWkXXTHkG9VC59pVpkQvOBdKQDXiEYCELNR7FTP9OqqHt2U DtG8LSUvuSRjMuL9ozBU5aF/EFeevIXjeMyi30f5mb2cT+yGDP/P/QKZNXyDVqD0 J95FvvIuitw5iLAQ1RzNQpMvHgwMsZZ0YpHCW+8hpAmI30HO7sHpOREXmCGt6OfV GZlmSIVP7fDqF2tF3SaFEABFd6XNeVj4APG1LgFOeE16Jl5RYaIqfMYCCWjyi5YQ vsa1GY1cIp4/ZkdG/nZvVOHYhqv6eTuske2QdzpLHR5nvhY3KB8UnPHzWwatHjkH e6FJquUPNbqzgEDgK0w/HJ9X/Kq9StKYC6uUDi0bkHrJ8ouDxcySFM9nNsMyfWix S0SerY5h8FfVdmk8MAV4xu+j3N87eorsMMnTJj40LcTQU6G0vCvAMYIT7PqgpaVf 68CqGerjMAzLUl3RJ3ng =vlfU -----END PGP SIGNATURE-----

Max TenEyck Woodbury

2:30 p.m.

On 03/17/2014 06:28 AM, Stefan Dösinger wrote:

...

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

Am 2014-03-15 21:17, schrieb Max TenEyck Woodbury:

...
If anybody has an idea about what might be causing these delays, please tell me.

Sorry, no idea :-( . Maybe perf or oprofile show something.

OK. A new technique I probably need to learn. Please tell me about it.

I was thinking that 'top' has some ability to monitor process wait states and I was going to start from there.

...

...
Note that my review of the use of this mutex indicates that it is of very questionable value. In my opinion it should be replaced by a set of read/write locks built into the individual data structures.

I have to give the infamous answers "it depends" and "this needs tests". It's best to use the same lock granularity as native, everything else is asking for trouble.

Well, at least as fine as native. Finer might be safer since there is little besides direct examination of the native execution stream (a big NONO!) that assures that all grain boundaries have been identified.

...

My very hacky testing suggests that native does not do any locking unless D3DCREATE_MULTITHREADED is passed. Applications can still call d3d from multiple threads without that flag, but then they have to do their own locking. If they don't, the application crashes sooner rather than later.

:-)

...

Ddraw has one global DLL lock. That can be demonstrated easily with the various enum callbacks in the API. Native does not bother to release the lock when calling the application callback.

OK. I believe you, but exactly what kind of lock is it and how did you go about identifying it? The fact that 'we' hold it during the call backs is something that bothered me. I also noted that 'we' sometimes take it out twice which also makes me think its use is rather superficial.

...

I suspect d3d8/9 has a per device lock, but I have not done any testing to confirm this. IDirect3DResource9::SetPrivateData is your ticket to finding out. It's the only way you can make native d3d8/9 call your code (IUnknown::AddRef/Release, thanks to D3DSPD_IUNKNOWN). Maybe Microsoft was careful enough to release the lock before calling those methods, but I'd bet a beer that they weren't.

:-) No bet. Even good coders have a bit of trouble with that kind of thing, and while they do have some of those, they also have some of the other kind.

...

The other thing to consider wrt locking is the multithreaded command stream. Some fields need synchronization. We can't use the wined3d lock for that. My general goal is to avoid sharing resource / device / whatever fields between the main and worker thread as much as possible, and if I do (e.g. (sub)resource.locations) set up CS and resource idle waits in such a way that the main thread knows no command is scheduled or being executed while it operates on those fields. Right now there are plenty of bugs in my preview code (e.g. struct wined3d_buffer.flags).

I'm not sure I'm following this properly, particularly the part about the main thread knowing about commands scheduled on other threads...

max

Stefan Dösinger

2:50 p.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

Am 2014-03-17 15:30, schrieb Max TenEyck Woodbury:

...

Well, at least as fine as native. Finer might be safer since there is little besides direct examination of the native execution stream (a big NONO!) that assures that all grain boundaries have been identified.

An application may (inadvertently rather than knowingly) expect d3d to block the execution of one of its threads and corrupt internal data if we don't. Unless we don't know of any such case we don't have to copy every stupid behavior native has, but there's also no point in being extra clever about things when applications don't expect d3d to do that.

...

OK. I believe you, but exactly what kind of lock is it and how did you go about identifying it? The fact that 'we' hold it during the call backs is something that bothered me. I also noted that 'we' sometimes take it out twice which also makes me think its use is rather superficial.

Something like this:

event e1 = cleared, e2 = cleared; IDirectDraw ddraw1, ddraw2;

Thread 1: { Start thread 2; Wait for event e1; ddraw1->AddRef(); Signal event e2; return; }

Thread 2: { ddraw2->EnumSurfaces (callback); callback: { Signal event e1; Wait for event e2 - expect a timeout; } }

The Addref call in thread 1 does not finish until the EnumSurfaces callback finishes. Since the callback waits for e2 this pseudocode deadlocks. The deadlock is detected by a timed out wait with a reasonably large (e.g. 500ms) wait time.

This deadlock even happens with two totally unrelated ddraw objects, and functions that shouldn't need a lock like AddRef. (AddRef can just be implemented using interlocked increment ops like in Wine.)

Alexandre said there was no point in having such a test in the tree unless we find an application that depends on the coarse locking behavior. You can search wine-patches for the test, I think I submitted it somewhen in late 2006 or 2007.

The same can be done in d3d9, but the AddRef and Release methods of an IUnknown * passed to SetPrivateData are the only case I know of where those libraries call application code. I expect d3d9 to be better wrt concurrency than ddraw. But it needs tests, not suspicions.

...PGP SIGNATURE...

-----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJTJwuxAAoJEN0/YqbEcdMw1YIP/0VbBi/A0yRV9rRSNP9nncrl sul2ChIE0161Xk3o+9rBnH2DJCyt7a1kx6/AkhZqnhid8UNWFCQxbtDdMpP8w/pQ R1Q2nofj82AaQAJGoIxC7WgwzJAmwjchCO3RWzszpipORXHro5VWMqBT8ZeHrQrT Tt4hsEdzgfj5bwcM4V/ja0Nm78aM+mLbE6WJbFHoo/u+M0gnhuDCViBMS2TeT50h I3NmAJOu6fIvYdFEpgTQBqprAAR4KZbwWxcie2IzoMoc0qygSjpWDshIlBczxfhZ x3wwbH52EZi5+Mpa6BCsm3DWZ14SjXZ9chVegYcG1svkrDzP7xaQ3COKRRyHiv5o 40Vq4RU95GnZNLP8BYZDKqCgYd1a/tOTrpgyS+QPxTJEtTaKaXjxfSK7ApTQ9cbY LsQmhrdvpBxjX/YGZOmr5LbNJJ/fHFLKlBejBS24cB11ZTzptY1z6tnVNKQ2KAM6 pMSwj3R0DhehN4y1tQOWbSYwy3pi6K/OsziuX3MU17zcx5Haom6z7amV/OfRjwlX /IxVQjJ6v9K9xjkFk2ymTxcnp/WuKiZaHvN2SshQr63xFQKgsbVUMymcW/aNXkmc E9w9fjRcwM4ch+nmT7ifvQkHsnf7nuiBa9+bZe08QrkmPeBV8aAz1ysByyMXTsxc OndGfGOhTu+utYvrLkVt =WmQv -----END PGP SIGNATURE-----

Max Woodbury

4:20 p.m.

On 03/17/2014 10:50 AM, Stefan Dösinger wrote:

...

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

Am 2014-03-17 15:30, schrieb Max TenEyck Woodbury:

...
Well, at least as fine as native. Finer might be safer since there is little besides direct examination of the native execution stream (a big NONO!) that assures that all grain boundaries have been identified.

An application may (inadvertently rather than knowingly) expect d3d to block the execution of one of its threads and corrupt internal data if we don't. Unless we don't know of any such case we don't have to copy every stupid behavior native has, but there's also no point in being extra clever about things when applications don't expect d3d to do that.

Yes, but it still might be 'better to err on the side of caution'...

...

...
OK. I believe you, but exactly what kind of lock is it and how did you go about identifying it? The fact that 'we' hold it during the call backs is something that bothered me. I also noted that 'we' sometimes take it out twice which also makes me think its use is rather superficial.

Something like this:

event e1 = cleared, e2 = cleared; IDirectDraw ddraw1, ddraw2;

Thread 1: { Start thread 2; Wait for event e1; ddraw1->AddRef(); Signal event e2; return; }

Thread 2: { ddraw2->EnumSurfaces (callback); callback: { Signal event e1; Wait for event e2 - expect a timeout; } }

The Addref call in thread 1 does not finish until the EnumSurfaces callback finishes. Since the callback waits for e2 this pseudocode deadlocks. The deadlock is detected by a timed out wait with a reasonably large (e.g. 500ms) wait time.

This deadlock even happens with two totally unrelated ddraw objects, and functions that shouldn't need a lock like AddRef. (AddRef can just be implemented using interlocked increment ops like in Wine.)

Alexandre said there was no point in having such a test in the tree unless we find an application that depends on the coarse locking behavior. You can search wine-patches for the test, I think I submitted it somewhen in late 2006 or 2007.

The same can be done in d3d9, but the AddRef and Release methods of an IUnknown * passed to SetPrivateData are the only case I know of where those libraries call application code. I expect d3d9 to be better wrt concurrency than ddraw. But it needs tests, not suspicions.

OK. that shows that there is such a lock, but it leaves the question of exactly what kind of lock it is open. If I remember correctly, there are three kinds:

- Critical sections that can be entered multiple times by a thread without blocking but delays other threads, which is what wine uses.

- Mutex locks that would block if the same thread tries to take it out a second time.

- Read/write locks that can be taken out many times by many threads for reading, but only allow one writer which blocks further readers and waits for existing readers to finish.

max

Stefan Dösinger

7 p.m.

Am 17.03.2014 um 17:20 schrieb Max Woodbury mtewoodbury@gmail.com:

...

Yes, but it still might be 'better to err on the side of caution'…

Except that "caution" usually means coarse locking. It’s much easier to screw up fine-grained locking.

Max Woodbury

8:19 p.m.

On 03/17/2014 03:00 PM, Stefan Dösinger wrote:

...

Am 17.03.2014 um 17:20 schrieb Max Woodbury mtewoodbury@gmail.com:

...
Yes, but it still might be 'better to err on the side of caution'…

Except that "caution" usually means coarse locking. It’s much easier to screw up fine-grained locking.

I was thinking otherwise, but I see your point.

max

4136

Age (days ago)

4138

Last active (days ago)

wine-devel@winehq.org

6 comments

3 participants

tags (0)

participants (3)

Max TenEyck Woodbury
Max Woodbury
Stefan Dösinger