On Mon, Jan 30, 2012 at 11:37:17AM +0100, Joerg-Cyril.Hoehle@t-systems.com wrote:
Fixes bug #29657
While I like the idea of this patch and how you implemented it, I wonder if it's appropriate for code freeze.
The main problem in that bug is that the CA driver can deadlock, or release memory out of order. When working on this patch, were you able to identify specifically where the contention is? The idea of the InTransition state was to handle the lock release in mid-call. Can we add a check for that state in the callbacks to solve this problem less intrusively?
The first patch seems fine.
Andrew
Hi,
Andrew Eikum wrote:
I wonder if it's appropriate for code freeze.
You choose: - either 1.4 keeps the old code as is and suffers random dead-locks, or - 1.4 integrates code void of race-conditions on the basis of a solid design.
When working on this patch, were you able to identify specifically where the contention is?
The "10 liner test hack" I mentioned in bug #29657 was simply to remove OSSpinLock inside the callbacks. No more dead-lock. It worked quite well (over hundred successful test iterations to verify the hypothesis), but it was a time bomb and eventually crashed.
Unlike a 10 line hack, a proper solution costs a hundred lines.
The idea of the InTransition state was to handle the lock release in mid-call. Can we add a check for that state in the callbacks to solve this problem less intrusively?
and then do what?
I've never experienced code that works releasing locks mid-call. Quite to the contrary, I'm convinced that such patterns are a sure indicator of a design flaw and bound to cause trouble.
Of the same vein, one liners like EnterCS This-> playing = StateStopped; LeaveCS are suspicious too. What does that guarantee?
You're right that InTransition is no more needed with my patch. A post code-freeze patch can clean up. More specifically, I've kept it so far because of Wait(event) in Stop, outside the lock.
While winealsa would benefit too from a lock-free player as sketched in bug #29531, I'm not considering submitting it for 1.4. The difference is that some apps may suffer audio glitches with winealsa, however any app can suffer a random freeze with winecoreaudio. (Well, perhaps I'll write it meanwhile and if it exhibits significant improvements, submit it anyway).
Next on my list for 1.4 is #28388. There too, a random dead-lock in any MIDI playing app is not acceptable IMHO.
GetPosition is also crucial for 1.4, regardless of code-freeze.
Regards, Jörg Höhle
Joerg-Cyril.Hoehle@t-systems.com writes:
Hi,
Andrew Eikum wrote:
I wonder if it's appropriate for code freeze.
You choose:
- either 1.4 keeps the old code as is and suffers random dead-locks, or
- 1.4 integrates code void of race-conditions on the basis of a solid design.
Can you point to a real app that triggers the bug?
Joerg-Cyril.Hoehle@t-systems.com skrev 2012-01-30 18:30:
Of the same vein, one liners like EnterCS This-> playing = StateStopped; LeaveCS are suspicious too. What does that guarantee?
Perhaps that another thread doing as below isn't fooled into calling both one() and three()?
EnterCS if (This-> playing == StateRunning) { one(); } two(); if (This-> playing == StateStopped) { three(); } LeaveCS
I don't know whether, or not, that has any bearing what-so-ever on the patch under discussion...
Cheers, Peter
Peter Rosin wrote:
Of the same vein, one liners like EnterCS This-> playing = StateStopped; LeaveCS are suspicious too. What does that guarantee?
Perhaps that another thread doing as below isn't fooled into calling both one() and three()?
Right. The next question is: What does that not guarantee (that you'd like to have)?
Alexandre asked:
Can you point to a real app that triggers the bug?
The children's use of the Mac has decreased since XMas because of a new laptop, however IIRC Sid Meyer's Pirates! was the most likely to dead-lock, more than Haegemonia: Legions of Iron (which crashes from time to time on its own). Heroes of Might & Magic V runs well. When an MS window app crashes rarely, you often don't know whether it's an app bug or not.
The typical audio dead-lock scenario goes as follows: The load average / performance meter window suddenly goes to >50% green. Hitting ^C in wine't terminal window yields 100% CPU usage, half red / half green. Often enough, another ^C is not enough to kill wine and the children would call me to issue killall -INT wineserver.
Better ask users what they think about winecoreaudio.
Regards, Jörg Höhle
Joerg-Cyril.Hoehle@t-systems.com skrev 2012-01-31 15:23:
Peter Rosin wrote:
Of the same vein, one liners like EnterCS This-> playing = StateStopped; LeaveCS are suspicious too. What does that guarantee?
Perhaps that another thread doing as below isn't fooled into calling both one() and three()?
Right. The next question is: What does that not guarantee (that you'd like to have)?
It does not guarantee that I have money in my pocket.
Seriously, I'm not sure in what way I'm supposed to not get what a CS does or doesn't do. Please spell it out!
Cheers, Peter
Hi,
What does that not guarantee (that you'd like to have)?
Consider this scenario:
Play: EnterCS-2 do some stuff, start playing state=Playing; LeaveCS-2
Stop: EnterCS-1 do stuff LeaveCS-1 ; because we want to wait for something to finish wait EnterCS-3 state=Stopped LeaveCS-3
Now issue Stop + Play from 2 threads. Play will start playing while the Stop thread is waiting. Then CS-3 will stomp over the state, leading to inconsistent data structures. The player thread is still running while the state variable indicates Stopped.
What can be done? A. Design the system such that waiting can happen inside the CS. winecoreaudio now can do that.
B. Design the system such that it accomodates transition states. This means a much more complex state machine than the API describes. "Can I restart playing if I'm mid-closing? mid-stopping?" "Am I allowed to refuse starting playing when I acknowledged Stop earlier such that the app now believes I've stopped and submits the next song?" It also likely begs for InterlockedCompareExchange and the sort. You then have to be careful about which slots are valid when. It is hard to get correct. (An example of using Interlocked* is in dlls/mci*/mci*.c notification, can you spot the one fault?)
OO programmers (should) know this well: while mid-method, your object state is not consistent; calling another method may call out to other objects which can again call one of your API methods. But the average API methods are not preprared to face the temporary inconsistent state. Erlang avoids this by design.
C. Restart the transaction Add a loop atop CS-1 After exiting CS-1 and waiting, check whether the state is the one you wish to reach. Otherwise restart the loop. (An example is PlaySound in dlls/winmm/playsound.c). All implementations of interlocked lists do something like this: check the final state and restart if not satisfying.
A variation on the above scenario, found in the MCI: the player thread wants to enter Stopped state at the end, after waiting for all resources to be returned. When is it allowed to modify the state variable?
Regards, Jörg Höhle