That's a lot of patches! In the future, please try to send no more than about 5 at a time, until each batch is committed. That's much easier to review.
I reviewed the first 12, and will do the rest after those are committed. Just reading the patches, I didn't notice any show-stoppers. Overall looks good to me. A couple of thoughts below.
On Thu, Oct 04, 2012 at 01:49:13PM +0200, Joerg-Cyril.Hoehle@t-systems.com wrote:
- switch (wmm->dwStatus) {
- case MCI_MODE_PLAY:
- case MCI_MODE_PAUSE:
- wmm->dwStatus = MCI_MODE_STOP;
- }
This is weird. You're "missing" a break, and it seems like an if statement would be more appropriate here, anyway.
Aside from that, you introduce a scary magic number, 0x78B0. I guess that's the MIDI sound off + control message. It would be nice if it was a proper symbol, but native's headers don't give us MIDI symbols, and there are a lot to define. So maybe it's not worth it. Anyway, something to think about cleaning up in the future.
Thanks for improving this code!
Andrew
Hi,
- switch (wmm->dwStatus) {
- case MCI_MODE_PLAY:
- case MCI_MODE_PAUSE:
- wmm->dwStatus = MCI_MODE_STOP;
- }
This is weird. You're "missing" a break,
You're right, at least a /* fall through */ or break is custom.
and it seems like an if statement would be more appropriate here, anyway.
I wouldn't like it. dwStatus is to be considered like a volatile. The Switch statements embeds the idea that the value is loaded once, then dispatched upon. The code does not depend on that (nor is the C compiler prevented from reloading from memory without volatile declaration), but that's my model. A chain of If statements would not carry that idea.
That's why, in patch 20/25, I wrote: + if (wmm->dwStatus == MCI_MODE_STOP || wmm->dwStatus == MCI_MODE_NOT_READY) + break; + else if (wmm->dwStatus != MCI_MODE_PLAY) + continue;
From a pure logic point of view, it could have been more directly expressed as: else if (wmm->dwStatus != MCI_MODE_PLAY) break; or if (wmm->dwStatus == MCI_MODE_PAUSE) continue; if (wmm->dwStatus != MCI_MODE_PLAY) break; but that's not the same. A concurrent thread might change dwStatus to PAUSE right between the 2 lines.
Unfortunately, within a Switch statement, Break cannot be used anymore to get out of the surrounding While loop. So I either had to use Goto or write the convoluted If statement.
The Ada language allows its Break statement to exit nested scopes. These are reliably referenced by name instead of by numeric nesting level like in Tcl.
Rethinking the issue, it could have been written as: else if (wmm->dwStatus != MCI_MODE_PLAY) continue; when reintroducing the original loop condition: while (wmm->dwStatus != MCI_MODE_STOP && wmm->dwStatus != MCI_MODE_NOT_READY) that still must *not* be expressed more readably as: while (wmm->dwStatus == MCI_MODE_PLAY || wmm->dwStatus == MCI_MODE_PAUSE)
you introduce a scary magic number, 0x78B0
It's always been in the >10 year old code, just moved around. Some Linux/ALSA header has symbolic names for the MIDI commands, but only winealsa.drv/midi.c can use these.
Thank you, Jörg Höhle
Joerg-Cyril.Hoehle@t-systems.com writes:
Hi,
- switch (wmm->dwStatus) {
- case MCI_MODE_PLAY:
- case MCI_MODE_PAUSE:
- wmm->dwStatus = MCI_MODE_STOP;
- }
This is weird. You're "missing" a break,
You're right, at least a /* fall through */ or break is custom.
and it seems like an if statement would be more appropriate here, anyway.
I wouldn't like it. dwStatus is to be considered like a volatile. The Switch statements embeds the idea that the value is loaded once, then dispatched upon. The code does not depend on that (nor is the C compiler prevented from reloading from memory without volatile declaration), but that's my model. A chain of If statements would not carry that idea.
That's why, in patch 20/25, I wrote:
- if (wmm->dwStatus == MCI_MODE_STOP || wmm->dwStatus == MCI_MODE_NOT_READY)
break;
- else if (wmm->dwStatus != MCI_MODE_PLAY)
continue;
From a pure logic point of view, it could have been more directly expressed as: else if (wmm->dwStatus != MCI_MODE_PLAY) break; or if (wmm->dwStatus == MCI_MODE_PAUSE) continue; if (wmm->dwStatus != MCI_MODE_PLAY) break; but that's not the same. A concurrent thread might change dwStatus to PAUSE right between the 2 lines.
If the code depends on that sort of thing then it's broken. If another thread can change dwStatus you need a critical section or an interlocked function.
Alexandre wrote:
if (wmm->dwStatus == MCI_MODE_PAUSE) continue; if (wmm->dwStatus != MCI_MODE_PLAY) break; but that's not the same. A concurrent thread might change dwStatus to PAUSE right between the 2 lines.
If the code depends on that sort of thing then it's broken. If another thread can change dwStatus you need a critical section or an interlocked function.
I believe it's not broken: 1. All writers of wmm->dwStatus are already within a critical section (except at mciOpen time where there's no threading to consider). 2. The player MUST NOT get stuck in a CS while playing. 3. But it must read that variable. 4. I believe Interlocked* is strictly superfluous when only reading, so I see no value in InterlockedExchange or InterlockedCompareExchange here.
The player must consider dwStatus conceptually "volatile" (without requiring the broken C keyword). Therefore one has to carefully think about possible values of (dwStatus == MCI_PLAY || dwStatus == MODE_PAUSE). The value could switch from PAUSE to PLAY just when evaluating ||.
However, the statement + if (wmm->dwStatus == MCI_MODE_STOP || wmm->dwStatus == MCI_MODE_NOT_READY) + break; cannot be fooled by interleaved changes.
Regards, Jörg Höhle
Joerg-Cyril.Hoehle@t-systems.com writes:
Alexandre wrote:
if (wmm->dwStatus == MCI_MODE_PAUSE) continue; if (wmm->dwStatus != MCI_MODE_PLAY) break; but that's not the same. A concurrent thread might change dwStatus to PAUSE right between the 2 lines.
If the code depends on that sort of thing then it's broken. If another thread can change dwStatus you need a critical section or an interlocked function.
I believe it's not broken:
- All writers of wmm->dwStatus are already within a critical section
(except at mciOpen time where there's no threading to consider). 2. The player MUST NOT get stuck in a CS while playing. 3. But it must read that variable. 4. I believe Interlocked* is strictly superfluous when only reading, so I see no value in InterlockedExchange or InterlockedCompareExchange here.
The player must consider dwStatus conceptually "volatile" (without requiring the broken C keyword). Therefore one has to carefully think about possible values of (dwStatus == MCI_PLAY || dwStatus == MODE_PAUSE). The value could switch from PAUSE to PLAY just when evaluating ||.
This makes no sense, you can't make assumptions about the generated code from the shape of the if statements. Also there's no such thing as "conceptually volatile", unless you add explicit memory barriers.
Alexandre Julliard wrote:
This makes no sense, you can't make assumptions about the generated code from the shape of the if statements.
Don't || and && impose an evaluation order?
Also there's no such thing as "conceptually volatile", unless you add explicit memory barriers.
Forget about "conceptually volatile". I meant to express the idea that some variable may be updated from another thread. This has nothing to do with the C keyword of that name.
Once again I read some articles about memory barriers and fail to see what's wrong with the final code (after patch 21/25):
if (wmm->dwStatus == MCI_MODE_STOP || wmm->dwStatus == MCI_MODE_NOT_READY) break; else if (wmm->dwStatus != MCI_MODE_PLAY || recheck) continue;
0. dwStatus is aligned within HeapAlloc'ed WINE_MCIMIDI, so we only consider atomic reads and writes.
1. Every writer of wmm->dwStatus is already within a critical section (after patch 11/25).
2. I don't want the player to block in a CS. All it needs is to poll dwStatus.
3. The player need not react immediately to status changes. Nothing in it depends on reacting ASAP to status changes. Say "stop" to children, and they'll stop eventually, but not immediately. :-) IOW, I don't mind if it reads a stale status value. It may stop after playing another note at the next iteration.
The last sentence sounds a bit strange. If every thread/core had 1GB local storage, would the player never see the status update by another core?
I expect memory writes to eventually propagate to all cores. Memory barriers are all about ordering reads and writes. Given dwStatus is a single memory location, do we care about ordering? (SetEvent doesn't matter)
Changing the code to: if (InterlockedCompareExchange(&wmm->dwStatus, PLAY, PLAY) != PAUSE && InterlockedCompareExchange(&wmm->dwStatus, PLAY, PLAY) != PLAY) break; would be as broken as: if (wmm->dwStatus != PAUSE && wmm->dwStatus != PLAY) break;
I propose possible rewrites that would make it clear that dwStatus is updated asynchronously. Wine lacks the MemoryBarrier() macro that MSDN documents, however InterlockedCompareExchange can be used as a no-op. At the cost of a goto, since a break within a while can't abort the while:
A switch (InterlockedCompareExchange(&wmm->dwStatus, PLAY, PLAY)) { case MODE_PLAY: break; case MODE_PAUSE: continue; default: goto finish; } if (recheck) continue;
or B InterlockedExchange(&status, wmm->dwStatus); if (status == MODE_PAUSE) continue; if (status != MODE_PLAY) break; if (recheck) continue;
but how is that different from C status = wmm->dwStatus; if (status == MODE_PAUSE) continue; if (status != MODE_PLAY) break; if (recheck) continue; given point 3. i.e. I don't care whether the player aborts now or at the next round.
Would one of these solve the issue in your opinion?
To make progress meanwhile, you may consider comitting the patches 13/14/16 (MCI_STATUS/SEEK/INFO).
Regards, Jörg Höhle
Joerg-Cyril.Hoehle@t-systems.com writes:
Alexandre Julliard wrote:
This makes no sense, you can't make assumptions about the generated code from the shape of the if statements.
Don't || and && impose an evaluation order?
Only if there are side effects.
Also there's no such thing as "conceptually volatile", unless you add explicit memory barriers.
Forget about "conceptually volatile". I meant to express the idea that some variable may be updated from another thread. This has nothing to do with the C keyword of that name.
Once again I read some articles about memory barriers and fail to see what's wrong with the final code (after patch 21/25):
if (wmm->dwStatus == MCI_MODE_STOP || wmm->dwStatus == MCI_MODE_NOT_READY) break; else if (wmm->dwStatus != MCI_MODE_PLAY || recheck) continue;
That depends on what state transitions are possible and what the behavior should be in various states. I haven't analyzed that in detail.
However, it does *not* depend on how you write the code. Whether you use one complex if() instead of two simple ones, or a switch(), or any other way that is logically equivalent. Either it's reliable in all cases, or in none. Writing complicated if() statements just because you think it somehow helps with concurrency is misguided.
- I don't want the player to block in a CS. All it needs is to poll dwStatus.
It wouldn't block if you don't do other things inside the CS.
On Tue, Oct 09, 2012 at 04:05:09PM +0200, Alexandre Julliard wrote: ...
Also there's no such thing as "conceptually volatile", unless you add explicit memory barriers.
Forget about "conceptually volatile". I meant to express the idea that some variable may be updated from another thread. This has nothing to do with the C keyword of that name.
Once again I read some articles about memory barriers and fail to see what's wrong with the final code (after patch 21/25):
if (wmm->dwStatus == MCI_MODE_STOP || wmm->dwStatus == MCI_MODE_NOT_READY) break; else if (wmm->dwStatus != MCI_MODE_PLAY || recheck) continue;
That depends on what state transitions are possible and what the behavior should be in various states. I haven't analyzed that in detail.
If the dwStatus field ISN'T volatile then the compile might generate code for the above that reads it once, twice, or three times.
If we assume that reads of it are atomic (ie it is not larger than a machine word, and is aligned) then there is no need to acquire any mutex across the read - unless you are also checking another variable.
The above code almost certainly requires that only a single read be done. This is likely to be the case if the code is optimised, and unlikely if not. So in the above the repeated reads probably technically require a lock.
Better to code as: status = wmm->dwStatus; if (...) but even then I think the compiler is allowed to perform extra reads. It might, for example, do so if the condition was complicayted and there was a lot of pressure on registers - so instead of spilling 'status' to stack, it would re-read it.
With gcc you can force it to only to one read by adding: asm volatile("":::"memory"); before the first if. The "memory" constraint of the asm statement tells gcc that it might change any memory location - thus acting as a barrier.
The only portable way is with volatile.
David
Hi,
David Laight wrote:
Better to code as: status = wmm->dwStatus; if (...)
Incidentally, that's what I did tonight in patch 20/25 try 2.
but even then I think the compiler is allowed to perform extra reads. It might, for example, do so if the condition was complicayted and there was a lot of pressure on registers - so instead of spilling 'status' to stack, it would re-read it.
Interesting. Do you think/believe or are you sure?
With gcc you can force it to only to one read by adding: asm volatile("":::"memory"); The only portable way is with volatile.
"only portable way" from the C perspective. However, given a specific target environment, we could use its API functions. Either MemoryBarrier(); /* which MSDN documents but Wine does not provide in include/*.h Or InterlockedExchange(&status, wmm->dwStatus);
So if AJ is still not satisfied with try 2, I'll change all reads of wmm->dwStatus within the player into InterlockedExchange. Yet I think that would be a superfluous extra memory barrier within the player.
Thank you, Jörg Höhle
On 10/10/2012 10:42 AM, Joerg-Cyril.Hoehle@t-systems.com wrote:
Hi,
David Laight wrote:
Better to code as: status = wmm->dwStatus; if (...)
Incidentally, that's what I did tonight in patch 20/25 try 2.
but even then I think the compiler is allowed to perform extra reads. It might, for example, do so if the condition was complicayted and there was a lot of pressure on registers - so instead of spilling 'status' to stack, it would re-read it.
Interesting. Do you think/believe or are you sure?
With gcc at -O0 'status' will get a stack slot http://gcc.gnu.org/ml/gcc/2010-05/msg00116.html With optimization turned on probably not as it is just a reference to an other memory location which can be easily optimized away.
With gcc you can force it to only to one read by adding: asm volatile("":::"memory"); The only portable way is with volatile.
"only portable way" from the C perspective. However, given a specific target environment, we could use its API functions. Either MemoryBarrier(); /* which MSDN documents but Wine does not provide in include/*.h Or InterlockedExchange(&status, wmm->dwStatus);
So if AJ is still not satisfied with try 2, I'll change all reads of wmm->dwStatus within the player into InterlockedExchange. Yet I think that would be a superfluous extra memory barrier within the player.
bye michael
On Wed, Oct 10, 2012 at 10:42:53AM +0200, Joerg-Cyril.Hoehle@t-systems.com wrote:
Hi,
David Laight wrote:
Better to code as: status = wmm->dwStatus; if (...)
Incidentally, that's what I did tonight in patch 20/25 try 2.
but even then I think the compiler is allowed to perform extra reads. It might, for example, do so if the condition was complicayted and there was a lot of pressure on registers - so instead of spilling 'status' to stack, it would re-read it.
Interesting. Do you think/believe or are you sure?
Probably in the realms of 'is allowed to' (under the general 'is if' rule), but in practise won't.
With gcc you can force it to only to one read by adding: asm volatile("":::"memory"); The only portable way is with volatile.
"only portable way" from the C perspective. However, given a specific target environment, we could use its API functions. Either MemoryBarrier(); /* which MSDN documents but Wine does not provide in include/*.h Or InterlockedExchange(&status, wmm->dwStatus);
So if AJ is still not satisfied with try 2, I'll change all reads of wmm->dwStatus within the player into InterlockedExchange. Yet I think that would be a superfluous extra memory barrier within the player.
That use of InterlockedExchange() is OTT. As well as being a slow, locked bus cycle is will force a stack slot for 'status', and the value be read back from there after anything that might modify memory (eg a function call).
I can't remember the minumum requirement to stop the compiler doing a reload. A call to an external function is more than enough.
There are also a lot of different tyoes of 'memory barrier' dunno which sort MSDN's MemoryBarrier() generates. On x86/amd64 you almost never need to request a barrier.
David
David Laight wrote:
InterlockedExchange(&status, wmm->dwStatus);
That use of InterlockedExchange() is OTT.
Thank you for the clarification.
I've come across http://gcc.gnu.org/wiki/Atomic/GCCMM http://gcc.gnu.org/wiki/Atomic/GCCMM/PracticalUsage and found the official expression for what I've been trying to explain:
I want the MIDI player to use the *relaxed* memory model.
"Relaxed [...] if you are using an atomic variable simply as a value and don't care about the synchronization aspects (ie, you just want to always see a valid value for the variable), then that maps to the relaxed mode. There may be some academic babble about certain provisions, but this is effectively what it boils down to. The relaxed mode is what you use when you don't care about all that memory flushing and just want to see the values of the atomic itself. This is the fastest model, but make sure you don't depend on the values of other shared variables in other threads. This is also what you get when you use the basic atomic STORE and LOAD macros in C."
MCISEQ is designed around an asymmetric N+1 threading model.
- N application threads may issue MCI commands. Critical sections are used to serialize status changes. - 1 player runs. All it cares about is to observe one valid (atomic) value for the variable wmm->dwStatus. It does not need to lock the memory bus or synchronise caches.
If dwStatus indicates a change, let the player act upon it. If the core's cached dwStatus is stale, the player will see the change after another iteration, afer the memory subsystem will have propagated changes.
This morning, I've even had the idea to poll dwStatus less often, i.e. only after sleeping. That way, all notes of a chord that should be played or released simultaneously would be processed together.
After the player loop finishes, the player too uses a critical section like the N app threads to perform the transition from PLAY (or PAUSE) to STOP.
Regards, Jörg höhle
Hi,
Alexandre Julliard wrote:
If another thread can change dwStatus you need a critical section or an interlocked function.
So if AJ is still not satisfied with try 2, I'll change all reads of wmm->dwStatus within the player into InterlockedExchange. Yet I think that would be a superfluous extra memory barrier within the player.
I've now submitted try 4 which uses InterlockedExchange. Given time, I'd like to find another place where to debate this issue, perhaps Usenet's comp.lang.c or a GCC mailing list (even if not specific to GCC)?
Regards, Jörg Höhle
Hi,
I'm missing a portable compiler barrier for use within Wine.
MSVC has _ReadBarrier, _ReadWriteBarrier etc. http://msdn.microsoft.com/en-us/library/f20w0x5e(v=vs.80).aspx
GCC has __asm__ __volatile__("":::"memory");
However compilers other than GCC may be used to compile Wine, e.g. I believe some people use LLVM (I've heard that LLVM parses GCC asm statements, but what #ifdef would be appropriate?).
Is __asm__ __volatile__("":::"memory"); portable across all targets that Wine supports?
Without that, it seems to me that using DWORD foo = (volatile DWORD) bar; is the most "portable" way of achieving a compiler barrier within the scope of Wine, even though the pthreads gurus will spit at it.
At least, every C compiler parses volatile, unlike __asm__...
Background: I want an atomic read of whatever value a single aligned int (or DOWRD or whatever) currently appears visible to one core. - Critical sections and mutexes are out of question, because they both can block. - InterlockedXyExchange are not appropriate, because there's nothing to exchange. Furthermore, they introduce a full hardware memory barrier whereas a compiler barrier suffices.
All I want is something like memory_order_consume from the C++11 standard.
Alexandre Julliard wrote:
- I don't want the player to block in a CS. All it needs is to poll dwStatus.
It wouldn't block if you don't do other things inside the CS.
I don't understand that one. Code in the player like: EnterCS(wmm->lock); local = wmm->dwStatus; LeaveCS(wmm->lock); *will* block when the app calls API functions that take the lock, even if for a short time. For instance, an app bombarding mmdevapi with GetCurrentPadding requests will disturb the periodic feeder in winealsa.drv:alsa_push_buffer_data. When that happens, only the UNIX scheduler knows when that thread will get scheduled again (even worse, it must first schedule the Wine server).
What do you think? Jörg Höhle
Joerg-Cyril.Hoehle@t-systems.com writes:
Alexandre Julliard wrote:
- I don't want the player to block in a CS. All it needs is to poll dwStatus.
It wouldn't block if you don't do other things inside the CS.
I don't understand that one. Code in the player like: EnterCS(wmm->lock); local = wmm->dwStatus; LeaveCS(wmm->lock); *will* block when the app calls API functions that take the lock, even if for a short time. For instance, an app bombarding mmdevapi with GetCurrentPadding requests will disturb the periodic feeder in winealsa.drv:alsa_push_buffer_data. When that happens, only the UNIX scheduler knows when that thread will get scheduled again (even worse, it must first schedule the Wine server).
A critical section doesn't involve the wine server. I also fail to see why something like GetCurrentPadding would need to change the status. Obviously if you have a global critical section protecting everything that may be an issue, but you can fix that.
I also have a hard time believing that a simple crit section would make a difference here, considering how many other sections are potentially entered in that player loop.
I strongly recommend that you stop playing games with the compiler and start thinking about doing this using the existing synchronization primitives (either crit section or interlocked functions) in the way that they are meant to be used. Once you have code that is demonstrably correct WRT threading, and you can show that the primitives in question are the cause of trouble, then we can talk about taking shortcuts.
Alexandre Julliard wrote:
A critical section doesn't involve the wine server.
I simply saw that path: RtlEnterCriticalSection -> RtlpWaitForCriticalSection -> -> wait_semaphore -> fast_wait | NTDLL_wait_for_multiple_objects -> SERVER_START_REQ but I don't like to argue about your server with you.
It doesn't bug me at all whether a blocked CS needs help from the Wine server or whether a Linux futex suffices. I simply want to avoid a CS where it is strictly superfluous.
I also fail to see why something like GetCurrentPadding would need to change the status.
Sorry for introducing that example from another Wine dll. It caused confusion. mmdevapi:GCP is unrelated to mcimidi:wmm->dwStatus
Obviously if you have a global critical section protecting everything that may be an issue, but you can fix that.
Indeed, winealsa could use 2 CS: one for preventing application-level concurrency, the other for controlling the internal periodic feeder.
However rather than augmenting the number and uses of CS and thinking about lock convoys and dead locks, I prefer correct designs with fewer CS.
I also have a hard time believing that a simple crit section would make a difference here, considering how many other sections are potentially entered in that player loop.
There aren't. The loop is: - get note (no CS) - may Sleep (small wrapper above UNIX sleep) - somehow check wmm->dwStatus for STOP - midiOutShortMsg (any number of CS, I don't care, the contract of this function simply is "get that note out ASAP".)
The only question is: how to read wmm->dwStatus?
Regards, Jörg Höhle
On Thu, Nov 01, 2012 at 09:25:53AM +0100, Joerg-Cyril.Hoehle@t-systems.com wrote:
Background: I want an atomic read of whatever value a single aligned int (or DOWRD or whatever) currently appears visible to one core.
A simple read of a volatile data item will do that. Look at what signal handlers are allowed to change.
New compilers link clang support the gcc asm semantics - they need to in order to compile a lot of their target code.
The purpose of 'asm volatile ("":::""memory);' is to constrain the values that the compiler can have cached in registers.
I use it for 3 purposes: 1) To ensure object code accesses memory in the correct order (provided the cpu can't reorder the memory accesses itself (which many modern cpu will do - at least for cached accesses). 2) To reduce the number 'live' values in a function to stop values being spilled to stack. 3) To force the generated code to read values early (sometimes before an 'if' when the value is only used in one clause) in order to remove cpu stalls waiting for memory and to fill otherwise unfillable delay slots. This one is very cpu specific - but I am counting every instruction in that function. (This is a MIPS-like embedded cpu.)
David