https://bugs.winehq.org/show_bug.cgi?id=53475
Bug ID: 53475 Summary: Hang: underrun of data / failed to create thread Product: Wine Version: 7.14 Hardware: x86-64 OS: Linux Status: UNCONFIRMED Severity: normal Priority: P2 Component: dsound Assignee: wine-bugs@winehq.org Reporter: ilkka.prusi@gmail.com Distribution: ---
Playing Alansya Chronicles in some situations the game will hang and needs to be forcibly closed. This happens on all Wine 7.x versions and I haven't tested with older versions of Wine.
There are specific log entries to note when this hang occurs:
(wine:3413): GStreamer-WARNING **: 22:20:23.968: failed to create thread: Error creating thread: Resource temporarily unavailable
028c:fixme:quartz:DSoundRender_UpdatePositions Underrun of data occurred!
Game uses many short audio samples during playback so perhaps thread pool is exhausted? Second entry would indicate that there is synchronization issue in the playback.
But there is another indication that the dsound code is iffy: 01a0:err:quartz:DSoundRender_SendSampleData WaitForSingleObject() returned 0.
This log entry is triggered when This->flush_event is signalled and the code is expecting timeout. This by itself is already strange looking code (from Win32-style standpoint), but what the code does next is even stranger.
Flush event looks like it is meant as signal that code trying to write should exit, the comment says that much: /* Signaled when a flush or state change occurs, i.e. anything that needs * to immediately unblock the streaming thread. */
But when DSoundRender_SendSampleData() gets the signalled event it goes back and tries to continue the loop instead of exiting.
I suspect the code expects that This->sink.flushing would have been set already when the event is signalled but apparently it isn't.
However, having DSoundRender_SendSampleData() bail out on the event does not fix the hang, which would indicate there is more to the issue than just one part.
So, maybe DSoundRender_SendSampleData() could check that if WaitForSingleObject() returns 0 (event signalled) it could just exit and log that instead of trying to repeat? It seems to be normal occurrence instead of error?
The state management seems oddly complicated and like it was added later than by design initially? I suspect this could be made much simpler to avoid the issue.
If there is race condition in state management, is it possible the code will hang indefinitely in DSoundRender_HandleEndOfStream()? That looks like a possible place that would explain the hang I'm seeing.
https://bugs.winehq.org/show_bug.cgi?id=53475
--- Comment #1 from ilkka.prusi@gmail.com --- The things that I've mentioned in some parts of the code might be different from the actual bug, which seems to point to either the way GStreamer is utilized or in GStreamer itself. The code might be odd, but most of the time it seems to work: it is a situation where many short samples are triggered one after another that seems to manifest the actual hang.
https://bugs.winehq.org/show_bug.cgi?id=53475
Zeb Figura z.figura12@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Summary|Hang: underrun of data / |Alansya Chronicles hangs |failed to create thread |with GStreamer "failed to | |create thread" error CC| |z.figura12@gmail.com Component|dsound |quartz
--- Comment #2 from Zeb Figura z.figura12@gmail.com --- The dsoundrender code is pretty awful, but in this case I think the root of the error is GStreamer failing to create threads, which is probably because it's running out of virtual address space. What does "some situations" mean? Is there a way to consistently reproduce? Does it happen after some (long or short) period of time?
https://bugs.winehq.org/show_bug.cgi?id=53475
--- Comment #3 from Rafał Mużyło galtgendo@o2.pl --- ...I strongly suspect this is a dupe of my bug 50779.
The exact conditions needed for a ruby maker game to trigger this bug are somewhat unclear.
The mentioned recent pulseaudio flag should workaround the bug, but only in a 'paper over' sense.
https://bugs.winehq.org/show_bug.cgi?id=53475
--- Comment #4 from ilkka.prusi@gmail.com --- Some parts (some scenes) of the game play fine. In certain game situation (one narrative scene) it can be consistently reproduced: hang in same location every time.
I re-tested with Wine 7.0 and that specific location works but some time after that the hang occurs with same GStreamer error. So while the problem exists it happens much sooner with Wine 7.14.
Side note: game is basically interactive fiction with narrative scenes. Scenes have text, images and audio (background and triggered samples). Scene are different narrative situations and while some might work others seems to trigger problems.
https://bugs.winehq.org/show_bug.cgi?id=53475
--- Comment #5 from ilkka.prusi@gmail.com --- (In reply to Rafał Mużyło from comment #3)
...I strongly suspect this is a dupe of my bug 50779.
The exact conditions needed for a ruby maker game to trigger this bug are somewhat unclear.
The mentioned recent pulseaudio flag should workaround the bug, but only in a 'paper over' sense.
And it likely doesn't solve the thing when using Pipewire instead of Pulseaudio?
https://bugs.winehq.org/show_bug.cgi?id=53475
--- Comment #6 from ilkka.prusi@gmail.com --- (In reply to ilkka.prusi from comment #5)
(In reply to Rafał Mużyło from comment #3)
...I strongly suspect this is a dupe of my bug 50779.
The exact conditions needed for a ruby maker game to trigger this bug are somewhat unclear.
The mentioned recent pulseaudio flag should workaround the bug, but only in a 'paper over' sense.
And it likely doesn't solve the thing when using Pipewire instead of Pulseaudio?
I speculate that with synchronization bugs there will happen memory leaks and other issues as well, which can manifest as running out of memory at some point. Differences in latency (kernel scheduling options, Pipewire vs. Pulseaudio) might also change how contested the critical sections are and contribute to how synchronization bugs appear.
https://bugs.winehq.org/show_bug.cgi?id=53475
--- Comment #7 from Rafał Mużyło galtgendo@o2.pl --- To clarify a thing: take stock of the sound files.
ogg files are unaffected, as they're handled by code embedded in the maker, not sure about wav, but mp3 files definitely trigger the bug.
As I've said in bug 50779 comment 16, a corner case was a game where this bug could be triggered by pressing down on the title screen enough times.
Most likely it comes down to sound effects (due to the way maker handles those), so figure out which one is most likely to trigger it for you.
https://bugs.winehq.org/show_bug.cgi?id=53475
--- Comment #8 from ilkka.prusi@gmail.com --- I noticed something strange in dsound_render_sink_eos() in dsoundrenderer.c.
There is sequence of Lock(), memset(..0..), Unlock() before leaving the method. However: a) if this is the last use of the sample and buffer will be destroyed b) if the buffer will be reused new data will overwrite existing on next use
In both cases, why bother waiting to set the buffer to zero?
This leads to next thing: before that reset there is call to DSoundRender_HandleEndOfStream() (which I mentioned earlier) and that looks like a place where you could get infinite wait. This is also the only place where that method is called and all it does is call DSoundRender_UpdatePositions() in a loop.
Commenting out that HandleEndOfStream() removes the "underrun" error from happening so that error might be bogus. And it does not seem to interfere with samples playing (I didn't notice problems in a quick test. That makes a call to UpdatePositions(), which maybe isn't needed if you are about to stop the stream anyway?
This does not stop GStreamer error and the hang from happening but with these simplifications maybe the error for "underrun" does not matter.
Maybe rest of the code can be simplified in same way.
https://bugs.winehq.org/show_bug.cgi?id=53475
--- Comment #9 from ilkka.prusi@gmail.com --- Created attachment 72847 --> https://bugs.winehq.org/attachment.cgi?id=72847 Simplify end of stream and flush event handling
Flush event handling has bogus error message: it is normal that WaitForSingleObject() returns 0 when event has been signalled. Also at that point it is expected that code will stop blocking so expedited return would be preferred instead of continuing in SendSampleData(). This has a case-check in the patch.
In dsound_render_sink_eos(), when stream is finished resetting should buffer have no meaningful purpose: a) buffer will be destroyed and sample no longer matters b) buffer could be reused and rewritten with new sample Zeroing serves no purpose in either case and is removed in the patch.
Additionally, in dsound_render_sink_eos() there is a waiting loop which seems to have no purpose and causes bogus "underrun" message when sample playing is supposed to end already. Code does other things that seem to have no useful meaning when stream is ending. The lines that may have use are lifter up the call chain in the patch.
Finally, filter->state_event should only be set after all modifying of the state is done, so ensure that is the last. This is also in the patch.
https://bugs.winehq.org/show_bug.cgi?id=53475
--- Comment #10 from ilkka.prusi@gmail.com --- Some typos: resetting should buffer have no -> resetting buffer should have no
https://bugs.winehq.org/show_bug.cgi?id=53475
--- Comment #11 from Zeb Figura z.figura12@gmail.com --- (In reply to ilkka.prusi from comment #9)
Created attachment 72847 [details] Simplify end of stream and flush event handling
Flush event handling has bogus error message: it is normal that WaitForSingleObject() returns 0 when event has been signalled. Also at that point it is expected that code will stop blocking so expedited return would be preferred instead of continuing in SendSampleData(). This has a case-check in the patch.
The problem with this diff is that under normal operation we really shouldn't be getting there in the first place. If flush_event is signaled then the sink should be flushing or stopped. In either case the upstream filter that's driving the graph (i.e. the parser) should have waited for the streaming thread to finish before un-flushing or restarting the graph.
The code is awful and doesn't make this clear, of course.
In dsound_render_sink_eos(), when stream is finished resetting should buffer have no meaningful purpose: a) buffer will be destroyed and sample no longer matters b) buffer could be reused and rewritten with new sample Zeroing serves no purpose in either case and is removed in the patch.
The buffer may not be destroyed or refilled immediately; it may take an arbitrary amount of time, during which time dsound will continue to loop over it. If we don't zero the buffer the user will hear the last second of audio repeat.
Additionally, in dsound_render_sink_eos() there is a waiting loop which seems to have no purpose and causes bogus "underrun" message when sample playing is supposed to end already. Code does other things that seem to have no useful meaning when stream is ending. The lines that may have use are lifter up the call chain in the patch.
It's waiting for the buffer to finish playing, so that the aforementioned zeroing won't overwrite data that hasn't been written out to the audio device yet.
Finally, filter->state_event should only be set after all modifying of the state is done, so ensure that is the last. This is also in the patch.
That's not necessary; there's nothing that needs to actually be synchronized there. The point of state_event is to signal preroll; usually it's signaled on a sample rather than EOS.
https://bugs.winehq.org/show_bug.cgi?id=53475
--- Comment #12 from ilkka.prusi@gmail.com --- (In reply to Zeb Figura from comment #11)
(In reply to ilkka.prusi from comment #9)
Created attachment 72847 [details] Simplify end of stream and flush event handling
Flush event handling has bogus error message: it is normal that WaitForSingleObject() returns 0 when event has been signalled. Also at that point it is expected that code will stop blocking so expedited return would be preferred instead of continuing in SendSampleData(). This has a case-check in the patch.
The problem with this diff is that under normal operation we really shouldn't be getting there in the first place. If flush_event is signaled then the sink should be flushing or stopped. In either case the upstream filter that's driving the graph (i.e. the parser) should have waited for the streaming thread to finish before un-flushing or restarting the graph.
The code is awful and doesn't make this clear, of course.
So the filter state and event state can end in a conflicting situation if I understood this correctly?
Looking at the code more it might be attempting to write out rest of the sample after flushing was signalled
In dsound_render_sink_eos(), when stream is finished resetting should buffer have no meaningful purpose: a) buffer will be destroyed and sample no longer matters b) buffer could be reused and rewritten with new sample Zeroing serves no purpose in either case and is removed in the patch.
The buffer may not be destroyed or refilled immediately; it may take an arbitrary amount of time, during which time dsound will continue to loop over it. If we don't zero the buffer the user will hear the last second of audio repeat.
Additionally, in dsound_render_sink_eos() there is a waiting loop which seems to have no purpose and causes bogus "underrun" message when sample playing is supposed to end already. Code does other things that seem to have no useful meaning when stream is ending. The lines that may have use are lifter up the call chain in the patch.
It's waiting for the buffer to finish playing, so that the aforementioned zeroing won't overwrite data that hasn't been written out to the audio device yet.
So end of stream happens earlier than actually stopping playback? But a couple of lines earlier there's IMediaEventSink_Release() already so it is supposed to end? Is it only relying on actual object destroying happening at a later time?
Maybe that function should just call IDirectSoundBuffer_Stop() to ensure it is ending cleanly? Or is there something conceptually different about end of stream rather than end of sample playback?
https://bugs.winehq.org/show_bug.cgi?id=53475
--- Comment #13 from Zeb Figura z.figura12@gmail.com --- (In reply to ilkka.prusi from comment #12)
So the filter state and event state can end in a conflicting situation if I understood this correctly?
Looking at the code more it might be attempting to write out rest of the sample after flushing was signalled
I'm not sure what you mean by a conflicting situation, but broadly, flushing is asynchronous relative to sample/EOS processing. Under normal operation it'll always happen on a different thread, and that means that a flush request can arrive at any point relative to where we are in dsound_render_sink_Receive() or dsound_render_sink_eos(). Continuing to process the current sample (and not checking for flushing every other line) is hence not necessarily wrong—the purpose of a flush is simply to prevent us from blocking for an arbitrary length of time.
So end of stream happens earlier than actually stopping playback?
Broadly, yes. Everything that happens here is buffered; we expect to receive sound and EOS notifications hundreds of ms before they should actually be rendered.
But a couple of lines earlier there's IMediaEventSink_Release() already so it is supposed to end? Is it only relying on actual object destroying happening at a later time?
IMediaEventSink is a different interface on a different object, and that Release() call should never be the last one—it's just there for refcounting correctness.
Maybe that function should just call IDirectSoundBuffer_Stop() to ensure it is ending cleanly? Or is there something conceptually different about end of stream rather than end of sample playback?
I think we could potentially stop the dsound buffer, yes. I don't know that it'd be worth doing—the current code works, and if we really wanted to improve it we'd need a much bigger fundamental rewrite—but as far as I can think of it wouldn't be wrong.
https://bugs.winehq.org/show_bug.cgi?id=53475
--- Comment #14 from ilkka.prusi@gmail.com --- After starting a new game with Wine 7.14 I got Wine crash and following errors:
0378:err:quartz:DSoundRender_SendSampleData WaitForSingleObject() returned 0. wine: Unhandled exception 0x40000015 in thread 24 at address 00404290 (thread 0024), starting debugger... 0394:fixme:imm:ImeSetActiveContext (0002007C, 1): stub 0394:fixme:imm:ImmReleaseContext (00020084, 0002007C): stub The futex facility returned an unexpected error code. The futex facility returned an unexpected error code. wine client error:1f0: wine client error:210: write: Bad file descriptor write: Bad file descriptor wine client error:134: The futex facility returned an unexpected error code.
And the lines about bad file descriptor repeat after that.
I'll retest with older Wine.
https://bugs.winehq.org/show_bug.cgi?id=53475
--- Comment #15 from ilkka.prusi@gmail.com --- (In reply to ilkka.prusi from comment #14)
After starting a new game with Wine 7.14 I got Wine crash and following errors:
...
And the lines about bad file descriptor repeat after that.
I'll retest with older Wine.
Wine 7.0 works where 7.14 crashes.