I'm taking a public thread that I responded to privately back to public view:
Robert Reif wrote:
Jeremy White wrote:
Putting traces into the mmtimer thread loop shows WaitForSingleObject is sleeping for about 20 ms and then calling the callback twice to catch up.
Well, that means mmtimer is right.
You might try putting traces around the conditional YieldExecution on line 730 of ntdll/sync.c and see if that chews up time; if it does, removing the Yield might help.
Interesting results. It fixes the sound quality and FIXME problem but it is still going about 20 ms between callback pairs.
Could the yield just be aggravating the timer resolution problem?
I think this is the thread priority problem I mentioned earlier.
The behavior of winmm is correct; it is supposed to deliver every single timer request, even if it has to double or triple (or more!) up to do it. I verified that very carefully on Windows.
The reason you see pretty steady 10 ms resolution on Windows is that the winmm thread on Windows is flagged as a priority 15 thread, so it does not get interrupted or time sliced away; it fires precisely when it wants to. It's much harder to achieve the same thing on Linux, sadly.
Hmm. As I sit here and think about this, though, I do find myself a little bit puzzled. I would have expected that, if my understanding is right, you would have seen much more dramatic time gaps. On a 2.6 kernel, for example, I would expect you to see jumps on the order of 100 ms (hence ten winmm requests fired). And exactly 20 ms spacing is a little too consistent and interesting a number; it suggests that there is a more precise understanding that should be available.
That may be a function of the test program, though. I suspect in a 'real' program, where other active threads are demanding more cpu that you might get more sporadic results. It might be interesting to add a spin looping thread to the test case and see if you suddenly get larger gaps.
Cheers,
Jeremy
Jeremy White wrote:
I'm taking a public thread that I responded to privately back to public view:
Robert Reif wrote:
Jeremy White wrote:
Putting traces into the mmtimer thread loop shows WaitForSingleObject is sleeping for about 20 ms and then calling the callback twice to catch up.
Well, that means mmtimer is right.
You might try putting traces around the conditional YieldExecution on line 730 of ntdll/sync.c and see if that chews up time; if it does, removing the Yield might help.
Interesting results. It fixes the sound quality and FIXME problem but it is still going about 20 ms between callback pairs.
Could the yield just be aggravating the timer resolution problem?
I think this is the thread priority problem I mentioned earlier.
The behavior of winmm is correct; it is supposed to deliver every single timer request, even if it has to double or triple (or more!) up to do it. I verified that very carefully on Windows.
The reason you see pretty steady 10 ms resolution on Windows is that the winmm thread on Windows is flagged as a priority 15 thread, so it does not get interrupted or time sliced away; it fires precisely when it wants to. It's much harder to achieve the same thing on Linux, sadly.
Hmm. As I sit here and think about this, though, I do find myself a little bit puzzled. I would have expected that, if my understanding is right, you would have seen much more dramatic time gaps. On a 2.6 kernel, for example, I would expect you to see jumps on the order of 100 ms (hence ten winmm requests fired). And exactly 20 ms spacing is a little too consistent and interesting a number; it suggests that there is a more precise understanding that should be available.
That may be a function of the test program, though. I suspect in a 'real' program, where other active threads are demanding more cpu that you might get more sporadic results. It might be interesting to add a spin looping thread to the test case and see if you suddenly get larger gaps.
Cheers,
Jeremy
I just posted a simple mmtimer test for the regression tests.
Running it like this
WINETEST_DEBUG=2 wine winmm_test.exe.so timer
will print out the actual raw data.
The tests with a period of 1 or 20 ms gives good results (much better than on xp). The 10 ms tests give the 19 and 1 ms steps like I'm seeing in direct sound.
I just posted a simple mmtimer test for the regression tests.
Hmm. When I run it, I see only the expected behavior, both on a 2.4 and 2.6 kernel.
Note that the resolution setting is completely meaningless; the only setting that matters is the period.
And, as far as I can tell, it exactly reproduces the period (and I don't see any strange variance in the min and max).
By the way, shame on me for not having written such a nice regression test when I revamped this code. Thanks for covering for me.
Cheers,
Jer
Jeremy White wrote:
Hmm. When I run it, I see only the expected behavior, both on a 2.4 and 2.6 kernel.
Here is what I get with an up to date RH9 on a Dell Dimension 8300 with latest CVS:
Without ntdll/sync.c patch:
timer.c:59:wPeriodMin = 1, wPeriodMax = 65535 timer.c:110:period = 1, resolution = 0 timer.c:136:min = 0, max = 13, average = 1.000000 timer.c:110:period = 1, resolution = 1 timer.c:136:min = 0, max = 2, average = 1.000000 timer.c:110:period = 10, resolution = 0 timer.c:136:min = 0, max = 23, average = 9.818182 timer.c:110:period = 10, resolution = 1 timer.c:136:min = 0, max = 20, average = 9.818182 timer.c:110:period = 10, resolution = 10 timer.c:136:min = 0, max = 24, average = 9.818182 timer.c:110:period = 20, resolution = 0 timer.c:136:min = 12, max = 21, average = 19.919192 timer.c:110:period = 20, resolution = 1 timer.c:136:min = 12, max = 24, average = 19.919192 timer.c:110:period = 20, resolution = 10 timer.c:136:min = 12, max = 21, average = 19.919192 timer.c:110:period = 20, resolution = 20 timer.c:136:min = 1, max = 30, average = 19.919192 timer: 40 tests executed, 0 marked as todo, 0 failures.
and with patch:
timer.c:59:wPeriodMin = 1, wPeriodMax = 65535 timer.c:110:period = 1, resolution = 0 timer.c:136:min = 0, max = 2, average = 1.000000 timer.c:110:period = 1, resolution = 1 timer.c:136:min = 0, max = 2, average = 1.000000 timer.c:110:period = 10, resolution = 0 timer.c:136:min = 0, max = 20, average = 9.818182 timer.c:110:period = 10, resolution = 1 timer.c:136:min = 0, max = 23, average = 9.818182 timer.c:110:period = 10, resolution = 10 timer.c:136:min = 0, max = 24, average = 9.818182 timer.c:110:period = 20, resolution = 0 timer.c:136:min = 12, max = 21, average = 19.919192 timer.c:110:period = 20, resolution = 1 timer.c:136:min = 12, max = 23, average = 19.919192 timer.c:110:period = 20, resolution = 10 timer.c:136:min = 12, max = 22, average = 19.919192 timer.c:110:period = 20, resolution = 20 timer.c:136:min = 12, max = 23, average = 19.919192 timer: 40 tests executed, 0 marked as todo, 0 failures.
I guess I need to add a standard deviation to the tests.
Hmm. Actually, both cases are broken; you should have a min/max that hovers right around the target period. This is what I get on a 2.6.9 kernel:
timer.c:59:wPeriodMin = 1, wPeriodMax = 65535 timer.c:110:period = 1, resolution = 0 timer.c:136:min = 0, max = 2, average = 1.000000 timer.c:110:period = 1, resolution = 1 timer.c:136:min = 0, max = 2, average = 1.000000 fixme:mmtime:timeBeginPeriod Stub; we set our timer resolution at minimum fixme:mmtime:timeEndPeriod Stub; we set our timer resolution at minimum timer.c:110:period = 10, resolution = 0 timer.c:136:min = 9, max = 11, average = 10.000000 fixme:mmtime:timeBeginPeriod Stub; we set our timer resolution at minimum fixme:mmtime:timeEndPeriod Stub; we set our timer resolution at minimum timer.c:110:period = 10, resolution = 1 timer.c:136:min = 9, max = 11, average = 10.000000 fixme:mmtime:timeBeginPeriod Stub; we set our timer resolution at minimum fixme:mmtime:timeEndPeriod Stub; we set our timer resolution at minimum timer.c:110:period = 10, resolution = 10 timer.c:136:min = 9, max = 11, average = 10.000000 fixme:mmtime:timeBeginPeriod Stub; we set our timer resolution at minimum fixme:mmtime:timeEndPeriod Stub; we set our timer resolution at minimum timer.c:110:period = 20, resolution = 0 timer.c:136:min = 19, max = 21, average = 20.000000 fixme:mmtime:timeBeginPeriod Stub; we set our timer resolution at minimum fixme:mmtime:timeEndPeriod Stub; we set our timer resolution at minimum timer.c:110:period = 20, resolution = 1 timer.c:136:min = 19, max = 21, average = 20.000000 fixme:mmtime:timeBeginPeriod Stub; we set our timer resolution at minimum fixme:mmtime:timeEndPeriod Stub; we set our timer resolution at minimum timer.c:110:period = 20, resolution = 10 timer.c:136:min = 19, max = 21, average = 20.000000 fixme:mmtime:timeBeginPeriod Stub; we set our timer resolution at minimum fixme:mmtime:timeEndPeriod Stub; we set our timer resolution at minimum timer.c:110:period = 20, resolution = 20 timer.c:136:min = 19, max = 21, average = 20.000000 timer: 40 tests executed, 0 marked as todo, 0 failures.
Now I just went to prove that my tests on 2.4 got the same results, and I now get results similar to what you report. I suspect I just misread them earlier (operator error), and that what we're really seeing is a difference bewteen the 2.4 and 2.6 kernels.
Cheers,
Jeremy
Robert Reif wrote:
Jeremy White wrote:
Hmm. When I run it, I see only the expected behavior, both on a 2.4 and 2.6 kernel.
Here is what I get with an up to date RH9 on a Dell Dimension 8300 with latest CVS:
Without ntdll/sync.c patch:
timer.c:59:wPeriodMin = 1, wPeriodMax = 65535 timer.c:110:period = 1, resolution = 0 timer.c:136:min = 0, max = 13, average = 1.000000 timer.c:110:period = 1, resolution = 1 timer.c:136:min = 0, max = 2, average = 1.000000 timer.c:110:period = 10, resolution = 0 timer.c:136:min = 0, max = 23, average = 9.818182 timer.c:110:period = 10, resolution = 1 timer.c:136:min = 0, max = 20, average = 9.818182 timer.c:110:period = 10, resolution = 10 timer.c:136:min = 0, max = 24, average = 9.818182 timer.c:110:period = 20, resolution = 0 timer.c:136:min = 12, max = 21, average = 19.919192 timer.c:110:period = 20, resolution = 1 timer.c:136:min = 12, max = 24, average = 19.919192 timer.c:110:period = 20, resolution = 10 timer.c:136:min = 12, max = 21, average = 19.919192 timer.c:110:period = 20, resolution = 20 timer.c:136:min = 1, max = 30, average = 19.919192 timer: 40 tests executed, 0 marked as todo, 0 failures.
and with patch:
timer.c:59:wPeriodMin = 1, wPeriodMax = 65535 timer.c:110:period = 1, resolution = 0 timer.c:136:min = 0, max = 2, average = 1.000000 timer.c:110:period = 1, resolution = 1 timer.c:136:min = 0, max = 2, average = 1.000000 timer.c:110:period = 10, resolution = 0 timer.c:136:min = 0, max = 20, average = 9.818182 timer.c:110:period = 10, resolution = 1 timer.c:136:min = 0, max = 23, average = 9.818182 timer.c:110:period = 10, resolution = 10 timer.c:136:min = 0, max = 24, average = 9.818182 timer.c:110:period = 20, resolution = 0 timer.c:136:min = 12, max = 21, average = 19.919192 timer.c:110:period = 20, resolution = 1 timer.c:136:min = 12, max = 23, average = 19.919192 timer.c:110:period = 20, resolution = 10 timer.c:136:min = 12, max = 22, average = 19.919192 timer.c:110:period = 20, resolution = 20 timer.c:136:min = 12, max = 23, average = 19.919192 timer: 40 tests executed, 0 marked as todo, 0 failures.
I guess I need to add a standard deviation to the tests.
I just posted a revised test which also prints out the standard deviation.
Same results, 1 and 20 ms have a small deviation and 10 ms has a deviation of around 10.
While people are thinking about mmtimer I've got a question, or more a problem.
Sometimes, mainly when escape from Bink videos (or they run to the end) the linked list in time.c is getting corrupt. (this happens about 1 in 5 times)
i.e.
TIME_MMSysTimeCallback() .... EnterCriticalSection(&iData->cs); for (ptimer = &TIME_TimersList; *ptimer != NULL; ) { /* silly test */ if((int)*ptimer < 1000){ FIXME("ptimer out of expected range, timers list is probably corrupt %p \n", *ptimer); break; }
I've had a look through the code and can't spot any problems with managing the linked list which would point to problems with TIME_TimersList but when the crash occurs ptimer != &TIME_TimersList
Also if the we're using a = &TIME_TimersList, a=&(*a)->lpNext and then referencing *a in critical sections doesn't anything that modifies TIME_TimersList or ... lpNext also have to be in a critical section? e.g. TIME_MMTimeStart and TIME_MMTimeStop.
If so should the code be change to copy the values in TIME_TimersList and ..->lpNext instead of referencing them or should it be changed to make sure TIME_MMTimeStart and TIME_MMTimeStop etc.. use critical sections.
Send instant messages to your online friends http://uk.messenger.yahoo.com
Oliver Stieber wrote:
I've had a look through the code and can't spot any problems with managing the linked list which would point to problems with TIME_TimersList but when the crash occurs ptimer != &TIME_TimersList
I looked as well, and couldn't see anything obvious.
The only 'flaw' I saw was that we used the WINMM_IData->cs, when, for logical separation, it seems we should have our own cs. But I can't imagine any way in which that flaw would create the condition you describe.
Some sub process causing heap corruption would explain this.
Also if the we're using a = &TIME_TimersList, a=&(*a)->lpNext and then referencing *a in critical sections doesn't anything that modifies TIME_TimersList or ... lpNext also have to be in a critical section? e.g. TIME_MMTimeStart and TIME_MMTimeStop.
If so should the code be change to copy the values in TIME_TimersList and ..->lpNext instead of referencing them or should it be changed to make sure TIME_MMTimeStart and TIME_MMTimeStop etc.. use critical sections.
For correctness, I agree that the clearing of TIME_TimersList should better be protected by a crit section; I admire your thoroughness on that. I also don't understand why time.c shouldn't have it's own crit sec for the TimersList.
But a case where that could credibly become an issue is hard to imagine. You have to have 1 thread termintating the MM Timer while another thread is simultaneously killing or setting an event. Further, the event loops inside the set and term event loops are pretty tight; it's hard to imagine having just the right magic timing conditions and then also having a context switch occur during the short tight loops.
I guess that's a long way of saying that yes, I think you have identified a flaw in the code, but that I can't imagine any way it's could cause the problem you're seeing.
Cheers,
Jer
in
TIME_TimersList and ..->lpNext instead of referencing them or should it be changed to make
sure
TIME_MMTimeStart and TIME_MMTimeStop etc.. use critical sections.
it's hard to imagine having just the right magic timing conditions and then also having a context switch occur during the short tight loops.
I guess that's a long way of saying that yes, I think you have identified a flaw in the code, but that I can't imagine any way it's could cause the problem you're seeing.
Well, I've managed to reproduce the fault once with the debugger running.. the backtrace I get is
TIME_MMSysTimeThread(arg=0x403ebcb8) winmm/time.c:249 THREAD_Start (ptr=0x40411f00) kernel/thread.c:107 start_thread (info=0x40411f20) ntdll/thread.c:200 libpthread.so.0 (0x4a8f8b2c) ?? +0x5a in libc.so.6 (0x00000000)
But that's not really useful, I couldn't find out how to switch threads in winedbg so I don't known what the backtraces of the other threads are..
would something like....
void TIME_MMTimeStart(void) { if(!InterlockedExchange(&TIME_activeTimer, TRUE)) { ...... } } }
void TIME_MMTimeStop(void) { if(InterlockedExchange(&TIME_activeTimer, FALSE)) { ...... } }
get around the critical section problems or should the code be change to use it's own critical section or both?
Cheers,
Jer
___________________________________________________________ ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com
On Sat, 12 Feb 2005 17:36:58 +0000, Oliver Stieber wrote:
But that's not really useful, I couldn't find out how to switch threads in winedbg so I don't known what the backtraces of the other threads are..
You put the tid after the bt command, eg:
bt 0x9
or you can use "bt all". Yes it's not very well documented. Yes I had exactly the same question when I started Wine hacking :)
It's now in the "developer cheatsheet" document for this very reason. I'd recommend you read it ;)
--- Mike Hearn mh@codeweavers.com wrote:
On Sat, 12 Feb 2005 17:36:58 +0000, Oliver Stieber wrote:
But that's not really useful, I couldn't find out
how
to switch threads in winedbg so I don't known what
the
backtraces of the other threads are..
You put the tid after the bt command, eg:
bt 0x9
or you can use "bt all". Yes it's not very well documented. Yes I had exactly the same question when I started Wine hacking :)
It's now in the "developer cheatsheet" document for this very reason. I'd recommend you read it ;)
I'll put a patch together so that it appears in winedgb help too.
There are a few more things that aren't in the deveoplers guide like WINEDEBUG=+tid,+snoop etc..
http://www.winehq.org/docs/wine-devel.html#DBG-CLASSES mentions snoop in controling debugger output, but doesn't tell you what it does.
Is there any chance of a partial merge between http://www.winehq.com/site/developer-cheatsheet and http://www.winehq.org/docs/wine-devel.html
the contents of expr leaves a lot to the imagination too!
___________________________________________________________ ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com
--- Oliver Stieber oliver_stieber@yahoo.co.uk > >
I guess that's a long way of saying that yes, I think you have identified a flaw in the code, but that I can't imagine any way it's could cause the problem you're seeing.
would something like....
void TIME_MMTimeStop(void) { if(InterlockedExchange(&TIME_activeTimer, FALSE)) { ...... } }
get around the critical section problems or should the code be change to use it's own critical section or both?
Hmmm, my never trust a dyslexic programmer on a satarday morning. now that I've read the code I can see that my suggestion would require three states and spin, so why not just use a critical section?
___________________________________________________________ Trial Yahoo! Mobile for FREE and win 3 dream holidays. http://www.yahoo.co.uk/trip
Well, I've managed to reproduce the fault once with the debugger running.. the backtrace I get is
Could you cut and paste in the full backtrace, with addresses, along with a diff of the debug changes you've made so I can see what's happening?
Thanks,
Jer
Ok, it looks like dsound is causing problems, or at the very least least alsa.
I've put the time.c stuff into it's own critical section, to no effect.
I've also been getting a 'dead lock' with the critical section pdbi->mmap_crst, but it's not really a dead lock because the holding thread is zero
a bit of watching showed that crit->OwningThread was zero but crit->LockCount was 1, I've set watchpoints on crit->OwningThread and couldn't find any problems so I can only assume that crit->LockCount is being set somewhere.. probably the same place as the MMTimer stuff is getting clobbered.
Unfortunately winedbg's expressions aren't up to the job of trapping OwningThread ==0 and LockCount != -1.
I've hacked out some basic internal type and pointer support for expressions in winedbg, but I promised Jason I'd start sending in DirectX 9 so it'll have to wait.
disp *((int *)0xaddress)) .. now works ok and so does x *(0xaddress) so in theory cond 1 (*((int *)0xaddress)) != 0) should work too?
--- Jeremy White jwhite@codeweavers.com wrote:
Well, I've managed to reproduce the fault once
with
the debugger running.. the backtrace I get is
Could you cut and paste in the full backtrace, with addresses, along with a diff of the debug changes you've made so I can see what's happening?
Thanks,
Jer
___________________________________________________________ ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com