On 14.03.2017 16:20, Sebastian Lackner wrote:
On 13.03.2017 13:31, Jacek Caban wrote:
I drafted an optimization discussed in this thread. It's just a proof-of-concept. A better implementation will store user APCs on server side when the async is created and I'd consider moving handling of returned data and wait handle out of fd-specific ops. I ran it on my benchmark from last year [1] and it was 2 times slower than what I called "plain wine+msgmode patchset+optimizations". It's expected, since with that we need two server calls (that one needed one). Going down to one server round-trip for simple cases is easy with that and should give about equal results.
The point is, named pipe code needs just small changes for such optimization, no refactoring. I hope that addresses your concerns.
Jacek
[1] https://www.winehq.org/pipermail/wine-devel/2016-October/115071.html
Thanks for sharing the draft patch. All of the ideas you mentioned sound reasonable, but if you already know how to solve these problems, I am not sure why you insist on merging your named pipe implementation first. The goal should be to keep the risk of regressions (this also includes performance regressions - or weird issues because USR1 signals arrive on other threads, possibly interrupting libraries which cannot deal with it) as small as possible. Imho it would be preferred to fix the limitations first, then adding the named pipe implementation.
I guess I look at this from different perspective. Adding different way of returning data before named pipes would introduce a dead and untested code, which would be enabled by the patch changing named pipes. I tried to construct the whole series in a way that would make the single most dangerous patch not only as small in size as possible (and it's still pretty long), but also reduce its impact. That's a way to actually reduce regression risk, in my opinion.
This way: - in case of regression in new transport code, bisect will show the fault patch instead of named pipe commit (that just enables previously dead code) - the first version of named pipes changes uses single, simpler and already tested way of transport, ensuring that this works well - APCs, which are needed for async results anyway, are better tested - once we start improving performance, the new code will be immediately covered by both Wine tests and user testing
The way I see it, a temporary performance degradation is a cost of better incremental changes that should reduce regression risk.
So far I'm also not sure yet if this separate async_complete wineserver call really is the best approach. What about just using the existing wait mechanism? Either read/write immediately starts a wait and the ntdll side would look similar to server_select().
That's interesting, but note that we can't finish the async on any wait from given thread, because a signal may happen before the data is actually read. But it should be doable.
Alternatively it would also be sufficient to block USR1 signals until ntdll has reached NtWaitForSingleObject.
That alone would be slower, it would need 3 server calls. But it's a nice idea to do that for calls ending with wait anyway.
Thanks, Jacek