On 22.02.2017 14:51, Jacek Caban wrote:
Signed-off-by: Jacek Caban [email protected]
dlls/kernel32/tests/pipe.c | 28 ++- dlls/ntdll/file.c | 7 +- dlls/ntdll/tests/file.c | 2 +- server/async.c | 9 + server/file.h | 1 + server/named_pipe.c | 446 ++++++++++++++++++++++++++++++++++++++++++--- 6 files changed, 447 insertions(+), 46 deletions(-)
As I see this updated patchset allows to return the result immediately, but still uses a USR1 signal to transfer back the IO completion info. Wouldn't it be useful to get rid of that aswell?
Also please note that we need to distinguish "immediate returns" and "async returns" to properly implement FileIoCompletionNotificationInformation. This especially means that waiting internally is not an option.
- iosb = async_get_iosb( async );
- if ((blocking || iosb->status != STATUS_PENDING)
&& !(handle = alloc_handle( current->process, async, SYNCHRONIZE, 0 )))
async_terminate( async, get_error() );
I don't think this will work as expected. async_terminate might queue an APC, but the caller will immediately deallocate the async struct in server_write_file because status != STATUS_PENDING.
Hi Sebastian,
On 27.02.2017 19:42, Sebastian Lackner wrote:
On 22.02.2017 14:51, Jacek Caban wrote:
Signed-off-by: Jacek Caban [email protected]
dlls/kernel32/tests/pipe.c | 28 ++- dlls/ntdll/file.c | 7 +- dlls/ntdll/tests/file.c | 2 +- server/async.c | 9 + server/file.h | 1 + server/named_pipe.c | 446 ++++++++++++++++++++++++++++++++++++++++++--- 6 files changed, 447 insertions(+), 46 deletions(-)
As I see this updated patchset allows to return the result immediately, but still uses a USR1 signal to transfer back the IO completion info. Wouldn't it be useful to get rid of that aswell?
I believe it would be useful, but I assumed it does not block this patch. It's tricky, but I gave it some thoughts before sending patches and intend to look at that as some point. Here are some thoughts:
As long as we transfer result over server socket, we don't have control in server over when iosb and result buffer are filled, so we need a signal from client when we can signal async completion (signal event/object, queue completion). I sent a patch that was more inline with how we do that for calls where I/O is implemented on client side [1], but Alexandre preferred that to be in server [2]. Having that in server is not only cleaner, but also gives us more control, like in which case to fill iosb or signal event. It will also make it possible to implement other solution (bellow).
As long as we need a way for client to signal that data is transferred, we need some sort of notification from client. We could add a new server call, that client would be responsible to call after receiving data, that we could use instead of wait handle and APCs, but I think that moving more logic to client is not the right solution.
An other solution is not to transfer result over server socket, but use write_process_memory() instead. This would allow us to do everything in a single server call. Unfortunately, it seems to me that there will be cases (like writing to memory what has write watch), when we will need APC-based fallback anyway. This is a huge change on its own, it needs at least: - Use write_process_memory() in async_terminate() when we know we can do that for particular async - Since above will complicate wait handle logic, it would be nice to move this logic to read/write/ioctl/flush request handlers instead of duplicating that in fd ops. - The above will need invasive change for device IRP-based calls, which currently use irp_call, not async, for wait handles. - The above has a side quest: it would be nice to consider making device calls blocking more compatible. I believe that we should always wait until real driver completes request (even for non-blocking calls) and return STATUS_PENDING only if driver returned STATUS_PENDING. That's something to keep in mind while doing design decision, not strictly related. - Right now, user passed APCs are stored in client and transferred as APC_ASYNC_IO. This would need to be changed.
I'm sure some of those tasks will need more smaller changes. Those would be much nicer to work on having named pipes in the tree first, so that we can at least test the code. And, since I think we will need APC+blocking as a fallback anyway, which we can use blocking as the main solution for the time being and have it well tested.
That said, I agree it would be useful, I just think it shouldn't block the patch.
Also please note that we need to distinguish "immediate returns" and "async returns" to properly implement FileIoCompletionNotificationInformation. This especially means that waiting internally is not an option.
It's easy to distinguish and still have internal waits. I can't point the right solution without looking deeper at FileIoCompletionNotificationInformation, but just as an example, you could do:
if (iosb->status != STATUS_PENDING) mark_async_as_immediate_result();
inside request handlers just after calling fd ops. Then you'd have all the info you need inside async object. Am I missing something?
- iosb = async_get_iosb( async );
- if ((blocking || iosb->status != STATUS_PENDING)
&& !(handle = alloc_handle( current->process, async, SYNCHRONIZE, 0 )))
async_terminate( async, get_error() );
I don't think this will work as expected. async_terminate might queue an APC, but the caller will immediately deallocate the async struct in server_write_file because status != STATUS_PENDING.
Good point, I will send an updated patch.
Thanks for the review,
Jacek
[1] https://www.winehq.org/pipermail/wine-patches/2017-January/157013.html
[2] https://www.winehq.org/pipermail/wine-devel/2017-February/116060.html
On 28.02.2017 14:03, Jacek Caban wrote:
Hi Sebastian,
On 27.02.2017 19:42, Sebastian Lackner wrote:
On 22.02.2017 14:51, Jacek Caban wrote:
Signed-off-by: Jacek Caban [email protected]
dlls/kernel32/tests/pipe.c | 28 ++- dlls/ntdll/file.c | 7 +- dlls/ntdll/tests/file.c | 2 +- server/async.c | 9 + server/file.h | 1 + server/named_pipe.c | 446 ++++++++++++++++++++++++++++++++++++++++++--- 6 files changed, 447 insertions(+), 46 deletions(-)
As I see this updated patchset allows to return the result immediately, but still uses a USR1 signal to transfer back the IO completion info. Wouldn't it be useful to get rid of that aswell?
I believe it would be useful, but I assumed it does not block this patch. It's tricky, but I gave it some thoughts before sending patches and intend to look at that as some point. Here are some thoughts:
I'm not sure if this should be blocking the patch, but at least for me it was not really obvious how you were planning to fix it afterwards.
As long as we transfer result over server socket, we don't have control in server over when iosb and result buffer are filled, so we need a signal from client when we can signal async completion (signal event/object, queue completion). I sent a patch that was more inline with how we do that for calls where I/O is implemented on client side [1], but Alexandre preferred that to be in server [2]. Having that in server is not only cleaner, but also gives us more control, like in which case to fill iosb or signal event. It will also make it possible to implement other solution (bellow).
I assume Alexandre was mainly referring to the event / APC stuff. I assume he also would be fine with a code path to return the result immediately when available. This could for example be done by marking the async object as "in progress" to avoid queing unnecessary APCs. If a real result is available at the end of the call, it could be transferred right away. This would also help to fix the problem with "immediate returns" below.
As long as we need a way for client to signal that data is transferred, we need some sort of notification from client. We could add a new server call, that client would be responsible to call after receiving data, that we could use instead of wait handle and APCs, but I think that moving more logic to client is not the right solution.
An other solution is not to transfer result over server socket, but use write_process_memory() instead. This would allow us to do everything in a single server call. Unfortunately, it seems to me that there will be cases (like writing to memory what has write watch), when we will need APC-based fallback anyway. This is a huge change on its own, it needs at least: [...]
I don't think write_process_memory is the right way to do it. This function also has a lot of overhead, like suspending the thread, or if /proc is not available reading/writing is done in chunks of sizeof(long). I'm pretty sure it would be faster to just transfer back the result with the same wineserver call. Also please note that the client side already adds a reply buffer, so the changes would mostly be on the server side.
Also please note that we need to distinguish "immediate returns" and "async returns" to properly implement FileIoCompletionNotificationInformation. This especially means that waiting internally is not an option.
It's easy to distinguish and still have internal waits. I can't point the right solution without looking deeper at FileIoCompletionNotificationInformation, but just as an example, you could do:
if (iosb->status != STATUS_PENDING) mark_async_as_immediate_result();
inside request handlers just after calling fd ops. Then you'd have all the info you need inside async object. Am I missing something?
With the currently proposed solution this would not work. The fd ops already call async_terminate -> async_set_result -> add_async_completion -> add_completion, but depending on the completion flags this should not be done if it is an immediate result. There is no way to know if its supposed to be an "immediate return" inside of the async_*() helpers, and checking afterwards is too late (unless we introduce a flag to block async objects from doing actions right away). Did you have a different idea how it should be implemented?
Best regards, Sebastian
Sebastian Lackner [email protected] writes:
I don't think write_process_memory is the right way to do it. This function also has a lot of overhead, like suspending the thread, or if /proc is not available reading/writing is done in chunks of sizeof(long). I'm pretty sure it would be faster to just transfer back the result with the same wineserver call. Also please note that the client side already adds a reply buffer, so the changes would mostly be on the server side.
Agreed. write_process_memory has many issues, and should only be used for debugging APIs. We don't want to make it a required part of the server protocol.
On 28.02.2017 15:12, Alexandre Julliard wrote:
Sebastian Lackner [email protected] writes:
I don't think write_process_memory is the right way to do it. This function also has a lot of overhead, like suspending the thread, or if /proc is not available reading/writing is done in chunks of sizeof(long). I'm pretty sure it would be faster to just transfer back the result with the same wineserver call. Also please note that the client side already adds a reply buffer, so the changes would mostly be on the server side.
Agreed. write_process_memory has many issues, and should only be used for debugging APIs. We don't want to make it a required part of the server protocol.
I never meant this to be required, we could still do everything without it using APCs. I also never meant it to be part of server protocol itself. I imagined it to be used instead of APCs in some cases (like setting IO_STATUS_BLOCK bytes, possibly more for data in read/ioctl) to avoid round-trips, falling back to APCs in case of any error or other problem.
Back to the current patch, it just uses currently available mechanisms to achieve the task and it seems to work well. As I described earlier, I could implement that without APCs for immediate result, but we would still need some notification to server that data has been completely transferred. If we want to do it that way, the cleanest solution IMO would be a new server call done by clients in server_*_file() functions called for immediate results after setting iosb fields and receiving data. Would you prefer that?
Thanks, Jacek
Jacek Caban [email protected] writes:
Back to the current patch, it just uses currently available mechanisms to achieve the task and it seems to work well. As I described earlier, I could implement that without APCs for immediate result, but we would still need some notification to server that data has been completely transferred. If we want to do it that way, the cleanest solution IMO would be a new server call done by clients in server_*_file() functions called for immediate results after setting iosb fields and receiving data. Would you prefer that?
That would certainly be better than write_process_memory, yes. But we can go with APCs for now.
On 28.02.2017 14:40, Sebastian Lackner wrote:
On 28.02.2017 14:03, Jacek Caban wrote:
As long as we transfer result over server socket, we don't have control in server over when iosb and result buffer are filled, so we need a signal from client when we can signal async completion (signal event/object, queue completion). I sent a patch that was more inline with how we do that for calls where I/O is implemented on client side [1], but Alexandre preferred that to be in server [2]. Having that in server is not only cleaner, but also gives us more control, like in which case to fill iosb or signal event. It will also make it possible to implement other solution (bellow).
I assume Alexandre was mainly referring to the event / APC stuff. I assume he also would be fine with a code path to return the result immediately when available. This could for example be done by marking the async object as "in progress" to avoid queing unnecessary APCs. If a real result is available at the end of the call, it could be transferred right away. This would also help to fix the problem with "immediate returns" below.
What about the problem of client having to signal server when it received and processed the data? When would you set the event in that case? We can transfer data immediately, but we still need to know when we can set event / queue APC. Currently we know that we can do that when APC_ASYNC_IO returns.
As long as we need a way for client to signal that data is transferred, we need some sort of notification from client. We could add a new server call, that client would be responsible to call after receiving data, that we could use instead of wait handle and APCs, but I think that moving more logic to client is not the right solution.
An other solution is not to transfer result over server socket, but use write_process_memory() instead. This would allow us to do everything in a single server call. Unfortunately, it seems to me that there will be cases (like writing to memory what has write watch), when we will need APC-based fallback anyway. This is a huge change on its own, it needs at least: [...]
I don't think write_process_memory is the right way to do it. This function also has a lot of overhead, like suspending the thread, or if /proc is not available reading/writing is done in chunks of sizeof(long).
That's another reason for a good fallback. I imagine we could use APC fallback if /proc is not available. Suspending thread doesn't sound too bad, esp. given that the alternative is an extra server call.
I'm pretty sure it would be faster to just transfer back the result with the same wineserver call. Also please note that the client side already adds a reply buffer, so the changes would mostly be on the server side.
Yes, data transfer is probably faster this way, but, as I said, the problem is not really about transfer speed itself, but the fact that we need server round-trips to know when data is transferred.
Also please note that we need to distinguish "immediate returns" and "async returns" to properly implement FileIoCompletionNotificationInformation. This especially means that waiting internally is not an option.
It's easy to distinguish and still have internal waits. I can't point the right solution without looking deeper at FileIoCompletionNotificationInformation, but just as an example, you could do:
if (iosb->status != STATUS_PENDING) mark_async_as_immediate_result();
inside request handlers just after calling fd ops. Then you'd have all the info you need inside async object. Am I missing something?
With the currently proposed solution this would not work. The fd ops already call async_terminate -> async_set_result -> add_async_completion -> add_completion, but depending on the completion flags this should not be done if it is an immediate result. There is no way to know if its supposed to be an "immediate return" inside of the async_*() helpers, and checking afterwards is too late (unless we introduce a flag to block async objects from doing actions right away). Did you have a different idea how it should be implemented?
In cases covered by my patch, async_terminate doesn't call async_set_result, but queues APC instead, so that's not a problem.
More generally speaking, if we had a code path like you described, the solution I proposed would still work, just reverse the logic. Mark all created asyncs as immediate returns, and change it to async where appropriate, something like:
if (iosb->status == STATUS_PENDING) mark_async_as_async_result();
in read/write/ioctl/flush should do. register_async request could do that unconditionally.
Thanks, Jacek
On 28.02.2017 15:14, Jacek Caban wrote:
On 28.02.2017 14:40, Sebastian Lackner wrote:
On 28.02.2017 14:03, Jacek Caban wrote:
As long as we transfer result over server socket, we don't have control in server over when iosb and result buffer are filled, so we need a signal from client when we can signal async completion (signal event/object, queue completion). I sent a patch that was more inline with how we do that for calls where I/O is implemented on client side [1], but Alexandre preferred that to be in server [2]. Having that in server is not only cleaner, but also gives us more control, like in which case to fill iosb or signal event. It will also make it possible to implement other solution (bellow).
I assume Alexandre was mainly referring to the event / APC stuff. I assume he also would be fine with a code path to return the result immediately when available. This could for example be done by marking the async object as "in progress" to avoid queing unnecessary APCs. If a real result is available at the end of the call, it could be transferred right away. This would also help to fix the problem with "immediate returns" below.
What about the problem of client having to signal server when it received and processed the data? When would you set the event in that case? We can transfer data immediately, but we still need to know when we can set event / queue APC. Currently we know that we can do that when APC_ASYNC_IO returns.
Well, there are also situations where it would be relatively easy to avoid the roundtrip. We could return immediately at least when no events, APCs or completions are involved, right? Since this is the most common situation it might still be worth the effort.
Also please note that we need to distinguish "immediate returns" and "async returns" to properly implement FileIoCompletionNotificationInformation. This especially means that waiting internally is not an option.
It's easy to distinguish and still have internal waits. I can't point the right solution without looking deeper at FileIoCompletionNotificationInformation, but just as an example, you could do:
if (iosb->status != STATUS_PENDING) mark_async_as_immediate_result();
inside request handlers just after calling fd ops. Then you'd have all the info you need inside async object. Am I missing something?
With the currently proposed solution this would not work. The fd ops already call async_terminate -> async_set_result -> add_async_completion -> add_completion, but depending on the completion flags this should not be done if it is an immediate result. There is no way to know if its supposed to be an "immediate return" inside of the async_*() helpers, and checking afterwards is too late (unless we introduce a flag to block async objects from doing actions right away). Did you have a different idea how it should be implemented?
In cases covered by my patch, async_terminate doesn't call async_set_result, but queues APC instead, so that's not a problem.
More generally speaking, if we had a code path like you described, the solution I proposed would still work, just reverse the logic. Mark all created asyncs as immediate returns, and change it to async where appropriate, something like:
if (iosb->status == STATUS_PENDING) mark_async_as_async_result();
in read/write/ioctl/flush should do. register_async request could do that unconditionally.
For flush we are currently not using any callback, so marking it in the handler would be too late. Thats also something we could fix of course - Nevertheless, as you can see it already gets pretty complicated. ;) If its possible to simplify this somehow, that would certainly be an improvement.
Best regards, Sebastian
On 28.02.2017 16:44, Sebastian Lackner wrote:
On 28.02.2017 15:14, Jacek Caban wrote:
On 28.02.2017 14:40, Sebastian Lackner wrote:
On 28.02.2017 14:03, Jacek Caban wrote:
As long as we transfer result over server socket, we don't have control in server over when iosb and result buffer are filled, so we need a signal from client when we can signal async completion (signal event/object, queue completion). I sent a patch that was more inline with how we do that for calls where I/O is implemented on client side [1], but Alexandre preferred that to be in server [2]. Having that in server is not only cleaner, but also gives us more control, like in which case to fill iosb or signal event. It will also make it possible to implement other solution (bellow).
I assume Alexandre was mainly referring to the event / APC stuff. I assume he also would be fine with a code path to return the result immediately when available. This could for example be done by marking the async object as "in progress" to avoid queing unnecessary APCs. If a real result is available at the end of the call, it could be transferred right away. This would also help to fix the problem with "immediate returns" below.
What about the problem of client having to signal server when it received and processed the data? When would you set the event in that case? We can transfer data immediately, but we still need to know when we can set event / queue APC. Currently we know that we can do that when APC_ASYNC_IO returns.
Well, there are also situations where it would be relatively easy to avoid the roundtrip. We could return immediately at least when no events, APCs or completions are involved, right? Since this is the most common situation it might still be worth the effort.
If event is not passed, we still need to signal object itself. Thinking about it some more, I guess we could just signal object it before returning for non-overlapped operations returning immediately. It wouldn't be exactly right, but it can't be used for synchronization reliably in this case anyway.
Another thing is that we currently don't know if there is user APC until APC_ASYNC_IO returns, but that can be fixed.
Since it's just an optimization, can we process with the patch itself first? I'm happy to optimize that, but I find it hard to agree that an optimization should block inclusion of patches targeted at correctness.
Also please note that we need to distinguish "immediate returns" and "async returns" to properly implement FileIoCompletionNotificationInformation. This especially means that waiting internally is not an option.
It's easy to distinguish and still have internal waits. I can't point the right solution without looking deeper at FileIoCompletionNotificationInformation, but just as an example, you could do:
if (iosb->status != STATUS_PENDING) mark_async_as_immediate_result();
inside request handlers just after calling fd ops. Then you'd have all the info you need inside async object. Am I missing something?
With the currently proposed solution this would not work. The fd ops already call async_terminate -> async_set_result -> add_async_completion -> add_completion, but depending on the completion flags this should not be done if it is an immediate result. There is no way to know if its supposed to be an "immediate return" inside of the async_*() helpers, and checking afterwards is too late (unless we introduce a flag to block async objects from doing actions right away). Did you have a different idea how it should be implemented?
In cases covered by my patch, async_terminate doesn't call async_set_result, but queues APC instead, so that's not a problem.
More generally speaking, if we had a code path like you described, the solution I proposed would still work, just reverse the logic. Mark all created asyncs as immediate returns, and change it to async where appropriate, something like:
if (iosb->status == STATUS_PENDING) mark_async_as_async_result();
in read/write/ioctl/flush should do. register_async request could do that unconditionally.
For flush we are currently not using any callback, so marking it in the handler would be too late. Thats also something we could fix of course - Nevertheless, as you can see it already gets pretty complicated. ;) If its possible to simplify this somehow, that would certainly be an improvement.
I think you misunderstood me somehow. I can only guess what you mean, since current flush() implementation queue async or return no wait handle - they never call async_terminate() from what I can tell. Even if they did, it would work with what I proposed: flush handler creates async - it's marked as returning immediately flush calls async_terminate() - due to above, async_terminate is aware that it's an immediate result flush op returns to request handler, which finds that there was an immediate return, so does not mark async as returning asynchronously.
I'm not sure what you expect me to improve in this regards. This is about hypothetical solution to a problem that's not strictly related to this patch. And I don't think it's that complicated ;)
Cheers, Jacek
On 28.02.2017 18:10, Jacek Caban wrote:
On 28.02.2017 16:44, Sebastian Lackner wrote:
Well, there are also situations where it would be relatively easy to avoid the roundtrip. We could return immediately at least when no events, APCs or completions are involved, right? Since this is the most common situation it might still be worth the effort.
If event is not passed, we still need to signal object itself. Thinking about it some more, I guess we could just signal object it before returning for non-overlapped operations returning immediately. It wouldn't be exactly right, but it can't be used for synchronization reliably in this case anyway.
Another thing is that we currently don't know if there is user APC until APC_ASYNC_IO returns, but that can be fixed.
Since it's just an optimization, can we process with the patch itself first? I'm happy to optimize that, but I find it hard to agree that an optimization should block inclusion of patches targeted at correctness.
I'm not saying that it should block anything, I will leave the design decision up to Alexandre. Nevertheless, I think it certainly makes sense to discuss about such aspects in advance. Depending on the solution used it might also require refactoring of this patch, which is easier to do before it was accepted.
For flush we are currently not using any callback, so marking it in the handler would be too late. Thats also something we could fix of course - Nevertheless, as you can see it already gets pretty complicated. ;) If its possible to simplify this somehow, that would certainly be an improvement.
I think you misunderstood me somehow. I can only guess what you mean, since current flush() implementation queue async or return no wait handle - they never call async_terminate() from what I can tell. Even if they did, it would work with what I proposed:
Sorry if the previous message wasn't clear. Yes, I was thinking about the hypothetical situation that we have to call async_terminate from a flush handler.
flush handler creates async - it's marked as returning immediately flush calls async_terminate() - due to above, async_terminate is aware that it's an immediate result flush op returns to request handler, which finds that there was an immediate return, so does not mark async as returning asynchronously.
Yes, something like that would be necessary.
I'm not sure what you expect me to improve in this regards. This is about hypothetical solution to a problem that's not strictly related to this patch. And I don't think it's that complicated ;)
Well, my main suggestion is that it would be better to handle async returns different than sync returns. Especially, we probably shouldn't use APC_ASYNC_IO as a way to delay async_set_result for synchronous returns. Also, we should use the reply buffer for read calls, or replace the wine_server_set_reply with a length argument if this is not feasible.
Best regards, Sebastian
On 28.02.2017 20:46, Sebastian Lackner wrote:
On 28.02.2017 18:10, Jacek Caban wrote:
On 28.02.2017 16:44, Sebastian Lackner wrote:
Well, there are also situations where it would be relatively easy to avoid the roundtrip. We could return immediately at least when no events, APCs or completions are involved, right? Since this is the most common situation it might still be worth the effort.
If event is not passed, we still need to signal object itself. Thinking about it some more, I guess we could just signal object it before returning for non-overlapped operations returning immediately. It wouldn't be exactly right, but it can't be used for synchronization reliably in this case anyway.
Another thing is that we currently don't know if there is user APC until APC_ASYNC_IO returns, but that can be fixed.
Since it's just an optimization, can we process with the patch itself first? I'm happy to optimize that, but I find it hard to agree that an optimization should block inclusion of patches targeted at correctness.
I'm not saying that it should block anything, I will leave the design decision up to Alexandre. Nevertheless, I think it certainly makes sense to discuss about such aspects in advance. Depending on the solution used it might also require refactoring of this patch, which is easier to do before it was accepted.
I drafted an optimization discussed in this thread. It's just a proof-of-concept. A better implementation will store user APCs on server side when the async is created and I'd consider moving handling of returned data and wait handle out of fd-specific ops. I ran it on my benchmark from last year [1] and it was 2 times slower than what I called "plain wine+msgmode patchset+optimizations". It's expected, since with that we need two server calls (that one needed one). Going down to one server round-trip for simple cases is easy with that and should give about equal results.
The point is, named pipe code needs just small changes for such optimization, no refactoring. I hope that addresses your concerns.
Jacek
[1] https://www.winehq.org/pipermail/wine-devel/2016-October/115071.html
On 13.03.2017 13:31, Jacek Caban wrote:
I drafted an optimization discussed in this thread. It's just a proof-of-concept. A better implementation will store user APCs on server side when the async is created and I'd consider moving handling of returned data and wait handle out of fd-specific ops. I ran it on my benchmark from last year [1] and it was 2 times slower than what I called "plain wine+msgmode patchset+optimizations". It's expected, since with that we need two server calls (that one needed one). Going down to one server round-trip for simple cases is easy with that and should give about equal results.
The point is, named pipe code needs just small changes for such optimization, no refactoring. I hope that addresses your concerns.
Jacek
[1] https://www.winehq.org/pipermail/wine-devel/2016-October/115071.html
Thanks for sharing the draft patch. All of the ideas you mentioned sound reasonable, but if you already know how to solve these problems, I am not sure why you insist on merging your named pipe implementation first. The goal should be to keep the risk of regressions (this also includes performance regressions - or weird issues because USR1 signals arrive on other threads, possibly interrupting libraries which cannot deal with it) as small as possible. Imho it would be preferred to fix the limitations first, then adding the named pipe implementation.
So far I'm also not sure yet if this separate async_complete wineserver call really is the best approach. What about just using the existing wait mechanism? Either read/write immediately starts a wait and the ntdll side would look similar to server_select(). Alternatively it would also be sufficient to block USR1 signals until ntdll has reached NtWaitForSingleObject. If you decide to add such a flag, it could probably also be reused to fix another bug.
[@ Alexandre: Do you think those concerns are valid, or do you think the patches would be acceptable despite the problems?]
Best regards, Sebastian
Sebastian Lackner [email protected] writes:
Thanks for sharing the draft patch. All of the ideas you mentioned sound reasonable, but if you already know how to solve these problems, I am not sure why you insist on merging your named pipe implementation first. The goal should be to keep the risk of regressions (this also includes performance regressions - or weird issues because USR1 signals arrive on other threads, possibly interrupting libraries which cannot deal with it) as small as possible. Imho it would be preferred to fix the limitations first, then adding the named pipe implementation.
So far I'm also not sure yet if this separate async_complete wineserver call really is the best approach. What about just using the existing wait mechanism? Either read/write immediately starts a wait and the ntdll side would look similar to server_select(). Alternatively it would also be sufficient to block USR1 signals until ntdll has reached NtWaitForSingleObject. If you decide to add such a flag, it could probably also be reused to fix another bug.
[@ Alexandre: Do you think those concerns are valid, or do you think the patches would be acceptable despite the problems?]
I think we can do it either way. I'm not necessarily opposed to merging the named pipe support first and improving the APC mechanism afterwards. I would like to see smaller patches though...
On 14.03.2017 16:47, Alexandre Julliard wrote:
I would like to see smaller patches though...
I did all I could without introducing dead code. I may prepare a series with use_server_io always returning 0 until a patch that flips it. It won't help with bisects, but I guess such series will be easier to review.
Thanks,
Jacek
On 14.03.2017 16:20, Sebastian Lackner wrote:
On 13.03.2017 13:31, Jacek Caban wrote:
I drafted an optimization discussed in this thread. It's just a proof-of-concept. A better implementation will store user APCs on server side when the async is created and I'd consider moving handling of returned data and wait handle out of fd-specific ops. I ran it on my benchmark from last year [1] and it was 2 times slower than what I called "plain wine+msgmode patchset+optimizations". It's expected, since with that we need two server calls (that one needed one). Going down to one server round-trip for simple cases is easy with that and should give about equal results.
The point is, named pipe code needs just small changes for such optimization, no refactoring. I hope that addresses your concerns.
Jacek
[1] https://www.winehq.org/pipermail/wine-devel/2016-October/115071.html
Thanks for sharing the draft patch. All of the ideas you mentioned sound reasonable, but if you already know how to solve these problems, I am not sure why you insist on merging your named pipe implementation first. The goal should be to keep the risk of regressions (this also includes performance regressions - or weird issues because USR1 signals arrive on other threads, possibly interrupting libraries which cannot deal with it) as small as possible. Imho it would be preferred to fix the limitations first, then adding the named pipe implementation.
I guess I look at this from different perspective. Adding different way of returning data before named pipes would introduce a dead and untested code, which would be enabled by the patch changing named pipes. I tried to construct the whole series in a way that would make the single most dangerous patch not only as small in size as possible (and it's still pretty long), but also reduce its impact. That's a way to actually reduce regression risk, in my opinion.
This way: - in case of regression in new transport code, bisect will show the fault patch instead of named pipe commit (that just enables previously dead code) - the first version of named pipes changes uses single, simpler and already tested way of transport, ensuring that this works well - APCs, which are needed for async results anyway, are better tested - once we start improving performance, the new code will be immediately covered by both Wine tests and user testing
The way I see it, a temporary performance degradation is a cost of better incremental changes that should reduce regression risk.
So far I'm also not sure yet if this separate async_complete wineserver call really is the best approach. What about just using the existing wait mechanism? Either read/write immediately starts a wait and the ntdll side would look similar to server_select().
That's interesting, but note that we can't finish the async on any wait from given thread, because a signal may happen before the data is actually read. But it should be doable.
Alternatively it would also be sufficient to block USR1 signals until ntdll has reached NtWaitForSingleObject.
That alone would be slower, it would need 3 server calls. But it's a nice idea to do that for calls ending with wait anyway.
Thanks, Jacek