Hmm, I need to think about that. When I said it fully works, I should maybe indicate that GDB doesn't usually display local variables for the first frame below the syscall, so maybe it's because of this even though it manages to catch up for lower frames.
As far as I remember, local variables were available in all frames in our case. It should be mostly because we specify callee-save registers right away when they get stored. Also it is likely, that if the computation in not correct in the first place, that it will catch up later, as there will be new unwind information mostly specifying "register foo is saved at that position on the stack".
FWIW this also works with Valgrind, and you might be interested in https://gitlab.winehq.org/wine/wine/-/merge_requests/1074 too.
That is interesting. So, but do I see correctly that this only works, if the Windows Pe/Coff files, actually have **DWARF** unwind information embedded. Which is as far as I am aware only the case for cross-compilation with mingw, right?
It's not exactly the `pushfl` but more that the dispatcher pops `rip` first, which effectively removes the return address from the stack, then `pushfl` overrides its value. It's not causing problems later because the dispatcher uses the poped `rip` value to return to the caller, but it temporarily confuses GDB stack unwinding. Maybe with your CFI instructions it's not required.
Right! And that is the reason why we don't need it, as we do specify "RIP" based on syscall frame and not based on CFA, right as it gets popped. I think that's another case, where it is actually better to do the `breg` instructions, as it gives you the freedom to specify some variables based on CFA and some based on register content.
The reason I've been told, is that some DRMs are checking that the NT syscalls didn't touch the stack. I don't really know more.
Oh interesting. That makes a lot of sense, and our lives so much harder... :smile:
Yes, it's already an issue for `perf` captures when you want stack traces that cross user and kernel stacks, and I don't really have any good solution for that. Having an optional working mode which would interleave kernel frames and user frames may be possible but it doesn't seem much tractable.
Yep, `perf` is actually using the same `perf_event_open` syscall that we are using. So it makes sense that we are observing the same issue. The "bruteforce" way we worked around with that, is to inject into the syscall dispatcher (using a uprobe) and always collect the "user-space" stack, then for the actual samples, make both stacks available to the unwinder. This obviously comes with a big performance hit, and might corrupt your performance characteristics. However, it actually turned out to be useful to find some issues.
An alternative solution, that we never ended up actually implementing would be to use a eBPF program for the sampling/stack collection. We could make it copy both stacks explicitly.
Obvious the more elegant solution would be, as you mentioned, to either have a switch in Wine to only have one stack, or alternatively, make it possible in perf_event_open, to collect both stacks. It would be be not the first time that Linux kernel has special handling for wine.
I should probably add that it only does that when it has the debug info for the PE modules, this isn't currently well supported, and needs to be done somehow manually.
Are you referring to the Windows unwind information (runtime function), or the DWARF information mentioned above? Our LLDB patch (which we hopefully upstream soon), as well as the libunwindstack implementation actually takes care of all kind of PE/Coff modules, and is able to use runtime function as well, as DWARF information.