FWIW I'm achieving similar results in GDB with https://gitlab.winehq.org/rbernon/wine/-/commit/bf615abeb0c206e40678cec0583e....
As I've been told it's incorrect I didn't try upstreaming it further, but as far as I'm concerned GDB is fully happy with it, and combined with https://gitlab.winehq.org/rbernon/wine/-/commit/9e900267f6eacb8511e18ff2fc19... and https://gitlab.winehq.org/rbernon/wine/-/commit/e21cf3f7702837557f9cc221b204... it can unwind through the syscall dispatcher just fine.
Imho, this variant with the many macros is likely going to be considered as too complex for the small benefit (it's really only for the convenience of debugging Wine itself with a Unix debugger, which I completely agree, *is* useful, though mostly for developers).
----
Hey, I saw your change and would also have hoped that this actually works, as it is obviously much more elegant not fallback to the .cfi_escape primitives.
I guess the biggest issue with that change is, however, that it actually goes against what the DWARF spec. says about the CFA:
Typically, the CFA is defined to be the value of the stack pointer at the call site in the previous frame (which may be different from its value on entry to the current frame).
In particular, your change was unfortunately not working with our version of (libunwindstack)[https://github.com/google/orbit/tree/main/third_party/libunwindstack] and LLDB (unfortunately not yet upstream). I am actually interested, does your change manage to unwind completely through the dispatcher and through the complete windows stack, as shown here? ![mixed_callstack_unwinding](https://werat.dev/blog/how-wine-works-101/6.png)
A note about the macros: I agree, they look horrible at the fist glance, but there are really just for readability. When you ignore those for a second, you actually only need to grasp the `__ASM_CFI_DW_OP_bregN` and `__ASM_CFI_DW_OP_deref` instructions, to get the computation of the registers and the CFA.
Regarding the usefulness of that change, this is clearly for devs (not only wine directly, but e.g. also DXVK or similars). Also it is not only for debuggers, but also for profiling, to actually find performance problems in the syscalls. As the whole code here, is already very low level, I think this is actually also pretty helpful for people maintaining the syscall dispatcher, as you can, as said in the commit message, actually step through each instruction of the dispatcher with a debugger.
So I really think it is worth the effort.