[PATCH 0/1] MR10471: unwind: fix some invalid `reinterpret_cast`s
Performing type `reinterpret-cast`s between pointers to types that are not _pointers-interconvertible_ and then using the results is incorrect usage. Only the “array of N unsigned char” or of type “array of N std::byte” types can used as storage (in general case). To convert a pointer to a structure object containing a byte array into a pointer to an object explicitly created in that storage, std::launder must be called on that "newly typed" pointer. Accessing an explicitly created object in such a byte store through a pointer to an object of the structure type that contains the given byte array requires calling std::launder. ```cpp struct S { unsigned char data[_LIBUNWIND_CURSOR_SIZE * sizeof(uint64_t)]; } s; struct S2 { int i; }; new(s.data) S2(42); auto s2_ptr1 = reinterpret_cast<S2*>(s.data); auto s2_ptr2 = reinterpret_cast<S2*>(s); // this pointer is not changed ([expr.static.cast#12]) and points to s (struct S) auto s2_ptr3 = std::launder(reinterpret_cast<S2*>(s)); int i1 = s2_ptr1->i; // OK int i3 = s2_ptr3->i; // OK (?) int i2 = s2_ptr2->i; // this is undefined behavior ``` Also a pointer to object of type uint16\[2\] cannot be reinterpreted as uint32 -- this is undefined behavior (https://godbolt.org/z/vY1zKEvo6). https://eel.is/c++draft/intro.object#3 https://eel.is/c++draft/basic.compound#5 https://eel.is/c++draft/basic.compound#6 https://eel.is/c++draft/basic.lval#11 https://eel.is/c++draft/expr.static.cast#12 https://eel.is/c++draft/expr.reinterpret.cast#7 https://en.cppreference.com/w/cpp/utility/launder.html https://en.cppreference.com/w/cpp/language/reinterpret_cast.html -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10471
From: Safocl Stollmannovic <safocl88@gmail.com> Some `reinterpret_cast`s were invalid. https://eel.is/c++draft/basic.compound#5 https://eel.is/c++draft/basic.compound#6 https://eel.is/c++draft/basic.lval#11 https://eel.is/c++draft/expr.static.cast#12 https://eel.is/c++draft/expr.reinterpret.cast#7 https://en.cppreference.com/w/cpp/utility/launder.html --- libs/unwind/include/libunwind.h | 8 +++++++- libs/unwind/src/Unwind-seh.cpp | 21 +++++++++------------ libs/unwind/src/UnwindCursor.hpp | 16 +++++++++++++--- libs/unwind/src/libunwind.cpp | 9 ++++----- 4 files changed, 33 insertions(+), 21 deletions(-) diff --git a/libs/unwind/include/libunwind.h b/libs/unwind/include/libunwind.h index df596ee0bfd..9d12be47ae2 100644 --- a/libs/unwind/include/libunwind.h +++ b/libs/unwind/include/libunwind.h @@ -68,7 +68,13 @@ struct unw_context_t { typedef struct unw_context_t unw_context_t; struct unw_cursor_t { - uint64_t data[_LIBUNWIND_CURSOR_SIZE]; + /* Only the “array of N unsigned char” or of type “array of N std::byte” types can used as + storage (https://eel.is/c++draft/intro.object#3) */ + union { + unsigned char data[_LIBUNWIND_CURSOR_SIZE * sizeof(uint64_t)]; + /* `aligner` exists only for alignment */ + uint64_t aligner__[_LIBUNWIND_CURSOR_SIZE]; + }; }; typedef struct unw_cursor_t unw_cursor_t; diff --git a/libs/unwind/src/Unwind-seh.cpp b/libs/unwind/src/Unwind-seh.cpp index 078edc97d03..588a9d4283c 100644 --- a/libs/unwind/src/Unwind-seh.cpp +++ b/libs/unwind/src/Unwind-seh.cpp @@ -442,21 +442,18 @@ _Unwind_GetRegionStart(struct _Unwind_Context *context) { static int _unw_init_seh(unw_cursor_t *cursor, CONTEXT *context) { #ifdef _LIBUNWIND_TARGET_X86_64 - new ((void *)cursor) UnwindCursor<LocalAddressSpace, Registers_x86_64>( + auto *co = new (cursor->data) UnwindCursor<LocalAddressSpace, Registers_x86_64>( context, LocalAddressSpace::sThisAddressSpace); - auto *co = reinterpret_cast<AbstractUnwindCursor *>(cursor); co->setInfoBasedOnIPRegister(); return UNW_ESUCCESS; #elif defined(_LIBUNWIND_TARGET_ARM) - new ((void *)cursor) UnwindCursor<LocalAddressSpace, Registers_arm>( + auto *co = new (cursor->data) UnwindCursor<LocalAddressSpace, Registers_arm>( context, LocalAddressSpace::sThisAddressSpace); - auto *co = reinterpret_cast<AbstractUnwindCursor *>(cursor); co->setInfoBasedOnIPRegister(); return UNW_ESUCCESS; #elif defined(_LIBUNWIND_TARGET_AARCH64) - new ((void *)cursor) UnwindCursor<LocalAddressSpace, Registers_arm64>( + auto *co = new (cursor->data) UnwindCursor<LocalAddressSpace, Registers_arm64>( context, LocalAddressSpace::sThisAddressSpace); - auto *co = reinterpret_cast<AbstractUnwindCursor *>(cursor); co->setInfoBasedOnIPRegister(); return UNW_ESUCCESS; #else @@ -467,11 +464,11 @@ _unw_init_seh(unw_cursor_t *cursor, CONTEXT *context) { static DISPATCHER_CONTEXT * _unw_seh_get_disp_ctx(unw_cursor_t *cursor) { #ifdef _LIBUNWIND_TARGET_X86_64 - return reinterpret_cast<UnwindCursor<LocalAddressSpace, Registers_x86_64> *>(cursor)->getDispatcherContext(); + return reinterpret_cast<UnwindCursor<LocalAddressSpace, Registers_x86_64> *>(cursor->data)->getDispatcherContext(); #elif defined(_LIBUNWIND_TARGET_ARM) - return reinterpret_cast<UnwindCursor<LocalAddressSpace, Registers_arm> *>(cursor)->getDispatcherContext(); + return reinterpret_cast<UnwindCursor<LocalAddressSpace, Registers_arm> *>(cursor->data)->getDispatcherContext(); #elif defined(_LIBUNWIND_TARGET_AARCH64) - return reinterpret_cast<UnwindCursor<LocalAddressSpace, Registers_arm64> *>(cursor)->getDispatcherContext(); + return reinterpret_cast<UnwindCursor<LocalAddressSpace, Registers_arm64> *>(cursor->data)->getDispatcherContext(); #else return nullptr; #endif @@ -480,11 +477,11 @@ _unw_seh_get_disp_ctx(unw_cursor_t *cursor) { static void _unw_seh_set_disp_ctx(unw_cursor_t *cursor, DISPATCHER_CONTEXT *disp) { #ifdef _LIBUNWIND_TARGET_X86_64 - reinterpret_cast<UnwindCursor<LocalAddressSpace, Registers_x86_64> *>(cursor)->setDispatcherContext(disp); + reinterpret_cast<UnwindCursor<LocalAddressSpace, Registers_x86_64> *>(cursor->data)->setDispatcherContext(disp); #elif defined(_LIBUNWIND_TARGET_ARM) - reinterpret_cast<UnwindCursor<LocalAddressSpace, Registers_arm> *>(cursor)->setDispatcherContext(disp); + reinterpret_cast<UnwindCursor<LocalAddressSpace, Registers_arm> *>(cursor->data)->setDispatcherContext(disp); #elif defined(_LIBUNWIND_TARGET_AARCH64) - reinterpret_cast<UnwindCursor<LocalAddressSpace, Registers_arm64> *>(cursor)->setDispatcherContext(disp); + reinterpret_cast<UnwindCursor<LocalAddressSpace, Registers_arm64> *>(cursor->data)->setDispatcherContext(disp); #endif } diff --git a/libs/unwind/src/UnwindCursor.hpp b/libs/unwind/src/UnwindCursor.hpp index 52439f9b545..b305a373982 100644 --- a/libs/unwind/src/UnwindCursor.hpp +++ b/libs/unwind/src/UnwindCursor.hpp @@ -1772,9 +1772,19 @@ bool UnwindCursor<A, R>::getInfoFromSEH(pint_t pc) { // these structures.) // N.B. UNWIND_INFO structs are DWORD-aligned. uint32_t lastcode = (xdata->CountOfCodes + 1) & ~1; - const uint32_t *handler = reinterpret_cast<uint32_t *>(&xdata->UnwindCodes[lastcode]); - _info.lsda = reinterpret_cast<unw_word_t>(handler+1); - if (*handler) { + // NOTE: lastcode can be equal to or greater than 2, then accessing UnwindCodes[lastcode] is + // out of bound (i.e. undefined behavior). The external memory outside the class object + // cannot be reached from a pointer to a subobject. The only valid case is when CountOfCodes + // is 0, and then lastcode will be equal 0. + // However, `reinterpret_cast` from a pointer to uint16[2] to a pointer to uint32 is not + // allowed in any case. (https://godbolt.org/z/vY1zKEvo6) + // It turns out that here we just need to check each of the array elements for zero. + const uint16_t hi = xdata->UnwindCodes[0]; + const uint16_t low = xdata->UnwindCodes[1]; +#warning "invalid reinterpret-cast" + // FIXME: Does `xdata` actually point to an array of UNWIND_INFO? + _info.lsda = reinterpret_cast<unw_word_t>(xdata+1); + if (hi && low) { _info.handler = reinterpret_cast<unw_word_t>(__libunwind_seh_personality); } else _info.handler = 0; diff --git a/libs/unwind/src/libunwind.cpp b/libs/unwind/src/libunwind.cpp index 5afe6235c4c..b5421136768 100644 --- a/libs/unwind/src/libunwind.cpp +++ b/libs/unwind/src/libunwind.cpp @@ -73,10 +73,9 @@ _LIBUNWIND_EXPORT int unw_init_local(unw_cursor_t *cursor, # error Architecture not supported #endif // Use "placement new" to allocate UnwindCursor in the cursor buffer. - new ((void *)cursor) UnwindCursor<LocalAddressSpace, REGISTER_KIND>( + auto *co = new (cursor->data) UnwindCursor<LocalAddressSpace, REGISTER_KIND>( context, LocalAddressSpace::sThisAddressSpace); #undef REGISTER_KIND - AbstractUnwindCursor *co = (AbstractUnwindCursor *)cursor; co->setInfoBasedOnIPRegister(); return UNW_ESUCCESS; @@ -94,17 +93,17 @@ _LIBUNWIND_EXPORT int unw_init_remote_thread(unw_cursor_t *cursor, // use "placement new" to allocate UnwindCursor in the cursor buffer switch (as->cpuType) { case CPU_TYPE_I386: - new ((void *)cursor) + new (cursor->data) UnwindCursor<RemoteAddressSpace<Pointer32<LittleEndian>>, Registers_x86>(((unw_addr_space_i386 *)as)->oas, arg); break; case CPU_TYPE_X86_64: - new ((void *)cursor) + new (cursor->data) UnwindCursor<RemoteAddressSpace<Pointer64<LittleEndian>>, Registers_x86_64>(((unw_addr_space_x86_64 *)as)->oas, arg); break; case CPU_TYPE_POWERPC: - new ((void *)cursor) + new (cursor->data) UnwindCursor<RemoteAddressSpace<Pointer32<BigEndian>>, Registers_ppc>(((unw_addr_space_ppc *)as)->oas, arg); break; -- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/10471
Alfred Agrell (@Alcaro) commented about libs/unwind/include/libunwind.h:
typedef struct unw_context_t unw_context_t;
struct unw_cursor_t { - uint64_t data[_LIBUNWIND_CURSOR_SIZE]; + /* Only the “array of N unsigned char” or of type “array of N std::byte” types can used as + storage (https://eel.is/c++draft/intro.object#3) */ + union { + unsigned char data[_LIBUNWIND_CURSOR_SIZE * sizeof(uint64_t)]; + /* `aligner` exists only for alignment */ Wouldn't alignas make more sense? Or are we trying to avoid c++11 features?
-- https://gitlab.winehq.org/wine/wine/-/merge_requests/10471#note_134023
Alfred Agrell (@Alcaro) commented about libs/unwind/src/UnwindCursor.hpp:
uint32_t lastcode = (xdata->CountOfCodes + 1) & ~1; - const uint32_t *handler = reinterpret_cast<uint32_t *>(&xdata->UnwindCodes[lastcode]); - _info.lsda = reinterpret_cast<unw_word_t>(handler+1); - if (*handler) { + // NOTE: lastcode can be equal to or greater than 2, then accessing UnwindCodes[lastcode] is + // out of bound (i.e. undefined behavior). The external memory outside the class object + // cannot be reached from a pointer to a subobject. The only valid case is when CountOfCodes + // is 0, and then lastcode will be equal 0. + // However, `reinterpret_cast` from a pointer to uint16[2] to a pointer to uint32 is not + // allowed in any case. (https://godbolt.org/z/vY1zKEvo6) + // It turns out that here we just need to check each of the array elements for zero. + const uint16_t hi = xdata->UnwindCodes[0]; + const uint16_t low = xdata->UnwindCodes[1]; +#warning "invalid reinterpret-cast" + // FIXME: Does `xdata` actually point to an array of UNWIND_INFO? + _info.lsda = reinterpret_cast<unw_word_t>(xdata+1);
AFAIK memcpy can alias anything, no need for std::byte or whatever. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10471#note_134024
Alfred Agrell (@Alcaro) commented about libs/unwind/src/UnwindCursor.hpp:
- const uint32_t *handler = reinterpret_cast<uint32_t *>(&xdata->UnwindCodes[lastcode]); - _info.lsda = reinterpret_cast<unw_word_t>(handler+1); - if (*handler) { + // NOTE: lastcode can be equal to or greater than 2, then accessing UnwindCodes[lastcode] is + // out of bound (i.e. undefined behavior). The external memory outside the class object + // cannot be reached from a pointer to a subobject. The only valid case is when CountOfCodes + // is 0, and then lastcode will be equal 0. + // However, `reinterpret_cast` from a pointer to uint16[2] to a pointer to uint32 is not + // allowed in any case. (https://godbolt.org/z/vY1zKEvo6) + // It turns out that here we just need to check each of the array elements for zero. + const uint16_t hi = xdata->UnwindCodes[0]; + const uint16_t low = xdata->UnwindCodes[1]; +#warning "invalid reinterpret-cast" + // FIXME: Does `xdata` actually point to an array of UNWIND_INFO? + _info.lsda = reinterpret_cast<unw_word_t>(xdata+1); + if (hi && low) { This looks like a behavioral change, or two. Are they intentional?
-- https://gitlab.winehq.org/wine/wine/-/merge_requests/10471#note_134025
I don't think we have any expertise on this lib at winehq. I'd recommend submitting it upstream https://github.com/llvm/llvm-project/blob/main/libunwind/include/libunwind.h..., and seeing if they like it. (yes, things have moved around a little since LLVM 8. No, I don't know why we're using something that old. I don't think the reason is secret, but I never asked.) -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10471#note_134026
(yes, things have moved around a little since LLVM 8. No, I don't know why we're using something that old. I don't think the reason is secret, but I never asked.)
It's the last version that was MIT-licensed. Newer versions are Apache2 which is not compatible with LGPL v2. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10471#note_134037
On Fri Mar 27 12:31:30 2026 +0000, Alfred Agrell wrote:
Wouldn't alignas make more sense? Or are we trying to avoid c++11 features? This is C code as I believe
-- https://gitlab.winehq.org/wine/wine/-/merge_requests/10471#note_134047
On Fri Mar 27 13:17:26 2026 +0000, Safocl Stollmannovic wrote:
This is C code as I believe Before C11, there was no compiler-independent version of \_Alignas.
-- https://gitlab.winehq.org/wine/wine/-/merge_requests/10471#note_134049
On Fri Mar 27 12:31:31 2026 +0000, Alfred Agrell wrote:
AFAIK memcpy can alias anything, no need for std::byte or whatever. Yes, but I'm not using std::byte here. And I don't think std::memcpy is better than explicitly comparing each element.
-- https://gitlab.winehq.org/wine/wine/-/merge_requests/10471#note_134050
On Fri Mar 27 13:25:03 2026 +0000, Safocl Stollmannovic wrote:
Yes, but I'm not using std::byte here. And I don't think std::memcpy is better than explicitly comparing each element. Can you suggest pseudocode (or perhaps actual final desired code) here that you think would be better?
-- https://gitlab.winehq.org/wine/wine/-/merge_requests/10471#note_134052
On Fri Mar 27 13:26:54 2026 +0000, Safocl Stollmannovic wrote:
Can you suggest pseudocode (or perhaps actual final desired code) here that you think would be better? Unless it's actually broken on some toolchain that's reasonably in use, it seems better option would be not to touch anything.
-- https://gitlab.winehq.org/wine/wine/-/merge_requests/10471#note_134053
On Fri Mar 27 12:31:31 2026 +0000, Alfred Agrell wrote:
This looks like a behavioral change, or two. Are they intentional? Since the code had (potencially) _undefined behavior_ in this section, it's difficult to predict what was actually intended. I'm really hoping for help from the community and developers in this case. Since it's difficult for me to clearly trace what was supposed to happen here, I would be very grateful if someone could answer the questions I've temporarily posed in the code comments. This is, in any case, fixable within the context of this merge request, and I really appreciate any hints.
-- https://gitlab.winehq.org/wine/wine/-/merge_requests/10471#note_134054
On Fri Mar 27 13:30:55 2026 +0000, Safocl Stollmannovic wrote:
Since the code had (potencially) _undefined behavior_ in this section, it's difficult to predict what was actually intended. I'm really hoping for help from the community and developers in this case. Since it's difficult for me to clearly trace what was supposed to happen here, I would be very grateful if someone could answer the questions I've temporarily posed in the code comments. This is, in any case, fixable within the context of this merge request, and I really appreciate any hints. but if you are talking about converting the code
```cpp uint16 a[2] = {...}; if (*reinterpret_cast<uint32*>(a)){...} ``` into the code ```cpp uint16 a[2] = {...}; const uint16_t hi = a[0]; const uint16_t low = a[1]; if (hi && low){...} ``` , then yes, I consider this change to be logically consistent with the original code. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10471#note_134056
On Fri Mar 27 12:59:30 2026 +0000, Alexandre Julliard wrote:
(yes, things have moved around a little since LLVM 8. No, I don't know why we're using something that old. I don't think the reason is secret, but I never asked.) It's the last version that was MIT-licensed. Newer versions are Apache2 which is not compatible with LGPL v2. But the LLVM project doesn't use the placement-new-operator to create new objects in storage. They use the corresponding compiler intrinsics (https://github.com/llvm/llvm-project/blob/abc0674f839ab4b932f5cc6e15da045668...).
However, here (and in several other code snippets) the placement-new operator is used: https://gitlab.winehq.org/wine/wine/-/blob/master/libs/unwind/src/libunwind.... -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10471#note_134057
I created an issue in the C++ CWG repository on some similar points. https://github.com/cplusplus/CWG/issues/874 -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10471#note_134063
participants (5)
-
Alexandre Julliard (@julliard) -
Alfred Agrell (@Alcaro) -
Nikolay Sivov (@nsivov) -
Safocl Stollmannovic -
Safocl Stollmannovic (@safocl88)