In c++11 char16_t is a distinct fundamental type, but in c11 it is merely a typedef in <uchar.h>.
Explicitly mention char16_t only in c++11 (where it is built-in), otherwise define WCHAR to match u"...", without naming char16_t.
Remove WINE_UNICODE_CHAR16; it is now the default when supported.
Signed-off-by: Kevin Puetz PuetzKevinA@JohnDeere.com ---
Per https://www.winehq.org/pipermail/wine-devel/2020-July/170502.html Jacek wants to avoid including <uchar.h>; without that C11 has to make some assumptions the standard doesn't guarantee. But that's far from wine's first foray into implementation-defined behavior, so OK :-)
c11 defines char16_t as a typedef to uint_least16_t, which is likely (though not guaranteed) the same as unsigned short (e.g. it could be int if that is a 16-bit type). I do see that basetsd has a comment saying only ILP32, LP64, or P64 typemodels are supported, though, so probably wine would already not work on a platform with 16-bit int.
However, GCC and clang provide a macro __CHAR16_TYPE__ which is easy to test and, if defined, will definitely match <uchar.h>. This patch uses it if available, else falls back on unsigned short as before.
One *could* delete the #if defined(__CHAR16_TYPE__) cases too and just asume unsigned short is right, which in practice it is. But IMO using the gcc/clang #define makes the goal clearer.
---
Removing WINE_UNICODE_CHAR1 changes little for C or C++98/C++03, or anything using -fshort-wchar/WINE_UNICODE_NATIVE, but it has some ABI implications for wineg++ -std=c++11 -fno-short-wchar.
On the plus side:
* TEXT(), OLESTR() macros "just work" in UNICODE (for c11 and c++11) * We revert an #ifdef with ABI implications, now there's just -f(no-)short-wchar and -std=c++*, which already mattered * in C++11, WCHAR overloads are no longer ambiguous vs integral types and WCHAR is recognizable as text rather than promoted to int. This is much more like the situation on MSVC where WCHAR == wchar_t; char16_t is also distinct from uint16_t, unsigned short, etc. * We avoid ever having the situation where TEXT("...") -> u"..." is accepted by the compiler and produces char16_t[], but that isn't compatible with TCHAR[] or LPCTSTR.
Silently getting the wrong type is obnoxious in C++, since templates can move the type-mismatch compile error far away from the offending TEXT("...") macro, or even compile but use an unexpected overload. Now u"..." either doesn't compile, or it's correct, so we can drop https://www.winehq.org/pipermail/wine-devel/2020-July/170227.html
On the minus side:
* any libraries built with wineg++ -std=c++11 -fno-short-wchar change their mangled names for functions with WCHAR/LPCWSTR parameters because char16_t is a distinct fundamental type.
This doesn't affect compatibility of wine itself (which always exports things under their MSVC-ish mangling as-if using wchar_t), but wineg++ could fail to link to a .dll.so built with wine 5.0.x headers. I don't know to what extent wineg++ tries to promise that this works, but IMO we are headed into a major version 6, and binaries are already not fully interchangeable due to winecrt0/__wine_spec_init changes. So I think think it seems preferable to just have good defaults that are MSVC-like, and fewer possible mistakes.
But it's certainly possible to keep the WINE_UNICODE_CHAR16 opt-in, (or add a WINE_NO_UNICODE_CHAR16 opt-out), if we must. Or we could follow gcc's lead on the upcoming c++20 -f(no-)char8_t and winegcc could have -f(no-)char16_t, or -fchar16_t-wchar (synthesizing the macro like it does with WINE_UNICODE_NATIVE). But I like the simplicity of "just works if the compiler has support" unless someone objects. --- include/sqltypes.h | 4 +++- include/tchar.h | 4 +++- include/winnt.h | 4 +++- 3 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/include/sqltypes.h b/include/sqltypes.h index 0923f6b362..08b6fc2f7b 100644 --- a/include/sqltypes.h +++ b/include/sqltypes.h @@ -30,8 +30,10 @@ extern "C" { typedef unsigned char SQLCHAR; #if defined(WINE_UNICODE_NATIVE) typedef wchar_t SQLWCHAR; -#elif defined(WINE_UNICODE_CHAR16) +#elif __cpp_unicode_literals >= 200710 typedef char16_t SQLWCHAR; +#elif defined(__CHAR16_TYPE__) +typedef __CHAR16_TYPE__ SQLWCHAR; #else typedef unsigned short SQLWCHAR; #endif diff --git a/include/tchar.h b/include/tchar.h index 9fc4c72099..e1e21df272 100644 --- a/include/tchar.h +++ b/include/tchar.h @@ -240,8 +240,10 @@ typedef unsigned short wctype_t; #ifndef __TCHAR_DEFINED #if defined(WINE_UNICODE_NATIVE) typedef wchar_t _TCHAR; -#elif defined(WINE_UNICODE_CHAR16) +#elif __cpp_unicode_literals >= 200710 typedef char16_t _TCHAR; +#elif defined(__CHAR16_TYPE__) +typedef __CHAR16_TYPE__ _TCHAR; #else typedef unsigned short _TCHAR; #endif diff --git a/include/winnt.h b/include/winnt.h index 63567ba62e..1874d53430 100644 --- a/include/winnt.h +++ b/include/winnt.h @@ -462,8 +462,10 @@ typedef int LONG, *PLONG; /* Some systems might have wchar_t, but we really need 16 bit characters */ #if defined(WINE_UNICODE_NATIVE) typedef wchar_t WCHAR; -#elif defined(WINE_UNICODE_CHAR16) +#elif __cpp_unicode_literals >= 200710 typedef char16_t WCHAR; +#elif defined(__CHAR16_TYPE__) +typedef __CHAR16_TYPE__ WCHAR; #else typedef unsigned short WCHAR; #endif
When winegcc is using an underlying POSIX libc (rather than -mno-cygwin) it will only have `char` and `wchar_t` functions. If _TCHAR is neither of these there may be no suitable function to alias _tcs* to.
Signed-off-by: Kevin Puetz PuetzKevinA@JohnDeere.com ---
wine-5.2 (c12089039637dec5e598ed1c41e707f057494242) allowed <tchar.h> to be used without MSVCRT. _TEXT(...) and _TCHAR typedef are useful, but the _tcs* macros may still be unusable.
Omitting them will be a clearer compile error when to not exist than mapping them to e.g. wcs* functions which do not accept a windows WCHAR (!= wchar_t), --- include/tchar.h | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/include/tchar.h b/include/tchar.h index e1e21df272..5eb8972563 100644 --- a/include/tchar.h +++ b/include/tchar.h @@ -37,9 +37,15 @@ extern "C" { #define _strninc(str,n) (((char*)(str))+(n)) #define _strspnp(s1,s2) (*((s1)+=strspn((s1),(s2))) ? (s1) : NULL)
+#if defined(__MSVCRT__) || defined(_MSC_VER) || (defined(WINE_UNICODE_NATIVE) && defined(_UNICODE)) || !(defined(_UNICODE) || defined(_MBCS))
/***************************************************************************** * tchar mappings + * + * These can only be defined when libc in use will have functions accepting _TCHAR, i.e. + * -mno-cygwin / __MSVCRT__ or __MSC_VER + * -fshort-wchar / WINE_UNICODE_NATIVE and _UNICODE (_TCHAR == WCHAR == wchar_t, so the libc wcs* functions are UTF-16) + * _TCHAR == `char` without _MBCS */ #ifndef _UNICODE # ifndef _MBCS @@ -223,6 +229,8 @@ extern "C" { #define _vtprintf WINE_tchar_routine(vprintf, vprintf, vwprintf) #define _TEOF WINE_tchar_routine(EOF, EOF, WEOF)
+#endif /* tchar mappings */ + #define __T(x) __TEXT(x) #define _T(x) __T(x) #define _TEXT(x) __T(x)
Hi Kevin,
On 18.09.2020 00:49, Kevin Puetz wrote:
In c++11 char16_t is a distinct fundamental type, but in c11 it is merely a typedef in <uchar.h>.
Explicitly mention char16_t only in c++11 (where it is built-in), otherwise define WCHAR to match u"...", without naming char16_t.
Remove WINE_UNICODE_CHAR16; it is now the default when supported.
I like the part that uses __cpp_unicode_literals, it would make our headers much more C++ friendly by default.
I'm less sure if we want __CHAR16_TYPE__. In practice, it will only affect C, which is much less strict about types anyway. We assume in Wine code base that WCHAR is 16-bit unsigned integer. If __CHAR16_TYPE__ is something else, we shouldn't use it. If it's the same, then there is little point in using it. I think that the original problem is fixed by __cpp_unicode_literals change alone, so how about doing just that?
Thanks,
Jacek
-----Original Message----- From: Jacek Caban jacek@codeweavers.com Sent: Wednesday, September 23, 2020 9:44 AM
In c++11 char16_t is a distinct fundamental type, but in c11 it is merely a typedef in <uchar.h>.
Explicitly mention char16_t only in c++11 (where it is built-in), otherwise define WCHAR to match u"...", without naming char16_t.
Remove WINE_UNICODE_CHAR16; it is now the default when supported.
I like the part that uses __cpp_unicode_literals, it would make our headers much more C++ friendly by default.
I'm less sure if we want __CHAR16_TYPE__. In practice, it will only affect C, which is much less strict about types anyway. We assume in Wine code base that WCHAR is 16-bit unsigned integer. If __CHAR16_TYPE__ is something else, we shouldn't use it. If it's the same, then there is little point in using it. I think that the original problem is fixed by __cpp_unicode_literals change alone, so how about doing just that?
__CHAR16_TYPE__ (per C11 standard) is always the same type as uint_least16_t. Which, in turn, is always a typedef for one of the ordinary integer types. So it is just a 16-bit unsigned integer, unless one doesn't exist at all , in which case it's the smallest unsigned integer that can hold 0...65535. It's never a distinct type like char16_t is for c++11/__cpp_unicode_literals.
What isn't guaranteed is that the only suitable type is `unsigned short`, it could legally be `unsigned int` (if that was also 16-bit).
I don't quite agree that ignoring this is preferable because C "is much less strict about types anyway." 1. The __CHAR16_TYPE__ path applies to C++98 and C++03, which lack __cpp_unicode_literals but might have a C-ish char16_t. So its behavior shouldn't completely ignore c++. 2. C has mostly the same strict-aliasing rules as C++, which would not permit `const unsigned short *` pointing to `unsigned int[]`, even if they are the same size. So when LPCTSTR = TEXT("...") becomes const WCHAR * = u"...", if WCHAR is `short` but __CHAR16_TYPE__ is e.g. `int`, there is a strict-aliasing violation (and undefined behavior) in C too.
Now, in practice wine probably doesn't support any platform with 16-bit `int`, So __CHAR16_TYPE__ is just going to be `unsigned short` anyway. This way is pedantically more portable (to various architectures, if they are using gcc or clang which provide __CHAR16_TYPE__) But in practice they will preprocess to the same thing, which makes both the arguments in favor and the arguments against moot.
So I'll drop it (after this last attempt to justify it) if that's what you want; I care about char16_t mostly in C++11, but was just trying to fix the C path to be portable too on general principle. Not a hill I need to die on :-)
On 23.09.2020 19:43, Puetz Kevin A wrote:
-----Original Message----- From: Jacek Caban jacek@codeweavers.com Sent: Wednesday, September 23, 2020 9:44 AM
In c++11 char16_t is a distinct fundamental type, but in c11 it is merely a typedef in <uchar.h>.
Explicitly mention char16_t only in c++11 (where it is built-in), otherwise define WCHAR to match u"...", without naming char16_t.
Remove WINE_UNICODE_CHAR16; it is now the default when supported.
I like the part that uses __cpp_unicode_literals, it would make our headers much more C++ friendly by default.
I'm less sure if we want __CHAR16_TYPE__. In practice, it will only affect C, which is much less strict about types anyway. We assume in Wine code base that WCHAR is 16-bit unsigned integer. If __CHAR16_TYPE__ is something else, we shouldn't use it. If it's the same, then there is little point in using it. I think that the original problem is fixed by __cpp_unicode_literals change alone, so how about doing just that?
__CHAR16_TYPE__ (per C11 standard) is always the same type as uint_least16_t. Which, in turn, is always a typedef for one of the ordinary integer types. So it is just a 16-bit unsigned integer, unless one doesn't exist at all , in which case it's the smallest unsigned integer that can hold 0...65535. It's never a distinct type like char16_t is for c++11/__cpp_unicode_literals.
What isn't guaranteed is that the only suitable type is `unsigned short`, it could legally be `unsigned int` (if that was also 16-bit).
I don't quite agree that ignoring this is preferable because C "is much less strict about types anyway."
- The __CHAR16_TYPE__ path applies to C++98 and C++03, which lack __cpp_unicode_literals but might have a C-ish char16_t. So its behavior shouldn't completely ignore c++.
It's not clear to me that we want to change the default in this case (and even if we do, it could be C++ only change).
- C has mostly the same strict-aliasing rules as C++, which would not permit `const unsigned short *` pointing to `unsigned int[]`, even if they are the same size. So when LPCTSTR = TEXT("...") becomes const WCHAR * = u"...", if WCHAR is `short` but __CHAR16_TYPE__ is e.g. `int`, there is a strict-aliasing violation (and undefined behavior) in C too.
Now, in practice wine probably doesn't support any platform with 16-bit `int`, So __CHAR16_TYPE__ is just going to be `unsigned short` anyway. This way is pedantically more portable (to various architectures, if they are using gcc or clang which provide __CHAR16_TYPE__) But in practice they will preprocess to the same thing, which makes both the arguments in favor and the arguments against moot.
So I'll drop it (after this last attempt to justify it) if that's what you want; I care about char16_t mostly in C++11, but was just trying to fix the C path to be portable too on general principle. Not a hill I need to die on :-)
I can see your point, but by changing C path, you put compatibility of Wine itself into the consideration. WCHAR is an important core type for Wine and we have a simple declaration that worked for years. If we were writing it from scratch, the consideration could be different. I think we agree the existing declaration works equally good in practice (esp. once WINE_UNICODE_CHAR16 handling is fixed in C), so I would prefer to avoid the additional complication. If we'd have 16-bit int, we would have much more problems than WCHAR declaration.
Thanks,
Jacek