On Aug 20, 2013, at 10:49 PM, Charles Davis wrote:
In the Windows world, "Unicode" almost universally means "UTF-16". So, use the well-known UTF-16 type instead of making up our own.
I have to wonder if there was a good reason Ken didn't use this initially.
Please hold this patch while I review it.
I think there is a good reason why I didn't use it, but I have to figure it out again. :-/
It has to do with a promised pasteboard type (what Microsoft calls delayed rendering) and the conversions to the other text types. I think we need to use a custom type to distinguish between being asked for promised data vs. being asked for a conversion.
-Ken
On Aug 21, 2013, at 10:12 AM, Ken Thomases wrote:
On Aug 20, 2013, at 10:49 PM, Charles Davis wrote:
In the Windows world, "Unicode" almost universally means "UTF-16". So, use the well-known UTF-16 type instead of making up our own.
I have to wonder if there was a good reason Ken didn't use this initially.
Please hold this patch while I review it.
I think there is a good reason why I didn't use it, but I have to figure it out again. :-/
It has to do with a promised pasteboard type (what Microsoft calls delayed rendering) and the conversions to the other text types. I think we need to use a custom type to distinguish between being asked for promised data vs. being asked for a conversion.
Well, that wasn't the reason. However, I've found a simpler reason not to commit this. The data supplied for CF_UNICODETEXT is null-terminated, but public.utf16-plain-text shouldn't be. (Neither CFString nor NSString treat a 0x0000 code unit specially. It just ends up part of the string.)
Was there a specific compatibility problem you were trying to solve? Or just reviewing the code and found this strange and wanted to improve it?
If you like, it may make sense to add conversions to public.utf16-plain-text like those done for public.utf8-plain-text, but the import/export functions will have to add/remove the terminating null.
-Ken
On Aug 21, 2013, at 10:11 AM, Ken Thomases wrote:
On Aug 21, 2013, at 10:12 AM, Ken Thomases wrote:
On Aug 20, 2013, at 10:49 PM, Charles Davis wrote:
In the Windows world, "Unicode" almost universally means "UTF-16". So, use the well-known UTF-16 type instead of making up our own.
I have to wonder if there was a good reason Ken didn't use this initially.
Please hold this patch while I review it.
I think there is a good reason why I didn't use it, but I have to figure it out again. :-/
It has to do with a promised pasteboard type (what Microsoft calls delayed rendering) and the conversions to the other text types. I think we need to use a custom type to distinguish between being asked for promised data vs. being asked for a conversion.
Well, that wasn't the reason. However, I've found a simpler reason not to commit this. The data supplied for CF_UNICODETEXT is null-terminated, but public.utf16-plain-text shouldn't be. (Neither CFString nor NSString treat a 0x0000 code unit specially. It just ends up part of the string.)
Wow, I didn't know that. I guess that makes sense, because NSCFStrings are counted.
Was there a specific compatibility problem you were trying to solve? Or just reviewing the code and found this strange and wanted to improve it?
I was looking over the patches as you sent them, found this strange, and thought this would improve it.
If you like, it may make sense to add conversions to public.utf16-plain-text like those done for public.utf8-plain-text, but the import/export functions will have to add/remove the terminating null.
Sounds good. But it might take me a little while to rework this. I have a bunch of other things going on at the moment. I'll probably miss today's commit wave, but not tomorrow's.
Chip