Re: [PATCH 1/4] kernel32: Support UTF-7 in MultiByteToWideChar.

5 Dec 2012


      On Tue, Dec 04, 2012 at 08:30:55PM -0700, Alex Henrie wrote:
...
2012/12/4 Fr?d?ric Delanoy frederic.delanoy@gmail.com:
...
The above MSDN comment indicates pre-Vista versions are buggy, so it's
probably not a good idea to match that behaviour.
I think encoding and decoding in UTF-7 arbitrary binary data was
considered a "feature" in Windows XP. As MSDN said, "Code written in
earlier versions of Windows that rely on this behavior to encode
random non-text binary data might run into problems." So I'm sure
there's at least one application that depends on the data not being
Unicode-normalized. Whoever adds normalization will have to make sure
it's turned off in Windows XP (or older) mode.
Actually UTF-8 is a PITA - a program has to know whether every
individual C string (or file) is UTF-8 or 8bit ascii (well 8859-x).
Assuming UTF-8 doesn't work unless in can process all arbitrary
byte sequences (and write them back) - which the standard doesn't
allow for.
In the US it probably isn't often an issue, but in europe there are
mane files that have occaisional characters with the top bit set.
In the UK we only see 0xA3 (pound sterling) - but it can crop up
anywhere - and causes my mail program (which, for some reason I
don't understand) assumes UTF-8 do drop core responsing to mails!
David
-- 
David Laight: david@l8s.co.uk

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: [PATCH 1/4] kernel32: Support UTF-7 in MultiByteToWideChar.