Re: Unicode normalization for Wine

26 Jul 2017

      Hello,
On 7/25/17 4:33 PM, Artur Świgoń wrote:
...
Dear All,
My name is Artur and I'm participating in Google Summer of Code 2017 for Wine.
Under Nikolay's supervision, I'm working on implementation of Unicode
normalization. I probably should have introduced myself some time ago to share
results of my research and my ideas, but I also wanted to wait until I could
illustrate my points with some code.
Very cool! This is a problem I ran into with Japanese unicode string comparisons a while ago so it is great it will be addressed! Then we will have to investigate the CompareStringW, and family, behavior.
...

Mappings for characters above 0xFFFF are encoded as UTF-16 (using surrogate
 pairs), but a single codepoint (UTF-32 if you like) is used for table
 indexing. Setting $utflim in make_unicode to 65536 is the simplest way to
 disable support for such characters, but supporting surrogate pairs should
 not affect any text-related Wine component in a negative way.

There is some super basic work on non-BMP unicode glyphs and surrogate pairs in Uniscribe (usp10).  I wrote a quick decode_surrogate_pair() function to help get a DWORD unicode value for the surrogate pair. So you can look at that if you are interested!
Thanks!
-aric

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: Unicode normalization for Wine