Re: [PATCH v3 1/2] kernelbase/locale: Implement comparison on top of official unicode weight tables

4 Mar 2020

      Fabian Maurer dark.shadow4@web.de writes:
...
Hello Alexandre,
...
Multi-language support, Japanese, Korean, multi-char sequences,
surrogates, linguistic mappings, etc.
There are a million things that need to be supported for proper
sorting. You don't have to implement them all, but it should be clear
from your approach that they can be added. Which in practice means you
need to at least prototype most of them.
Well, they can be added, it's just that I left them out for the initial
versions...
Short breakdown:

Multi-language: The character is looked up the current language, as a

fallback the default is used. Currently, only the default is implemented
I don't see any language support, there's just one big sortkey
table. Yes, that's what the current code is doing too, but if we are
rewriting it, we should get the architecture right.
...

Multi-char sequences: You man when a single codepoint is encoded as more

than one WCHAR? Is supported, windows seems to treat each WCHAR separately
I mean when multiple chars map to one sortkey. The COMPRESSION sections
in the Microsoft table.
...

Linguistic mappings: Not sure what you mean, sorry

NORM_LINGUISTIC_CASING and the like.
...
Question: How should I prove it works? I can't possible add all of that in the
first draft.
The usual way is to add a bunch of tests with todo_wine, and then send a
patch series with each patch removing the corresponding todos.
...
...
We only have tests for a very small number of strings, that's clearly
not proper coverage. Some way of systematically generating test strings
should be considered.
Like, random strings from a known seed? I intentionally didn't do that,
because of performance concerns.
Not necessarily random, but some interesting data. For instance the
normalization tests can run the entire test suite from unicode.org, you
may be able to find something similar. Or build your own somehow.
...
...
Also testing sort keys directly, like you did in
the first try (but without depending on the exact values).
I've that planned, yes. Do you want that in the first version already?
The tests should come before the code, or at the same time.
...
...
Note that we most likely want to use a Windows-compatible NLS file, like
we are now using for codepage or normalization tables. I can work on
that part.
I have to admit, I don't know what you mean by that. I don't know about NLS
files.
This is new stuff. Look at the nls directory, and at the make_unicode
script.
-- 
Alexandre Julliard
julliard@winehq.org

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: [PATCH v3 1/2] kernelbase/locale: Implement comparison on top of official unicode weight tables