Maarten Lankhorst wrote:
On a related note - I haven't been able to get an answer to that one, not even through experimentation. Does anyone know whether Windows' Unicode is UTF-16 or UCS-2? Whether it's necessary to handle aggregates is crucially important when reordering characters.
Shachar
I'm guessing utf-16, not 100% sure though.
Ok. Just so you know, this means the reordering code is buggy for UTF-16 aggregates. I suspect the classification code is too. From the recent changes to bidi.c:
if (odd(levels[lastgood])) for (k = j - 1; k >= lastgood; --k) lpOrder[done + k] = done + j - 1 - k;
An aggregate in an odd level will have its two part reversed, making it meaningless at best (at worst, the trailing part will match the leading part of the previous character, creating a totally unrelated legal character). I suspect you have a similar problem in the classification part of the code.
I've gone over the tables (cursory glance), and haven't been able to find characters in the aggregate area that are naturally likely to receive an odd level. I also asked on the fribidi list several times, and got the reply that there are such letters, but no specifics. This has a lot to do with my inability to empirically test whether Windows handle these. My latest test was an attempt to run a string that has RLO through GetCharacterPlacement, but even that fails at the moment (not to mention that GetCharacterPlacement is an old interface under Windows, and is slightly depracated, despite not being documented as such).
While it is true that the very fact that I'm having such a hard time in finding out whether aggregates are supported on Windows means that any bug we introduce because of lack of support for aggregates will be a rare one, I still would prefer not introducing bugs into Wine if they can be avoided.
Maarten
Shachar