Hi Maarten,
Can you, please, explain the advantage of creating our own implementation of the BiDi algorithm over using existing implementations? I know ICU sucks (especially as far as linkage is concerned), but there are other implementations, major among which is fribidi, which are free, are C based, and have a compatible license. Is there any need to Wine to be aware of the inside working of the algorithm?
Also, so long as you are picking up the BiDi glove I dropped oh so many years ago, it seems to me that the proper place to implement BiDi would be in unscribe, where it is for Windows. The GDI implementation Wine has is a hack that reached it's useful end the moment you realize that DrawText needs its own implementation, independent of ExtTextOut (mostly due to line breaking code).
Thanks, Shachar
Shachar Shemesh shachar@shemesh.biz writes:
Hi Maarten,
Can you, please, explain the advantage of creating our own implementation of the BiDi algorithm over using existing implementations? I know ICU sucks (especially as far as linkage is concerned), but there are other implementations, major among which is fribidi, which are free, are C based, and have a compatible license. Is there any need to Wine to be aware of the inside working of the algorithm?
The algorithm is pretty simple, and since we need to have the character tables anyway there's no reason to add an external dependency for this.
Also, so long as you are picking up the BiDi glove I dropped oh so many years ago, it seems to me that the proper place to implement BiDi would be in unscribe, where it is for Windows. The GDI implementation Wine has is a hack that reached it's useful end the moment you realize that DrawText needs its own implementation, independent of ExtTextOut (mostly due to line breaking code).
Actually the proper place would be libwine along with the rest of the Unicode support.
Alexandre Julliard wrote:
Actually the proper place would be libwine along with the rest of the Unicode support.
I've spent the past hour downloading 12% of the git repository, so I'm unable to look at current Wine code for at least the next 24 hours :-(.
From memory, libwine contains mostly tables and stuff, not actual
algorithms. Wouldn't it be better to place the tables at libwine, but the algorithm at uniscribe, if only to follow Window's design of things?
Also, I suspect it might be necessary, at some point in the extreme distant future, to do some deviation from the Unicode algorithm. It's pretty far away, as we're talking about nuances that are hard to pick if you don't know your stuff, but there are places where Windows can be said to be either implementing an old version of the standard, or implementing its own idea altogether.
My original plan was to import the fribidi code into a subdirectory of the wine tree, and make the necessary changes there.
On a related note - I haven't been able to get an answer to that one, not even through experimentation. Does anyone know whether Windows' Unicode is UTF-16 or UCS-2? Whether it's necessary to handle aggregates is crucially important when reordering characters.
Shachar
Shachar Shemesh shachar@shemesh.biz writes:
I've spent the past hour downloading 12% of the git repository, so I'm unable to look at current Wine code for at least the next 24 hours :-(.
You can always browse the code at http://source.winehq.org/git/wine.git
From memory, libwine contains mostly tables and stuff, not actual
algorithms. Wouldn't it be better to place the tables at libwine, but the algorithm at uniscribe, if only to follow Window's design of things?
libwine certainly contains algorithms, but Uniscribe is a possibility too, if it provides everything we need.
Also, I suspect it might be necessary, at some point in the extreme distant future, to do some deviation from the Unicode algorithm. It's pretty far away, as we're talking about nuances that are hard to pick if you don't know your stuff, but there are places where Windows can be said to be either implementing an old version of the standard, or implementing its own idea altogether.
That's one more reason for having our own instead of using an external library.
My original plan was to import the fribidi code into a subdirectory of the wine tree, and make the necessary changes there.
I think Maarten's work shows that this is not necessary.
On a related note - I haven't been able to get an answer to that one, not even through experimentation. Does anyone know whether Windows' Unicode is UTF-16 or UCS-2? Whether it's necessary to handle aggregates is crucially important when reordering characters.
Recent Windows versions do support surrogates.
Alexandre Julliard wrote:
Actually the proper place would be libwine along with the rest of the Unicode support.
I've spent the past hour downloading 12% of the git repository, so I'm unable to look at current Wine code for at least the next 24 hours :-(.
From memory, libwine contains mostly tables and stuff, not actual
algorithms. Wouldn't it be better to place the tables at libwine, but the algorithm at uniscribe, if only to follow Window's design of things?
Also, I suspect it might be necessary, at some point in the extreme distant future, to do some deviation from the Unicode algorithm. It's pretty far away, as we're talking about nuances that are hard to pick if you don't know your stuff, but there are places where Windows can be said to be either implementing an old version of the standard, or implementing its own idea altogether.
My original plan was to import the fribidi code into a subdirectory of the wine tree, and make the necessary changes there.
On a related note - I haven't been able to get an answer to that one, not even through experimentation. Does anyone know whether Windows' Unicode is UTF-16 or UCS-2? Whether it's necessary to handle aggregates is crucially important when reordering characters.
Shachar
Shachar Shemesh schreef:
Alexandre Julliard wrote:
Actually the proper place would be libwine along with the rest of the Unicode support.
I've spent the past hour downloading 12% of the git repository, so I'm unable to look at current Wine code for at least the next 24 hours :-(.
From memory, libwine contains mostly tables and stuff, not actual
algorithms. Wouldn't it be better to place the tables at libwine, but the algorithm at uniscribe, if only to follow Window's design of things?
Also, I suspect it might be necessary, at some point in the extreme distant future, to do some deviation from the Unicode algorithm. It's pretty far away, as we're talking about nuances that are hard to pick if you don't know your stuff, but there are places where Windows can be said to be either implementing an old version of the standard, or implementing its own idea altogether.
My original plan was to import the fribidi code into a subdirectory of the wine tree, and make the necessary changes there.
Actually dlls/gdi32/bidi.c now already has a full implementation of the bidirectional algorythm, you could easily take it out there and replace it by a (reverse)memcpy depending on which forced direction, since the force direction flag overrides all other flags that determine direction. It shouldn't be much of an effort to implement whatever uniscribe needs using bidi.c..
On a related note - I haven't been able to get an answer to that one, not even through experimentation. Does anyone know whether Windows' Unicode is UTF-16 or UCS-2? Whether it's necessary to handle aggregates is crucially important when reordering characters.
Shachar
I'm guessing utf-16, not 100% sure though.
Maarten
Maarten Lankhorst wrote:
On a related note - I haven't been able to get an answer to that one, not even through experimentation. Does anyone know whether Windows' Unicode is UTF-16 or UCS-2? Whether it's necessary to handle aggregates is crucially important when reordering characters.
Shachar
I'm guessing utf-16, not 100% sure though.
Ok. Just so you know, this means the reordering code is buggy for UTF-16 aggregates. I suspect the classification code is too. From the recent changes to bidi.c:
if (odd(levels[lastgood])) for (k = j - 1; k >= lastgood; --k) lpOrder[done + k] = done + j - 1 - k;
An aggregate in an odd level will have its two part reversed, making it meaningless at best (at worst, the trailing part will match the leading part of the previous character, creating a totally unrelated legal character). I suspect you have a similar problem in the classification part of the code.
I've gone over the tables (cursory glance), and haven't been able to find characters in the aggregate area that are naturally likely to receive an odd level. I also asked on the fribidi list several times, and got the reply that there are such letters, but no specifics. This has a lot to do with my inability to empirically test whether Windows handle these. My latest test was an attempt to run a string that has RLO through GetCharacterPlacement, but even that fails at the moment (not to mention that GetCharacterPlacement is an old interface under Windows, and is slightly depracated, despite not being documented as such).
While it is true that the very fact that I'm having such a hard time in finding out whether aggregates are supported on Windows means that any bug we introduce because of lack of support for aggregates will be a rare one, I still would prefer not introducing bugs into Wine if they can be avoided.
Maarten
Shachar