Behdad Esfahbod wrote:
I wonder if WINE already has bidi tables of unicode too, if it has, no problem using it.
Wrong, I'm afraid. The WINE BiDi table only has the information as defined in Window's GetStringType function. Quoting: Name Value Meaning *Strong*
C2_LEFTTORIGHT 0x0001 Left to right C2_RIGHTTOLEFT 0x0002 Right to left *Weak*
C2_EUROPENUMBER 0x0003 European number, European digit C2_EUROPESEPARATOR 0x0004 European numeric separator C2_EUROPETERMINATOR 0x0005 European numeric terminator C2_ARABICNUMBER 0x0006 Arabic number C2_COMMONSEPARATOR 0x0007 Common numeric separator *Neutral*
C2_BLOCKSEPARATOR 0x0008 Block separator C2_SEGMENTSEPARATOR 0x0009 Segment separator C2_WHITESPACE 0x000A White space C2_OTHERNEUTRAL 0x000B Other neutrals *Not applicable*
C2_NOTAPPLICABLE 0x0000 No implicit directionality (for example, control codes)
If that's enough - great. I was under the impression this is not enough for 3.0 Unicode implementation.
As for compiling Fribidi with UTF-16 - from what I understood from what you said before, such a thing may cause reordering problems if Fribidi decides, for whatever reason, that a surrogate character needs to be right to left. I am not 100% familiar with the bidi algorythm yet, but won't marking all surrogate forms (both slices) as strong left to right solve this problem?
The problem you are talking about will show itself when some one use RLO..PDF pairs (override embeddings) over some surrogate pairs, but this is not the main problem, I can hach fribidi to take care of surrogates and reorder them back if needed (really easy), the real problem is that when using UTF-16, fribidi will assume all surrogate characters LTR (strong left to right), but there *are* non-LTR characters there already (like language tags), and it may cause to different renderings. If you want conformance, UTF-32 is needed ;-(.
Hmm. Ok, I'll write a mental note to look at it later.
If not, we can always put in special handling that reorders surrogates as a pair.
That sounds like the ultimate solution? If we already implement re-reordering for wrongly ordered surrogates, we don't have to mark them all as strong LTR. Right?
Shachar