[PATCH v5 0/1] MR10704: Display invalid Indic syllables
Use the U+25CC DOTTED CIRCLE as the base glyph for invalid Indic syllables https://bugs.winehq.org/show_bug.cgi?id=27637 -- v5: Display invalid Indic syllables https://gitlab.winehq.org/wine/wine/-/merge_requests/10704
From: Aric Stewart <aric@codeweavers.com> Use the U+25CC DOTTED CIRCLE as the base glyph for invalid Indic syllables Handle the resulting pwLogClust changes correctly. https://bugs.winehq.org/show_bug.cgi?id=27637 --- dlls/gdi32/uniscribe/indic.c | 38 ++++++++------ dlls/gdi32/uniscribe/shape.c | 75 +++++++++++++++++++++++++++ dlls/gdi32/uniscribe/usp10_internal.h | 1 + 3 files changed, 98 insertions(+), 16 deletions(-) diff --git a/dlls/gdi32/uniscribe/indic.c b/dlls/gdi32/uniscribe/indic.c index 2d527ddbd1a..bae638ced63 100644 --- a/dlls/gdi32/uniscribe/indic.c +++ b/dlls/gdi32/uniscribe/indic.c @@ -326,6 +326,7 @@ void Indic_ParseSyllables(HDC hdc, SCRIPT_ANALYSIS *psa, ScriptCache *psc, const unsigned int center = 0; int index = 0; int next = 0; + BOOL valid; *syllable_count = 0; @@ -344,24 +345,29 @@ void Indic_ParseSyllables(HDC hdc, SCRIPT_ANALYSIS *psa, ScriptCache *psc, const if (next >= cChar) break; next = Indic_process_next_syllable(input, cChar, 0, ¢er, index, lex); - if (next != -1) - { - *syllables = realloc(*syllables, sizeof(IndicSyllable)*(*syllable_count+1)); - (*syllables)[*syllable_count].start = index; - (*syllables)[*syllable_count].base = center; - (*syllables)[*syllable_count].ralf = -1; - (*syllables)[*syllable_count].blwf = -1; - (*syllables)[*syllable_count].pref = -1; - (*syllables)[*syllable_count].end = next-1; - FindBaseConsonant(hdc, psa, psc, input, &(*syllables)[*syllable_count], lex, modern); - index = next; - *syllable_count = (*syllable_count)+1; - } - else if (index < cChar) - { + valid = (next != -1); + if (index < cChar && !valid) { TRACE("Processing failed at %i\n",index); - next = ++index; + center = index; + next = index + 1; } + *syllables = realloc(*syllables, sizeof(IndicSyllable)*(*syllable_count+1)); + if (!*syllables) { + ERR("Allocation failure of syllables\n"); + *syllable_count = 0; + return; + } + (*syllables)[*syllable_count].valid = valid; + (*syllables)[*syllable_count].start = index; + (*syllables)[*syllable_count].base = center; + (*syllables)[*syllable_count].ralf = -1; + (*syllables)[*syllable_count].blwf = -1; + (*syllables)[*syllable_count].pref = -1; + (*syllables)[*syllable_count].end = next-1; + if (valid) + FindBaseConsonant(hdc, psa, psc, input, &(*syllables)[*syllable_count], lex, modern); + index = next; + *syllable_count = (*syllable_count)+1; } TRACE("Processed %i of %i characters into %i syllables\n",index,cChar,*syllable_count); } diff --git a/dlls/gdi32/uniscribe/shape.c b/dlls/gdi32/uniscribe/shape.c index 9f67b99c11d..37254e19a8f 100644 --- a/dlls/gdi32/uniscribe/shape.c +++ b/dlls/gdi32/uniscribe/shape.c @@ -2209,6 +2209,58 @@ static void ShapeIndicSyllables(HDC hdc, ScriptCache *psc, SCRIPT_ANALYSIS *psa, } } +static void mark_invalid_syllables(HDC hdc, const WCHAR* pwcChars, INT cChars, WORD *pwGlyphs, INT *pcGlyphs, INT cMaxGlyphs, WORD *pwLogClust, IndicSyllable *syllables, int syllable_count, lexical_function lexical) +{ + int i; + WCHAR invalid = 0x25cc; + WORD invalid_glyph; + int offset = 0; + + if (!hdc || !pwcChars || !pwGlyphs || !pcGlyphs || !pwLogClust || !syllables || syllable_count <= 0) { + ERR("Invalid parameters in mark_invalid_syllables\n"); + return; + } + if (cChars <= 0 || cMaxGlyphs <= 0) { + ERR("Invalid size parameters\n"); + return; + } + if (NtGdiGetGlyphIndicesW(hdc, &invalid, 1, &invalid_glyph, 0) == GDI_ERROR || invalid_glyph == 0x0000) { + TRACE("Invalid glyph 0x25cc not found in font, using placeholder\n"); + invalid_glyph = 0x0020; // Use space as fallback + } + + for (i = 0; i < syllable_count; i++) + if (!syllables[i].valid) break; + + if (i >= syllable_count) { + /* Everything valid */ + return; + } + + /* Mark invalid combinations */ + for (i = 0; i < syllable_count; i++) + { + if (!syllables[i].valid) { + if (*pcGlyphs + 1 > cMaxGlyphs) { + ERR("Number of glyphs exceed buffer(%i, %i)\n", *pcGlyphs, cMaxGlyphs); + pwGlyphs[syllables[i].start] = invalid_glyph; + } else { + int dir = (lexical(pwcChars[syllables[i].start]) == lex_Matra_pre)?1:0; + int index = syllables[i].start+dir+offset; + if (pwGlyphs[index-1] != invalid_glyph) { + for (int j = *pcGlyphs; j>=index; j--) + pwGlyphs[j+1] = pwGlyphs[j]; + pwGlyphs[index] = invalid_glyph; + *pcGlyphs = *pcGlyphs+1; + for (int j = cChars; j>syllables[i].start; j--) + pwLogClust[j] = pwLogClust[j] + 1; + offset++; + } + } + } + } +} + static inline int unicode_lex(WCHAR c) { int type; @@ -2322,6 +2374,9 @@ static void ContextualShape_Sinhala(HDC hdc, ScriptCache *psc, SCRIPT_ANALYSIS * /* Step 4: Base Form application to syllables */ NtGdiGetGlyphIndicesW(hdc, input, cCount, pwOutGlyphs, 0); *pcGlyphs = cCount; + + mark_invalid_syllables(hdc, input, cCount, pwOutGlyphs, pcGlyphs, cMaxGlyphs, pwLogClust, syllables, syllable_count, sinhala_lex); + ShapeIndicSyllables(hdc, psc, psa, input, cChars, syllables, syllable_count, pwOutGlyphs, pcGlyphs, pwLogClust, sinhala_lex, NULL, TRUE); free(input); @@ -2379,6 +2434,8 @@ static void ContextualShape_Devanagari(HDC hdc, ScriptCache *psc, SCRIPT_ANALYSI NtGdiGetGlyphIndicesW(hdc, input, cCount, pwOutGlyphs, 0); *pcGlyphs = cCount; + mark_invalid_syllables(hdc, input, cCount, pwOutGlyphs, pcGlyphs, cMaxGlyphs, pwLogClust, syllables, syllable_count, devanagari_lex); + /* Step 3: Base Form application to syllables */ ShapeIndicSyllables(hdc, psc, psa, input, cChars, syllables, syllable_count, pwOutGlyphs, pcGlyphs, pwLogClust, devanagari_lex, NULL, modern); @@ -2436,6 +2493,8 @@ static void ContextualShape_Bengali(HDC hdc, ScriptCache *psc, SCRIPT_ANALYSIS * NtGdiGetGlyphIndicesW(hdc, input, cCount, pwOutGlyphs, 0); *pcGlyphs = cCount; + mark_invalid_syllables(hdc, input, cCount, pwOutGlyphs, pcGlyphs, cMaxGlyphs, pwLogClust, syllables, syllable_count, bengali_lex); + /* Step 3: Initial form is only applied to the beginning of words */ for (cCount = cCount - 1 ; cCount >= 0; cCount --) { @@ -2499,6 +2558,8 @@ static void ContextualShape_Gurmukhi(HDC hdc, ScriptCache *psc, SCRIPT_ANALYSIS NtGdiGetGlyphIndicesW(hdc, input, cCount, pwOutGlyphs, 0); *pcGlyphs = cCount; + mark_invalid_syllables(hdc, input, cCount, pwOutGlyphs, pcGlyphs, cMaxGlyphs, pwLogClust, syllables, syllable_count, gurmukhi_lex); + /* Step 3: Base Form application to syllables */ ShapeIndicSyllables(hdc, psc, psa, input, cChars, syllables, syllable_count, pwOutGlyphs, pcGlyphs, pwLogClust, gurmukhi_lex, NULL, modern); @@ -2539,6 +2600,8 @@ static void ContextualShape_Gujarati(HDC hdc, ScriptCache *psc, SCRIPT_ANALYSIS NtGdiGetGlyphIndicesW(hdc, input, cCount, pwOutGlyphs, 0); *pcGlyphs = cCount; + mark_invalid_syllables(hdc, input, cCount, pwOutGlyphs, pcGlyphs, cMaxGlyphs, pwLogClust, syllables, syllable_count, gujarati_lex); + /* Step 2: Base Form application to syllables */ ShapeIndicSyllables(hdc, psc, psa, input, cChars, syllables, syllable_count, pwOutGlyphs, pcGlyphs, pwLogClust, gujarati_lex, NULL, modern); @@ -2595,6 +2658,8 @@ static void ContextualShape_Oriya(HDC hdc, ScriptCache *psc, SCRIPT_ANALYSIS *ps NtGdiGetGlyphIndicesW(hdc, input, cCount, pwOutGlyphs, 0); *pcGlyphs = cCount; + mark_invalid_syllables(hdc, input, cCount, pwOutGlyphs, pcGlyphs, cMaxGlyphs, pwLogClust, syllables, syllable_count, oriya_lex); + /* Step 3: Base Form application to syllables */ ShapeIndicSyllables(hdc, psc, psa, input, cChars, syllables, syllable_count, pwOutGlyphs, pcGlyphs, pwLogClust, oriya_lex, NULL, modern); @@ -2645,6 +2710,8 @@ static void ContextualShape_Tamil(HDC hdc, ScriptCache *psc, SCRIPT_ANALYSIS *ps NtGdiGetGlyphIndicesW(hdc, input, cCount, pwOutGlyphs, 0); *pcGlyphs = cCount; + mark_invalid_syllables(hdc, input, cCount, pwOutGlyphs, pcGlyphs, cMaxGlyphs, pwLogClust, syllables, syllable_count, tamil_lex); + /* Step 3: Base Form application to syllables */ ShapeIndicSyllables(hdc, psc, psa, input, cChars, syllables, syllable_count, pwOutGlyphs, pcGlyphs, pwLogClust, tamil_lex, SecondReorder_Like_Tamil, modern); @@ -2694,6 +2761,8 @@ static void ContextualShape_Telugu(HDC hdc, ScriptCache *psc, SCRIPT_ANALYSIS *p NtGdiGetGlyphIndicesW(hdc, input, cCount, pwOutGlyphs, 0); *pcGlyphs = cCount; + mark_invalid_syllables(hdc, input, cCount, pwOutGlyphs, pcGlyphs, cMaxGlyphs, pwLogClust, syllables, syllable_count, telugu_lex); + /* Step 3: Base Form application to syllables */ ShapeIndicSyllables(hdc, psc, psa, input, cChars, syllables, syllable_count, pwOutGlyphs, pcGlyphs, pwLogClust, telugu_lex, SecondReorder_Like_Telugu, modern); @@ -2746,6 +2815,8 @@ static void ContextualShape_Kannada(HDC hdc, ScriptCache *psc, SCRIPT_ANALYSIS * NtGdiGetGlyphIndicesW(hdc, input, cCount, pwOutGlyphs, 0); *pcGlyphs = cCount; + mark_invalid_syllables(hdc, input, cCount, pwOutGlyphs, pcGlyphs, cMaxGlyphs, pwLogClust, syllables, syllable_count, kannada_lex); + /* Step 3: Base Form application to syllables */ ShapeIndicSyllables(hdc, psc, psa, input, cChars, syllables, syllable_count, pwOutGlyphs, pcGlyphs, pwLogClust, kannada_lex, SecondReorder_Like_Telugu, modern); @@ -2791,6 +2862,8 @@ static void ContextualShape_Malayalam(HDC hdc, ScriptCache *psc, SCRIPT_ANALYSIS NtGdiGetGlyphIndicesW(hdc, input, cCount, pwOutGlyphs, 0); *pcGlyphs = cCount; + mark_invalid_syllables(hdc, input, cCount, pwOutGlyphs, pcGlyphs, cMaxGlyphs, pwLogClust, syllables, syllable_count, malayalam_lex); + /* Step 3: Base Form application to syllables */ ShapeIndicSyllables(hdc, psc, psa, input, cChars, syllables, syllable_count, pwOutGlyphs, pcGlyphs, pwLogClust, malayalam_lex, SecondReorder_Like_Tamil, modern); @@ -2825,6 +2898,8 @@ static void ContextualShape_Khmer(HDC hdc, ScriptCache *psc, SCRIPT_ANALYSIS *ps NtGdiGetGlyphIndicesW(hdc, input, cCount, pwOutGlyphs, 0); *pcGlyphs = cCount; + mark_invalid_syllables(hdc, input, cCount, pwOutGlyphs, pcGlyphs, cMaxGlyphs, pwLogClust, syllables, syllable_count, khmer_lex); + /* Step 2: Base Form application to syllables */ ShapeIndicSyllables(hdc, psc, psa, input, cChars, syllables, syllable_count, pwOutGlyphs, pcGlyphs, pwLogClust, khmer_lex, NULL, FALSE); diff --git a/dlls/gdi32/uniscribe/usp10_internal.h b/dlls/gdi32/uniscribe/usp10_internal.h index b8ae1fb1a57..aa24df22308 100644 --- a/dlls/gdi32/uniscribe/usp10_internal.h +++ b/dlls/gdi32/uniscribe/usp10_internal.h @@ -218,6 +218,7 @@ typedef struct _scriptData } scriptData; typedef struct { + BOOL valid; INT start; INT base; INT ralf; -- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/10704
On Fri May 1 17:37:14 2026 +0000, समीरसिंह Sameer Singh wrote:
hmm, it looks like harfbuzz also groups all Mm at the beginning and all Mp at the end with the dotted circle in the middle. ``` hb-view /usr/share/fonts/TiroIndigo-otf/TiroBangla-Regular.otf "ৌৌৌ" --output-format=png --output-file=test.png ``` {width="716" height="325"} This can be also viewed using `hb-shape` ```
hb-shape /usr/share/fonts/TiroIndigo-otf/TiroBangla-Regular.otf "ৌৌৌ" [bSignE.init=0+396|bSignE=0+405|bSignE=0+405|BASE=0+724|bAuMark=0+247|bAuMark=0+247|bAuMark.fina=0+247]
Where bSignE = Mm, bAuMark = Mp So this does not seem like a bug. The bug I was talking about was when U+09CC is preceded by a space, a dotted circle was inserted at the start of the glyph. Are you aware of this? {width="486" height="90"} ~~Looking closely at the second line, I can see that the glyph is decomposed here, evident by the fact that Mm has a top line.~~ I may be wrong here. > I would also be curious about the behavior of U+09cc in other situations. Is it being properly shaped in string where it is being used correctly? I do not know Bengali so I am not sure how to find it. > > This string `মৌমাছি` appears to be `u+09ae u+09cc u+09ae u+09be u+099b u+09bf` and a quick visual inspection seems to show it being shaped correctly. Yes I can see that it is being shaped properly. Thank you that pointed out a flaw in the logic that I corrected. I believe that should be corrected now.
-- https://gitlab.winehq.org/wine/wine/-/merge_requests/10704#note_138559
On Fri May 1 19:01:36 2026 +0000, Aric Stewart wrote:
Thank you that pointed out a flaw in the logic that I corrected. I believe that should be corrected now. Yes, the error seems to be gone now. I will test this patch some more and will get back to you in a few days to report if I find something else missing.
-- https://gitlab.winehq.org/wine/wine/-/merge_requests/10704#note_138564
participants (3)
-
Aric Stewart -
Aric Stewart (@aricstewart) -
समीरसिंह Sameer Singh (@ss141309)