Signed-off-by: Rémi Bernon rbernon@codeweavers.com ---
This brings another ~200ms prefix startup time improvement, from ~1.2s to ~1s for "wine cmd /c exit" execution time on average, as well as a ~50ms process startup time improvement, from ~0.25s to ~0.2s execution time when prefix is already started.
The test shows that using wcsicmp is incorrect for face name comparison, or at least that we should not rely on the current locale, and perf also reports a high number of CPU cache miss coming from the locale refcounting, which is the main source of improvement here.
IIUC RtlDowncaseUnicodeChar also does locale dependent case folding, but I'm not sure to see how it's controlled (it's the system default locale that defines the loaded tables right?), and we should probably check if case matching depends on it. If not, is there any canonical normalized unicode case folding that should be used instead?
dlls/gdi32/tests/font.c | 19 ++++++++++++++----- dlls/gdi32/tests/wine_langnames3.sfd | 4 ++-- dlls/gdi32/tests/wine_langnames3.ttf | Bin 2048 -> 2076 bytes 3 files changed, 16 insertions(+), 7 deletions(-)
diff --git a/dlls/gdi32/tests/font.c b/dlls/gdi32/tests/font.c index d3ec971dce0..3bcbebaf91a 100644 --- a/dlls/gdi32/tests/font.c +++ b/dlls/gdi32/tests/font.c @@ -7269,12 +7269,21 @@ static void test_lang_names(void) /* either because it's the primary language, or because it's a secondary */ ok( efnd.total == min( 2, i + 1 ), "%d: EnumFontFamiliesExA unexpected count %u.\n", i, efnd.total );
- strcpy( font.lfFaceName, "Wine Lang Cond (fr)" ); - memset( &efnd, 0, sizeof(efnd) ); - EnumFontFamiliesExA( dc, &font, enum_fullname_data_proc, (LPARAM)&efnd, 0 ); + wcscpy( font_w.lfFaceName, L"Wine Police d'écriture (fr)" ); + memset( &efnd_w, 0, sizeof(efnd_w) ); + EnumFontFamiliesExW( dc, &font_w, enum_fullname_data_proc_w, (LPARAM)&efnd_w, 0 ); /* as wine_langnames3.sfd does not specify (en) name, (fr) is preferred */ - if (i == 2) ok( efnd.total == 1, "%d: EnumFontFamiliesExA unexpected count %u.\n", i, efnd.total ); - else ok( efnd.total == 0, "%d: EnumFontFamiliesExA unexpected count %u.\n", i, efnd.total ); + if (i == 2) ok( efnd_w.total == 1, "%d: EnumFontFamiliesExW unexpected count %u.\n", i, efnd_w.total ); + else ok( efnd_w.total == 0, "%d: EnumFontFamiliesExW unexpected count %u.\n", i, efnd_w.total ); + + /* case matching should not depend on the current locale */ + if (i == 2) + { + wcscpy( font_w.lfFaceName, L"Wine POLICE D'ÉCRITURE (fr)" ); + memset( &efnd_w, 0, sizeof(efnd_w) ); + EnumFontFamiliesExW( dc, &font_w, enum_fullname_data_proc_w, (LPARAM)&efnd_w, 0 ); + todo_wine ok( efnd_w.total == 1, "%d: EnumFontFamiliesExW unexpected count %u.\n", i, efnd_w.total ); + }
strcpy( font.lfFaceName, "Wine Lang Cond (ko)" ); memset( &efnd, 0, sizeof(efnd) ); diff --git a/dlls/gdi32/tests/wine_langnames3.sfd b/dlls/gdi32/tests/wine_langnames3.sfd index ea0dbff57d9..55536fa58c1 100644 --- a/dlls/gdi32/tests/wine_langnames3.sfd +++ b/dlls/gdi32/tests/wine_langnames3.sfd @@ -39,8 +39,8 @@ MarkAttachClasses: 1 DEI: 91125 LangName: 1036 \ "" \ - "Wine Lang Cond (fr)" \ - "Reg (fr)" \ + "Wine Police d'+AOkA-criture (fr)" \ + "R+AOkA-guli+AOgA-re (fr)" \ "" \ "Wine Lang Cond Reg (fr)" LangName: 1042 \ diff --git a/dlls/gdi32/tests/wine_langnames3.ttf b/dlls/gdi32/tests/wine_langnames3.ttf index dd76bd478bbb2546cfbbb7e9184cc7f602c59303..408d127fb1ef2a1d74977aecfb6d70a4ae03d838 100644 GIT binary patch delta 236 zcmZn=m?JPjOo)R42xQz{T-_K%8AKQu*q=<)uxGq8F(X4`#|$<b1_tIi3=AwH1^LA# z|L-xF0TpZjYS3a}1d>4PxX7<-<Bw=&dow0I24)6E79IvB1{tO_D4T^rkEsRB76PhP zW7+~`GcgD--GQ<}YTj+0#G=c{ZpNU?V8~!N`53Dii<z#W;Uu=x_5lp}3^@#$49N_s w3<?Y>4C)Lof$SonNC`tJkOqlrFr)!lnm{K70cF#H;vgL_K(d?Pu?aB(09p|&jsO4v
delta 192 zcmbOu&>%2DOo;sn0|SGMyNjzEgD8Ut0|Wbsi5m8d2PS4@Xsl=c_lALic@7Z&D9A4^ z`G1eW45(lVP=gi&Baj5*kW;bSHvWiawpU})V_;@rWZ_|8Vh~^ogR)r|^q4ZhY$2d( zHKrL*Hb~6@C>x~a(&kAlx{U0G3<eCk43?9Rv6`_M8t7V1Vmm##n_YxogCUKfh(Qx* PWe`IuL;7Yeb|FRpHXtN(
Signed-off-by: Rémi Bernon rbernon@codeweavers.com --- dlls/gdi32/font.c | 68 ++++++++++++++++++++++++++--------------- dlls/gdi32/tests/font.c | 2 +- 2 files changed, 44 insertions(+), 26 deletions(-)
diff --git a/dlls/gdi32/font.c b/dlls/gdi32/font.c index 303467af466..0098fbb2437 100644 --- a/dlls/gdi32/font.c +++ b/dlls/gdi32/font.c @@ -97,6 +97,24 @@ static struct font_gamma_ramp font_gamma_ramp; static void add_face_to_cache( struct gdi_font_face *face ); static void remove_face_from_cache( struct gdi_font_face *face );
+static inline WCHAR facename_tolower( WCHAR c ) +{ + if ((char)c >= 'A' && (char)c <= 'Z') return c - 'A' + 'a'; + else if (c > 127) return RtlDowncaseUnicodeChar( c ); + else return c; +} + +static inline int facename_compare( const WCHAR *str1, const WCHAR *str2, SIZE_T len ) +{ + while (len--) + { + WCHAR c1 = facename_tolower( *str1++ ), c2 = facename_tolower( *str2++ ); + if (c1 != c2) return c1 - c2; + else if (!c1) return 0; + } + return 0; +} + /* Device -> World size conversion */
/* Performs a device to world transformation on the specified width (which @@ -465,7 +483,7 @@ static const WCHAR *get_gdi_font_subst( const WCHAR *from_name, int from_charset
LIST_FOR_EACH_ENTRY( subst, &font_subst_list, struct gdi_font_subst, entry ) { - if (!wcsicmp(subst->names, from_name) && + if (!facename_compare( subst->names, from_name, -1 ) && (subst->from_charset == from_charset || subst->from_charset == -1)) { if (to_charset) *to_charset = subst->to_charset; @@ -538,19 +556,19 @@ static int family_namecmp( const WCHAR *str1, const WCHAR *str2 ) { int prio1, prio2, vert1 = (str1[0] == '@' ? 1 : 0), vert2 = (str2[0] == '@' ? 1 : 0);
- if (!wcsnicmp( str1, ff_swiss_default, LF_FACESIZE - 1 )) prio1 = 0; - else if (!wcsnicmp( str1, ff_modern_default, LF_FACESIZE - 1 )) prio1 = 1; - else if (!wcsnicmp( str1, ff_roman_default, LF_FACESIZE - 1 )) prio1 = 2; + if (!facename_compare( str1, ff_swiss_default, LF_FACESIZE - 1 )) prio1 = 0; + else if (!facename_compare( str1, ff_modern_default, LF_FACESIZE - 1 )) prio1 = 1; + else if (!facename_compare( str1, ff_roman_default, LF_FACESIZE - 1 )) prio1 = 2; else prio1 = 3;
- if (!wcsnicmp( str2, ff_swiss_default, LF_FACESIZE - 1 )) prio2 = 0; - else if (!wcsnicmp( str2, ff_modern_default, LF_FACESIZE - 1 )) prio2 = 1; - else if (!wcsnicmp( str2, ff_roman_default, LF_FACESIZE - 1 )) prio2 = 2; + if (!facename_compare( str2, ff_swiss_default, LF_FACESIZE - 1 )) prio2 = 0; + else if (!facename_compare( str2, ff_modern_default, LF_FACESIZE - 1 )) prio2 = 1; + else if (!facename_compare( str2, ff_roman_default, LF_FACESIZE - 1 )) prio2 = 2; else prio2 = 3;
if (prio1 != prio2) return prio1 - prio2; if (vert1 != vert2) return vert1 - vert2; - return wcsnicmp( str1 + vert1, str2 + vert2, LF_FACESIZE - 1 ); + return facename_compare( str1 + vert1, str2 + vert2, LF_FACESIZE - 1 ); }
static int family_name_compare( const void *key, const struct wine_rb_entry *entry ) @@ -837,7 +855,7 @@ static int remove_font( const WCHAR *file, DWORD flags )
static inline BOOL faces_equal( const struct gdi_font_face *f1, const struct gdi_font_face *f2 ) { - if (wcsicmp( f1->full_name, f2->full_name )) return FALSE; + if (facename_compare( f1->full_name, f2->full_name, -1 )) return FALSE; if (f1->scalable) return TRUE; if (f1->size.y_ppem != f2->size.y_ppem) return FALSE; return !memcmp( &f1->fs, &f2->fs, sizeof(f1->fs) ); @@ -1173,7 +1191,7 @@ static struct gdi_font_link *find_gdi_font_link( const WCHAR *name ) struct gdi_font_link *link;
LIST_FOR_EACH_ENTRY( link, &font_links, struct gdi_font_link, entry ) - if (!wcsnicmp( link->name, name, LF_FACESIZE - 1 )) return link; + if (!facename_compare( link->name, name, LF_FACESIZE - 1 )) return link; return NULL; }
@@ -1186,8 +1204,8 @@ static struct gdi_font_family *find_family_from_font_links( const WCHAR *name, c
LIST_FOR_EACH_ENTRY( link, &font_links, struct gdi_font_link, entry ) { - if (!wcsnicmp( link->name, name, LF_FACESIZE - 1) || - (subst && !wcsnicmp( link->name, subst, LF_FACESIZE - 1 ))) + if (!facename_compare( link->name, name, LF_FACESIZE - 1 ) || + (subst && !facename_compare( link->name, subst, LF_FACESIZE - 1 ))) { TRACE("found entry in system list\n"); LIST_FOR_EACH_ENTRY( entry, &link->links, struct gdi_font_link_entry, entry ) @@ -1288,7 +1306,7 @@ static void populate_system_links( const WCHAR *name, const WCHAR * const *value font_link = add_gdi_font_link( name ); for ( ; *values; values++) { - if (!wcsicmp( name, *values )) continue; + if (!facename_compare( name, *values, -1 )) continue; if (!(value = get_gdi_font_subst( *values, -1, NULL ))) value = *values; if (!(family = find_family_from_name( value ))) continue; /* use first extant filename for this Family */ @@ -1371,12 +1389,12 @@ static void load_system_links(void) { const WCHAR *subst = get_gdi_font_subst( font_links_defaults_list[i].shelldlg, -1, NULL );
- if ((!wcsicmp( font_links_defaults_list[i].shelldlg, shelldlg_name ) || - (subst && !wcsicmp( subst, shelldlg_name )))) + if ((!facename_compare( font_links_defaults_list[i].shelldlg, shelldlg_name, -1 ) || + (subst && !facename_compare( subst, shelldlg_name, -1 )))) { for (j = 0; j < ARRAY_SIZE(font_links_list); j++) populate_system_links( font_links_list[j], font_links_defaults_list[i].substitutes ); - if (!wcsicmp(shelldlg_name, font_links_defaults_list[i].substitutes[0])) + if (!facename_compare(shelldlg_name, font_links_defaults_list[i].substitutes[0], -1)) populate_system_links( shelldlg_name, font_links_defaults_list[i].substitutes ); } } @@ -1477,7 +1495,7 @@ static struct gdi_font_face *find_matching_face_by_name( const WCHAR *name, cons /* search by full face name */ WINE_RB_FOR_EACH_ENTRY( family, &family_name_tree, struct gdi_font_family, name_entry ) LIST_FOR_EACH_ENTRY( face, get_family_face_list(family), struct gdi_font_face, entry ) - if (!wcsnicmp( face->full_name, name, LF_FACESIZE - 1 ) && + if (!facename_compare( face->full_name, name, LF_FACESIZE - 1 ) && can_select_face( face, fs, can_use_bitmap )) return face;
@@ -2150,7 +2168,7 @@ static void create_child_font_list( struct gdi_font *font ) * Sans Serif. This is how asian windows get default fallbacks for fonts */ if (is_dbcs_ansi_cp(GetACP()) && font->charset != SYMBOL_CHARSET && font->charset != OEM_CHARSET && - wcsicmp( font_name, L"Microsoft Sans Serif" ) != 0) + facename_compare( font_name, L"Microsoft Sans Serif", -1 ) != 0) { if ((font_link = find_gdi_font_link( L"Microsoft Sans Serif" ))) { @@ -2175,7 +2193,7 @@ static BOOL fontcmp( const struct gdi_font *font, DWORD hash, const LOGFONTW *lf if (memcmp( &font->matrix, matrix, sizeof(*matrix))) return TRUE; if (memcmp( &font->lf, lf, offsetof(LOGFONTW, lfFaceName))) return TRUE; if (!font->can_use_bitmap != !can_use_bitmap) return TRUE; - return wcsicmp( font->lf.lfFaceName, lf->lfFaceName); + return facename_compare( font->lf.lfFaceName, lf->lfFaceName, -1 ); }
static DWORD hash_font( const LOGFONTW *lf, const FMAT2 *matrix, BOOL can_use_bitmap ) @@ -2696,16 +2714,16 @@ static BOOL family_matches( struct gdi_font_family *family, const WCHAR *face_na { struct gdi_font_face *face;
- if (!wcsnicmp( face_name, family->family_name, LF_FACESIZE - 1 )) return TRUE; + if (!facename_compare( face_name, family->family_name, LF_FACESIZE - 1 )) return TRUE; LIST_FOR_EACH_ENTRY( face, get_family_face_list(family), struct gdi_font_face, entry ) - if (!wcsnicmp( face_name, face->full_name, LF_FACESIZE - 1 )) return TRUE; + if (!facename_compare( face_name, face->full_name, LF_FACESIZE - 1 )) return TRUE; return FALSE; }
static BOOL face_matches( const WCHAR *family_name, struct gdi_font_face *face, const WCHAR *face_name ) { - if (!wcsnicmp( face_name, family_name, LF_FACESIZE - 1)) return TRUE; - return !wcsnicmp( face_name, face->full_name, LF_FACESIZE - 1 ); + if (!facename_compare( face_name, family_name, LF_FACESIZE - 1)) return TRUE; + return !facename_compare( face_name, face->full_name, LF_FACESIZE - 1 ); }
static BOOL enum_face_charsets( const struct gdi_font_family *family, struct gdi_font_face *face, @@ -3571,7 +3589,7 @@ static struct gdi_font *select_font( LOGFONTW *lf, FMAT2 dcmat, BOOL can_use_bit SYMBOL_CHARSET so that Symbol gets picked irrespective of the original value lfCharSet. Note this is a special case for Symbol and doesn't happen at least for "Wingdings*" */ - if (!wcsicmp( lf->lfFaceName, L"Symbol" )) lf->lfCharSet = SYMBOL_CHARSET; + if (!facename_compare( lf->lfFaceName, L"Symbol", -1 )) lf->lfCharSet = SYMBOL_CHARSET;
/* check the cache first */ if ((font = find_cached_gdi_font( lf, &dcmat, can_use_bitmap ))) @@ -7775,7 +7793,7 @@ struct external_key
static int compare_external_key( const void *key, const struct wine_rb_entry *entry ) { - return wcsicmp( key, WINE_RB_ENTRY_VALUE( entry, struct external_key, entry )->value ); + return facename_compare( key, WINE_RB_ENTRY_VALUE( entry, struct external_key, entry )->value, -1 ); }
static struct wine_rb_tree external_keys = { compare_external_key }; diff --git a/dlls/gdi32/tests/font.c b/dlls/gdi32/tests/font.c index 3bcbebaf91a..0d5fa42f043 100644 --- a/dlls/gdi32/tests/font.c +++ b/dlls/gdi32/tests/font.c @@ -7282,7 +7282,7 @@ static void test_lang_names(void) wcscpy( font_w.lfFaceName, L"Wine POLICE D'ÉCRITURE (fr)" ); memset( &efnd_w, 0, sizeof(efnd_w) ); EnumFontFamiliesExW( dc, &font_w, enum_fullname_data_proc_w, (LPARAM)&efnd_w, 0 ); - todo_wine ok( efnd_w.total == 1, "%d: EnumFontFamiliesExW unexpected count %u.\n", i, efnd_w.total ); + ok( efnd_w.total == 1, "%d: EnumFontFamiliesExW unexpected count %u.\n", i, efnd_w.total ); }
strcpy( font.lfFaceName, "Wine Lang Cond (ko)" );
Signed-off-by: Huw Davies huw@codeweavers.com
Rémi Bernon rbernon@codeweavers.com writes:
Signed-off-by: Rémi Bernon rbernon@codeweavers.com
This brings another ~200ms prefix startup time improvement, from ~1.2s to ~1s for "wine cmd /c exit" execution time on average, as well as a ~50ms process startup time improvement, from ~0.25s to ~0.2s execution time when prefix is already started.
The test shows that using wcsicmp is incorrect for face name comparison, or at least that we should not rely on the current locale, and perf also reports a high number of CPU cache miss coming from the locale refcounting, which is the main source of improvement here.
IIUC RtlDowncaseUnicodeChar also does locale dependent case folding, but I'm not sure to see how it's controlled (it's the system default locale that defines the loaded tables right?), and we should probably check if case matching depends on it. If not, is there any canonical normalized unicode case folding that should be used instead?
Unicode case mapping in ntdll is not locale-dependent. You can use CompareStringOrdinal() for that sort of thing.
On 11/23/20 2:16 PM, Alexandre Julliard wrote:
Rémi Bernon rbernon@codeweavers.com writes:
Signed-off-by: Rémi Bernon rbernon@codeweavers.com
This brings another ~200ms prefix startup time improvement, from ~1.2s to ~1s for "wine cmd /c exit" execution time on average, as well as a ~50ms process startup time improvement, from ~0.25s to ~0.2s execution time when prefix is already started.
The test shows that using wcsicmp is incorrect for face name comparison, or at least that we should not rely on the current locale, and perf also reports a high number of CPU cache miss coming from the locale refcounting, which is the main source of improvement here.
IIUC RtlDowncaseUnicodeChar also does locale dependent case folding, but I'm not sure to see how it's controlled (it's the system default locale that defines the loaded tables right?), and we should probably check if case matching depends on it. If not, is there any canonical normalized unicode case folding that should be used instead?
Unicode case mapping in ntdll is not locale-dependent. You can use CompareStringOrdinal() for that sort of thing.
I also wanted to avoid iterating the whole strings as much as possible. RtlCompareUnicodeStrings (and so CompareStringOrdinal too) needs the actual strings length, or at least one of the two strings length to stop the iteration. This starts to add up as these comparisons are done quite a lot, specially when there's a lot of fonts.
Using CompareStringOrdinal, with strlenW calls to get the actual lengths, instead adds ~50ms to the results above when prefix is not start, and ~15ms to the process execution time when prefix is already started.
Of course the results depends on how many system fonts are installed, and I'm testing it with the debian fonts-noto package which brings a lot, but I don't think it's so unusual (I didn't even install it for the test, for some reason I've got it installed for a while).
On 11/23/20 2:40 PM, Rémi Bernon wrote:
On 11/23/20 2:16 PM, Alexandre Julliard wrote:
Rémi Bernon rbernon@codeweavers.com writes:
Signed-off-by: Rémi Bernon rbernon@codeweavers.com
This brings another ~200ms prefix startup time improvement, from ~1.2s to ~1s for "wine cmd /c exit" execution time on average, as well as a ~50ms process startup time improvement, from ~0.25s to ~0.2s execution time when prefix is already started.
The test shows that using wcsicmp is incorrect for face name comparison, or at least that we should not rely on the current locale, and perf also reports a high number of CPU cache miss coming from the locale refcounting, which is the main source of improvement here.
IIUC RtlDowncaseUnicodeChar also does locale dependent case folding, but I'm not sure to see how it's controlled (it's the system default locale that defines the loaded tables right?), and we should probably check if case matching depends on it. If not, is there any canonical normalized unicode case folding that should be used instead?
Unicode case mapping in ntdll is not locale-dependent. You can use CompareStringOrdinal() for that sort of thing.
I also wanted to avoid iterating the whole strings as much as possible. RtlCompareUnicodeStrings (and so CompareStringOrdinal too) needs the actual strings length, or at least one of the two strings length to stop the iteration. This starts to add up as these comparisons are done quite a lot, specially when there's a lot of fonts.
Using CompareStringOrdinal, with strlenW calls to get the actual lengths, instead adds ~50ms to the results above when prefix is not start, and ~15ms to the process execution time when prefix is already started.
Of course the results depends on how many system fonts are installed, and I'm testing it with the debian fonts-noto package which brings a lot, but I don't think it's so unusual (I didn't even install it for the test, for some reason I've got it installed for a while).
Well nevermind, it looks like that RtlCompareUnicodeStrings should support -1 length and stop the comparison on first EOS, according to:
https://testbot.winehq.org/JobDetails.pl?Key=82338
On 11/23/20 3:12 PM, Rémi Bernon wrote:
On 11/23/20 2:40 PM, Rémi Bernon wrote:
On 11/23/20 2:16 PM, Alexandre Julliard wrote:
Rémi Bernon rbernon@codeweavers.com writes:
Signed-off-by: Rémi Bernon rbernon@codeweavers.com
This brings another ~200ms prefix startup time improvement, from ~1.2s to ~1s for "wine cmd /c exit" execution time on average, as well as a ~50ms process startup time improvement, from ~0.25s to ~0.2s execution time when prefix is already started.
The test shows that using wcsicmp is incorrect for face name comparison, or at least that we should not rely on the current locale, and perf also reports a high number of CPU cache miss coming from the locale refcounting, which is the main source of improvement here.
IIUC RtlDowncaseUnicodeChar also does locale dependent case folding, but I'm not sure to see how it's controlled (it's the system default locale that defines the loaded tables right?), and we should probably check if case matching depends on it. If not, is there any canonical normalized unicode case folding that should be used instead?
Unicode case mapping in ntdll is not locale-dependent. You can use CompareStringOrdinal() for that sort of thing.
I also wanted to avoid iterating the whole strings as much as possible. RtlCompareUnicodeStrings (and so CompareStringOrdinal too) needs the actual strings length, or at least one of the two strings length to stop the iteration. This starts to add up as these comparisons are done quite a lot, specially when there's a lot of fonts.
Using CompareStringOrdinal, with strlenW calls to get the actual lengths, instead adds ~50ms to the results above when prefix is not start, and ~15ms to the process execution time when prefix is already started.
Of course the results depends on how many system fonts are installed, and I'm testing it with the debian fonts-noto package which brings a lot, but I don't think it's so unusual (I didn't even install it for the test, for some reason I've got it installed for a while).
Well nevermind, it looks like that RtlCompareUnicodeStrings should support -1 length and stop the comparison on first EOS, according to:
Or not, it just compares lengths... so what I initially said.