Word 2016 queries a lot of font glyph bounding boxes and bitmaps with translation matrices. The
result of these queries is not cached because the transform matrix is not the identity matrix.
However, the translation offsets don't affect FreeType font operations at all, which can be
verified in ft_matrix_from_dwrite_matrix() called by get_glyph_transform(). So these results with
translation matrices can be cached as well. With this patch, Word 2016 stuttering is reduced
significantly.
--
v3: dwrite: Use cache when font transform matrix contains only translation offsets.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2482
Based on [a patch](https://www.winehq.org/mailman3/hyperkitty/list/wine-devel@winehq.or… by Jinoh Kang (@iamahuman) from February 2022.
I removed the need for the event object and implemented fast paths for Linux.
On macOS 10.14+ `thread_get_register_pointer_values` is called on every thread of the process.
On Linux 4.14+ `membarrier(MEMBARRIER_CMD_GLOBAL_EXPEDITED, ...)` is used.
On x86 Linux <= 4.13 and on other platforms `madvise(..., MADV_DONTNEED)` is used, which sends IPIs to all cores causing them to do a memory barrier.
--
v10: ntdll: Add thread_get_register_pointer_values-based implementation of NtFlushProcessWriteBuffers.
ntdll: Add sys_membarrier-based implementation of NtFlushProcessWriteBuffers.
ntdll: Add MADV_DONTNEED-based implementation of NtFlushProcessWriteBuffers.
https://gitlab.winehq.org/wine/wine/-/merge_requests/741