On Fri Dec 2 21:47:10 2022 +0000, Bartosz Kosiorek wrote:
With only `gdiplus: improve performance of matrix multiplication by unrolling loop.` commit applied, I had: Wine gdiplus.dll **with only `matrix_multiply` optimizations** and `matrix_multiply` inlining:
- 500 x `GdipScaleMatrix` time (seconds): 0.21s
- 700 x `GdipMultiplyMatrix` time (seconds): 0.14s
Generally I cannot get value below 0.21s (with `GdipScaleMatrix` sometimes I get 0.06s). It seems that similar optimizations are applied in native Windows gdiplus.dll, as `GdipScaleMatrix` is faster than `GdipMultiplyMatrix` (which will not be possible with using `matrix_multiply`): Native (Windows gdiplus.dll)
- 500 x`GdipScaleMatrix` time (seconds): 0.84s
- 700 x`GdipMultiplyMatrix` time (seconds): 1.22s
I just wanted to verify that it simplifies the generated code for GdipScaleMatrix even with the second patch applied, which it does.