By not using floorf from external library ucrtbase.dll the performance of GdipDrawImagePointsRect was improved.
From: Bartosz Kosiorek gang65@poczta.onet.pl
By not using floorf from external library ucrtbase.dll the performance of GdipDrawImagePointsRect was improved.
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=53947 --- dlls/gdiplus/graphics.c | 38 +++++++++++++++++++++++++++----------- 1 file changed, 27 insertions(+), 11 deletions(-)
diff --git a/dlls/gdiplus/graphics.c b/dlls/gdiplus/graphics.c index 371629a5bef..b75a46c57eb 100644 --- a/dlls/gdiplus/graphics.c +++ b/dlls/gdiplus/graphics.c @@ -1055,22 +1055,38 @@ static ARGB resample_bitmap_pixel(GDIPCONST GpRect *src_rect, LPBYTE bits, UINT } case InterpolationModeNearestNeighbor: { - FLOAT pixel_offset; + /* Using floorf from ucrtbase.dll is extremaly slow compared to own implementation (casting to INT) */ + INT pixel_with_offset_x, pixel_with_offset_y; switch (offset_mode) { - default: - case PixelOffsetModeNone: - case PixelOffsetModeHighSpeed: - pixel_offset = 0.5; - break; + default: + case PixelOffsetModeNone: + case PixelOffsetModeHighSpeed: + if (point->X >= 0.0f) + pixel_with_offset_x = (INT)(point->X + 0.5f); + else + pixel_with_offset_x = (INT)(point->X - 0.5f);
- case PixelOffsetModeHalf: - case PixelOffsetModeHighQuality: - pixel_offset = 0.0; - break; + if (point->Y >= 0.0f) + pixel_with_offset_y = (INT)(point->Y + 0.5f); + else + pixel_with_offset_y = (INT)(point->Y - 0.5f); + break; + case PixelOffsetModeHalf: + case PixelOffsetModeHighQuality: + if (point->X >= 0.0f) + pixel_with_offset_x = (INT)(point->X); + else + pixel_with_offset_x = (INT)(point->X - 1.0f); + + if (point->Y >= 0.0f) + pixel_with_offset_y = (INT)(point->Y); + else + pixel_with_offset_y = (INT)(point->Y - 1.0f); + break; } return sample_bitmap_pixel(src_rect, bits, width, height, - floorf(point->X + pixel_offset), floorf(point->Y + pixel_offset), attributes); + pixel_with_offset_x, pixel_with_offset_y, attributes); }
}
It looks like floor does a lot more, for example handling all the corner cases. Are you sure this isn't a problem here?
FWIW, just having the function inlined already helps a lot.
Btw, is there a reason you changed the switch/case identation? IMHO it makes reading the patch harder.
I think a bigger question is why we need float coordinates here to begin with.