I'm still primarily concerned with getting all of the features in place and behaving correctly first. Performance isn't a concern for me, but if someone points out a real app where it is an issue then I may be more interested.
I agree that image drawing is far enough along that we can start to optimize it, and I'm not surprised that the performance is bad. I imagine that convert_pixels is very slow in most cases, and optimized codepaths for the common cases could lead to some general improvement (even those cases that are treated specially have not been profiled, so they may not be ideal). The fact that alpha_blend_pixels works using GdipBitmapGetPixel/GdipBitmapSetPixel is very bad, and will hurt performance of any drawing operation on a bitmap.
When drawing to an HDC (or a Bitmap with premultiplied alpha), the fact that we convert from the bitmap format to non-premultiplied alpha, then convert back to premultiplied alpha in alpha_blend_pixels, is probably not good. In cases where we have no ImageAttributes object, we may be able to use premultiplied alpha for the whole process.