Thanks for writing tests. It turns out I was a bit confused when I asked for them.
For some reason, I was was under the impression that GdipMeasureDriverString would return a measurement in device coordinates, when it should have been obvious to me that it returns world coordinates. Looking back at the existing tests confirms that, since the measurements are the same with a NULL matrix regardless of world transform.
That means the test I wanted doesn't answer the question I wanted answered. I'm sorry I created extra work for you with my mistake.
The approach of rotating and scaling still works, but it requires drawing text and visually inspecting the output. I've written and attached a test program that does that (with a matrix that rotates by 90 degrees and a world transform with a much greater scale on the Y axis than the X axis). It shows that things work as I expected. The text is stretched vertically AFTER the rotation is applied, and it's easy to see the difference between native's behavior and your patch.
It also appears that the origin point (and presumably the later points as well) should not have the matrix applied to it. It looks to me like your patch does not transform the points in the GDI32 codepath, and thus gets the origin point correct in the testcase. It does transform them in the SOFTWARE codepath, and if I force that codepath to be taken, the text draws outside the bounds of the window. So the matrix shouldn't be passed to transform_and_round_points, and it's not necessary to add a matrix argument to that function.