I don't think these tests make sense. They're very awkward, both because of the need to poll until the window is visible, and because of the attempt to compare translated text (which I don't think we do anywhere else). They're also not very valuable, since this feature is just for debugging anyway.
I think this is a case where an out-of-tree test (or really just manual verification) is better.