actually the test bot fails on your first commit (we require to have no test failures after each commit, not at the end of the serie)
so the preferred is to mark the tests with todo_wine in the first commit (actually showing what isn't working), and remove them in second commit (when you actually committed something)
the different results you get for win7 should be marked with a 'broken(...)' in the ok's condition