Although this shows the feasibility, I'm going to leave the proper unwinding for a later MR. I think that not calling `pthread_exit` is a good first step, although it doesn't call pthread cancel handlers and may leave objects in invalid state, it still solves the mentioned bugs.
Proper unwinding is a bit more complicated, and for instance the version I pushed earlier didn't handle legacy code with threads created without a syscall frame, which could nonetheless use pthread condition variables in a unix / win32 code mix (we have such things in Proton for instance).