 
            `mach_continuous_approximate_time()` has the necessary precision for win32 ticks and can be up to 4x faster than `mach_continuous_time()`.
Also `clock_gettime( CLOCK_REALTIME, &ts )` calls always end up in `__commpage_gettimeofday( struct timeval *tp )`: ``` * frame #0: 0x00007ff806788763 libsystem_kernel.dylib`__commpage_gettimeofday frame #1: 0x00007ff8066709a3 libsystem_c.dylib`gettimeofday + 45 frame #2: 0x00007ff806678b31 libsystem_c.dylib`clock_gettime + 117 ``` These extra calls, setup and converting from one struct format to another costs another 60 CPU cycles and in my testing makes `NtQuerySystemTime` approximately 30% faster as well with this MR. This is a fairly hot code path (especially when using certain out-of-tree in process synchronization patch sets), so probably worth the optimization here.
All of these APIs are available since 10.12.
-- v2: ntdll: Replace '0' with 'NULL' in gettimeofday() calls. ntdll: Use __commpage_gettimeofday in NtQuerySystemTime on macOS. ntdll: Always use mach_continuous_approximate_time on macOS.