Hi, the last thing I have to do with the profiling stuff is the trickiest: To decide what is the best way to determine the times.
To get times that are useful, I'm using the RDTSC opcode (which returns the number of CPU clock cycles since the machine was powered on).. basically I've added a couple of defines to wine/debug.h:
#define GET_COUNTER(__COUNTER) __asm__ __volatile__ ( "rdtsc" : "=a" (__COUNTER.LowPart), "=d" (__COUNTER.HighPart) )
and theres another, GET_ELAPSED which takes a start time, and calculates the difference.
(BTW: I don't think these names are very good; I'm going to change them)
The elapsed value is output when the relay trace for the function return is printed.
I've also created a new script to analyse these to give you the average times, total times etc, and I've fixed tools/examine-relay to cope with the addition to the trace output.
The problem is.. not all i386-derived CPUs support this feature. As far as I can see there are three ways of determining this. Which would people prefer?
1) Check in /proc/cpuinfo for the "tsc" flag. 2) I could write a short amount of assembly to interrogate the chip itself for this feature (a couple of instructions) 3) Use the existing QueryPerformanceCounter().
Personally, I'm favouring (2)... but what do other people think?