On Fri Apr 4 15:55:02 2025 +0000, Connor McAdams wrote:
Correct. No two runs are identical in either case, but it seems to be within the same variance as beefore the function was split off.
For what it's worth, the difference doesn't seem *that* large that I would consider `FORCEINLINE` necessary here[*]. What I feared was that we'd be falling off a performance cliff with the additional function call overhead, but that definitely doesn't seem to be the case from your tests.
The gap with native is significant. No need to tackle that until necessary, but it's something to keep in mind. BTW, what are the time units in the tables above? Milliseconds?
[*]: There are other possible concerns with inlining (e.g. additional cache pressure) that are hard to evaluate in a standalone test and might discourage it, all things being (roughly) equal otherwise.