Re: d3dx9: Avoid expensive computations

26 Feb 2013


      On Mon, Feb 25, 2013 at 11:08:02AM +0100, Henri Verbeet wrote:
...
On 25 February 2013 10:24, Rico Sch?ller kgbricola@web.de wrote:
...
I did some small tests for speed with the following results. You may also
avoid such a lot of variable assignments like *pout = out and you may use 4
vecs instead. This should save ~48 assignments and it should also improve
the speed a bit more (~10%). Though, native is still 40% faster than that.
I'd somewhat expect native to use SSE versions of this kind of thing
when the CPU supports those instructions. You also generally want to
pay attention to the order in which you access memory, although
perhaps it doesn't matter so much here because an entire matrix should
be able to fit in a single cacheline, provided it's properly aligned.
Also make sure that are memory that is written to can't be aliased
by the memory that is read.
If aliasing is possible the compiler has sequence the code to
ensure the writes and reads happen in the correct order.
That function probably has too many live values for the normal
registers - so some values will get flushed to stack.
SSE might be better.
David
-- 
David Laight: david@l8s.co.uk

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: d3dx9: Avoid expensive computations