Am Sonntag 22 Oktober 2006 10:31 schrieben Sie:
Warning: Lots of rant following, for the specific answers on ivg2's concerns see the end of the mail
n0dalus wrote:
On 10/22/06, Ivan Gyurdiev ivg231@gmail.com wrote:
Constant is convenient, but if it can't meet all necessary requirements, I wouldn't hesitate to drop the idea - never compromise on design in favor of C optimizations. Tomorrow's hardware will make any non-algorithmic optimizations irrelevant.
While this is true for most things, it shouldn't be applied in all cases. For things like graphics processing, I would say every bit of optimization is worth it, even at the expense of a little design flexibility.
Bah.. excuses for bad design. Constant-time access is important, but you need to index on the right thing - see other mail.
Which leads to stuff like java or .net where you need ~100MB of runtime to run a hello world app. Bigger java / .net apps run like bloat even on shiny new hardware(my personal impression, I have no statistics)
Keep in mind that having everyone in the world constantly upgrading their hardware because of attitudes like this is not sustainable --
Sure it is, my computer at work disagrees w/ you.
Well, the thing is that I want yesterday's gams to run on yesterday's hardware. With that upgrading hardware to improve performance is *not* an option.
Linux does a good job in running properly even on old hardware - I have an up-to-date Gentoo setup on my old notebook(120 mhz, 16 mb ram) set up as a router, and it runs as well as the old suse 6.4 I used to have on that notebook. I think we shouldn't waste that potential :-)
Well, my aims regarding performance are basically to be able to run the games equally fast as on windows on the same hardware, and to run games on the hardware that fullfills the minimal requirements of the games. Some more specific targets:
Run Half-Life 2 at >60 fps in dxlevel 90 with all stuff enabled on my gf7600 amd64 dual core box and on an intel mac(core due ~2 ghz, radeon X1600)*
Run Half-Life 2 playable on my notebook(1.6 ghz pentium m, radeon M9) (**)
Get 14000 3dmarks on my notebook in 3dmark2000
Run older stuff(Tomb raider 3, Moto racer 2, Empire Earth, Diablo 2) on my brothers notebook(700 mhz, 128 mb ram, ati mach64 :-D (***) )
* Might be impossible because macos X is bloat and just runs that nice because apple uses superiour hardware
** needs better ati drivers :-o
*** easy because the drivers are way better than the Windows ones.
a far better future would be where a standard computer is cheaper, needs less power, produces less noise and heat, and just does its job.
Why? Just like you upgrade software to get new features and solve problems, you should upgrade hardware for the same purpose. Some problems are best solved in the hardware, rather than wasting programmers' time. What is it with software developers and old computers ?
Well, as we all know the upgrading thing is enforced in the wintel way via file formats and so on. The usual users use office 2003 for the same thing they used office 95, but try to do anything proper with office95 these days ;-)
I dislike that attitude, so I try to make newer versions of my own code to run as fast or faster than older versions on the same hardware with the same features. In the comercial world upgrading hardware is prefered over writing fast code because hardware upgrades are cheaper and easier as long as it stays in acceptable bounds. I have seen comercial software developments in a company(I'm not talking about cw here), and I know open source development since 3 years, and the open source model with little to no financial pressures offers a good way to develop proper software instead of upgrading hardware :-)
--------rant end-----------
Ivan, your concerns about the flexibility of a static table are right. The const pixel format table has hit limits with CheckDeviceFormat and sRGB textures. The solution for CheckDeviceFormat was to take it out of the table, and for sRGB Phil Costin decieded to use a 2nd table for sRGB formats. However, I do not think that more dynamic things like a switch-case statement or a non-const table would change anything. CheckDeviceFormat would still stay as it is, and for sRGB we would need an equal switch like with the const table.
Regarding the sampler states, I claim that the biggest problem wouldn't be solved with a changeable state table or code handling it. The problem is that sampler settings are per texture in gl, and per device in d3d(except I am completely missinformed). The biggest problem of that is keeping track of changes. Render states and other things like matrices, shaders, viewport, ..., do not have that problem.
What the current code basically does is: -> bind texture -> apply sampler states
a first improvement suggestion would be if texture different, bind new texture check each sampler against the old values for the texture, if different appy the new state
I am afraid that there is not much improvement possible over that.
The approach with my constant table would be to group sampler states for sampler X with the texture bound to sampler X. Which would basically mean "if sampler states or texture changed, reapply(or verify) sampler and texture. What is also possible is not to group them and to have the function handling the bound textures check cause an extra check for the sampler. "if the texture is changed apply the texture and apply/verify the sampler states", and "if the sampler states are changed reapply them(and do not care for the texture"
Another difference between samplers and texture stage states, render states, ..., is that samplers are still effective with shaders, while the rest is a part of the fixed pipeline. Thus samplers can be subject to change in dx10 and future versions while the rest is pretty much in its final form(except ms decides to reintroduce the fixed pipeline in dx11)
That said, I do not claim that my constant table is the ultimately best solution. I have a in principle nop patch for render states which moves them to a different file and replaces the switch statement with the table but does no dirtifying yet. I will add shaders, matrices, ..., to it and see how it works out before sending patches. Samplers will not go into that table, I think we shouldn't try to mix apples and pears by force. I rather think about a dirty list per supported sampler(including the bound texture) in addition to the dirty state list. SetTexture and SetSamplerState for a certain sampler will dirtify the pixel shader state to verify the source samplers used in the shader.
Anyone who has other suggestions is free to concretize them. Henri's trees turned out to be basically the same as my idea, except that Henri prefered a list for the dirty list while I originally planned an array. We decided to go for a list because of more efficient memory usage. (Or is there anything else I missed? Lionel taked about trees too, but I don't know what exactly his idea was)