Re: [PATCH 6/6] wined3d: Do not warn about WINED3DUSAGE_WRITEONLY.
On 3 June 2014 09:41, Stefan Dösinger <stefan(a)codeweavers.com> wrote:
As the previous tests show, we can't do anything with this flag. Sort of. It may make sense to set WINED3D_BUFFER_DOUBLEBUFFER if WINED3DUSAGE_WRITEONLY isn't set.
Am 04.06.2014 um 16:02 schrieb Henri Verbeet <hverbeet(a)gmail.com>:
On 3 June 2014 09:41, Stefan Dösinger <stefan(a)codeweavers.com> wrote:
As the previous tests show, we can't do anything with this flag. Sort of. It may make sense to set WINED3D_BUFFER_DOUBLEBUFFER if WINED3DUSAGE_WRITEONLY isn't set. Why would we want to do that? The only thing I can think of is if there's a GL driver where reading back the buffer object has the same performance characteristics as reading back a WRITEONLY resource on Windows. So far I haven't seen an application / driver combination with such an issue, but I didn't really look for one either.
On 4 June 2014 18:35, Stefan Dösinger <stefandoesinger(a)gmail.com> wrote:
Am 04.06.2014 um 16:02 schrieb Henri Verbeet <hverbeet(a)gmail.com>:
Sort of. It may make sense to set WINED3D_BUFFER_DOUBLEBUFFER if WINED3DUSAGE_WRITEONLY isn't set. Why would we want to do that? The only thing I can think of is if there's a GL driver where reading back the buffer object has the same performance characteristics as reading back a WRITEONLY resource on Windows. So far I haven't seen an application / driver combination with such an issue, but I didn't really look for one either.
I could imagine buffers being moved from VRAM to GART when they're mapped, which would then make subsequent draws potentially slower. Dynamic buffers are more or less expected to be in GART, but we want static buffers to be in VRAM as much as possible. I could also perhaps imagine the driver keeping a copy in CPU memory instead, but that would then use up address space.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Am 2014-06-04 19:21, schrieb Henri Verbeet:
I could imagine buffers being moved from VRAM to GART when they're mapped, which would then make subsequent draws potentially slower. Dynamic buffers are more or less expected to be in GART, but we want static buffers to be in VRAM as much as possible. I could also perhaps imagine the driver keeping a copy in CPU memory instead, but that would then use up address space. I guess those things are possible, but at the moment hypothetical. I don't think we should keep printing the FIXME because of them. If we print anything it would be better to write a FIXME if DYNAMIC is set, but WRITEONLY isn't.
(Yes, I am aware of some buffer handling performance problems on the Nvidia driver, but as far as I understand them they are not about VRAM vs GART placement, but about unneeded synchronization.) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJTkEdGAAoJEN0/YqbEcdMw2NwP/3Qd4GmbLHnqjM5UNiK1eXiB iOGlbKdGtLhRzaDC4uA7Ncc4/1GhWilDNPMq0ebWE5VqsZOcqqcefcHymoo2rG1x sMf85aVZEX+pq18rfaue2b3IXbS0mYtB9hQ1jzruszG3g8aKeMX95w7vG6OCT92h 0a0IOZ4+NPKoUm98S1qck1M4hpxNlpXsdThq40J/z0+dNcrU/7NDQypwNvmz7KEk qp/8a+86YGKVD5i3Cpj6qge5z1YMaaPbB9vEE40ypwndXRG8g0RMJkFUXOMlAVFV zZKwICGxKMKUSLN3Lczp+Sy4fME6nEqs3/BLYlIVKv2++6dYg6fEgxg7bU3r4+XF 4sVxC915CdLfPBEEUls97b94D/s6zt15cNLjvJ1MbuGuOgqz2MFmlQtKY4vV4rSh OC0MpGobGSSMDatfav47SyTJIXU6uljp3wXMFkwJyzcyz1rTbhF6NJMiYXzMSv4l SeU9Bcpc2WbpBjdOFyOsYqYp6kZufTdvpHi2TzuUc7J1JLupGFMfkERpO1kzru9B 23VGHIYg0n4vcYj4Om+iDBLYJxhJRYDvw/3p8PC9S/OsPyhSj5PFldukkNK3o5ua xbTdw5GraDkZ7xrls66QbBnNC9W4ZczTjLWHfjFe+pT/euI2tvMBjM9h3jNGQRzx UvslS/2Cn1+rfD6YfVuj =qyev -----END PGP SIGNATURE-----
On 5 June 2014 12:32, Stefan Dösinger <stefandoesinger(a)gmail.com> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Am 2014-06-04 19:21, schrieb Henri Verbeet:
I could imagine buffers being moved from VRAM to GART when they're mapped, which would then make subsequent draws potentially slower. Dynamic buffers are more or less expected to be in GART, but we want static buffers to be in VRAM as much as possible. I could also perhaps imagine the driver keeping a copy in CPU memory instead, but that would then use up address space. I guess those things are possible, but at the moment hypothetical. I don't think we should keep printing the FIXME because of them. If we print anything it would be better to write a FIXME if DYNAMIC is set, but WRITEONLY isn't.
I suppose you could replace the current FIXME with some kind of d3d_perf WARN in buffer_init(). I'd still prefer benchmarking about the impact of the flag in various scenarios be done.
Am 05.06.2014 um 12:40 schrieb Henri Verbeet <hverbeet(a)gmail.com>:
I suppose you could replace the current FIXME with some kind of d3d_perf WARN in buffer_init(). I'd still prefer benchmarking about the impact of the flag in various scenarios be done.
I have done some basic benchmarking of the WRITEONLY flag in combination with DYNAMIC on Windows. The short summary: D3DUSAGE_WRITEONLY has no impact on Nvidia. On AMD GPUs not setting D3DUSAGE_WRITEONLY makes the common CPU->GPU streaming use case slower. If the application maps the buffer with D3DLOCK_READONLY or even reads back its contents, not setting D3DUSAGE_WRITEONLY improves performance considerably. All tests were run on Windows 7. I have not tested this on Intel. This is the raw data. The values are frames per seconds. The GPU is mostly idle in my test application. "draw" means the common writeonly use case of buffers where data is written with DISCARD or NOOVERWRITE maps. "read" writes data the usual way, draws, then performs a readonly map and copies the data from the buffer into a separate block of memory. "lock only" behaves like read, but does not perform the memcpy. dynamic dynamic | writeonly Geforce 650m draw 925 980 read 1.4 1.4 lock only 390 385 X1600 draw 167 220 read 45 1.69 lock only 159 11.24 hd5770 draw 157 345 read 40 0.39 lock only 145 30
On 4 July 2014 21:27, Stefan Dösinger <stefandoesinger(a)gmail.com> wrote:
I have done some basic benchmarking of the WRITEONLY flag in combination with DYNAMIC on Windows. The short summary: D3DUSAGE_WRITEONLY has no impact on Nvidia. On AMD GPUs not setting D3DUSAGE_WRITEONLY makes the common CPU->GPU streaming use case slower. If the application maps the buffer with D3DLOCK_READONLY or even reads back its contents, not setting D3DUSAGE_WRITEONLY improves performance considerably.
All tests were run on Windows 7. I have not tested this on Intel.
This is the raw data. The values are frames per seconds. The GPU is mostly idle in my test application. "draw" means the common writeonly use case of buffers where data is written with DISCARD or NOOVERWRITE maps. "read" writes data the usual way, draws, then performs a readonly map and copies the data from the buffer into a separate block of memory. "lock only" behaves like read, but does not perform the memcpy.
dynamic dynamic | writeonly Geforce 650m draw 925 980 read 1.4 1.4 lock only 390 385
X1600 draw 167 220 read 45 1.69 lock only 159 11.24
hd5770 draw 157 345 read 40 0.39 lock only 145 30
Actually, another guess would be that the driver will hand you a write-combined mapping if D3DUSAGE_WRITEONLY is set. (And just always on NVIDIA.) In theory that would be visible through VirtualQuery(). Of course OpenGL doesn't really allow you to control that, other than implicitly through GL_MAP_WRITE_BIT / GL_MAP_READ_BIT and usage hints.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Am 2014-07-07 14:09, schrieb Henri Verbeet:
Actually, another guess would be that the driver will hand you a write-combined mapping if D3DUSAGE_WRITEONLY is set. (And just always on NVIDIA.) In theory that would be visible through VirtualQuery(). I did a quick test, and this is exactly what happens.
-----BEGIN PGP SIGNATURE----- Version: GnuPG v2 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJTuphFAAoJEN0/YqbEcdMwfToP/R1z5rvR8j4OhH6ZJ2PHkBe+ bWP/u9938YbsfOi3A7wWVUveguN1eqHw1GXDIKdUEblA1t+LT8ujqnpjc9PJCj9B ja/9dUOSZvUPAMsrSGFh3243VZjnSpyknKnWmMKWU+/erVRSD+EMIjJgrb7OJ+YG zxaUtgdsQrYKpc3uXibRHg0PS0VnKraHkrW/Ne8PzMbDLf4aej6/lICUCRjt1cTp +ZAhMwN/b+8viF7odq4qi41w2fnYcYDQZKjP4sdqk8G7tQMVOcGBxXHND4kLwgBe q79+NKKQVn5xjAJ+GDxoy9/uyQJziPKh6ja3v/+1ybwABptPzblE2fUJEmntPjuZ FFzIQP9o76B2YKR0hzEPlPe+C8MsCIciU1CgLp0zhBHVQMkqMfAyrh/QXFM3sW6R yZbqQY1BOZW5SXki5IdWugVv+ldyiGCojvf1lzmX1l8dZR4yQk6aQ+iJrNECd0vc 6ytiKh7ZAOOfPO5mFePWrew7olc18iSIB2pY3rIuccQxZqYV9j6lXXDJQpCoZFRo syQIvjiL1lHNFggMZ4k/f9FR+T7kT4mNOG0MFGd88tPum4KQW0MyF/jHH/KH06Hp AGN6jk+HKt2d5T4BvQcHlzpcdHctLwwf2A732ol0q7Bkcm/9NatXl9bqhXt0nxE9 Vh3RXHRF9+uYkuXwBgih =dUSk -----END PGP SIGNATURE-----
participants (2)
-
Henri Verbeet -
Stefan Dösinger