On Fri Mar 22 11:53:45 2024 +0000, Giovanni Mascellani wrote:
- vkd3d_shader_cache_put() has a "flags" parameter, but
vkd3d_shader_cache_get() doesn't. Are we sure we'll never need it?
I can add one without defining any valid flags for now. I have no
particular preference either way. However, see below on your question regarding eviction - we could potentially drop the flags on put() as well. One of the advantages of not exposing anything in a public interface for the moment is that we can let interfaces evolve a bit before we commit to anything.
- Presumably "mem_size" and "disk_size" in struct
vkd3d_shader_cache_info are specified in bytes. Being uint32_t's that would make their maximum size about 4 GiB. And while you can store a fair amount of data in 4 GiB, it's perhaps not entirely inconceivable to go over that limit these days. Should we use size_t for these?
32 bit sizes follow Microsoft's API. If we want to make 64 bit limits
possible I'd prefer uint64_t over size_t. If we use size_t we have to use 64 bit sizes in the disk files anyway and beware of situations where a 64 bit sized cache is opened in a 32 bit process. I'd agree with using `uint64_t` rather than `size_t`.
- Do we want to give the application control over when the cache
should be flushed to disk? Probably not a requirement for the initial version, but it seems useful to have options other than "the cache get serialised to disk when the cache is closed" and "any shader cache API call may cause the cache to get serialised to disk".
Yes, I think we want that. The second game I tested in wined3d (Mass
Effect 2) already ran into problems there - it just calls exit() rather than closing the device properly.
The default behavior should probably be something more reliable than
"flush only on close". Either flush after every n writes, have a worker thread flush after a time of inactivity (although I don't like the idea of spawning a thread), flush from an _onexit() callback. A worker thread wouldn't seem inappropriate to me. It gives the advantage that writes are quick for the application (because they don't immediately hit the disk) and can be pooled in batches (which is quite faster for the database, because it only syncs at the end of the transaction).
- Suppose I want to run two instances of e.g. vkd3d-gears
simultaneously. How should that work?
Good question. Windows refuses to open an on-disk cache from two
different processes with CreateShaderCacheSession. I haven't tested what happens with the driver's own cache, but I guess one of the processes will end up without access to the cache.
And yes, doing share checks is not there right now. Windows has better
support for file locking than Unix and the answer is probably somewhere in flock() or CreateFile's dwShareMode. Do we consider valuable to replicate Windows' defect here? It seems that sharing the cache, at least for reading, is just better. On the main reasons I'm pushing for using SQLite (rather that implement our own disk format) is precisely because I expect it to make much easier dealing with scenarios like this. Two processes can open the same SQLite database with no problem, then each time any of the two processes has some new shaders to put in the cache it can begin a transaction, INSERT the shaders and close the transaction. Every now and then the two processes can SELECT to refresh their in-memory idea of which shaders are available in the database, and when they need one it's just another SELECT. No further need to deal with OS-specific details (like locking) or rewriting the whole file each time and then moving it (which helps ensuring that the database remains consistent, but means that writes by one concurrent writer are overwritten by the next one).
- How should names get resolved for on-disk caches? Should "cache"
and "./cache" be considered the same cache?
I think we should leave it up to the file system semantics. And yes,
we might run in a situation where a caller tries to open "c:/Cache", "C:\cache" and "\?\c:\cAcHe" and it fails because the existing cache list doesn't catch that they refer to the same file. I'm not yet convinced that keeping a global list of all the open caches and deduplicate open calls at that level is a good idea, both for in-memory and disk-backed caches. My feeling is that the caller has a better knowledge at what's actually going on to decide when two caches are really the same cache or not. For example, I suppose that right now you rely on this mechanism to ensure that all the devices created within the same process use the same cache object. Why not opening the cache just once, as soon as the first device needs it, and then keep around the cache handle in a global? This problem of comparing file paths goes out of the window.
Fwiw I extended my tests a bit - native limited the maximum cache disk size to 1GB, although not the memory size. If I try to insert huge items (approx 512 MB) StoreValue() crashes. At larger values (1 GB) it returns an error, and at even larger values (around 2GB) it crashes again.
I think using 64 bit sizes is overengineering. It is easy to do though and doesn't cost much, at least for the total cache size.