This improves performance for the game "Grounded", on a AMD Radeon RX 6700 XT, with radv from Mesa 22.3.6. Testing was done with the "cb_access_map_w" option enabled, which also improves performance with the game by itself.
From my testing, it's possible to raise the threshold from 2 ms up to 5 ms or so, before the driver or GPU seems to reclock back to the lower power level. However, this measurement is questionable for several reasons. It seems to vary depending on the scene being rendered, and of course this will be specific to the game and driver and GPU in question anyway. The game also has a weird approach to vsync that seems to involve it presenting stale frames (and hence artificially inflating the FPS), which I'm not fully sure I accounted for while measuring. And of course, it's hard to be sure that 5 ms is actually the threshold for how long the driver will go before powering down the GPU. In any case, it seems better to err on the side of submitting more often, to make sure the fix affects more drivers.
While submission isn't cheap, it seems to me that submitting every 2 ms is unlikely to cause a bottleneck [consider that this is at most 8 (more) submissions per frame].
The maximum of 4 concurrent periodically submitted buffers was chosen arbitrarily. Removing the maximum altogether does not measurably affect performance for this game either way.
Credit goes to Philip Rebohle and his work on DXVK for helping me to notice that periodic submission might make a difference.
-- v3: wined3d: Submit command buffers after 512 draw or dispatch commands. wined3d: Retrieve the VkCommandBuffer from wined3d_context_vk after executing RTV barriers.