Sorry for the late review.
A lot of these patches need more description, ideally description in the patch subject itself.
What's the point of 1/5; how does it help to calculate that info sooner?
Why 2/5? Isn't the decoder supposed to update the output format manually? Does 2/5 solve a specific problem, or is it just to make the code more conceptually correct?
Why 4/5? How does it help to flip that· way?
5/5 seems to be a refactor (changing the semantics of wg_format width/height to no longer include padding) at the same time as a functional change. From what I can read of the functional change I find it very confusing, also. Why can't we just put a meta on the input buffer? In what case will the input and output size be inconsistent, once we've already discarded padding?