I don't think adding `GST_ELEMENT_FACTORY_TYPE_HARDWARE` is required, we don't disallow hardware decoders. Then we sort the elements by rank, so if the decoder has a rank `None` it will unlikely be chosen over software decoders with a higher rank.
I don't know if there's a good solution here, we could maybe order hardware decoders first but at the same time I'm not sure that hardware decoding is very useful right now. It will always force an additional copy of the frames back to CPU memory, as we have no way to keep the frame on the GPU and pass them downstream as is. I haven't done much measurements though, so if you have numbers showing that even with the cost of the copy it is still worth it, please share :).
In any case, some of the changes look alright. I would approve changing to `GST_RANK_NONE` for instance, and adding a parser upfront seems useful too after all (there's still some shenanigans with stream formats as discussed in https://gitlab.winehq.org/wine/wine/-/merge_requests/3810#note_45157). I would make it optional and generic though, something like that:
```c if ((element = find_element(GST_ELEMENT_FACTORY_TYPE_PARSER, src_caps, src_caps)) && !append_element(transform->container, element, &first, &last)) { gst_caps_unref(raw_caps); goto out; } ```