On 23/04/20 8:15 a. m., Nikolay Sivov wrote:
Hi,
I'm replying to message 1/8, but it really applies to whole set. Could you explain how this works without going into details too much, e.g. some questions that I got so far:
- top level function first identifies pairs of {source, sink} paths as
branches, and that seems reasonable for trivial cases of 1:1 transforms, and no tee nodes.
How could this be used in generic case when you have some of the above? For tee node case for example, with single source and two sinks that will give two paths/pairs, while in fact, assuming sinks use same type for now, you only need to resolve from source to tee node. Same for n:m transform (is that even supported in default loader?), that gives (n x m) branches when n sources are connected to m sinks?
I think generic way would be to resolve connections, independently or as independently as transforms allow, not from every source to sink, but from {node1, output_index} -> {node2, input_index}.
Yeah... I had worked under the assumption that loader only works with straightforward 1:1 transforms; in general I have seen that the loader is simpler than one might suspect. It only solves topologies for playback, for example.
However I will do some tests to see whether a more general approach is justified.
- resolution seems to start with source nodes (and I apologize if I
got it wrong), is that correct? Is there an evidence that it works that way?
wouldn't this be an implementation detail? _Load either returns the new topology, or nothing at all, so I am not sure how we could see if it starts at source or at sink.
AFAIK user is expected to specify sink types, and that's the goal - to construct a graph that delivers such output type, if possible. Wouldn't it make sense to start from output types instead, and follow paths upstream? Sink could only be connected to source directly, transform output, or a tee. First case is trivial, transform will give you a list of what it can output to choose from, you can then add more transforms if necessary. For tee node, it will likely be dictated by primary output type.
the thing is, in the case of decoders/converters (which again I suspect are the only mfts the loader handles, since it only works for playback), it is a general tendency that the mft first expects an input type to be set through _SetInputType(), and only when it's set, and as a function of the type set, it gives available output types through _GetOutputAvailableType(). I remember that at first I intended to do it as you say, starting from sink, but I decided the other way round in part due to this.
Now you could get a list of mfts for the supported output type, but only considering majortype and subtype, which isn't a guarantee that the output is compatible with the downstream node. In any case, the detailed mediatypes for these mfts are obtained only when setting explicitly its input type, but for that we would need the possible output types for its upstream node, which may be another mft that would again require an input type set on it for it to display the detailed possible output types. Continuing like this we would finally get to the source and we would be practically be starting from it, which is what we wanted to avoid in the first place.
So suppose we have:
source --> sink
Start from sink, get its input media type, and get the mfts that output that media type through MFTEnumEx(). We get the mfts that can output that majortype and subtype, without further qualification:
source --> mft1 --> sink
we still can't call _GetOutputAvailableType() on mft1 to get detailed view of types and call _IsMediaTypeSupported on sink's media type handler, so we would need to see if source's output type is compatible as input type for mft1. If not, then get another list of mfts with output types compatible with mft1 as input through MFTEnumEX():
source --> mft2 --> mft1 --> sink
we have gone this far and we still don't even know if mft1 is good or not. Now if source's output is supported by mft2, _SetInputType(source_type) on mft2, which will produce the list of available outputs, and for each one of these we would call _SetInputType(mft2_output) on mft1, and then for each output type produced in mft1 we would call _IsMediaTypeSupported() on sink.
The other way round would be start with
source --> sink
we know source has a single output media type, so get mfts that support it as input through MFTEnumEx():
source --> mft1 --> sink
iterate over mft1's output types through _GetOutputAvailableType(), which we would have right away since we already set its input. Continue like this.
So I think that this tendency, that is common in decoder mfts, justifies starting from source. But apart from this, as far as I remember when I thought about it, I think there isn't much to gain with wherever we start the resolution.
Source nodes however could be in theory more flexible, capture sources might provide several output format and you can pick best suited one, to avoid conversion/resampling/etc.
I'm not sure that we can in any case choose the best suited type. If MF_TOPOLOGY_ENUMERATE_SOURCE_TYPES attribute is set on topology, the loader follows a precise algorithm, which doesn't include deciding the mediatype on the basis of convenience. And if MF_TOPOLOGY_ENUMERATE_SOURCE_TYPES is not set, it simply picks the current media type, and if that is not set, it picks the first available type (index 0), and only tries with that one.
- why is it useful to have lists of whole topology instances?
In each step of the resolution we have to change the state of nodes in order to get available types, and also there is the possibility that nodes and/or whole branches are optional, which would require reverting states.
Suppose we have an initial
source --> mft1 --> sink
the algorithms starts with the pair (source, mft1) and tries to solve that.
A solution to this pair includes much more state than just source node and mft1 node. For example, suppose one such resolution is
source --> mft2 --> mft1
State here includes source's output type, mft2 input type, mft2 output type and mft1 input type (same as mft2 output). It is also possible that another configuration is compatible, maybe with a different output from source, and different types for mft2. Then it is also possible that other resolutions of the pair (source, mft1) are possible, with other intermediate mfts and combination of input/output types.
To simplify this, I construct a topology and configure it with the compatible state for each possibility, cloning the nodes and setting their respective input/output types, and save a list of these topologies. The list is then used as basis for the next iteration (the pair (mft1, sink)), trying each compatible configuration/state in turn that was gathered in the previous iteration. The essence for justifying this, I think, is that in the process we are modifying the state of the nodes, in order to set its inputs and get its outputs, so we need to preserve the states that turn out to be compatible; going back to those states solely by using loops proved to be a nightmare, very unpractical and impossible to read. So cloning nodes and creating a new topology for each positive resolution that we gather on the way is, I believe, a very small price to pay for a much needed simplification.
The other thing is what I mentioned about optional nodes/branches (something that currently isn't handled). Suppose we get something like source --> mft1 --> mft2 --> ? --> sink, that is, we solved almost all the topology, but at the end we couldn't solve the last part. There should be the possibility to revert the state in case any node in the branch turns out to be optional, maybe go back to mft1 and try to connect to other intermediate mfts (if mft2 is optional, for example). There is definitely a need to recover previous useful states, for which I think a higher-level construct comes very handy. I opted for the particular solution of maintaining lists of topologies with the proper state, and cloned nodes, since the process needs to alter the original node's state to be able to figure out its things. I think this particular solution turned out relatively well, but of course I am open to discuss and work with other possibilities.
- what does connection method really do. I didn't get this part in
particular:
2780 hr = IMFAttributes_GetUINT32(attrs_src, &MF_TOPONODE_CONNECT_METHOD, &method); 2781 if (!enum_src_types || (hr == S_OK && !(method & MF_CONNECT_RESOLVE_INDEPENDENT_OUTPUTTYPES))) 2782 { 2783 for (method = MF_CONNECT_DIRECT; method < MF_CONNECT_RESOLVE_INDEPENDENT_OUTPUTTYPES; method++) 2784 { 2785 for (i = 0; i < num_media_types; i++) 2786 { 2787 IMFMediaTypeHandler_SetCurrentMediaType(mth_src, src_mediatypes[i]); 2788 if (SUCCEEDED(hr = topology_loader_resolve_branch_connect_nodes(full_topology, src->id, method))) 2789 goto out; 2790 } 2791 } 2792 } 2793 else 2794 { 2795 for (i = 0; i < num_media_types; i++) 2796 { 2797 for (method = MF_CONNECT_DIRECT; method < MF_CONNECT_RESOLVE_INDEPENDENT_OUTPUTTYPES; method++) 2798 { 2799 IMFMediaTypeHandler_SetCurrentMediaType(mth_src, src_mediatypes[i]); 2800 if (SUCCEEDED(hr = topology_loader_resolve_branch_connect_nodes(full_topology, src->id, method))) 2801 goto out; 2802 } 2803 } 2804 }
First source attribute is checked with MF_CONNECT_RESOLVE_INDEPENDENT_OUTPUTTYPES as a mask, and its actual value then is never used; and second - following loops iterate in [0,4), when only defined values are 0,1,3, and 4, plus some higher bits for OPTIONAL flags.
here the attribute that is retrieved is for the source. Because MF_CONNECT_DIRECT and MF_CONNECT_ALLOW_* applies to connections from upstream nodes, they don't apply to source, so here we only care if it has MF_CONNECT_RESOLVE_INDEPENDENT_OUTPUTTYPES.
I reuse 'method' to iterate through the methods, but I hadn't noticed that 2 isn't defined, so this is definitely a bug.
I can understand that DIRECT means that no decoders and no converters are allowed, and branch is not optional, so if direct connect fails, whole resolution fails. But what happens with method == 2? It's clear what loop reordering does here, and that's how it's documented, but going back to previous point, you can do that as last step too, after sinks are connected and configured.
method == 2 shouldn't happen. My bad here since I assumed the enum was continuous and I simply wanted to use that (supposed) fact to code the iterations.
I must change this to use the correct values from the enum, and first try MF_CONNECT_DIRECT, and then the others.
In reality, I really haven't found any difference from _DECODER and _CONVERTER, I really don't know what _CONVERTER is supposed to mean, since I always see that the category used for MFTEnumEx() is MFT_CATEGORY_*_DECODER. I must further investigate this, so in the meantime I think I am treating MF_CONNECT_ALLOW_CONVERTER/DECODER as the same, but I maintain the distinction at the formal level in the iteration so that we can more easily fix that in the future should any difference show up.
Also, does MF_CONNECT_RESOLVE_INDEPENDENT_OUTPUTTYPES actually apply to sources only as docs say? Do we apply other methods set with MF_TOPONODE_CONNECT_METHOD for other node types?
I haven't tested this attribute with other types of nodes, honestly. We can leave open this possibility. I wouldn't count on the loader accounting for the complexity it would mean to allow this attribute on other nodes, based on the limitations I've seen it has, but I will test this to find out.
Currently we don't consider MF_TOPONODE_CONNECT_METHOD for other nodes, but I think it will be relatively easy given the base code presented. I have considered that we will have to eventually handle this, and other things (for example, MF_CONNECT_AS_OPTIONAL), so the design was conceived with these extensions in mind. I just want first to have a base that works and then start working on these things (there are other attributes that require handling, actually).