On 3/26/20 3:55 PM, Derek Lesho wrote:
On 3/26/20 3:19 PM, Zebediah Figura wrote:
On 3/26/20 2:54 PM, Derek Lesho wrote:
On 3/26/20 2:46 PM, Zebediah Figura wrote:
On 3/26/20 12:18 PM, Derek Lesho wrote:
On 3/26/20 11:40 AM, Zebediah Figura wrote:
On 3/25/20 11:57 PM, Derek Lesho wrote: > On 3/24/20 3:22 PM, Zebediah Figura wrote: > >> General comments: >> >> It's not great to introduce code that's not used anywhere, it's >> essentially dead until then. >> >> This could, I think, be split up into much smaller pieces in any case: >> you're introducing two different functions here, and each function >> introduces support for several different formats. >> >> On 3/24/20 2:39 PM, Derek Lesho wrote: >>> Signed-off-by: Derek Lesho dlesho@codeweavers.com >>> --- >>> dlls/winegstreamer/gst_private.h | 4 + >>> dlls/winegstreamer/mfplat.c | 533 ++++++++++++++++++++++++++++++- >>> include/codecapi.h | 38 +++ >>> 3 files changed, 574 insertions(+), 1 deletion(-) >>> create mode 100644 include/codecapi.h >>> >>> diff --git a/dlls/winegstreamer/gst_private.h >>> b/dlls/winegstreamer/gst_private.h >>> index e6fb841fc8..a6c3fd3784 100644 >>> --- a/dlls/winegstreamer/gst_private.h >>> +++ b/dlls/winegstreamer/gst_private.h >>> @@ -36,6 +36,7 @@ >>> #include "winuser.h" >>> #include "dshow.h" >>> #include "strmif.h" >>> +#include "mfobjects.h" >>> #include "wine/heap.h" >>> #include "wine/strmbase.h" >>> @@ -54,4 +55,7 @@ void start_dispatch_thread(void) DECLSPEC_HIDDEN; >>> extern HRESULT mfplat_get_class_object(REFCLSID rclsid, REFIID riid, >>> void **obj) DECLSPEC_HIDDEN; >>> +IMFMediaType* media_type_from_caps(GstCaps *caps); >>> +GstCaps *caps_from_media_type(IMFMediaType *type); >>> + >> Using the generic name "media_type", in a module that serves multiple >> media APIs, is not great. > Would you prefer mf_media_type? That's probably better, yes.
>> Also, why is this in the public header? > Would it be better to split this into a mfplat_private.h header? I mean, why do you need to use it from anything other than mfplat.c?
Because I'd prefer to not merge around 4000 thousands lines of code into a single file. (See media_source.c, mf_decode.c)
This is another reason why it doesn't make a lot of sense to submit dead code.
The code which uses these functions are included in my more recent patch-set.
While submitting the code that uses a helper function in the same patch set does help, it's still not the best way to organize patches. Also, in this case, it means submitting at least 16 patches in one set, which is not desirable either.
The best way to submit such a patch set is to add the code which uses (or is going to use) media_type_from_caps() first, then actually implement media_type_from_caps(). That can mean e.g. adding a stub media_type_from_caps() that prints a FIXME and returns NULL, such as in fb6956c7d, or just leaving that part out of the caller (and probably doing a similar fail-with-FIXME). I don't know what the best way to arrange that is in this case, but I'm not the one writing the patches.
Such a top-down approach is much easier to review, because then you know exactly how a helper will be used when or before you have to review that helper's implementation. When you submit the helper by itself, first, it's hard to understand if it's doing the right thing. You also won't have dead code (and won't have to work around compiler warnings for such by e.g. making functions non-static).
Ah, I see what you mean, that makes sense.
>> Also, style nitpick: please try to be consistent about your asterisk >> placement (ideally using "type *var" style.) > Ack. >>> #endif /* __GST_PRIVATE_INCLUDED__ */ >> ... >> >>> @@ -433,3 +438,529 @@ HRESULT mfplat_get_class_object(REFCLSID >>> rclsid, REFIID riid, void **obj) >>> return CLASS_E_CLASSNOTAVAILABLE; >>> } >>> + >>> +struct aac_user_data >>> +{ >>> + WORD payload_type; >>> + WORD profile_level_indication; >>> + WORD struct_type; >>> + WORD reserved; >>> + /*BYTE audio_specific_config;*/ >> What's this field doing here? > We store the audio_config_config after these fields, and I wanted to > express that here, it's not important though. It's not necessarily a problem to specify that arbitrary data comes after the struct, but that comment is not particularly clear.
>>> +}; >>> + >>> +/* IMPORTANT: caps will be modified to represent the exact type >>> needed for the format */ >> Why? > Because in the case of a demuxer, the caps of the stream we receive > might not map 1:1 with the representation in media foundation. Because > of this, in the media source, if any adjustments are needed, we feed the > buffers through a parser to correct it. > > See: > https://github.com/Guy1524/wine/commit/7ab88be3882ab95f3fc17dab374184e06f018... This seems like a very confusing way to do that. At least I'd relegate it to a separate function. I wouldn't expect a conversion function to modify its argument, and it moreover makes it essentially unusable anywhere else.
The alternative is to just fail, because there's no mapping. For example there's no equivalent to a non annex b h.264 stream in MF.
Sure, I think that's a good design. It's a clear way to communicate "we don't support these caps".
But we do want to support those caps, with modifications. When modifications are needed, we try to use a parser to perform those transformations. For example, qtdemux doesn't output h264 streams in annex b form, so we find a parser that converts it into that form.
Right, the idea is to make those modifications *before* converting into an MF media type.
For reference, the way it's done in quartz is:
(1) during test-play, we record the pin's preferred caps and convert them to a DirectShow media type [setcaps_sink()]; (2) when connecting to the downstream DirectShow sink, we first propose that media type [gstdecoder_source_get_media_type(index = 0)]; (3) if that fails, we propose a bunch of other DirectShow types to the downstream sink [gstdecoder_source_get_media_type(index > 0)]; (4) if none of those work, it tries any types enumerated by the downstream sink, ensuring that GStreamer can understand them [gstdecoder_source_query_accept()]; (5) we convert that type to GstCaps, stripping details we don't care about if necessary, and specify that as the format our sink pad demands [query_sink(), case GST_QUERY_CAPS].
Some of this is implied by the design of quartz (e.g. source pins generally try formats suggested by the downstream sink, though it's not a requirement), but in general, the idea that I think also makes sense here is to determine a media type that you support from the media type that the element exposes, require that type on the sink pad, give GStreamer the tools to convert between the two if necessary, and let GStreamer's caps negotiation do the rest.
I don't think that using GStreamer's caps negotiation is mutually exclusive with the modifications I'm making to the caps in this function. The function's purpose is to find the closest matching GstCaps, and it describes how it got there by performing the modifications. Yes, right now, the code just manually finds a parser, but we could easily set the sink caps to the those returned from this function, then use caps negotation. Essentially, the function determines the media type we support by looking at the preferred caps' format, and just changes the details to make it match with the media foundation representation. I don't see a need to split that off into a separate function, as, in my opinion, the function serves as good documentation on the exact meanings of a given IMFMediaType.
I wouldn't expect the purpose of a function called "media_type_from_caps()" to be to find MF-compatible GstCaps from input GstCaps.
Even if you were to rename the function, I don't see the benefit of performing both tasks in a single function. As I see it, you could just as easily have two functions, along the lines of:
static GstCaps *make_compatible_caps(const GstCaps *source_caps) { GstCaps *caps = gst_caps_copy(source_caps); /* ... */
if (!strcmp(type, "video/x-h264")) { /* Media Foundation does not support unparsed h264. */ gst_structure_set(structure, "parsed", GST_TYPE_BOOLEAN, TRUE, NULL); } return caps; }
/* Returns NULL if the type cannot be converted to caps. */ static IMFMediaType *media_type_from_caps(const GstCaps *caps) { /* ... */ }
Besides being clearer to read, as I see it this:
* allows you to use media_type_from_caps() in other places;
* allows you to support advertising multiple types more easily (which I believe mfplat supports in general),
* if, like quartz, we ever want to derive the caps from an IMFMediaType instead of from the source caps, you then don't have to change media_type_from_caps() at all.
That said, these modifications are specific to the format, and along those lines it may make more sense to append specific elements rather than to make specific changes to the caps and try to find an element that can accommodate those. This will also help if you ever need to append multiple such elements. Thus you can e.g. append an audioconvert element unconditionally, and if no conversion is necessary it'll just pass through.
In the case of compressed sample parsers, what would I append unconditionally? It's very specific to the type.
Looking at the modifications you do make—
- you force h264 into annex B format, which is the job of h264parse;
Yes, because that's how it's represented on windows.
- you force all raw audio into 32-bit float. Does native mfplat really
never output integer PCM?
I think I can fix that, I do know that MFAudioFormat float can only be be F32LE though.
64-bit float exists. (So does 16-bit and 24-bit, in fact.) That's not necessarily to say that any given MF object handles it, but I'd recommend at least checking whether the bit depth and endianness matches what you expect, instead of just assuming that it does.
Okay, I heard somewhere that MFAudioFormat_Float was always 32 bit. That must have been wrong information, I'll fix that.
It's possible that MFAudioFormat_Float is always 32-bit, but GStreamer's audio/x-raw isn't always 32-bit.
Oh, that's why why we set the F32LE format, so that if it isn't 32-bit little-endian, it will be converted to that.
>>> +IMFMediaType* media_type_from_caps(GstCaps *caps) >>> +{ >>> + IMFMediaType *media_type; >>> + GstStructure *info; >>> + const char *media_type_name; >>> + gchar *human_readable; >>> + >>> + if (FAILED(MFCreateMediaType(&media_type))) >>> + { >>> + return NULL; >>> + } >>> + >>> + info = gst_caps_get_structure(caps, 0); >>> + media_type_name = gst_structure_get_name(info); >>> + >>> + human_readable = gst_caps_to_string(caps); >>> + TRACE("caps = %s\n", human_readable); >>> + g_free(human_readable); >> Probably would be best to guard this with TRACE_ON, so that we don't >> bother allocating anything otherwise. >> >> Also, you'll want to use debugstr_a(), especially since caps can overrun >> the static buffer in ntdll. > Ack. >>> + >>> + if (!(strncmp(media_type_name, "video", 5))) >> Style nitpick, superfluous parentheses. >> >> I think Nikolay already mentioned this, but it's probably not a bad idea >> to just match against the whole "video/x-h264" etc. sequence. > Ack. >>> + { >>> + const char *video_format = media_type_name + 6; >>> + gint width, height, framerate_num, framerate_den; >>> + >>> + IMFMediaType_SetGUID(media_type, &MF_MT_MAJOR_TYPE, >>> &MFMediaType_Video); >>> + >>> + if (gst_structure_get_int(info, "width", &width) && >>> gst_structure_get_int(info, "height", &height)) >>> + { >>> + IMFMediaType_SetUINT64(media_type, &MF_MT_FRAME_SIZE, >>> ((UINT64)width << 32) | height); >>> + } >>> + if (gst_structure_get_fraction(info, "framerate", &framerate_num, >>> &framerate_den)) >>> + { >>> + IMFMediaType_SetUINT64(media_type, &MF_MT_FRAME_RATE, >>> ((UINT64)framerate_num << 32) | framerate_den); >>> + } >>> + >>> + if (!(strcmp(video_format, "x-h264"))) >>> + { >>> + const char *profile, *level; >>> + >>> + IMFMediaType_SetGUID(media_type, &MF_MT_SUBTYPE, &MFVideoFormat_H264); >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_COMPRESSED, TRUE); >>> + >>> + if ((profile = gst_structure_get_string(info, "profile"))) >>> + { >>> + if (!(strcmp(profile, "main"))) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_MPEG2_PROFILE, >>> eAVEncH264VProfile_Main); >>> + else if (!(strcmp(profile, "high"))) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_MPEG2_PROFILE, >>> eAVEncH264VProfile_High); >>> + else if (!(strcmp(profile, "high-4:4:4"))) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_MPEG2_PROFILE, >>> eAVEncH264VProfile_444); >>> + else >>> + ERR("Unrecognized profile %s\n", profile); >> This ERR (and many below) should probably be a FIXME instead, methinks. > Ack. >>> + } >>> + if ((level = gst_structure_get_string(info, "level"))) >>> + { >>> + if (!(strcmp(level, "1"))) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_MPEG2_LEVEL, >>> eAVEncH264VLevel1); >>> + else if (!(strcmp(level, "1.1"))) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_MPEG2_LEVEL, >>> eAVEncH264VLevel1_1); >>> + else if (!(strcmp(level, "1.2"))) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_MPEG2_LEVEL, >>> eAVEncH264VLevel1_2); >>> + else if (!(strcmp(level, "1.3"))) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_MPEG2_LEVEL, >>> eAVEncH264VLevel1_3); >>> + else if (!(strcmp(level, "2"))) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_MPEG2_LEVEL, >>> eAVEncH264VLevel2); >>> + else if (!(strcmp(level, "2.1"))) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_MPEG2_LEVEL, >>> eAVEncH264VLevel2_1); >>> + else if (!(strcmp(level, "2.2"))) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_MPEG2_LEVEL, >>> eAVEncH264VLevel2_2); >>> + else if (!(strcmp(level, "3"))) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_MPEG2_LEVEL, >>> eAVEncH264VLevel3); >>> + else if (!(strcmp(level, "3.1"))) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_MPEG2_LEVEL, >>> eAVEncH264VLevel3_1); >>> + else if (!(strcmp(level, "3.2"))) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_MPEG2_LEVEL, >>> eAVEncH264VLevel3_2); >>> + else if (!(strcmp(level, "4"))) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_MPEG2_LEVEL, >>> eAVEncH264VLevel4); >>> + else if (!(strcmp(level, "4.1"))) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_MPEG2_LEVEL, >>> eAVEncH264VLevel4_1); >>> + else if (!(strcmp(level, "4.2"))) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_MPEG2_LEVEL, >>> eAVEncH264VLevel4_2); >>> + else if (!(strcmp(level, "5"))) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_MPEG2_LEVEL, >>> eAVEncH264VLevel5); >>> + else if (!(strcmp(level, "5.1"))) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_MPEG2_LEVEL, >>> eAVEncH264VLevel5_1); >>> + else if (!(strcmp(level, "5.2"))) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_MPEG2_LEVEL, >>> eAVEncH264VLevel5_2); >>> + else >>> + ERR("Unrecognized level %s\n", level); >>> + } >> Could we maybe make this a table instead? > Sure. >>> + gst_caps_set_simple(caps, "stream-format", G_TYPE_STRING, >>> "byte-stream", NULL); >>> + gst_caps_set_simple(caps, "alignment", G_TYPE_STRING, "au", NULL); >>> + for (unsigned int i = 0; i < gst_caps_get_size(caps); i++) >>> + { >>> + GstStructure *structure = gst_caps_get_structure (caps, i); >>> + gst_structure_remove_field(structure, "codec_data"); >>> + } >>> + } >>> + else if (!(strcmp(video_format, "x-wmv"))) >>> + { >>> + gint wmv_version; >>> + const char *format; >>> + const GValue *codec_data; >>> + >>> + if (gst_structure_get_int(info, "wmvversion", &wmv_version)) >>> + { >>> + switch (wmv_version) >>> + { >>> + case 1: >>> + IMFMediaType_SetGUID(media_type, &MF_MT_SUBTYPE, &MFVideoFormat_WMV1); >>> + break; >>> + case 2: >>> + IMFMediaType_SetGUID(media_type, &MF_MT_SUBTYPE, &MFVideoFormat_WMV2); >>> + break; >>> + case 3: >>> + IMFMediaType_SetGUID(media_type, &MF_MT_SUBTYPE, &MFVideoFormat_WMV3); >>> + break; >>> + default: >>> + ERR("Unrecognized wmvversion %d\n", wmv_version); >>> + } >>> + } >>> + >>> + if ((format = gst_structure_get_string(info, "format"))) >>> + { >>> + if (!(strcmp(format, "WVC1"))) >>> + IMFMediaType_SetGUID(media_type, &MF_MT_SUBTYPE, &MFVideoFormat_WVC1); >> What if it's not? I think that deserves at least a FIXME. >> >> (Style nitpick, extra parentheses.) > Ack. >>> + } >>> + >>> + if ((codec_data = gst_structure_get_value(info, "codec_data"))) >>> + { >>> + GstBuffer *codec_data_buffer = gst_value_get_buffer(codec_data); >>> + if (codec_data_buffer) >>> + { >>> + gsize codec_data_size = gst_buffer_get_size(codec_data_buffer); >>> + gpointer codec_data_raw = heap_alloc(codec_data_size); >>> + gst_buffer_extract(codec_data_buffer, 0, codec_data_raw, >>> codec_data_size); >>> + IMFMediaType_SetBlob(media_type, &MF_MT_USER_DATA, codec_data_raw, >>> codec_data_size); >>> + } >>> + } >>> + } >>> + else if (!(strcmp(video_format, "mpeg"))) >>> + { >>> + IMFMediaType_SetGUID(media_type, &MF_MT_SUBTYPE, &MFVideoFormat_M4S2); >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_COMPRESSED, TRUE); >> There are other video/mpeg formats. > TBH, the only reason I've included this is for the tests to work, I'll > look into how to differentiate the mpeg types tomorrow. >>> + } >>> + else if (!(strcmp(video_format, "x-raw"))) >>> + { >>> + const char *fourcc = gst_structure_get_string(info, "stream-format"); >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_COMPRESSED, FALSE); >>> + if (fourcc && (strlen(fourcc) == 4)) >>> + { >>> + GUID fourcc_subtype = MFVideoFormat_Base; >>> + fourcc_subtype.Data1 = MAKEFOURCC( >>> + toupper(fourcc[0]), toupper(fourcc[1]), toupper(fourcc[2]), >>> toupper(fourcc[3])); >>> + IMFMediaType_SetGUID(media_type, &MF_MT_SUBTYPE, &fourcc_subtype); >>> + } >>> + else >>> + ERR("uncompressed video has no stream-format\n"); >> I've never seen a FOURCC stored in the "stream-format" tag; where are >> you getting this from? > You're right, I think I'm supposed to use "format" here, but this is > dead code rn so I that's why I didn't see any problems. >>> + } >>> + else >>> + ERR("Unrecognized video format %s\n", video_format); >>> + } >>> + else if (!(strncmp(media_type_name, "audio", 5))) >>> + { >>> + const char *audio_format = media_type_name + 6; >>> + >>> + IMFMediaType_SetGUID(media_type, &MF_MT_MAJOR_TYPE, >>> &MFMediaType_Audio); >>> + if (!(strcmp(audio_format, "mpeg"))) >>> + { >>> + int mpeg_version = -1; >>> + >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_COMPRESSED, TRUE); >>> + >>> + if (!(gst_structure_get_int(info, "mpegversion", &mpeg_version))) >>> + ERR("Failed to get mpegversion\n"); >>> + switch (mpeg_version) >>> + { >>> + case 1: >>> + { >>> + IMFMediaType_SetGUID(media_type, &MF_MT_SUBTYPE, &MFAudioFormat_MPEG); >>> + break; >>> + } >> What about MFAudioFormat_MP3? > I'm actually not sure what to use here, I should probably remove it for now. >>> + case 2: >>> + case 4: >>> + { >>> + const char *format, *profile, *level; >>> + DWORD profile_level_indication = 0; >>> + const GValue *codec_data; >>> + DWORD asc_size = 0; >>> + struct aac_user_data *user_data = NULL; >>> + >>> + IMFMediaType_SetGUID(media_type, &MF_MT_SUBTYPE, &MFAudioFormat_AAC); >>> + >>> + codec_data = gst_structure_get_value(info, "codec_data"); >>> + if (codec_data) >>> + { >>> + GstBuffer *codec_data_buffer = gst_value_get_buffer(codec_data); >>> + if (codec_data_buffer) >>> + { >>> + if ((asc_size = gst_buffer_get_size(codec_data_buffer)) >= 2) >>> + { >>> + user_data = heap_alloc_zero(sizeof(*user_data)+asc_size); >>> + gst_buffer_extract(codec_data_buffer, 0, (gpointer)(user_data + 1), >>> asc_size); >>> + } >>> + else >>> + ERR("Unexpected buffer size\n"); >>> + } >>> + else >>> + ERR("codec_data not a buffer\n"); >>> + } >>> + else >>> + ERR("codec_data not found\n"); >>> + if (!user_data) >>> + user_data = heap_alloc_zero(sizeof(*user_data)); >>> + >>> + { >>> + int rate; >>> + if (gst_structure_get_int(info, "rate", &rate)) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_AUDIO_SAMPLES_PER_SECOND, >>> rate); >>> + } >>> + { >>> + int channels; >>> + if (gst_structure_get_int(info, "channels", &channels)) >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_AUDIO_NUM_CHANNELS, >>> channels); >>> + } >> Did you mean to add these blocks? > Yeah, it's so I can declare the variables closer to where they are used. I'll admit I don't get the obsession with C99 variable declarations, but this just seems janky.
It wouldn't seem janky if we had C99 variable declarations :P
>>> + >>> + if ((format = gst_structure_get_string(info, "stream-format"))) >>> + { >>> + DWORD payload_type = -1; >>> + if (!(strcmp(format, "raw"))) >>> + payload_type = 0; >>> + else if (!(strcmp(format, "adts"))) >>> + payload_type = 1; >>> + else >>> + ERR("Unrecognized stream-format\n"); >>> + if (payload_type != -1) >>> + { >>> + IMFMediaType_SetUINT32(media_type, &MF_MT_AAC_PAYLOAD_TYPE, >>> payload_type); >>> + user_data->payload_type = payload_type; >>> + } >>> + } >>> + else >>> + { >>> + ERR("Stream format not present\n"); >>> + } >>> + >>> + profile = gst_structure_get_string(info, "profile"); >>> + level = gst_structure_get_string(info, "level"); >>> + /* Data from >>> https://docs.microsoft.com/en-us/windows/win32/medfound/aac-encoder#output-t... >>> */ >> I'm not sure I'd link to Microsoft documentation; it's not very stable. > Would a link to an archive.is backup of it be better? Probably.
>>> + if (profile && level) >>> + { >>> + if (!(strcmp(profile, "lc")) && !(strcmp(level, "2"))) >>> + profile_level_indication = 0x29; >>> + else if (!(strcmp(profile, "lc")) && !(strcmp(level, "4"))) >>> + profile_level_indication = 0x2A; >>> + else if (!(strcmp(profile, "lc")) && !(strcmp(level, "5"))) >>> + profile_level_indication = 0x2B; >>> + else >>> + ERR("Unhandled profile/level combo\n"); >>> + } >>> + else >>> + ERR("Profile or level not present\n"); >>> + >>> + if (profile_level_indication) >>> + { >>> + IMFMediaType_SetUINT32(media_type, >>> &MF_MT_AAC_AUDIO_PROFILE_LEVEL_INDICATION, profile_level_indication); >>> + user_data->profile_level_indication = profile_level_indication; >>> + } >>> + >>> + IMFMediaType_SetBlob(media_type, &MF_MT_USER_DATA, (BYTE >>> *)user_data, sizeof(user_data) + asc_size); >>> + heap_free(user_data); >>> + break; >>> + } >>> + default: >>> + ERR("Unhandled mpegversion %d\n", mpeg_version); >>> + } >>> + } >>> + else if (!(strcmp(audio_format, "x-raw"))) >>> + { >>> + IMFMediaType_SetGUID(media_type, &MF_MT_SUBTYPE, &MFAudioFormat_Float); >>> + >>> + gst_caps_set_simple(caps, "format", G_TYPE_STRING, "F32LE", NULL); >> There are other audio formats. > Ah, you mean PCM? I'll add a case for that tomorrow. f32le is PCM, but I mean integer PCM and other depths than 32-bit.
Hmm okay, I'll do more research on that.
Presumably there should also be channel and sample rate data here.
Yeah good catch.
>>> + } >>> + else >>> + ERR("Unrecognized audio format %s\n", audio_format); >>> + } >>> + else >>> + { >>> + goto fail; >> I'm generally of the opinion that one line of cleanup doesn't merit a >> "goto". > Okay I'll change that then. >>> + } >>> + >>> + return media_type; >>> + fail: >>> + IMFMediaType_Release(media_type); >>> + return NULL; >>> +} >>> + >>> +static const char *fourcc_str(DWORD fourcc) >>> +{ >>> + if (!fourcc) return NULL; >>> + return wine_dbg_sprintf ("%c%c%c%c", >>> + (char)(fourcc), (char)(fourcc >> 8), >>> + (char)(fourcc >> 16), (char)(fourcc >> 24)); >>> +} >> I don't think you want to use Wine's debugging utilities for non-debug >> code. > Ack. >>> + >>> +GstCaps *caps_from_media_type(IMFMediaType *type) >>> +{ >>> + GUID major_type; >>> + GUID subtype; >>> + GUID base_masked_subtype; >>> + GstCaps *output = NULL; >>> + >>> + if (FAILED(IMFMediaType_GetMajorType(type, &major_type))) >>> + return NULL; >>> + if (FAILED(IMFMediaType_GetGUID(type, &MF_MT_SUBTYPE, &subtype))) >>> + return NULL; >>> + base_masked_subtype = subtype; >>> + base_masked_subtype.Data1 = 0; >>> + >>> + if (IsEqualGUID(&major_type, &MFMediaType_Video)) >>> + { >>> + UINT64 frame_rate = 0, frame_size = 0; >>> + DWORD *framerate_num = ((DWORD*)&frame_rate) + 1; >>> + DWORD *framerate_den = ((DWORD*)&frame_rate); >>> + DWORD *width = ((DWORD*)&frame_size) + 1; >>> + DWORD *height = ((DWORD*)&frame_size); >> It seems simpler to me to do e.g. >> >> DWORD width = frame_size; >> DWORD height = frame_size >> 32; > I'm not getting the width and height here, I'm declaring pointers to > them which are set later on. Right, I mean actually set the variables after retrieving frame_size; in full something like
DWORD width, height; /* ... */ IMFMediaType_GetUINT64(type, &MF_MT_FRAME_SIZE, &frame_size); width = frame_size; height = frame_size >> 32;
Yeah that works.
>>> + >>> + if (IsEqualGUID(&subtype, &MFVideoFormat_H264)) >>> + { >>> + enum eAVEncH264VProfile h264_profile; >>> + enum eAVEncH264VLevel h264_level; >>> + output = gst_caps_new_empty_simple("video/x-h264"); >>> + gst_caps_set_simple(output, "stream-format", G_TYPE_STRING, >>> "byte-stream", NULL); >>> + gst_caps_set_simple(output, "alignment", G_TYPE_STRING, "au", NULL); >>> + >>> + if (SUCCEEDED(IMFMediaType_GetUINT32(type, &MF_MT_MPEG2_PROFILE, >>> &h264_profile))) >>> + { >>> + const char *profile = NULL; >>> + switch (h264_profile) >>> + { >>> + case eAVEncH264VProfile_Main: profile = "main"; break; >>> + case eAVEncH264VProfile_High: profile = "high"; break; >>> + case eAVEncH264VProfile_444: profile = "high-4:4:4"; break; >>> + default: ERR("Unknown profile %u\n", h264_profile); >>> + } >>> + if (profile) >>> + gst_caps_set_simple(output, "profile", G_TYPE_STRING, profile, NULL); >>> + } >>> + if (SUCCEEDED(IMFMediaType_GetUINT32(type, &MF_MT_MPEG2_LEVEL, >>> &h264_level))) >>> + { >>> + const char *level = NULL; >>> + switch (h264_level) >>> + { >>> + case eAVEncH264VLevel1: level = "1"; break; >>> + case eAVEncH264VLevel1_1: level = "1.1"; break; >>> + case eAVEncH264VLevel1_2: level = "1.2"; break; >>> + case eAVEncH264VLevel1_3: level = "1.3"; break; >>> + case eAVEncH264VLevel2: level = "2"; break; >>> + case eAVEncH264VLevel2_1: level = "2.1"; break; >>> + case eAVEncH264VLevel2_2: level = "2.2"; break; >>> + case eAVEncH264VLevel3: level = "3"; break; >>> + case eAVEncH264VLevel3_1: level = "3.1"; break; >>> + case eAVEncH264VLevel3_2: level = "3.2"; break; >>> + case eAVEncH264VLevel4: level = "4"; break; >>> + case eAVEncH264VLevel4_1: level = "4.1"; break; >>> + case eAVEncH264VLevel4_2: level = "4.2"; break; >>> + case eAVEncH264VLevel5: level = "5"; break; >>> + case eAVEncH264VLevel5_1: level = "5.1"; break; >>> + case eAVEncH264VLevel5_2: level = "5.2"; break; >>> + default: ERR("Unknown level %u\n", h264_level); >>> + } >>> + if (level) >>> + gst_caps_set_simple(output, "level", G_TYPE_STRING, level, NULL); >>> + } >>> + } >>> + else if (IsEqualGUID(&subtype, &MFVideoFormat_WVC1)) >>> + { >>> + BYTE *user_data; >>> + DWORD user_data_size; >>> + output = gst_caps_new_empty_simple("video/x-wmv"); >>> + gst_caps_set_simple(output, "format", G_TYPE_STRING, "WVC1", NULL); >>> + >>> + gst_caps_set_simple(output, "wmvversion", G_TYPE_INT, 3, NULL); >>> + >>> + if (SUCCEEDED(IMFMediaType_GetAllocatedBlob(type, &MF_MT_USER_DATA, >>> &user_data, &user_data_size))) >>> + { >>> + GstBuffer *codec_data_buffer = gst_buffer_new_allocate(NULL, >>> user_data_size, NULL); >>> + gst_buffer_fill(codec_data_buffer, 0, user_data, user_data_size); >>> + gst_caps_set_simple(output, "codec_data", GST_TYPE_BUFFER, >>> codec_data_buffer, NULL); >>> + gst_buffer_unref(codec_data_buffer); >>> + CoTaskMemFree(user_data); >>> + } >>> + } >>> + else if (IsEqualGUID(&base_masked_subtype, &MFVideoFormat_Base)) >>> + { >>> + output = gst_caps_new_empty_simple("video/x-raw"); >>> + gst_caps_set_simple(output, "format", G_TYPE_STRING, >>> fourcc_str(subtype.Data1), NULL); >> What about RGB formats? > Ah, I didn't think about those, looks like we'll have to use a table of > known conversions instead. Well, to some degree, though you can also make use of gst_video_format_from_fourcc(). See also amt_to_gst_caps_video() in gstdemux.c.
Ah check for RGB formats first then fall back to FOURCC conversion, okay sure.
>>> + } >>> + else { >>> + ERR("Unrecognized subtype %s\n", debugstr_guid(&subtype)); >>> + return NULL; >>> + } >>> + >>> + IMFMediaType_GetUINT64(type, &MF_MT_FRAME_RATE, &frame_rate); >>> + IMFMediaType_GetUINT64(type, &MF_MT_FRAME_SIZE, &frame_size); >>> + >>> + if (frame_rate) >>> + gst_caps_set_simple(output, "framerate", GST_TYPE_FRACTION, >>> *framerate_num, *framerate_den, NULL); >>> + if (frame_size) >>> + { >>> + gst_caps_set_simple(output, "width", G_TYPE_INT, *width, NULL); >>> + gst_caps_set_simple(output, "height", G_TYPE_INT, *height, NULL); >>> + } >>> + return output; >>> + } >>> + else if (IsEqualGUID(&major_type, &MFMediaType_Audio)) >>> + { >>> + DWORD rate, channels; >>> + >>> + if (IsEqualGUID(&subtype, &MFAudioFormat_AAC)) >>> + { >>> + DWORD payload_type, indication; >>> + struct aac_user_data *user_data; >>> + UINT32 user_data_size; >>> + output = gst_caps_new_empty_simple("audio/mpeg"); >>> + >>> + /* TODO */ >>> + gst_caps_set_simple(output, "framed", G_TYPE_BOOLEAN, TRUE, NULL); >>> + gst_caps_set_simple(output, "mpegversion", G_TYPE_INT, 4, NULL); >> What's TODO here? > MFAudioFormat_AAC could also mean mpegversion=2, and I don't know what > the "framed" attribute is for. A TODO message should probably mention what exactly is to be done.
In general it's good practice to understand what your code is doing before you submit it, but regardless, "framed" means there is exactly one frame per buffer. Is that guaranteed by the MF source? (It's not obvious to me that it is...)
Yeah I should probably remove it in that case, I was trying to match up all the attributes when going through the conversion to IMFMediaType and back, but it's probably not necessary.
>>> + >>> + if (SUCCEEDED(IMFMediaType_GetUINT32(type, &MF_MT_AAC_PAYLOAD_TYPE, >>> &payload_type))) >>> + { >>> + switch (payload_type) >>> + { >>> + case 0: >>> + gst_caps_set_simple(output, "stream-format", G_TYPE_STRING, "raw", >>> NULL); >>> + break; >>> + case 1: >>> + gst_caps_set_simple(output, "stream-format", G_TYPE_STRING, "adts", >>> NULL); >>> + break; >>> + default: >>> + gst_caps_set_simple(output, "stream-format", G_TYPE_STRING, "raw", >>> NULL); >> Seems to me that 2 and 3 should be mapped to "adif" and "loas", >> respectively. > Ack. >>> + } >>> + } >>> + else >>> + gst_caps_set_simple(output, "stream-format", G_TYPE_STRING, "raw", >>> NULL); >>> + >>> + if (SUCCEEDED(IMFMediaType_GetUINT32(type, >>> &MF_MT_AAC_AUDIO_PROFILE_LEVEL_INDICATION, &indication))) >>> + { >>> + switch (indication) >>> + { >>> + case 0x29: >>> + { >>> + gst_caps_set_simple(output, "profile", G_TYPE_STRING, "lc", NULL); >>> + gst_caps_set_simple(output, "level", G_TYPE_STRING, "2", NULL); >>> + break; >>> + } >>> + case 0x2A: >>> + { >>> + gst_caps_set_simple(output, "profile", G_TYPE_STRING, "lc", NULL); >>> + gst_caps_set_simple(output, "level", G_TYPE_STRING, "4", NULL); >>> + break; >>> + } >>> + case 0x2B: >>> + { >>> + gst_caps_set_simple(output, "profile", G_TYPE_STRING, "lc", NULL); >>> + gst_caps_set_simple(output, "level", G_TYPE_STRING, "5", NULL); >>> + break; >>> + } >>> + default: >>> + ERR("Unrecognized profile-level-indication %u\n", indication); >>> + } >> I think you could significantly deduplicate this switch. > Ack. >>> + } >>> + >>> + if (SUCCEEDED(IMFMediaType_GetAllocatedBlob(type, &MF_MT_USER_DATA, >>> (BYTE **) &user_data, &user_data_size))) >>> + { >>> + if (user_data_size > sizeof(sizeof(*user_data))) >>> + { >>> + GstBuffer *audio_specific_config = gst_buffer_new_allocate(NULL, >>> user_data_size - sizeof(*user_data), NULL); >>> + gst_buffer_fill(audio_specific_config, 0, user_data + 1, >>> user_data_size - sizeof(*user_data)); >>> + >>> + gst_caps_set_simple(output, "codec_data", GST_TYPE_BUFFER, >>> audio_specific_config, NULL); >>> + gst_buffer_unref(audio_specific_config); >>> + } >>> + CoTaskMemFree(user_data); >>> + } >>> + } >>> + else if (IsEqualGUID(&subtype, &MFAudioFormat_Float)) >>> + { >>> + output = gst_caps_new_empty_simple("audio/x-raw"); >>> + >>> + gst_caps_set_simple(output, "format", G_TYPE_STRING, "F32LE", NULL); >>> + } >>> + else >>> + { >>> + ERR("Unrecognized subtype %s\n", debugstr_guid(&subtype)); >>> + if (output) >>> + gst_caps_unref(output); >>> + return NULL; >>> + } >>> + if (SUCCEEDED(IMFMediaType_GetUINT32(type, >>> &MF_MT_AUDIO_SAMPLES_PER_SECOND, &rate))) >>> + { >>> + gst_caps_set_simple(output, "rate", G_TYPE_INT, rate, NULL); >>> + } >>> + if (SUCCEEDED(IMFMediaType_GetUINT32(type, >>> &MF_MT_AUDIO_NUM_CHANNELS, &channels))) >>> + { >>> + gst_caps_set_simple(output, "channels", G_TYPE_INT, channels, NULL); >>> + } >>> + >>> + return output; >>> + } >>> + >>> + ERR("Unrecognized major type %s\n", debugstr_guid(&major_type)); >>> + return NULL; >>> +} >>> diff --git a/include/codecapi.h b/include/codecapi.h >>> new file mode 100644 >>> index 0000000000..2690b523d7 >>> --- /dev/null >>> +++ b/include/codecapi.h >>> @@ -0,0 +1,38 @@ >>> +#ifndef __CODECAPI_H >>> +#define __CODECAPI_H >>> + >>> +enum eAVEncH264VProfile >>> +{ >>> + eAVEncH264VProfile_unknown = 0, >>> + eAVEncH264VProfile_Simple = 66, >>> + eAVEncH264VProfile_Base = 66, >>> + eAVEncH264VProfile_Main = 77, >>> + eAVEncH264VProfile_High = 100, >>> + eAVEncH264VProfile_422 = 122, >>> + eAVEncH264VProfile_High10 = 110, >>> + eAVEncH264VProfile_444 = 244, >>> + eAVEncH264VProfile_Extended = 88, >>> +}; >>> + >>> +enum eAVEncH264VLevel >>> +{ >>> + eAVEncH264VLevel1 = 10, >>> + eAVEncH264VLevel1_b = 11, >>> + eAVEncH264VLevel1_1 = 11, >>> + eAVEncH264VLevel1_2 = 12, >>> + eAVEncH264VLevel1_3 = 13, >>> + eAVEncH264VLevel2 = 20, >>> + eAVEncH264VLevel2_1 = 21, >>> + eAVEncH264VLevel2_2 = 22, >>> + eAVEncH264VLevel3 = 30, >>> + eAVEncH264VLevel3_1 = 31, >>> + eAVEncH264VLevel3_2 = 32, >>> + eAVEncH264VLevel4 = 40, >>> + eAVEncH264VLevel4_1 = 41, >>> + eAVEncH264VLevel4_2 = 42, >>> + eAVEncH264VLevel5 = 50, >>> + eAVEncH264VLevel5_1 = 51, >>> + eAVEncH264VLevel5_2 = 52 >>> +}; >>> + >>> +#endif >>> \ No newline at end of file >>>