AddSourceFilter behavior - winapi

The following code is good at rendering an MPG file without audio:
IBaseFilter *pRenderer;
CoCreateInstance(CLSID_VideoRenderer, NULL, CLSCTX_INPROC_SERVER, IID_PPV_ARGS(&pRenderer)));
IFileSourceFilter *pSourceFilter;
IBaseFilter *pBaseFilter;
CoCreateInstance(CLSID_AsyncReader, NULL, CLSCTX_INPROC_SERVER, IID_PPV_ARGS(&pSourceFilter));
pSourceFilter->QueryInterface(IID_PPV_ARGS(&pBaseFilter));
pGraphBuilder->AddFilter(pRenderer, L"Renderer Filter");
pSourceFilter->Load(filename, NULL);
pGraphBuilder->AddFilter(pBaseFilter, L"File Source Filter");
But fails with an WMV file with audio. The failure happens at the following call, when I connect the only output of the video source with the only input of the video renderer.
pGraphBuilder->Connect(pOutPin[0], pInPin[0])
Which returns -2147220969. If I replace the code above with the following:
IBaseFilter *pRenderer;
CoCreateInstance(CLSID_VideoRenderer, NULL, CLSCTX_INPROC_SERVER, IID_PPV_ARGS(&pRenderer));
IBaseFilter *pBaseFilter;
pGraphBuilder->AddSourceFilter(filename, L"Renderer Filter", &pBaseFilter);
pGraphBuilder->AddFilter(pRenderer, L"Renderer Filter");
then the MPG plays fine with:
pGraphBuilder->Connect(pOutPin[0], pInPin[0])
while the WMV results in the same error as above, but instead it plays upside down with:
pGraphBuilder->Connect(pOutPin[1], pInPin[0])
All of this means that the second coding style creates a source with two output pins, and probably audio is mapped to the first one. Or, maybe, an A/V splitter is inserted automatically by DirectShow.
My understanding is that AddSourceFilter can create a splitter transparently. Is it correct?
If I want to do it manually, which component should I use?
Why the WMV video renders upside-down?

Which returns -2147220969
Which is 0x80040217 VFW_E_CANNOT_CONNECT "No combination of intermediate filters could be found to make the connection."
which is the result of your manual adding CLSID_AsyncReader: Windows Media files are typically rendered through another source filter (use GraphEdit from Windows SDK to render a file and you will be able to inspect the topology).
My understanding is that AddSourceFilter can create a splitter transparently. Is it correct?
Yes if splitter is compatible with Async Reader, which is not the case.
If I want to do it manually, which component should I use?
Use GraphEdit to create topologies interactively and you will have an idea what to do on code.
Why the WMV video renders upside-down?
Because of the topology. Most likely you have a weird combination of filters on the pipeline, including third party ones. Inspecting effective topology is the key to resolve the problem.

Use pGraphBuilder->AddSourceFilter() to add the source filter for a specific file. Don't assume that the File Source (Async) is the right source filter (for some formats, the source and demux are combined into a single filter).

Related

Resize MFT Issues: Video Composition in Windows Media Foundation

I'm trying to do composition with two separate video sources in Media Foundation. I am attempting to encode a video with a video overlay. To do so I am attempting to use the Video Resizer on the smaller input.
I've seen several threads on this, but I thought I'd ask around in any case.
Basically the idea is to create two source readers and a sink writer. The source files are h264, so I use the reader to decode into YUY2. While processing samples, I send the appropriate sample to the Resize MFT, then down the line (I haven't made it this far) I combine the two images to create the overlay effect with MFCopyImage.
My question is: I am getting an E_INVALIDARG when I call ProcessInput on the Resize MFT.
To initialize the mft, I am giving it the appropriate type from the reader via SetInput Type. After that I am setting all the appropriate properties via the PropertyStore, and then updating the framesize for the output type of the MFT. I have read the documentation and modeled my implementation according to the MFT Processing Model.
None of these steps raise any red flags until I actually attempt to use ProcessInput.
Although I have limited experience in Windows Media Foundation, I have been able to use the Framerate DSP with success. I would appreciate any advice.
Thank you!
For anyone else stuck in a similar situation, I ended up not using the Resizer MFT but the Video Processor MFT which worked with much less effort.

WMV encoding using Media Foudation: specifying "Number of B Frames"

I am encoding video to WMV using Media Foundation SDK. I see that the number of B frames can be set using a property, but I have no clue how/where to set it.
That property is called MFPKEY_NUMBFRAMES and is described here:
http://msdn.microsoft.com/en-us/library/windows/desktop/ff819354%28v=vs.85%29.aspx
Our code does roughly the following:
call MFStartup
call MFCreateAttributes once so we can set muxer, video and audio attributes
configure the IMFAttributes created in the previous step, for example by setting the video bitrate: pVideoOverrides->SetUINT32(MF_MT_AVG_BITRATE, m_iVideoBitrateBPS);
create sink writer by calling IMFReadWriteClassFactory::CreateInstanceFromURL
for each frame, call WriteSample on the sink writer
call MFShutdown
Am I supposed to set the b-frames property on the IMFAttribute on which I also set the video bitrate?
The property is applicable to Windows Media Video 9 Encoder. That is, you need to locate it on your topology and adjust the property there. Other topology elements (e.g. multiplexer) might accept other properties, but this one has no effect there.
MSDN gives you step by st4ep instructions in Configuring a WMV Encoder and where it says
To specify the target bitrate, set the MF_MT_AVG_BITRATE attribute on the media type.
You can also alter other encoder properties. There is also step by step detailed Tutorial: 1-Pass Windows Media Encoding which shows the steps of the entire process.

DirectShow - How to read a file from a source filter

I'm writing a DirectShow source filter which is registered as a CLSID_VideoInputDeviceCategory, so it can be seen as a Video Capture Device (from Skype, for example, it is viewed as another WebCam).
My source filter is based on the VCam example from here, and, for now, the filter produces the exact output as this example (random colored pixels with one Video output pin, no audio yet), all implemented in the FillBuffer() method of the one and only output pin.
Now the real scenario will be a bit more tricky - The filter uses a file handle to a hardware device, opened using the CreateFile() API call (opening the device is out of my control, and is done by a 3Party library). It should then read chunks of data from this handle (usually 256-512 bytes chunk sizes).
The device is a WinUSB device and the 3Party framework just "gives" me an opened file handle to read chunks from.
The data read by the filter is a *.mp4 file, which is streamed from the device to the "handle".
This scenario is equivalent to a source filter reading from a *.mp4 file on the disk (in "chunks") and pushing its data to the DirectShow graph, but without the ability to read the file entirely from start to end, so the file size is unknown (Correct?).
I'm pretty new to DirectShow and I feel as though I'm missing some basic concepts. I'll be happy if anyone can direct me to solutions\resources\explanations for the following questions:
1) From various sources on the web and Microsoft SDK (v7.1) samples, I understood that for an application (such as Skype) to build a correct & valid DirectShow graph (so it will render the Video & Audio successfully), the source filter pin (inherits from CSourceStream) should implement the method "GetMediaType". Depending on the returned value from this implemented function, an application will be able to build the correct graph to render the data, thus, build the correct order of filters. If this is correct - How would I implement it in my case so that the graph will be built to render *.mp4 input in chunks (we can assume constant chunk sizes)?
2) I've noticed the the FillBuffer() method is supposed to call SetTime() for the IMediaSample object it gets (and fills). I'm reading raw *.mp4 data from the device. Will I have to parse the data and extract the frames & time values from the stream? If yes - an example would b great.
3) Will I have to split the data received from the file handle (the "chunks") to Video & Audio, or can the data be pushed to the graph without the need to manipulate it in the source filter? If split is needed - How can it be done (the data is not continuous, and is spitted to chunks) and will this affect the desired implementation of "GetMediaType"?
Please feel free to correct me if I'm using incorrect terminology.
Thanks :-)
This is a good question. On the one hand this is doable, but there is some specific involved.
First of all, your filter registered under CLSID_VideoInputDeviceCategory category is expected to behave as a live video source. By doing so you make it discoverable by applications (such as Skype as you mentioned), and those applications will be attempting to configure video resolution, they expect video to go at real time rate, some applications (such as Skype) are not expecting compressed video such H.264 there or would just reject such device. You can neither attach audio right to this filter as applications would not even look for audio there (not sure if you have audio on your filter, but you mentioned .MP4 file so audio might be there).
On your questions:
1 - You would have a better picture of application requirement by checking what interface methods applications call on your filter. Most of the methods are implemented by BaseClasses and convert the calls into internal methods such as GetMediaType. Yes you need to implement it, and by doing so you will - among other - enable your filter to connect with downstream filter pins by trying specific media types you support.
Again, those cannot me MP4 chunks, even if such approach can work in other DirectShow graphs. Implementing a video capture device you should be delivering exactly video frames, preferably decompressed (well those could be compressed too, but you are going to immediately have compatibility issies with applications).
A solution you might be thinking of is to embed a fully featured graph internally to which you inject your MP4 chunks, then the pipelines parse those, decodes and delivers to your custom renderer, taking frames on which you re-expose them off your virtual device. This might be a good design, though assumes certain understanding of how filters work internally.
2 - Your device is typically treated as/expected to be a live source, which means that you deliver video in realtime and frames are not necessarily time stamped. So you can put times there and yes you definitely need to extract time stamps from your original media (or have it done by internal graph as mentioned in item 1 above), however be prepared that applications strip time stamps especially for preview purposes, since the source is "live".
3 - Getting back to audio, you cannot implement audio on the same virtual device. Well you can, and this filter might be even working in a custom built graph, but this is not going to work with applications. They will be looking for separate audio device, and if you implement such, they will instantiate it separately. So you are expected to implement both virtual video and virtual audio source, and implement internal synchronization behind the scenes. This is where timestamps will be important, by providing them correctly you will keep lip sync in live session to what it was originally on the media file you are streaming from.

Can I get raw video frames from DirectShow without playback

I'm working on a media player using Media foundation. I want to support VOB files playback. However, media foundation currently does not support the VOB container. Therefore I wish to use DirectShow for the same.
My idea here is not to take an alternate path using a DirectsShow graph, but just grab a video frame and pass it to the same pipeline in media foundation. In media foundation, I have an 'IMFSourceReader' which simply reads frames from the video file. Is there a direct show equivalent, which just gives me the frames without needing to create a graph, start playback cycle, and then trying to extract frames from the renders pin? (To be more clear, does DirectsShow support an architecture wherein it could give me raw frames without actually having to play the video?)
I've read about ISampleGrabber but its deprecated and I think it won't fit my architecture. I've not worked on DirectShow before.
Thanks,
Mots
You have to build a graph and accept frames from the respective parser/demultiplexer filter which will read container and deliver individual frames on its output.
The playback does not have to be realtime, nor you need to fake painting those video frames somewhere. Once you get the data you need in Sample Grabber filter, or a customer filter, you can terminate pipeline with a Null Renderer. That is, you can arrange getting frames you need in a more or less convenient way.
You can use Monogram frame grabber filter to connect the VOB DS filter's output - it works great. See the comments there for how to connect the output to external application.

How do I implement a DShow filter for reading specialized AVI file

I'm trying to write DirectShow filter which will read file containts some xml-data at the beginning and avi video after it. I'm going open a file in the filter, skip the xml-data and begin the playback. I found in the Windows SDK the example which played BMP-file (Microsoft SDKs\Windows\v7.1\Samples\multimedia\directshow\filters\pushsource). Where can I spy out how do I read avi frames, convert ones and push it in an output pin?
Sorry for my English.
You can find AVI file specs here. But there is an easier solution: use standard AVI Splitter filter which is part of DirectShow. Just take another sample filter from SDK - Async and make it read your XML data and then act as a regular file source but reading data from your file with some shifted offset. This way all the parsing work will be done by AVI Splitter and all your filter needs to do is reading parts of file that Splitter requests.

Resources