How does Windows Media Foundation Media Source or Source Reader handle overrun? - media

I've implemented a UVC video viewing application using the source reader in async mode (OnReadSample()). The connected camera produces raw10 frames and can display just the raw images or perform additional processing (within OnReadSample() callback) and display the generated output as well (i.e., two viewers). The two images are displayed correctly with exception of a lag (i.e., camera to display) due the additional processing time being greater than the frame rate (1/FPS).
How does the Media Source handle an overrun scenario? My understanding (please correct if wrong) is new MFSamples (i.e. image containers) are created and queued, but I've yet to find info on what happens when the queue depth is reached.
Can the Media Source queue depth be set to a particular number?
Some additional system details:
Win 10
Direct3D9
Thanks,
Steve.

Related

IMFMediaEngine duplicate player surface

How to use IMFMediaEngine for playing one video simultaneously in two areas or windows?
The IMFMediaEngineClassFactory::CreateInstance method has frame server mode and rendering mode. The rendering mode creates single video output provided by Window HWND or DirectComposition
Does it mean that I need to use frame server mode? And how to do that for making two outputs? Also I need async output, for video won't be interrupted by main thread.

Interesting behavior in Media Source Extensions

I'm trying to build a fairly standard video player using Media Source Extensions; however, I want the user to be able to control when the player moves on to a new video segment. For example, we might see the following behavior:
Video player plays 1st segment
Source Buffer runs out of data causing the video to appear paused
When the user is ready, they click a button that adds the 2nd segment to the Source Buffer
The video continues by playing the 2nd segment
This works well, except that when the video appears paused during step 2 it doesn't stop at the last frame of the 1st segment. Instead, it stops two frames before the end of the 1st segment. Those last two frames aren't being dropped, they just get played after the user clicks the button to advance the video. This is an issue for my application, and I'm trying to figure out a way to make sure all of the frames from the 1st segment get played before the end of step 2.
I suspect that these last two frames are getting held up in the video decoder buffer. Especially since calling endOfStream() on my Media Source after adding the 1st segment to the Source Buffer causes the 1st segment to play all the way through with no frames left behind.
Additional Info
I created each video segment file from a series of PNGs using the following ffmpeg command
ffmpeg -i %04d.png -movflags frag_keyframe+empty_moov+default_base_moof video_segment.mp4
Maybe this is a clue? End of stream situations not handled correctly (last frames are dropped)
Another interesting thing to note is that if the video only has 2 frames or less, MSE doesn't play it at all.
The browser I'm using is Chrome. The code for my MSE player is just taken from the Google Developers example, but I'll post it here for completeness. This code only covers up to step 2 since that's where the issue is.
<script>
const mediaSource = new MediaSource();
video.src = URL.createObjectURL(mediaSource);
mediaSource.addEventListener('sourceopen', sourceOpen, { once: true });
function sourceOpen() {
URL.revokeObjectURL(video.src);
const sourceBuffer = mediaSource.addSourceBuffer('video/mp4; codecs="avc1.64001f"');
sourceBuffer.mode = 'sequence';
// Fetch the video and add it to the Source Buffer
fetch('https://s3.amazonaws.com/bucket_name/video_file.mp4')
.then(response => response.arrayBuffer())
.then(data => sourceBuffer.appendBuffer(data));
}
This works well, except that when the video appears paused during step 2 it doesn't stop at the last frame of the 1st segment. Instead, it stops two frames before the end of the 1st segment. Those last two frames aren't being dropped, they just get played after...
This behavior is browser dependent. Let's start with spec quote:
When the media element needs more data, the user agent SHOULD transition it from HAVE_ENOUGH_DATA to HAVE_FUTURE_DATA early enough for a web application to be able to respond without causing an interruption in playback. For example, transitioning when the current playback position is 500ms before the end of the buffered data gives the application roughly 500ms to append more data before playback stalls.
The behavior you are seeing is that MSE compatible browser is aware that the stream is not yet ended but it also is aware that it is running out of data. It indicates need in further data by changing its ready state however it does not have an obligation to play out every frame it already holds. It enters buffering state based on clock time of current playback vs. end of available data.
Even though link above says...
For example, in video this corresponds to the user agent having data from the current frame, but not the next frame
...actual implementations might interpret this differently and switch to HAVE_CURRENT_DATA a bit too early, that is holding a few more video frames but knowing that it is not yet end of stream and further frames are missing. It is a sort of browser implementation specificity you just have to live with.

How to reset topology for changed webcam resolution (WMF)

I have setup a WMF session (built an IMFTopology object with a source pointing to a webcam and a standard EVR for screen output), assigned it to an IMFMediaSession and started a preview. All is working great.
Now, I stop the session (waiting for the actual stop), change the source's resolution (setting an appropriate IMFMediaType via its IMFMediaTypeHandler) and then build a new topology with that new source and a newly created IMFActivate object for the EVR. Also changing the output window's size to match the new frame size.
When I start that new session there's no image (or the image is garbled, or cut off at the bottom - depends on the change in resolution). It is almost as if the new topology is trying to re-use the previously setup EVR and it is not working correctly.
I tried setting that new media type on the EVR when generating a new one, tried to force the new window size on the EVR (via a call to SetWindowPos()), tried to get that output node by previously assigned streamID and set its preferred input format... Nothing worked - I get the same black (or garbled) image when I start the playback.
The only time the "new" session plays correctly is when I chose back the original source format. Then it continues as if nothing bad happened.
Why is that? How do I fix this?
Not providing the source code as there's no easy way to just provide the relevant parts. Generally my code closely follows the sample from MSDN's article on creating a Media Session for playing back a file.
According to MS's documentation the IMFMediaSession is managing the start/stop of the source so I'm relying on that when I'm changing the source's video format (otherwise the application fails).
If you want to build a real new topology, you need to release all MediaFoundation objects (source, sink, topology, and so on).
If not, it can be a little bit complicated.

how to create our own custom filter so that we get an access over there using directshow

I have a question regarding Media Playback which is as below .
Please solve my problem
"i have an audio stream which is successfully compressed by using direct show, now before entering it to the rendering filter
i need to create my own custom filter so that i have full access over it , because by using existing filter i am not able to get access over the file."// This is my question
I have read about Direct show.The Microsoft Direct Show application programming interface (API) is a media-streaming architecture for Microsoft Windows. Using Direct Show, your applications can perform high-quality video and audio playback or capture.
The Direct Show headers, libraries, SDK tools, and samples are available in the Windows SDK.
Please suggest..
Windows SDK also offers you samples. Gargle Filter Sample in \Samples\multimedia\directshow\filters\gargle is close to what you need: mid-point filter for audio with full control over streamed data.
// Summary
//
// A simple, in-place transform, audio effect which modifies the data
// in the samples that pass through it. The effect is an amplitude
// modulation with a synthesised secondary wave function.
// The secondary wave can be a triangular or square wave. A properties
// sheet allows the shape and frequency of the secondary wave to be chosen.
//
// At low modulation frequencies it sounds like a tremolo, at higher
// modulation frequencies it sounds like a distortion, adding extra
// frequencies above and below the original unmodulated sound.

Delay in AUGraph callback

We are developing a music player app for Lion OSX(10.7), which applies different audio effects to selected music file.
We have used Audio unit and AUGraph APi's to achieve this.
However after connecting all the audio unit node , when we call AUGraphStart(mGraph) graph takes around 1 sec to invoke first I/o callback.
Because of this there is slight delay in the beginning of the playback.
How can we avoid this delay?Could any one provide any imputs to help us solve this issue?
One solution is to start the audio graph running before displaying any UI that the user could use to start playback. Since the audio units will then be running, you could fill any audio output buffers with silence before the appropriate UI event. If the buffers are small/short, the latency from any UI event till an output buffer is filled may be small enough to be below normal human perception.

Resources