I'm trying to build a fairly standard video player using Media Source Extensions; however, I want the user to be able to control when the player moves on to a new video segment. For example, we might see the following behavior:
Video player plays 1st segment
Source Buffer runs out of data causing the video to appear paused
When the user is ready, they click a button that adds the 2nd segment to the Source Buffer
The video continues by playing the 2nd segment
This works well, except that when the video appears paused during step 2 it doesn't stop at the last frame of the 1st segment. Instead, it stops two frames before the end of the 1st segment. Those last two frames aren't being dropped, they just get played after the user clicks the button to advance the video. This is an issue for my application, and I'm trying to figure out a way to make sure all of the frames from the 1st segment get played before the end of step 2.
I suspect that these last two frames are getting held up in the video decoder buffer. Especially since calling endOfStream() on my Media Source after adding the 1st segment to the Source Buffer causes the 1st segment to play all the way through with no frames left behind.
Additional Info
I created each video segment file from a series of PNGs using the following ffmpeg command
ffmpeg -i %04d.png -movflags frag_keyframe+empty_moov+default_base_moof video_segment.mp4
Maybe this is a clue? End of stream situations not handled correctly (last frames are dropped)
Another interesting thing to note is that if the video only has 2 frames or less, MSE doesn't play it at all.
The browser I'm using is Chrome. The code for my MSE player is just taken from the Google Developers example, but I'll post it here for completeness. This code only covers up to step 2 since that's where the issue is.
<script>
const mediaSource = new MediaSource();
video.src = URL.createObjectURL(mediaSource);
mediaSource.addEventListener('sourceopen', sourceOpen, { once: true });
function sourceOpen() {
URL.revokeObjectURL(video.src);
const sourceBuffer = mediaSource.addSourceBuffer('video/mp4; codecs="avc1.64001f"');
sourceBuffer.mode = 'sequence';
// Fetch the video and add it to the Source Buffer
fetch('https://s3.amazonaws.com/bucket_name/video_file.mp4')
.then(response => response.arrayBuffer())
.then(data => sourceBuffer.appendBuffer(data));
}
This works well, except that when the video appears paused during step 2 it doesn't stop at the last frame of the 1st segment. Instead, it stops two frames before the end of the 1st segment. Those last two frames aren't being dropped, they just get played after...
This behavior is browser dependent. Let's start with spec quote:
When the media element needs more data, the user agent SHOULD transition it from HAVE_ENOUGH_DATA to HAVE_FUTURE_DATA early enough for a web application to be able to respond without causing an interruption in playback. For example, transitioning when the current playback position is 500ms before the end of the buffered data gives the application roughly 500ms to append more data before playback stalls.
The behavior you are seeing is that MSE compatible browser is aware that the stream is not yet ended but it also is aware that it is running out of data. It indicates need in further data by changing its ready state however it does not have an obligation to play out every frame it already holds. It enters buffering state based on clock time of current playback vs. end of available data.
Even though link above says...
For example, in video this corresponds to the user agent having data from the current frame, but not the next frame
...actual implementations might interpret this differently and switch to HAVE_CURRENT_DATA a bit too early, that is holding a few more video frames but knowing that it is not yet end of stream and further frames are missing. It is a sort of browser implementation specificity you just have to live with.
Related
I've implemented a UVC video viewing application using the source reader in async mode (OnReadSample()). The connected camera produces raw10 frames and can display just the raw images or perform additional processing (within OnReadSample() callback) and display the generated output as well (i.e., two viewers). The two images are displayed correctly with exception of a lag (i.e., camera to display) due the additional processing time being greater than the frame rate (1/FPS).
How does the Media Source handle an overrun scenario? My understanding (please correct if wrong) is new MFSamples (i.e. image containers) are created and queued, but I've yet to find info on what happens when the queue depth is reached.
Can the Media Source queue depth be set to a particular number?
Some additional system details:
Win 10
Direct3D9
Thanks,
Steve.
I am trying to find a way to create and separate a video widget into two parts, in order to process stereo videos:
The first one would play a part of the video;
The second one would play the other part of the video.
I currently do not know where to start. I am searching around qt multimedia module, but I do not know how to achieve this behavior.
Does anyone have an idea?
I was also thinking to build two video widgets and run them into two threads but they have to be perfectly synchronized. The idea was to cut the video into two ones with ffmpeg and affecting each one to a video widget. However I do not think it would be easy to achieve this (each frame would have to be sync).
Thanks for your answers.
If your stereo video data is encoded in some special format that needs decoding on the codec/container format, I think that the QMultiMedia stuff in Qt is too basic for this kind of use case as it does not allow tuning into "one stream" of a multi-stream transport container.
However, if you have alternating scan-lines, alternating frames or even "side-by-side" or "over-and-under" image per frame encoded in a "normal" video stream, then all you will have to do is intercept the frames as they are being decoded and separate the frame into two QImages and display them.
That is definitely doable!
However depending on your video source and even the platform, you might want to select different methods. For example if you are using a QCamera as the source of your video you could use the QVideoProbe or QViewFinder approaches. Interrestingly the availability of those methods on different platforms vary, so definitely figure out that first.
If you are decoding video using QMediaPlayer, QVideoProbe will probably be the way to go.
For an inttroduction to how you can grab frames using the different methods, please look at some of the examples from the official documentation on the subject.
Here is a short example of using the QVideoProbe approach:
videoProbe = new QVideoProbe(this);
// Here, myVideoSource is a camera or other media object compatible with QVideoProbe
if (videoProbe->setSource(myVideoSource)) {
// Probing succeeded, videoProbe->isValid() should be true.
connect(videoProbe, SIGNAL(videoFrameProbed(QVideoFrame)),
this, SLOT(processIndividualFrame(QVideoFrame)));
}
// Cameras need to be started. Do whatever your video source requires to start here
myVideoSource->start();
// [...]
// This is the slot where the magic happens (separating each single frame from video into two `QImage`s and posting the result to two `QLabel`s for example):
void processIndividualFrame(QVideoFrame &frame){
QVideoFrame cloneFrame(frame);
cloneFrame.map(QAbstractVideoBuffer::ReadOnly);
const QImage image(cloneFrame.bits(),
cloneFrame.width(),
cloneFrame.height(),
QVideoFrame::imageFormatFromPixelFormat(cloneFrame.pixelFormat()));
cloneFrame.unmap();
QSize sz = image.size();
const int w = sz.width();
const int h2 = sz.height() / 2;
// Assumes "over-and-under" placement of stereo data for simplicity.
// If you instead need access to individual scanlines, please have a look at [this][2].
QImage leftImage = image.copy(0, 0, w, h2);
QImage rightImage = image.copy(0, h2, w, h2);
// Assumes you have a UI set up with labels named as below, and with sizing / layout set up correctly
ui->myLeftEyeLabel.setPixmap(QPixmap::fromImage(leftImage));
ui->myRightEyeLabel.setPixmap(QPixmap::fromImage(leftImage));
// Should play back rather smooth since they are effectively updated simultaneously
}
I hope this was useful.
BIG FAT WARNING: Only parts of this code has been tested or even compiled!
We are developing a music player app for Lion OSX(10.7), which applies different audio effects to selected music file.
We have used Audio unit and AUGraph APi's to achieve this.
However after connecting all the audio unit node , when we call AUGraphStart(mGraph) graph takes around 1 sec to invoke first I/o callback.
Because of this there is slight delay in the beginning of the playback.
How can we avoid this delay?Could any one provide any imputs to help us solve this issue?
One solution is to start the audio graph running before displaying any UI that the user could use to start playback. Since the audio units will then be running, you could fill any audio output buffers with silence before the appropriate UI event. If the buffers are small/short, the latency from any UI event till an output buffer is filled may be small enough to be below normal human perception.
How do I automatically expand an embedded youtube video when the user presses play?
The situation:
- For various reasons the ideal layout of the webpage mans that the video player must appear initially at a small size (let's say 480x385) when the user arrives on the page
- The video being shown contains some detail and is difficult to watch at 480x385
- Right now the user must click on the "full-screen" icon which comes standard in every youtube video player. This is irritating to many people.
The desired solution:
- When users click on the video to play it, the player automatically expands to a more reasonable size (e.g., 640x385 or 853x505) and plays at that size
- The video could be played in a modal overlay, but other solutions would be welcome as well
- Upon completion of the video, the expanded view should automatically disappear and the video should appear in it's original size on the page
You basically need two things:
The ability to detect when the the video is playing. You should be able to figure that out from this YouTube example.
Resize the player by changing the height and width properties of the "object" node that contains the player.
I was wondering how do software like GotoMeeting capture desktop. I can do a full screen (or block by block) capture using GDI but that just seems too wasteful to me. Also I have looked into Mirror devices but I was wondering if there's a simpler technique or a library out there which does this.
I need fast and efficient desktop screen capture (10p15 fps) which I am eventually going to convert into a video file and integrate with my application to send the captured feed over the network or something.
Thanks!
Yes, taking a screen capture and finding the diff between previous capture would be a good way to reduce bandwidth on transmission by sending the changes across only, of course this is similar to video encoding techniques which does this block by block.
Still means you need to do a capture plus extra processing for getting the difference, i.e, encoding it.
by using the Mirror devices you can get both the updated Rectangle that are change and also pointer to the Screen. updated Rectangle pointer point to all the rectangle that are change , these rectangle are the change rectangle that are frequently changing. filter out some of the rectangle because in one second you can get 1000 of rectangles.
I would either:
Do full screen captures, and then
perform image processing to isolate
parts of the screen that have changed
to save bandwidth.
-OR-
Use a program like CamStudio.
i got 20 to 30 frame per second using memory driver i am display on my picture box but when i get full screen update then these frame are buffered. as picture box is slow and have change to my own component this is some how fast but not good on full screen as well averge i display 10 fps in full screen. i am facing problem in rendering frames i can capture 20 to 30 frames per second but my rendering is 8 to 10 full screen frames in one second . if any one has achive rendering frames in full screen please replay me.
What language?
.NET provides Graphics.CopyFromScreen.