I'm using Media Foundation and the IMFSampleGrabberSinkCallback to playback video files and render them to a texture. I am able to get video samples in the IMFSampleGrabberSinkCallback::OnProcessSample method, but those samples are compressed. I have way less samples than I have pixels in my render target. According to this, the media session should load any decoder that is needed (if available), but that does not seem to be the case. Even if I create the decoder and add it to the topology myself, the video samples are still compressed. Is there anything in particular I am missing here ?
Thanks.
Related
I have use ffmpeg and mp4parser to add image watermark on video.
both works when video size is small like less than 5MB to 7Mb but
when it comes to large video size(anything above than 7MB or so..)
it fails and it doesn't not work.
what are the resources that helps to adding watermark on video quickly. if you have any useful resources that please let me know?
It depends on what exactly you need.
If the watermark is just needed when the video is viewed on the android device, the easiest and quickest way is to overlay the image with a transparent background over the video view. You will need to think about fullscreen vs inline and portrait vs landscape to ensure it lines up as you want.
If you want to watermark the video itself, so that the watermark is included if the video is copied or sent elsewhere, then ffmpeg is likely as fast as other solutions on the device itself. If you are able to send the video to a server and have the watermark applied there you will have the ability to use much more powerful compute resource.
We have a DirectShow application where we capture video input from USB, multiplex with audio from a WAV file (backing music), overlay audio and video effects, compress and write to an MP4 file.
Originally we were using an audio input source (microphone) and mixing our backing music and sound effects over the top but the decision was made to not capture live audio, and so I thought it would make more sense to use the backing music WAV file itself as the audio source.
Here is the filter graph we have:
backing.wav is a simple WAV file (stored locally), and was added to the graph using IFilterGraph::AddSourceFilter.
The problem is that when the graph is run, no audio samples are delivered from the WAV file. The video part of the graph runs as normal, but it's as if the audio part of the graph simply isn't running.
If I stop the graph in GraphEdit, add the Default DirectSound Device audio renderer and hook that up in place of the AAC Encoder filter and then run the graph again, the audio plays as you would expect.
Additionally, if backing.wav is replaced with an audio capture source like a microphone, audio data flows through as normal.
Does anyone have any ideas why the above graph, using a WAV file as the audio source, would fail to produce any audio samples?
I suppose the title is incorrectly identifying/summarizing the problem. There is nothing to tell for the fact that audio is not produced. It is likely that it is produced equally well with DirectSound Renderer and with AAC Encoder, specifically the data is reaching output pin of Mixing Transform Filter (is this your filter? You should be able to trace its flow and see media samples passing though).
With the information given, I would say it's likely that custom AAC encoder somehow does not like the feed and somehow either drops data or switches to erroneous state. You should be able to debug this further by inserting a Sample Grabber (or alike) filter¹ before the AAC encoder and tracing the media samples. Also comparing them to data from another source. The encoder might be sensitive to small details like media sample duration or discontinuity flag on the first sample streamed.
¹ With GraphStudioNext (GraphEdit makes no longer sense compared to) you can use internal Analyzer Filter and review media sample flow interactively using filter property page.
We are using a directshow interface to capture images from a video stream. These images are presented in a fixed size window.
Once we have captured an image we store it as a bitmap. Downstream we have the ability to add annotation to the image, for example letters in a fixed size font.
In one of our desktop environments, the annotation has started appearing at half the size that it normally appears at. This implies that the image we are merging the text onto has dimensions that are maybe twice as large.
The system that this happens on is a shared resource as in some unknown individual has installed software on the system that differs from our baseline.
We have two approaches - the 1st is to reimage the system to get our default text size behaviour back. The 2nd is to figure out how directshow manages image dimensions so that we can set the scaling on the image correctly.
A survey of the directshow literature indicates that the above is not a trivial task. The original work was done by another team that did not document what they did. Can anybody point us in the direction of what directshow object we want to deal with to properly size the sampled image?
DirectShow - as a framework - does not deal with resolutions directly. Your video source (such as capture hardware) is capable of providing video feed in certain resolution which you possibly can change. You normally use IAMStreamConfig as described in Configure the Video Output Format in order to choose capture resolution.
Sometimes you cannot affect capture resolution and you need to resample the image in whatever dimensions you captured it. There is no stock filter for this, however Media Foundation provides a suitable Video Resizer DSP which does most of the task. Unfortunately it does not fit DirectShow pipeline smoothly, so you need fitting and/or custom filter for resizing.
When filters connect in DirectShow, they have an AM_MEDIA_TYPE. Here you will find a VIDEOINFOHEADER with a BITMAPINFOHEADER and this header has a biWidth and biHeight.
Try to build the FilterGraph manually (with GraphEdit or GraphStudioNext) and inspect these fields.
Given a sample buffer of H.264, is there a way to extract the frame it represents as an image?
I'm using QTKit to capture video from a camera and using a QTCaptureMovieFileOutput as the output object.
I want something similar to the CVImageBufferRef that is passed as a parameter to the QTCaptureVideoPreviewOutput delegate method. For some reason, the file output doesn't contain the CVImageBufferRef.
What I do get is a QTSampleBuffer which, since I've set it in the compression options, contains an H.264 sample.
I have seen that on the iPhone, CoreMedia and AVFoundation can be used to create a CVImageBufferRef from the given CMSampleBufferRef (Which, I imagine is as close to the QTSampleBuffer as I'll be able to get) - but this is the Mac, not the iPhone.
Neither CoreMedia or AVFoundation are on the Mac, and I can't see any way to accomplish the same task.
What I need is an image (whether it be a CVImageBufferRef, CIImage or NSImage doesn't matter) from the current frame of the H.264 sample that is given to me by the Output object's call back.
Extended info (from the comments below)
I have posted a related question that focusses on the original issue - attempting to simply play a stream of video samples using QTKit: Playing a stream of video data using QTKit on Mac OS X
It appears not to be possible which is why I've moved onto trying to obtain frames as images and creating an appearance of video, by scaling, compressing and converting the image data from CVImageBufferRef to NSImage and sending it to a peer over the network.
I can use the QTCapturePreviewVideoOutput (or decompressed) to get uncompressed frame images in the form of CVImageBufferRef.
However, these images references need compressing, scaling and converting into NSImages before they're any use to me, hence the attempt to get an already scaled and compressed frame from the framework using the QTCaptureMovieFileOutput (which allows a compression and image size to be set before starting the capture), saving me from having to do the expensive compression, scale and conversion operations, which kill CPU.
Does the Creating a Single-Frame Grabbing Application section of the QTKit Application Programming Guide not work for you in this instance?
I am working on an application and I have a problem I just cant seem to find a solution for. The application is written in vc++. What I need to do is display a YUV video feed with text on top of it.
Right now it works correctly by drawing the text in the OnPaint method using GDI and the video on a DirectDraw overlay. I need to get rid of the overlay because it causes to many problems. It wont work on some video cards, vista, 7, etc.
I cant figure out a way to complete the same thing in a more compatible way. I can draw the video using DirectDraw with a back buffer and copy it to the primary buffer just fine. The issue here is that the text being drawn in GDI flashes because of the amount of times the video is refreshed. I would really like to keep the code to draw the text intact if possible since it works well.
Is there a way to draw the text directly to a DirectDraw buffer or memory buffer or something and then blt it to the back buffer? Should I be looking at another method all together? The two important OS's are XP and 7. If anyone has any ideas just let me know and I will test them out. Thanks.
Try to look into DirectShow and the Ticker sample on microsoft.com:
DirectShow Ticker sample
This sample uses the Video Mixing Renderer to blend video and text. It uses the IVMRMixerBitmap9 interface to blend text onto the bottom portion of the video window.
DirectShow is for building filter graphs for playing back audio or video streams an adding different filters for different effects and manipulation of video and audio samples.
Instead of using the Video Mixing Renderer of DirectShow, you can also use the ISampleGrabber interface. The advantage is, that it is a filter which can be used with other renderers as well, for example when not showing the video on the screen but streaming it over network or dumping it to a file.