Derive qsv hwdevice from D3D11VA hwdevice using FFMPEG - ffmpeg

I'm trying to derive a QSV hwcontext from D3D11VA device in order to encode d3d11 frames but I'm getting an error when calling av_hwdevice_ctx_create_derived.
buffer_t ctx_buf { av_hwdevice_ctx_alloc(AV_HWDEVICE_TYPE_D3D11VA) };
auto ctx = (AVD3D11VADeviceContext *)((AVHWDeviceContext *)ctx_buf->data)->hwctx;
std::fill_n((std::uint8_t *)ctx, sizeof(AVD3D11VADeviceContext), 0);
auto device = (ID3D11Device *)hwdevice_ctx->data;
device->AddRef();
ctx->device = device;
ctx->lock_ctx = (void *)1;
ctx->lock = do_nothing;
ctx->unlock = do_nothing;
auto err = av_hwdevice_ctx_init(ctx_buf.get());
and then I call
av_hwdevice_ctx_create_derived(&derive_hw_device_ctx, AV_HWDEVICE_TYPE_QSV, ctx_buf.get(), 0);
I'm seeing this in the log:
[AVHWDeviceContext # 000001de119a9b80] Initialize MFX session: API version is 1.35, implementation version is 1.30
[AVHWDeviceContext # 000001de119a9b80] Error setting child device handle: -16
Please let me know if you have any idea how to fix it or a different approach to encode d3d11 frames on QSV encoder.
Thank you.
OS: windows 10 64bits
CPU: Intel i5-8400
Graphics card: Nvidia GT1030 (has no hw encoder)

Adding this to d3d11 device solved the issue:
ID3D10Multithread *pMultithread;
status = device->QueryInterface(IID_ID3D10Multithread, (void **)&pMultithread);
if(SUCCEEDED(status)) {
pMultithread->SetMultithreadProtected(TRUE);
Release(pMultithread);
}

Related

VideoToolbox hardware encoded I frame not clear on Intel Mac

When I captured video from camera on Intel Mac, used VideoToolbox to hardware encode raw pixel buffers to H.264 codec slices, I found that the VideoToolbox encoded I frame not clear, causing it looks like blurs every serveral seconds. Below are properties setted:
self.bitrate = 1000000;
self.frameRate = 20;
int interval_second = 2;
int interval_second = 2;
NSDictionary *compressionProperties = #{
(id)kVTCompressionPropertyKey_ProfileLevel: (id)kVTProfileLevel_H264_High_AutoLevel,
(id)kVTCompressionPropertyKey_RealTime: #YES,
(id)kVTCompressionPropertyKey_AllowFrameReordering: #NO,
(id)kVTCompressionPropertyKey_H264EntropyMode: (id)kVTH264EntropyMode_CABAC,
(id)kVTCompressionPropertyKey_PixelTransferProperties: #{
(id)kVTPixelTransferPropertyKey_ScalingMode: (id)kVTScalingMode_Trim,
},
(id)kVTCompressionPropertyKey_AverageBitRate: #(self.bitrate),
(id)kVTCompressionPropertyKey_ExpectedFrameRate: #(self.frameRate),
(id)kVTCompressionPropertyKey_MaxKeyFrameInterval: #(self.frameRate * interval_second),
(id)kVTCompressionPropertyKey_MaxKeyFrameIntervalDuration: #(interval_second),
(id)kVTCompressionPropertyKey_DataRateLimits: #[#(self.bitrate / 8), #1.0],
};
result = VTSessionSetProperties(self.compressionSession, (CFDictionaryRef)compressionProperties);
if (result != noErr) {
NSLog(#"VTSessionSetProperties failed: %d", (int)result);
return;
} else {
NSLog(#"VTSessionSetProperties succeeded");
}
These are very strange compression settings. Do you really need short GOP and very strict data rate limits?
I very much suspect you just copied some code off the internet without having any idea what it does. If it's the case, just set interval_second = 300 and remove kVTCompressionPropertyKey_DataRateLimits completely

Capturing 48 kHz audio with FFmpeg and DirectShow (dshow input)

I tried to capture audio with 48 kHz in FFmpeg, the code as below:
AVInputFormat* ifmt = av_find_input_format("dshow");
CHECK_POINTER_RETURN_VALUE(ifmt, false)
pFmtCtx = avformat_alloc_context();
CHECK_POINTER_RETURN_VALUE(pFmtCtx, false)
AVDictionary *param = nullptr;
std::string sr = std::to_string(48000);
av_dict_set(&param, "sample_rate",sr.c_str(), 0);
int error = avformat_open_input(&pFmtCtx, ffName.c_str(), ifmt, &param);
if (error != 0) {
char buf[2014];
av_strerror(error, buf, 1024);
LOG(ERROR)<<"open audio device failed,err is "<<buf;
return false;
}
but "avformat_open_input" return fail, err shows "I/O error", if the sample rate is 44100, all is OK.
Now FFmpeg doesn't support capturing 48 kHz audio?
This was an issue with the DirectShow API that FFmpeg was using. It has been resolved with a change to FFmpeg: https://github.com/FFmpeg/FFmpeg/commit/d9a9b4c877b85fea5a5bad74c3d592a756047f79
Specifically, DirectShow doesn't adequately describe the audio device capabilities with AUDIO_STREAM_CONFIG_CAPS when the audio device supports both 44.1 kHz and 48 kHz as clock multiples. WAVEFORMATEX within the AM_MEDIA_TYPE must be used instead.
As #die maus mentioned, the fact that this works if sample rate is set to 44100, but not 48000, likely indicates that your input device does not support sampling at 48 kHz. This is not a limitation of FFmpeg, but of the hardware.
And as #moi suggested, unless you have a specific need for 48 kHz, 44.1 should work just fine.
If you really need 48 kHz (e.g. you are sending the audio to something else that expects 48 kHz), you can resample the audio. FFmpeg includes libswresample for this purpose; see the example here.

MFTransform encoder->ProcessInput returns E_FAIL

When I run encoder->ProcessInput(stream_id, sample.Get(), 0) I am getting a E_FAIL ("Unspecified error") error which isn't very helpful.
I am either trying to (1) Figure out what the real error is and/or (2) get past this unspecified error.
Ultimately, my goal is achieving this: http://alax.info/blog/1716
Here's the gist of what I am doing:
(Error occurs in this block)
void encode_frame(ComPtr<ID3D11Texture2D> texture) {
_com_error error = NULL;
IMFTransform *encoder = nullptr;
encoder = get_encoder();
if (!encoder) {
cout << "Did not get a valid encoder to utilize\n";
return;
}
cout << "Making it Direct3D aware...\n";
setup_D3_aware_mft(encoder);
cout << "Setting up input/output media types...\n";
setup_media_types(encoder);
error = encoder->ProcessMessage(MFT_MESSAGE_COMMAND_FLUSH, NULL); // flush all stored data
error = encoder->ProcessMessage(MFT_MESSAGE_NOTIFY_BEGIN_STREAMING, NULL);
error = encoder->ProcessMessage(MFT_MESSAGE_NOTIFY_START_OF_STREAM, NULL); // first sample is about to be processed, req for async
cout << "Encoding image...\n";
IMFMediaEventGenerator *event_generator = nullptr;
error = encoder->QueryInterface(&event_generator);
while (true) {
IMFMediaEvent *event = nullptr;
MediaEventType type;
error = event_generator->GetEvent(0, &event);
error = event->GetType(&type);
uint32_t stream_id = get_stream_id(encoder); // Likely just going to be 0
uint32_t frame = 1;
uint64_t sample_duration = 0;
ComPtr<IMFSample> sample = nullptr;
IMFMediaBuffer *mbuffer = nullptr;
DWORD length = 0;
uint32_t img_size = 0;
MFCalculateImageSize(desktop_info.input_sub_type, desktop_info.width, desktop_info.height, &img_size);
switch (type) {
case METransformNeedInput:
ThrowIfFailed(MFCreateDXGISurfaceBuffer(__uuidof(ID3D11Texture2D), texture.Get(), 0, false, &mbuffer),
mbuffer, "Failed to generate a media buffer");
ThrowIfFailed(MFCreateSample(&sample), sample.Get(), "Couldn't create sample buffer");
ThrowIfFailed(sample->AddBuffer(mbuffer), sample.Get(), "Couldn't add buffer");
// Test (delete this) - fake buffer
/*byte *buffer_data;
MFCreateMemoryBuffer(img_size, &mbuffer);
mbuffer->Lock(&buffer_data, NULL, NULL);
mbuffer->GetCurrentLength(&length);
memset(buffer_data, 0, img_size);
mbuffer->Unlock();
mbuffer->SetCurrentLength(img_size);
sample->AddBuffer(mbuffer);*/
MFFrameRateToAverageTimePerFrame(desktop_info.fps, 1, &sample_duration);
sample->SetSampleDuration(sample_duration);
// ERROR
ThrowIfFailed(encoder->ProcessInput(stream_id, sample.Get(), 0), sample.Get(), "ProcessInput failed.");
I setup my media types like this:
void setup_media_types(IMFTransform *encoder) {
IMFMediaType *output_type = nullptr;
IMFMediaType *input_type = nullptr;
ThrowIfFailed(MFCreateMediaType(&output_type), output_type, "Failed to create output type");
ThrowIfFailed(MFCreateMediaType(&input_type), input_type, "Failed to create input type");
/*
List of all MF types:
https://learn.microsoft.com/en-us/windows/desktop/medfound/alphabetical-list-of-media-foundation-attributes
*/
_com_error error = NULL;
int stream_id = get_stream_id(encoder);
error = output_type->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);
error = output_type->SetGUID(MF_MT_SUBTYPE, desktop_info.output_sub_type);
error = output_type->SetUINT32(MF_MT_AVG_BITRATE, desktop_info.bitrate);
error = MFSetAttributeSize(output_type, MF_MT_FRAME_SIZE, desktop_info.width, desktop_info.height);
error = MFSetAttributeRatio(output_type, MF_MT_FRAME_RATE, desktop_info.fps, 1);
error = output_type->SetUINT32(MF_MT_INTERLACE_MODE, MFVideoInterlace_Progressive); // motion will be smoother, fewer artifacts
error = output_type->SetUINT32(MF_MT_MPEG2_PROFILE, eAVEncH264VProfile_High);
error = output_type->SetUINT32(MF_MT_MPEG2_LEVEL, eAVEncH264VLevel3_1);
error = output_type->SetUINT32(CODECAPI_AVEncCommonRateControlMode, eAVEncCommonRateControlMode_CBR); // probably will change this
ThrowIfFailed(encoder->SetOutputType(stream_id, output_type, 0), output_type, "Couldn't set output type");
error = input_type->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);
error = input_type->SetGUID(MF_MT_SUBTYPE, desktop_info.input_sub_type);
error = input_type->SetUINT32(MF_MT_INTERLACE_MODE, MFVideoInterlace_Progressive);
error = MFSetAttributeSize(input_type, MF_MT_FRAME_SIZE, desktop_info.width, desktop_info.height);
error = MFSetAttributeRatio(input_type, MF_MT_FRAME_RATE, desktop_info.fps, 1);
error = MFSetAttributeRatio(input_type, MF_MT_PIXEL_ASPECT_RATIO, 1, 1);
ThrowIfFailed(encoder->SetInputType(stream_id, input_type, 0), input_type, "Couldn't set input type");
}
My desktop_info struct is:
struct desktop_info {
int fps = 30;
int width = 2560;
int height = 1440;
uint32_t bitrate = 10 * 1000000; // 10Mb
GUID input_sub_type = MFVideoFormat_ARGB32;
GUID output_sub_type = MFVideoFormat_H264;
} desktop_info;
Output of my program prior to reaching ProcessInput:
Hello World!
Number of devices: 3
Device #0
Adapter: Intel(R) HD Graphics 630
Got some information about the device:
\\.\DISPLAY2
Attached to desktop : 1
Got some information about the device:
\\.\DISPLAY1
Attached to desktop : 1
Did not find another adapter. Index higher than the # of outputs.
Successfully duplicated output from IDXGIOutput1
Accumulated frames: 0
Created a 2D texture...
Number of encoders/processors available: 1
Encoder name: IntelĀ« Quick Sync Video H.264 Encoder MFT
Making it Direct3D aware...
Setting up input/output media types...
If you're curious what my Locals were right before ProcessInput: http://prntscr.com/mx1i9t
This may be an "unpopular" answer since it doesn't provide a solution for MFT specifically but after 8 months of working heavily on this stuff, I would highly recommend not using MFT and implementing encoders directly.
My solution was implementing an HW encoder like NVENC/QSV and you could fall back on a software encoder like x264 if the client doesn't have HW acceleration available.
The reason for this is that MFT is far more opaque and not well documented/supported by Microsoft. I think you'll find you want more control over the settings & parameter tuning of the encoder's as well wherein each encoder implementation is subtly different.
We have seen this error coming from the Intel graphics driver. (The H.264 encoder MFT uses the Intel GPU to do the encode the video into H.264 format.)
In our case, I think the bug was triggered by configuring the encoder to a very high bit rate and then configuring to a low bit rate. In your sample code, it does not look like you are changing the bit rate, so I am not sure if it is the same bug.
Intel just released a new driver about two weeks ago, that is supposed to have the fix for the bug that we were seeing. So, you may want to give that new driver a try -- hopefully it will fix the problem that you are having.
The new driver is version 25.20.100.6519. You can get it from the Intel web site: https://downloadcenter.intel.com/download/28566/Intel-Graphics-Driver-for-Windows-10
If the new driver does not fix the problem, you could try running your program on a different PC that uses a NVidia or AMD graphics card, to see if the problem only happens on PCs that have Intel graphics.

OS X: respond to new audio device

I need to be notified, when a new audio device appears on OS X. I'm not sure where to start. Can Core Audio do this for me, or do I need to get down to a lower level with for instance IO Kit?
You can do this by observing kAudioHardwarePropertyDevices. The code looks roughly like:
AudioObjectPropertyAddress propertyAddress = {
.mSelector = kAudioHardwarePropertyDevices,
.mScope = kAudioObjectPropertyScopeGlobal,
.mElement = kAudioObjectPropertyElementMaster
};
OSStatus result = AudioObjectAddPropertyListener(kAudioObjectSystemObject, &propertyAddress, myAudioObjectPropertyListenerProc, NULL);
In myAudioObjectPropertyListenerProc you can determine what devices are currently available.

Decoder crashes after ffmpeg upgrade

Recently I upgraded ffmpeg from 0.9 to 1.0 (tested on Win7x64 and on iOS), and now avcodec_decode_video2 seagfaults. Long story short: the crash occurs every time the video dimensions change (eg. from 320x240 to 160x120 or vice versa).
I receive mpeg4 video stream from some proprietary source and decode it like this:
// once, during initialization:
AVCodec *codec_ = avcodec_find_decoder(CODEC_ID_MPEG4);
AVCodecContext ctx_ = avcodec_alloc_context3(codec_);
avcodec_open2(ctx_, codec_, 0);
AVPacket packet_;
av_init_packet(&packet_);
AVFrame picture_ = avcodec_alloc_frame();
// on every frame:
int got_picture;
packet_.size = size;
packet_.data = (uint8_t *)buffer;
avcodec_decode_video2(ctx_, picture_, &got_picture, &packet_);
Again, all the above had worked flawlessly until I upgraded to 1.0. Now every time the frame dimensions change - avcodec_decode_video2 crashes. Note that I don't assign width/height in AVCodecContext - neither in the beginning, nor when the stream changes - can it be the reason?
I'd appreciate any idea!
Update: setting ctx_.width and ctx_.height doesn't help.
Update2: just before the crash I get the following log messages:
mpeg4, level 24: "Found 2 unreleased buffers!".
level 8: "Assertion i < avci->buffer_count failed at libavcodec/utils.c:603"
Update3 upgrading to 1.1.2 fixed this crash. The decoder is able again to cope with dimensions change on the fly.
You can try to fill the AVPacket::side_data. If you change the frame size, codec receives information from it (see libavcodec/utils.c apply_param_change function)
This structure can be filled as follows:
int my_ff_add_param_change(AVPacket *pkt, int32_t width, int32_t height)
{
uint32_t flags = 0;
int size = 4 * 3;
uint8_t *data;
if (!pkt)
return AVERROR(EINVAL);
flags = AV_SIDE_DATA_PARAM_CHANGE_DIMENSIONS;
data = av_packet_new_side_data(pkt, AV_PKT_DATA_PARAM_CHANGE, size);
if (!data)
return AVERROR(ENOMEM);
((uint32_t*)data)[0] = flags;
((uint32_t*)data)[1] = width;
((uint32_t*)data)[2] = height;
return 0;
}
You need to call this function every time the size changes.
I think this feature has appeared recently. I didn't know about it until I looked new ffmpeg sources.
UPD
As you write, the easiest method to solve the problem is to perform codec restart. Just call avcodec_close / avcodec_open2
I just ran into same issue when my frames were changing size on the fly. However, calling avcodec_close/avcodec_open2 is superflous. A cleaner way is to just reset your AVPacket data structure before the call to avcodec_decode_video2. Here it is the code:
av_init_packet(&packet_)
The key here is that this method resets the all of the values of AVPacket to defaults. Check docs for more info.

Resources