Playing H.264 video in an application through ffmpeg using DXVA2 acceleration - windows

I am trying to output H.264 video in a Windows application. I am moderately familiar with FFMPEG and I have been successful at getting it to play H.264 in a SDL window without a problem. Still, I would really benefit from using Hardware Acceleration (probably through DXVA2)
I am reading raw H264 video, no container, no audio ... just raw video (and no B-frames, just I and P). Also, I know that all the systems that will use this applications have Nvidia GPUs supporting at least VP3.
Given that set of assumptions I was hoping to cut some corners, make it simple instead of general, just have it working for my particular scenario.
So far I know that I need to set the hardware acceleration in the codec context by filling the hwaccel member through a call to ff_find_hwaccel. My plan is to look at Media Player Classic Home Cinema which does a pretty good job at supporting DXVA2 using FFMPEG when decoding H.264. However, the code is quite large and I am not exactly sure where to look. I can find the place where ff_find_hwaccel is called in h264.c, but I was wondering where else should I be looking at.
More specifically, I would like to know what is the minimum set of steps that I have to code to get DXVA2 through FFMPEG working?
EDIT: I am open to look at VLC or anything else if someone knows where I can find the "important" piece of code that does the trick. I just mentioned MPC-HC because I think it is the easiest to get to compile in Windows.

Related

In what way does this HEVC video not comply to Apples's requirements document?

My goal is to work out why a given video file does not play on Macos/Safari/Quicktime.
The background to this question is that it is possible to play HEVC videos with a transparent background/alpha channel on Safari/MacOS. To be playable, a video must meet the specific requirements set out by Apple in this document:
https://developer.apple.com/av-foundation/HEVC-Video-with-Alpha-Interoperability-Profile.pdf
The video that does not play on Apple/Safari/Quicktime is an HEVC video with an alpha transparency channel. Note that VLC for MacOS DOES play this file. Here it is:
https://drive.google.com/file/d/1ZnXjcDbk-_YxTgRuH_D7RSR9SXdY_XTv/view?usp=share_link
I have two example HEVC video files with a transparent background/alpha channel, and they both play fine using either Quicktime player or Safari:
Working video #1:
https://drive.google.com/file/d/1PJAyg_sVKVvb-Py8PAu42c1qm8l2qCbh/view?usp=share_link
Working video #2:
https://drive.google.com/file/d/1kk8ssUyT7qAaK15afp8VPR6mIWPFX8vQ/view?usp=sharing
The first step is to work out in what way my non-working video ( https://drive.google.com/file/d/1ZnXjcDbk-_YxTgRuH_D7RSR9SXdY_XTv/view?usp=share_link ) does not comply with the specification.
Once it is clear which requirements are not met by the non-working video then I can move onto the next phase, which is to try to formulate an ffmpeg command that will output a video meeting the requirements.
I have read Apples requirements document and I am out of my depth in trying to analyse the non working video against the requirements - I don't know how to do it.
Can anyone suggest a way to identify what is wrong with the video?
Additional context is that I am trying to find a way to create Apple/MacOS compatible alpha channel / transparent videos using ffmpeg with hevc_nvenc running on an Intel machine. I am aware that Apple hardware can create such videos, but for a wide variety of reasons it is not practical for me to use Apple hardware to do the job. I have spent many hours trying all sorts of ffmpeg and ffprobe commands to try to work out what is wrong and modify the video to fix it, but to be honest most of my attempts are guesswork.
The Apple specification for an alpha layer in HEVC requires that the encoder process and store the alpha in a certain manner. It also requires that the stream configuration syntax be formed in a specific manner. At time of writing, I'm aware of only the videotoolbox HEVC encoder being capable of emitting such a stream.

Real time microphone audio manipulation windows

I would like to make an app (Target pc windows) that let you modify the micro input in real time, like introducing sound effects or even modulating your voice.
I searched over the internet and only found people telling that it would not be possible without using a virtual audio cable.
However I know some apps with similar behavior (voicemod, resonance) not using a virtual audio cable so I would like some help about how can be done (just the name of a library capable would be enough) or where to start.
Firstly, you can use professional ready-made software for that - Digital audio workstation (DAW) in combination with a huge number of plugins for that.
See 5 steps to real-time process your instrument in the DAW.
And What is (audio) direct monitoring?
If you are sure you have to write your own, you can use libraries for real-time audio processing (as far as I know, C++ is better for this than C#).
These libraries really works. They are specially designed for realtime.
https://github.com/thestk/rtaudio
http://www.portaudio.com/
See also https://en.wikipedia.org/wiki/Csound
If you don't have a professional sound interface yet, but want to minimize a latency, read about Asio4All
The linked tutorial worked for me. In it, a sound is recorded and saved to a .wav.
The key to having this stream to a speaker would be opening a SourceDataLine and outputting to that instead of writing to a wav file. So, instead of outputting on line 59 to AudioSystem.write, output to a SourceDataLine write method.
IDK if there will be a feedback issue. Probably good to output to headphones and not your speakers!
To add an effect, the AudioInputLine has to be accessed and processed in segments. In each segment the following needs to happen:
obtain the byte array from the AudioInputLine
convert the audio bytes to PCM
apply your audio effect to the PCM (if the effect is a volume change over time, this could be done by progressively altering a volume factor between 0 to 1, multiplying the factor against the PCM)
convert back to audio bytes
write to the SourceDataLine
All these steps have been covered in StackOverflow posts.
The link tutorial does some simplification in how file locations, threads, and the stopping and starting are handled. But most importantly, it shows a working, live audio line from the microphone.

DirectX vs FFmpeg

i'm in the process of deciding how to decode received video frames, based on the following:
platform is Windows.
frames are encoded in H264 or H265.
GPU should be used as much
certainly we prefer less coding and simplest code. we just need to decode and show the result on screen. no recording is required, not anything else.
still i'm a newbie, but i think one may decode a frame directly by directx or through ffmpeg. am i right?
if so, which one is preferred?
For a simple approach and simple code using GPU only, take a look at my project using DirectX : H264Dxva2Decoder
If you are ready to code, you can use my approach.
If not, you can use MediaFoundation or FFMPEG, both can do the job.
MediaFoundation is C++ and COM oriented. FFMPEG is C oriented. It can make the difference for you.
EDIT
You can use my program because you have frames encoded in H264 or H265. For h265, you will have to add extra code.
Of course, you need to make modifications. And yes you can send frames to DirectX without using a file. This project use only avcc video file format, but it can be modify for others cases.
You don't need the atom parser. You need to modify the nalu parser, if frames are annex-b format, for example. You will also need to modify the buffering mechanism, if frames are annex-b format.
I can help you, if you provide frames samples encoded in H264.
About Ffmpeg, it has fewer limitations than my program, according to h264 specifications,
but does not provide the rendering mechanism. You will have to mix Ffmepg and my rendering mechanism, for example.
Or study a program like MPC-HC that shows the mix. I can not help anymore here.
EDIT 2
One thing to know, you can't decode encoded packets directly to GPU. You need to parse them before. That's why there is a nalu parser (see DXVA_PicParams_H264).
If you are not ready to code and to understand how it works, use Ffmpeg, it will be simpler, in effect. You can focus on rendering, not on decoding.
It's also important to know which one gives a better result, consumes less resources (CPU, GPU, RAM (both system memory and graphics card memory), supports wider range of formats, etc.
You ask for a real expertise...
If you code your own program, you will be able to optimize it, and certainly get better results. If you use Ffmpeg, and it has performance problems in your context, you could be blocked... because you will not modify Ffmpeg.
You say you will use Bosch cameras. Normally, all encoded video will be in the same format. So once your code is able to decode it, you don't really need all the Ffmpeg features.

MP4 Fast Forward/Rewind

I need help with implementing fast forward and rewind. I'm using directshow in c# and have played with IMediaSeeking however the results suck! SetRate does not work at all and SetPositions is choppy and apparently has sync issues with multiple threads so it ceases to run after the first time it's called. I played with Imediaposition but could not get it to work at all. My graph is simply
FileSourceAsync -> Intel Splitter -> MainConcept Decoder -> Decklink Render
After scanning the supported interfaces the filesource and decoder do not apparently support IMediaSeeking.
Does anybody have any ideas or clues that can help me fast forward and rewind an mp4 file in a directshow graph?
Cheers.
IMediaSeeking works properly when underlying filter implement it properly. One of the filters you use (Intel's?) seems to be having issues with seeking. Perhaps you can replace it with a better alternative.

How to play multiple mp3/wma files at once?

I have the need to play multiple soundeffects at once in my WP7 app.
I currently have it working with wav files that takes around 5 megabyte, instead of 500kb when coded in wma/mp3.
Current part of the code:
Stream stream = TitleContainer.OpenStream(String.Format("/location/{0}.wav", value)
SoundEffect effect = SoundEffect.FromStream(stream);
effect.Play();
This works great in a loop, preparing all effects, and then playing them.
However, I would really like to use mp3/wma/whatever-codec to slim my xap file down.
I tried to use MediaElement, but it appears that you also can't use that to play multiple files. Also the XNA MediaPlayer can't be instantiated, and as far as I experienced can't be made to play multiple files at once.
The only solution I see left is that I somehow decode the mp3 to wav and feed that Stream to SoundEffect.
Any ideas on how to accomplish the multiple playback? Or suggestions on how to decode mp3 to wav?
On the conversion... sorry - but I don't think there's any api currently available for WMA or MP3 decoding.
Also, I don't think there are any implementations of MP3, WMA or Ogg decoders which are available in pure c# code - all of them I've seen use DirectShow or PInvoke - e.g. see C# Audio Library.
I personally do expect audio/video compression/decompression to be available at some point in the near future in the WP7 APIs - but I can't guess when!
For some simple compression you can try things like shipping mono instead of stereo files, or shipping 8 bit rather than 16 bit audio files - these are easy to convert back to 16 bit (with obvious loss of resolution) on the phone.
Using compression like zip might also help for some sound effects... but I wouldn't expect it to be hugely successful.

Resources