build an encoder on android(FFMPEG)

build an encoder on android(FFMPEG) - ffmpeg

I need to build an encoder on android. Trying to encode the video stream captured by camera to h.264.
I've got the libffmpeg.so file, but I don't know how to use it.
I'm new on this. Could anyone give some suggestions?

To use the FFMPEG libraries on Android, you would have to integrate the same as OMX components.
For ffmpeg compilation and OMX generation, you could refer to this link: FFmpeg on Android
Once you have the OMX component ready, you will have to integrate the same into Android, by including the same in media_codecs.xml. If you desire to invoke your specific encoder always, please do ensure that your codec is the first codec registered in the list.
For the encoder, you will to have consider a couple of important points.
One, if you wish to optimize your system, then you may want to avoid copying of frames from the source (camera, surface or some other source) to the input port of your OMX encoder component. Hence, your codec will have to support passing of buffers through metadata (Reference: http://androidxref.com/4.2.2_r1/xref/frameworks/av/media/libmediaplayerservice/StagefrightRecorder.cpp#1413). If you require more information on this topic, please raise a separate question.
Two, The encoder will have to support standard OMX indices and some new indices. For example, for Miracast, a new index prependSPSPPStoIDRFrames is introduced, which is supported through getExtensionIndex. For reference, you could refer to http://androidxref.com/4.2.2_r1/xref/frameworks/av/media/libstagefright/ACodec.cpp#891 .
In addition to the aforementioned index, the encoder will also get a new request to enableGraphicBuffers with a FALSE boolean value. The most important point for these 2 indices is to ensure that the OMX component doesn't fail when these 2 indices are invoked.
With these modifications, you should be able to integrate your encoder into Stagefright framework.

Related

In what way does this HEVC video not comply to Apples's requirements document?

My goal is to work out why a given video file does not play on Macos/Safari/Quicktime.
The background to this question is that it is possible to play HEVC videos with a transparent background/alpha channel on Safari/MacOS. To be playable, a video must meet the specific requirements set out by Apple in this document:
https://developer.apple.com/av-foundation/HEVC-Video-with-Alpha-Interoperability-Profile.pdf
The video that does not play on Apple/Safari/Quicktime is an HEVC video with an alpha transparency channel. Note that VLC for MacOS DOES play this file. Here it is:
https://drive.google.com/file/d/1ZnXjcDbk-_YxTgRuH_D7RSR9SXdY_XTv/view?usp=share_link
I have two example HEVC video files with a transparent background/alpha channel, and they both play fine using either Quicktime player or Safari:
Working video #1:
https://drive.google.com/file/d/1PJAyg_sVKVvb-Py8PAu42c1qm8l2qCbh/view?usp=share_link
Working video #2:
https://drive.google.com/file/d/1kk8ssUyT7qAaK15afp8VPR6mIWPFX8vQ/view?usp=sharing
The first step is to work out in what way my non-working video ( https://drive.google.com/file/d/1ZnXjcDbk-_YxTgRuH_D7RSR9SXdY_XTv/view?usp=share_link ) does not comply with the specification.
Once it is clear which requirements are not met by the non-working video then I can move onto the next phase, which is to try to formulate an ffmpeg command that will output a video meeting the requirements.
I have read Apples requirements document and I am out of my depth in trying to analyse the non working video against the requirements - I don't know how to do it.
Can anyone suggest a way to identify what is wrong with the video?
Additional context is that I am trying to find a way to create Apple/MacOS compatible alpha channel / transparent videos using ffmpeg with hevc_nvenc running on an Intel machine. I am aware that Apple hardware can create such videos, but for a wide variety of reasons it is not practical for me to use Apple hardware to do the job. I have spent many hours trying all sorts of ffmpeg and ffprobe commands to try to work out what is wrong and modify the video to fix it, but to be honest most of my attempts are guesswork.

The Apple specification for an alpha layer in HEVC requires that the encoder process and store the alpha in a certain manner. It also requires that the stream configuration syntax be formed in a specific manner. At time of writing, I'm aware of only the videotoolbox HEVC encoder being capable of emitting such a stream.

Free and open-source lib to decode x.265 (HEVC) stream in a C project?

I'm doing a project in C which requires playing an incoming stream of HEVC content to the user. My understanding is that I need a library that gives me an API to a HEVC decoder (not and encoder, but a decoder). Here are my options so far:
The x265 looks perfect but it's all about the encoding part (and nothing about decoding it !). I'm not interested in an API to a HEVC encoder, what I want is the decoder part.
There is libde265 and OpenHEVC but I'm not sure they have what I want. Couldn't find it anywhere in their docs that there is an API that I can use to decode the content but since there are players out there using those libs, I'm assuming it must be there somewhere ... couldn't find it though !
There is ffmpeg project with its own decoders (HEVC included) but I'm not sure this is the right thing since I only want the hevc decoder and nothing else.
Cheers

Just go with FFmpeg, I'm guessing you'll only need to link with libavcodec library and it's API/Interfaces. And yes, the machine where your code work will have the whole FFmpeg installed (or maybe not, just the library might work).
Anyway, even that shouldn't be any problem unless the machine is an embedded system with tight space constraints (which is unlikely since it's h265, which implies abundant of source needed).

WMV encoding using Media Foudation: specifying "Number of B Frames"

I am encoding video to WMV using Media Foundation SDK. I see that the number of B frames can be set using a property, but I have no clue how/where to set it.
That property is called MFPKEY_NUMBFRAMES and is described here:
http://msdn.microsoft.com/en-us/library/windows/desktop/ff819354%28v=vs.85%29.aspx
Our code does roughly the following:
call MFStartup
call MFCreateAttributes once so we can set muxer, video and audio attributes
configure the IMFAttributes created in the previous step, for example by setting the video bitrate: pVideoOverrides->SetUINT32(MF_MT_AVG_BITRATE, m_iVideoBitrateBPS);
create sink writer by calling IMFReadWriteClassFactory::CreateInstanceFromURL
for each frame, call WriteSample on the sink writer
call MFShutdown
Am I supposed to set the b-frames property on the IMFAttribute on which I also set the video bitrate?

The property is applicable to Windows Media Video 9 Encoder. That is, you need to locate it on your topology and adjust the property there. Other topology elements (e.g. multiplexer) might accept other properties, but this one has no effect there.
MSDN gives you step by st4ep instructions in Configuring a WMV Encoder and where it says
To specify the target bitrate, set the MF_MT_AVG_BITRATE attribute on the media type.
You can also alter other encoder properties. There is also step by step detailed Tutorial: 1-Pass Windows Media Encoding which shows the steps of the entire process.

Wrap a stream of raw H264 NALUs into a container like MP4

I have an application that sends raw h264 NALUs as generated from encoding on the fly using x264 x264_encoder_encode. I am getting them through plain TCP so I am not missing any frames.
I need to be able to decode such a stream in the client using Hardware Acceleration in Windows (DXVA2). I have been struggling to find a way to get this to work using FFMPEG. Perhaps it may be easier to try Media Foundation or DirectShow, but they won't take raw H264.
I either need to:
Change the code from the server application to give back an mp4 stream. I am not that experienced with x264. I was able to get raw H264 by calling x264_encoder_encode, by following the answer to this question: How does one encode a series of images into H264 using the x264 C API? How can I go from this to something that is wrapped in MP4 while still being able to stream it in realtime
I could at the receiver wrap it with mp4 headers and feed it into something that can play it using DXVA. I wouldn't know how to do this
I could find another way to accelerate it using DXVA with FFMPEG or something else that takes it in raw format.
An important restriction is that I need to be able to pre-process each decoded frame before displaying it. Any solution that does decoding and displaying in a single step would not work for me
I would be fine with either solution

I believe you should be able to use H.264 packets off the wire with Media Foundation. there's an example on page 298 of this book http://www.docstoc.com/docs/109589628/Developing-Microsoft-Media-Foundation-Applications# that use a HTTP stream with Media Foundation.
I'm only learning Media Foundation myself and am trying to do a similar thing to you, in my case I want to use H.264 payloads from an RTP packet, and from my understanding that will require a custom IMFSourceReader. Accessing the decoded frames should also be possible from what I've read since there seems to be complete flexibility in chaining components together into topologies.

Can I get raw video frames from DirectShow without playback

I'm working on a media player using Media foundation. I want to support VOB files playback. However, media foundation currently does not support the VOB container. Therefore I wish to use DirectShow for the same.
My idea here is not to take an alternate path using a DirectsShow graph, but just grab a video frame and pass it to the same pipeline in media foundation. In media foundation, I have an 'IMFSourceReader' which simply reads frames from the video file. Is there a direct show equivalent, which just gives me the frames without needing to create a graph, start playback cycle, and then trying to extract frames from the renders pin? (To be more clear, does DirectsShow support an architecture wherein it could give me raw frames without actually having to play the video?)
I've read about ISampleGrabber but its deprecated and I think it won't fit my architecture. I've not worked on DirectShow before.
Thanks,
Mots

You have to build a graph and accept frames from the respective parser/demultiplexer filter which will read container and deliver individual frames on its output.
The playback does not have to be realtime, nor you need to fake painting those video frames somewhere. Once you get the data you need in Sample Grabber filter, or a customer filter, you can terminate pipeline with a Null Renderer. That is, you can arrange getting frames you need in a more or less convenient way.

You can use Monogram frame grabber filter to connect the VOB DS filter's output - it works great. See the comments there for how to connect the output to external application.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio