Wrap a stream of raw H264 NALUs into a container like MP4 - ffmpeg

I have an application that sends raw h264 NALUs as generated from encoding on the fly using x264 x264_encoder_encode. I am getting them through plain TCP so I am not missing any frames.
I need to be able to decode such a stream in the client using Hardware Acceleration in Windows (DXVA2). I have been struggling to find a way to get this to work using FFMPEG. Perhaps it may be easier to try Media Foundation or DirectShow, but they won't take raw H264.
I either need to:
Change the code from the server application to give back an mp4 stream. I am not that experienced with x264. I was able to get raw H264 by calling x264_encoder_encode, by following the answer to this question: How does one encode a series of images into H264 using the x264 C API? How can I go from this to something that is wrapped in MP4 while still being able to stream it in realtime
I could at the receiver wrap it with mp4 headers and feed it into something that can play it using DXVA. I wouldn't know how to do this
I could find another way to accelerate it using DXVA with FFMPEG or something else that takes it in raw format.
An important restriction is that I need to be able to pre-process each decoded frame before displaying it. Any solution that does decoding and displaying in a single step would not work for me
I would be fine with either solution

I believe you should be able to use H.264 packets off the wire with Media Foundation. there's an example on page 298 of this book http://www.docstoc.com/docs/109589628/Developing-Microsoft-Media-Foundation-Applications# that use a HTTP stream with Media Foundation.
I'm only learning Media Foundation myself and am trying to do a similar thing to you, in my case I want to use H.264 payloads from an RTP packet, and from my understanding that will require a custom IMFSourceReader. Accessing the decoded frames should also be possible from what I've read since there seems to be complete flexibility in chaining components together into topologies.

Related

Decode H.264 without stream

I have an application wherein I have H.264 frames from an RTSP stream stored in a proprietary database. I need to be able to present a frame to the H.264 decoder (frames in sequence, of course) and get back the decoded frame (bitmap, whatever) output. I cannot use the traditional DirectShow streams because I don't have a stream. Is there any codec can be used in this manner? Later I will need to go the other way as well (given bitmaps or other format images, create an H.264 stream). Any help you can give would be greatly appreciated.
Create a DirectShow Source Filter that assembles the h264 stream from database, then you can pass it to standard DirectShow H264 decoder. Look into DirectShow samples for example source code.
As Isso mentioned already, you can push the H.264 data into DirectShow pipeline and have the frame decoded. Additionally to this, there is H.264 Video Decoder MFT (Windows 7 and more recent only) which might be an easier way to use the decoder and to apply it to an individual "frame". You can use other decoders as well, such as FFmpeg/libavcodec however you would still need to interface to the decoders typically designed for stream processing.

Encode WebCam frames with H.264 on .NET

What i want to do is the following procedure:
Get a frame from the Webcam.
Encode it with an H264 encoder.
Create a packet with that frame with my own "protocol" to send it via UDP.
Receive it and decode it...
It would be a live streaming.
Well i just need help with the Second step.
Im retrieving camera images with AForge Framework.
I dont want to write frames to files and then decode them, that would be very slow i guess.
I would like to handle encoded frames in memory and then create the packets to be sent.
I need to use an open source encoder. Already tryed with x264 following this example
How does one encode a series of images into H264 using the x264 C API?
but seems it only works on Linux, or at least thats what i thought after i saw like 50 errors when trying to compile the example with visual c++ 2010.
I have to make clear that i already did a lot of research (1 week reading) before writing this but couldnt find a (simple) way to do it.
I know there is the RTMP protocol, but the video stream will always be seen by one peroson at a(/the?) time and RTMP is more oriented to stream to many people. Also i already streamed with an adobe flash application i made but was too laggy ¬¬.
Also would like you to give me an advice about if its ok to send frames one by one or if it would be better to send more of them within each packet.
I hope that at least someone could point me on(/at?) the right direction.
My english is not good maybe blah blah apologies. :P
PS: doesnt has to be in .NET, it can be in any language as long as it works on Windows.
Many many many many thanks in advance.
You could try your approach using Microsoft's DirectShow technology. There is an opensource x264 wrapper available for download at Monogram.
If you download the filter, you need to register it with the OS using regsvr32. I would suggest doing some quick testing to find out if this approach is feasible, use the GraphEdit tool to connect your webcam to the encoder and have a look at the configuration options.
Also would like you to give me an advice about if its ok to send frames one by one or if it would be better to send more of them within each packet.
This really depends on the required latency: the more frames you package, the less header overhead, but the more latency since you have to wait for multiple frames to be encoded before you can send them. For live streaming the latency should be kept to a minimum and the typical protocols used are RTP/UDP. This implies that your maximum packet size is limited to the MTU of the network often requiring IDR frames to be fragmented and sent in multiple packets.
My advice would be to not worry about sending more frames in one packet until/unless you have a reason to. This is more often necessary with audio streaming since the header size (e.g. IP + UDP + RTP) is considered big in relation to the audio payload.

What is the minimum amount of metadata is needed to stream only video using libx264 to encode at the server and libffmpeg to decode at the client?

I want to stream video (no audio) from a server to a client. I will encode the video using libx264 and decode it with ffmpeg. I plan to use fixed settings (at the very least they will be known in advance by both the client and the server). I was wondering if I can avoid wrapping the compressed video in a container format (like mp4 or mkv).
Right now I am able to encode my frames using x264_encoder_encode. I get a compressed frame back, and I can do that for every frame. What extra information (if anything at all) do I need to send to the client so that ffmpeg can decode the compressed frames, and more importantly how can I obtain it with libx264. I assume I may need to generate NAL information (x264_nal_encode?). Having an idea of what is the minimum necessary to get the video across, and how to put the pieces together would be really helpful.
I found out that the minimum amount of information are the NAL units from each frame, this will give me a raw h264 stream. If I were to write this to a file, I could watchit using VLC if adding a .h264
I can also open such a file using ffmpeg, but if I want to stream it, then it makes more sense to use RTSP, and a good open source library for that is Live555: http://www.live555.com/liveMedia/
In their FAQ they mention how to send the output from your encoder to live555, and there is source for both a client and a server. I have yet to finish coding this, but it seems like a reasonable solution

Save Live Video Stream To Local Storage

Problem:
I have to save live video streams data which come as an RTP packets from RTSP Server.
The data come in two formats : MPEG4 and h264.
I do not want to encode/decode input stream.
Just write to a file which is playable with proper codecs.
Any advice?
Best Wishes
History:
My Solutions and their problems:
Firt attempt: FFmpeg
I use FFmpeg libary to get audio and video rtp packets.
But in order to write packets i have to use av_write_frame :
which seems that decode /encode takes place.
Also, when i give output format as mp4 ( av_guess_format("mp4", NULL, NULL)
the output file is unplayable.
[ any way ffmpeg has bad doc. hard to find what is wrong]
Second attempth: DirectShow
Then i decide to use DirectShow. I found a RTSP Source Filter.
Then a Mux and File Writer.
Cretae Single graph:
RTSP Source --> MPEG MUX ---> File Writer
It worked but the problem is that the output file is not playable
if graph is not stoped. If something happens, graph crashed for example
the output file is not playable
Also i can able to write H264 data, but the video is completely unplayable.
The MP4 file format has an index that is required for correct playback, and the index can only be created once you've finished recording. So any solution using MP4 container files (and other indexed files) is going to suffer from the same problem. You need to stop the recording to finalise the file, or it will not be playable.
One solution that might help is to break the graph up into two parts, so that you can keep recording to a new file while stopping the current one. There's an example of this at www.gdcl.co.uk/gmfbridge.
If you try the GDCL MP4 multiplexor, and you are having problems with H264 streams, see the related question GDCL Mpeg-4 Multiplexor Problem

Adding audio channel using ffmpeg

I am working on ffmpeg and trying to add a audio stream on the fly. I am using AudioQueues and I get raw audio buffer. I am encoding audio with linear PCM and hence the audio I get will be of raw format, which I know ffmpeg does accept it. But I cannot figure out how. I have looked into AVStream, where in we have to create a new stream for this audio channel but how do I encode it to a video which is already initialized in another AVStream structure.
Overall, I would like to have an idea of the architecture of ffmpeg. I found it difficult to work since it is least documented. Any pointers or details are appreciated.
Thanks and Regards,
Raj Pawan G
If you want to use java, you'll find a much better documented API wrapper for FFmpeg with Xuggler.
That said, FFmpeg can support Raw PCM bu tnot all containers can contain it. use the PCM codecs (see avcodec.h) and find the one that has the right size and attributes you want. To add the audio to the same container find a AVFormatContext object that you use for your existing video stream, and use av_new_stream(...) to add a new stream. Then attach your PCM encoder and 'encode' to that and write resulting packets. See output_example.c in the FFmpeg for examples of this API in action.

Resources