Convert webm (or any other) format's chunks to mp4 - ffmpeg

Is it possible to get webm ( or other format ) chucks from a http post (upload) on my sever (i know how to do this).... then feed them as chucks (chunks recieved from browser) to gstreamer or ffmpeg to be converted to mp4 with reduced quality without loading the entire file in memory or to disk before saving the converted mp4? Why I dont want them to be loaded fully into memory or disk? scalability

Yes, you can feed ffmpeg one frame at a time without keeping the whole video file locally. You can read chunks of data from http stream and give them to ffmpeg library to decode. Here is an official example.

Related

What exactly is Fragmented mp4(fMP4)? How is it different from normal mp4?

Media Source Extension (MSE) needs fragmented mp4 for playback in the browser.
A fragmented MP4 contains a series of segments which can be requested individually if your server supports byte-range requests.
Boxes aka Atoms
All MP4 files use an object oriented format that contains boxes aka atoms.
You can view a representation of the boxes in your MP4 using an online tool such as MP4 Parser or if you're using Windows, MP4 Explorer. Let's compare a normal MP4 with one that is fragmented:
Non-Fragmented MP4
This screenshot (from MP4 Parser) shows an MP4 that hasn't been fragmented and quite simply has one massive mdat (Movie Data) box.
If we were building a video player that supports adaptive bitrate, we might need to know the byte position of the 10 sec mark in a 0.5Mbps and a 1Mbps file in order to switch the video source between the two files at that moment. Determining this exact byte position within one massive mdat in each respective file is not trivial.
Fragmented MP4
This screenshot shows a fragmented MP4 which has been segmented using MP4Box with the onDemand profile.
You'll notice the sidx and series of moof+mdat boxes. The sidx is the Segment Index and stores meta data of the precise byte range locations of the moof+mdat segments.
Essentially, you can independently load the sidx (its byte-range will be defined in the accompanying .mpd Media Presentation Descriptor file) and then choose which segments you'd like to subsequently load and add to the MSE SourceBuffer.
Importantly, each segment is created at a regular interval of your choosing (ie. every 5 seconds), so the segments can have temporal alignment across files of different bitrates, making it easy to adapt the bitrate during playback.
Media File Formats
Media data streams are wrapped in a container format. The container includes the physical data of the media but also metadata that are necessary for playback. For example it signals to the video player the codec
used, subtitles tracks etc. In video streaming there are two main formats
that are used for storage and presentation of multimedia content: MPEG-
2 Transport Streams (MPEG-2 TS)[25] and ISO Base Media File Formats
(ISOBMFF)[24](MP4 and fragmented MP4).
MPEG-2 Transport Streams are specified by [25] and are designed for
broadcasting video through satellite networks. However, Apple adopted
it for its adaptive streaming protocol making it an important format. In
MPEG-2 TS audio, video and subtitle streams are multiplexed together.
MP4 and fragmented MP4 (fMP4), are both part of the MPEG-4, Part
12 standard that covers the ISOBMFF. MP4 is the most known multimedia
container format and it’s widely supported in different operating systems
and devices. The structure of an MP4 video file, is shown in figure 2.2a.
As shown, MP4 consist of different boxes, each with a different function-
ality. These boxes are the basic building block of every container in MP4.
For example the file type box (’ftyp’), specifies the compatible brands (spe-
cifications) of the file. MP4 files have a Movie Box (’moov’) that contains
metadata of the media file and sample tables that are important for timing
and indexing the media samples (’stbl’). Also there is a Media Data Box
(’mdat’) that contains the corresponding samples. In the fragmented con-
tainer, shown in figure 2.2b, media samples are interleaved by using Movie
Fragment boxes (’moof’) which contain the sample table for the specific
fragment(mdat box).
Ref : https://repository.tudelft.nl/islandora/object/uuid%3Ae06cde4c-1514-4a8d-90be-7e10eee5aac1

FFMpeg - Is it difficultt to use

I am trying to use ffmpeg, and have been doing a lot of experiment last 1 month.
I have not been able to get through. Is it really difficult to use FFmpeg?
My requirement is simple as below.
Can you please guide me if ffmpeg is suitable one or I have implement on my own (using codec libs available).
I have a webm file (having VP8 and OPUS frames)
I will read the encoded data and send it to remote guy
The remote guy will read the encoded data from socket
The remote guy will write it to a file (can we avoid decoding).
Then remote guy should be able to pay the file using ffplay or any player.
Now I will take a specific example.
Say I have a file small.webm, containing VP8 and OPUS frames.
I am reading only audio frames (OPUS) using av_read_frame api (Then checks stream index and filters audio frames only)
So now I have data buffer (encoded) as packet.data and encoded data buffer size as packet.size (Please correct me if wrong)
Here is my first doubt, everytime audio packet size is not same, why the difference. Sometimes packet size is as low as 54 bytes and sometimes it is 420 bytes. For OPUS will frame size vary from time to time?
Next say somehow extract a single frame (really do not know how to extract a single frame) from packet and send it to remote guy.
Now remote guy need to write the buffer to a file. To write the file we can use av_interleaved_write_frame or av_write_frame api. Both of them takes AVPacket as argument. Now I can have a AVPacket, set its data and size member. Then I can call av_write_frame api. But that does not work. Reason may be one should set other members in packet like ts, dts, pts etc. But I do not have such informations to set.
Can somebody help me to learn if FFmpeg is the right choice, or should I write a custom logic like parse a opus file and get frame by frame.
Now remote guy need to write the buffer to a file. To write the file
we can use av_interleaved_write_frame or av_write_frame api. Both of
them takes AVPacket as argument. Now I can have a AVPacket, set its
data and size member. Then I can call av_write_frame api. But that
does not work. Reason may be one should set other members in packet
like ts, dts, pts etc. But I do not have such informations to set.
Yes, you do. They were in the original packet you received from the demuxer in the sender. You need to serialize all information in this packet and set each value accordingly in the receiver.

Extract frames as images from an RTMP stream in real-time

I am streaming short videos (4 or 5 seconds) encoded in H264 at 15 fps in VGA quality from different clients to a server using RTMP which produced an FLV file. I need to analyse the frames from the video as images as soon as possible so I need the frames to be written as PNG images as they are received.
Currently I use Wowza to receive the streams and I have tried using the transcoder API to access the individual frames and write them to PNGs. This partially works but there is about a second delay before the transcoder starts processing and when the stream ends Wowza flushes its buffers causing the last second not to get transcoded meaning I can lose the last 25% of the video frames. I have tried to find a workaround but Wowza say that it is not possible to prevent the buffer getting flushed. It is also not the ideal solution because there is a 1 second delay before I start getting frames and I have to re-encode the video when using the transcoder which is computationally expensive and unnecessarily for my needs.
I have also tried piping a video in real-time to FFmpeg and getting it to produce the PNG images but unfortunately it waits until it receives the entire video before producing the PNG frames.
How can I extract all of the frames from the stream as close to real-time as possible? I don’t mind what language or technology is used as long as it can run on a Linux server. I would be happy to use FFmpeg if I can find a way to get it to write the images while it is still receiving the video or even Wowza if I can find a way not to lose frames and not to re-encode.
Thanks for any help or suggestions.
Since you linked this question from the red5 user list, I'll add my two cents. You may certainly grab the video frames on the server side, but the issue you'll run into is transcoding from h.264 into PNG. The easiest was would be to use ffmpeg / avconv after getting the VideoData object. Here is a post that gives some details about getting the VideoData: http://red5.5842.n7.nabble.com/Snapshot-Image-from-VideoData-td44603.html
Another option is on the player side using one of Dan Rossi's FlowPlayer plugins: http://flowplayer.electroteque.org/snapshot
I finally found a way to do this with FFmpeg. The trick was to disable audio, use a different flv meta data analyser and to reduce the duration that FFmpeg waits for before processing. My FFmpeg command now starts like this:
ffmpeg -an -flv_metadata 1 -analyzeduration 1 ...
This starts producing frames within a second of receiving an input from a pipe so writes the streamed frames pretty close to real-time.

What is the minimum amount of metadata is needed to stream only video using libx264 to encode at the server and libffmpeg to decode at the client?

I want to stream video (no audio) from a server to a client. I will encode the video using libx264 and decode it with ffmpeg. I plan to use fixed settings (at the very least they will be known in advance by both the client and the server). I was wondering if I can avoid wrapping the compressed video in a container format (like mp4 or mkv).
Right now I am able to encode my frames using x264_encoder_encode. I get a compressed frame back, and I can do that for every frame. What extra information (if anything at all) do I need to send to the client so that ffmpeg can decode the compressed frames, and more importantly how can I obtain it with libx264. I assume I may need to generate NAL information (x264_nal_encode?). Having an idea of what is the minimum necessary to get the video across, and how to put the pieces together would be really helpful.
I found out that the minimum amount of information are the NAL units from each frame, this will give me a raw h264 stream. If I were to write this to a file, I could watchit using VLC if adding a .h264
I can also open such a file using ffmpeg, but if I want to stream it, then it makes more sense to use RTSP, and a good open source library for that is Live555: http://www.live555.com/liveMedia/
In their FAQ they mention how to send the output from your encoder to live555, and there is source for both a client and a server. I have yet to finish coding this, but it seems like a reasonable solution

Save Live Video Stream To Local Storage

Problem:
I have to save live video streams data which come as an RTP packets from RTSP Server.
The data come in two formats : MPEG4 and h264.
I do not want to encode/decode input stream.
Just write to a file which is playable with proper codecs.
Any advice?
Best Wishes
History:
My Solutions and their problems:
Firt attempt: FFmpeg
I use FFmpeg libary to get audio and video rtp packets.
But in order to write packets i have to use av_write_frame :
which seems that decode /encode takes place.
Also, when i give output format as mp4 ( av_guess_format("mp4", NULL, NULL)
the output file is unplayable.
[ any way ffmpeg has bad doc. hard to find what is wrong]
Second attempth: DirectShow
Then i decide to use DirectShow. I found a RTSP Source Filter.
Then a Mux and File Writer.
Cretae Single graph:
RTSP Source --> MPEG MUX ---> File Writer
It worked but the problem is that the output file is not playable
if graph is not stoped. If something happens, graph crashed for example
the output file is not playable
Also i can able to write H264 data, but the video is completely unplayable.
The MP4 file format has an index that is required for correct playback, and the index can only be created once you've finished recording. So any solution using MP4 container files (and other indexed files) is going to suffer from the same problem. You need to stop the recording to finalise the file, or it will not be playable.
One solution that might help is to break the graph up into two parts, so that you can keep recording to a new file while stopping the current one. There's an example of this at www.gdcl.co.uk/gmfbridge.
If you try the GDCL MP4 multiplexor, and you are having problems with H264 streams, see the related question GDCL Mpeg-4 Multiplexor Problem

Resources