How to reformat audio splitted to different blobs? - ffmpeg

I'm recording audio on web browser by RecordRTC then sending they to server in small blobs.
First of blobs has headers with codecs and I may just reformat it to any format by ffmpeg. All other blobs have no any built-in information and when I'm trying to put they in ffmpeg it just answer 'Codec not found'.
Ofc I may just concat they before any manipulations, but I want to stream they continuously to another computer.
So, how may I reformating such files during the stream?

Related

Livestream WebVTT with HLS

I've implemented an HLS service with ffmpeg (which pulls a live stream from nginx-rtmp). That all works fine, but now I'm wondering what kind of programming pattern I should be using to get live captioning to work.
I'm planning on using ffmpeg to output the incoming mp4 stream to multiple WAV chunks (i.e., the same way HLS fMP4 parts are created), and then sending those chunks over to Azure Cognitive Services for speech-to-text recognition. My question is, what do I do when I receive the speech results? Do I dump that vtt file into the same directory as my HLS chunks, and then serve that up using a single m3u8 file (with audio/video tracks along with the text track)?
Currently ffmpeg is updating the m3u8 playlist for HLS clients; would it be possible for me to create the m3u8 playlist just for the vtt files, and serve that concurrently with the "regular" HLS playlist? Also, time synchronization would seem to be difficult, because I'll be sending discrete WAV files over to Azure, so the vtt timestamps are going to be relative to the chunk I'm sending.
Help! I've done searches online, and I grasp the various issues, but I'm not sure how to plumb them all together.

HTTP Live Streaming: Fragmented MP4 or MPEG-TS?

I have an IP camera which sends out a live stream in RTSP over UDP and I want to display this stream in the browser, and I want it to work on the major browsers and on mobile (both iOs and Android). To achieve this I want to convert the stream to HTTP Live Streaming (HLS) on the server before sending it to the client. Now I've read that not very long ago Apple added support for fragmented MP4 (fMP4) as format for the stream, whereas normally the stream would be sent in MPEG-TS format. And fMP4 is also the format that MPEG-DASH supports, and MPEG-DASH might be the industry standard in a few years.
Now my question is, what the advantages and disadvantages of both fMP4 and MPEG-TS are?
EDIT: According to the technical notes for HLS from Apple, live streams must be encoded as MPEG-TS streams (https://developer.apple.com/library/content/technotes/tn2224/_index.html#//apple_ref/doc/uid/DTS40009745-CH1-ENCODEYOURVARIANTS). Is there a reason for this or is this information outdated?
fMP4 is likely to replace TS as a standard. It has less overhead and is required for HEVC, but the main advantage is compatibility with DASH - i.e. you can generate both HLS and DASH using the same files, which helps with compute and storage costs. For your particular use case, HLS TS probably has the more coverage (due to old devices and players) than HLS fMP4, but HLS+DASH fMP4 is what I would choose.

Live stream multi-bitrate video

Preface
I have read this two part tutorial (Part-1 and Part-2) by Steamroot on MPEG-DASH, and below is my understanding (please correct me if I am wrong):
The video needs to be encoded into multiple bit-rates using FFmpeg.
The encoded videos need to be transcoded (dashified) using MP4Box.
The dashified videos can be served using a web server.
Problem
I intend to live-stream an event and I need help to understand the following:
Can I club the FFmpeg and MP4Box commands into a single step? Maybe through a wrapper program so that I do not have to run them separately? Is there any other or better solution?
How do I send the dashified content to the web server? FTP? Would any vanilla web server do?
Lastly, a friend had hinted that I could also use GStreamer to achieve my objective. But, I could not find any good resource on the internet for the same. So, where (and how) does GStreamer fit in the above process?
What is the format you will be getting out of your camera for your live-event? There are a lot of solutions a lot more adapted for live streaming (the tutorial I wrote is for VOD streams only). You can check out simple solutions like Wowza Streaming Server, Nible streamer (free), etc, that take a RTMP stream and transform it into other formats (HLS, DASH, etc...).
Most of the livestreaming platforms can even do that for you (livestream.com, youtube, twitch, or even facebook now)
The dashified content will be requested as HTTP ressources by the browser or other players. In the case of a VoD stream, indeed you just need to make the dash segments available through a web-server. For live content, you need something smarter, that will encode, package the segments and make them available on the fly.
Gstreamer can transcode and transmux the original content, and can do it on the fly. You will be able to get different formats as outputs, like RTMP, HLS, and probably even mpeg-dash. Then you still need to make your content available via a webserver.
In conclusion, if you just want to transmit an occasional live event, it's probably a lot easier a platform that will ingest your RTMP stream and do all the complicated steps for you.

Live Transcoding & Streaming

My client has a requirement where he needs me to transcode a source file into a proxy with a unique burn in on it per playback.
For the proxy I will be using ffmpeg, nothing fancy, but ideally the users can play back the file as it is being transcoded since it may take up to several minutes to complete the transcoding.
Another restriction is that the player does not support HLS and other live streaming options and can only accept MP4s as a source.
Any ideas/suggestions would be great.
It seems you have conflicting requirements. mp4 is VERY poorly suited for live streaming. It is 'possible' to create a fake moov and have the player perform byte ranges. But it is very inefficient. You really need a player or platform that supports streaming formats such as fmp4 (fragmented mp4/dash) hls, ts, flv, rtmp, rtc, etc.

Encoding and streaming continuous PNG output image files as live video streaming on Web browser

I have an Open GL application that renders a simulation animation and outputs several PNG image files per second and saves these files in a disk. I want to stream these image files as a video streaming over HTTP protocol so I can view the animation video from a web browser. I already have a robust socket server that handles connection from websocket and I can handle all the handshake and message encoding/decoding part. My server program and OpenGL application program are written in C++.
A couple of questions in mind:
what is the best way to stream this OpenGL animation output and view it from my web browser? The video image frames are dynamically (continuously) generated by the OpenGL application as PNG image files. The web browser should display the video corresponding to the Open GL display output (with minimum latency time).
How can I encode these PNG image files as a continuous (live) video programmatically using C / C++ (without me manually pushing the image files to a streaming server software, like Flash Media Live Encoder)? What video format should I produce?
Should I send/receive the animation data using a web-socket, or is there any other better ways? (like JQuery Ajax call for instead, I am just making this up, but please guide me through the correct way of implementing this). It is gonna be great if this live video streaming works across different browsers.
Does HTML5 video tag support live video streaming, or does it only work for a complete video file which exists at a particular URL/directory (not a live streaming)?
Is there any existing code samples (tutorial) for doing this live video streaming, where you have a C/C++/Java application producing some image frames, and have a web-browser consuming this output as a video streaming? I could barely find tutorials about this topic after spending few hours searching on Google.
You definitely want to stop outputting PNG files to disk and instead input the frames of image data into a video encoder. A good bet is to use libav/ffmpeg. Next, you will have to encapsulate the encoded video to a network friendly format. I would recommend x264 as an encoder and MPEG4 or MPEG2TS stream format.
To view the video in the web browser, you'll have to choose the streaming format. HLS in HTML5 is supported by Safari, but unfortunately not much else. For wide client support you will need to use a plugin such as flash or a media player.
The easiest way I can think of to do this is to use Wowza for doing a server-side restream. The GL program would stream MPEG2 TS to Wowza, and it would then prepare streams for HLS, RTMP(flash), RTSP, and Microsoft Smooth Streaming (Silverlight). Wowza costs about $1000. You could setup an RTMP stream using Red5, which is free. Or you could do RTSP serving with VLC, but RTSP clients are universally terrible.
Unfortunately, at this time, the level of standardization for web video is very poor, and the video tooling is rather cumbersome. It's a large undertaking, but you can get hacking with ffmpeg/libav. A proof of concept could be writing image frames in YUV420p format to a pipe that ffmpeg is listening to and choosing an output stream that you can read with an RTSP client such as VLC, Quicktime, or Windows Media Player.
Most live video is MPEG2 internally, wrapped up as RTMP (Flash) or HLS (Apple). There is probably a way to render off your OpenGL to frames and have them converted into MPEG2 as a live stream, but I don't know exactly how (maybe FFMPEG?). Once that is done you can push the stream through Flash Media Live Encoder (it's free) and stream it out to Flash clients directly using RTMP or push publish it into Wowza Media Server to package it for Flash, Http Live Streaming (Cupertino), Smooth Streaming for Silverlight.
Basically you can string together some COTS solutions into a pipeline and play on a standard player without handling the sockets and low level stuff yourself.

Resources