RTMP live stream directly from NVENC encoder - ffmpeg

I am trying to create a live RTMP stream containing the animation generated with NVIDIA OptiX. The stream is to be received by nginx + rtmp module and broadcasted in MPEG-DASH format. Full chain up to dash.js player is working if the video is first saved to .flv file and then I send it with ffmpeg without any reformatting using command:
ffmpeg -re -i my_video.flv -c:v copy -f flv rtmp://x.x.x.x:1935/dash/test
But I want to stream directly from the code. And with this I am failng... Nginx logs an error "dash: invalid avcc received (2: No such file or directory)". Then it seems to receive the stream correctly (segments are rolling, dash manifest is there), however the stream is not possible to play in the browser.
I can see only one difference in the manifest between direct stream and stream from file. Codecs attribute of the representation in the direct stream is missed: codecs="avcc1.000000" instead of "avc1.640028" which I get when streaming from file.
My code opens the stream:
av_register_all();
AVOutputFormat* fmt = av_guess_format("flv",
file_name, nullptr);
fmt->video_codec = AV_CODEC_ID_H264;
AVFormatContext* _oc;
avformat_alloc_output_context2(&_oc, fmt, nullptr, "rtmp://x.x.x.x:1935/dash/test");
AVStream* _vs = avformat_new_stream(_oc, nullptr);
_vs->id = 0;
_vs->time_base = AVRational { 1, 25 };
_vs->avg_frame_rate = AVRational{ 25, 1 };
AVCodecParameters *vpar = _vs->codecpar;
vpar->codec_id = fmt->video_codec;
vpar->codec_type = AVMEDIA_TYPE_VIDEO;
vpar->format = AV_PIX_FMT_YUV420P;
vpar->profile = FF_PROFILE_H264_HIGH;
vpar->level = _level;
vpar->width = _width;
vpar->height = _height;
vpar->bit_rate = _avg_bitrate;
avio_open(&_oc->pb, _oc->filename, AVIO_FLAG_WRITE);
avformat_write_header(_oc, nullptr);
Width, height, bitrate, level and profile I get from NVENC encoder settings. I also do the error checking, ommited here. Then I have a loop writing each encoded packets, with IDR frames etc all prepared on the fly with NVENC. The loop body is:
auto & pkt_data = _packets[i];
AVPacket pkt = { 0 };
av_init_packet(&pkt);
pkt.pts = av_rescale_q(_n_frames++, AVRational{ 1, 25 }, _vs->time_base);
pkt.duration = av_rescale_q(1, AVRational{ 1, 25 }, _vs->time_base);
pkt.dts = pkt.pts;
pkt.stream_index = _vs->index;
pkt.data = pkt_data.data();
pkt.size = (int)pkt_data.size();
if (!memcmp(pkt_data.data(), "\x00\x00\x00\x01\x67", 5))
{
pkt.flags |= AV_PKT_FLAG_KEY;
}
av_write_frame(_oc, &pkt);
Obviously ffmpeg is writing avcc code somewhere... I have no clue where to add this code so the RTMP server can recognize it. Or I am missing something else?
Any hint greatly appreciated, folks!

Thanks to Gyan's comment I was able to solve the issue. Following the AV_CODEC_FLAG_GLOBAL_HEADER flag in the wrapper one can see how the global header is added, which was missing in my case. You can use directly the NVENC API function nvEncGetSequenceParams, but since I am anyway using SDK, it is a bit cleaner.
So I had to attach the header to AVCodecParameters::extradata:
std::vector<uint8_t> payload;
_encoder->GetSequenceParams(payload);
vpar->extradata_size = payload.size();
vpar->extradata = (uint8_t*)av_mallocz(payload.size() + AV_INPUT_BUFFER_PADDING_SIZE);
memcpy(vpar->extradata, payload.data(), payload.size());
_encoder is my instance of NvEncoder from SDK.
The wrapper is doing the same thing, however using deprecated struct AVCodecContext.

Related

How to configure AVStream to write 29.97FPS files using FFmpeg

I'm trying to write mkv file using ffmpeg to encode in FFV1 and FLAC in NTSC format, but the frame rate shown in VLC and media info are not correct.
Here is how I create and configure the output format context:
AVOutputFormat *outputFormat = av_guess_format("matroska", NULL, NULL);
//Allocate an AVFormatContext for an output format.
int err = avformat_alloc_output_context2(&_formatContext, outputFormat, NULL, filename);
//Specify the codec of the outputFormat
_formatContext->oformat->video_codec = _videoCodecContext->codec_id;
//Create AVStream
AVStream *videoStream = avformat_new_stream(_formatContext, NULL);
//FrameDuration.value : 1001, FrameDuration.timescale : 30000
videoStream->time_base = (AVRational){ (int)_frameDuration.value, (int)_frameDuration.timescale }; //1001 30000
//Copy video stream parameters to the muxer
err = avcodec_parameters_from_context(videoStream->codecpar, _videoCodecContext);
//Open file for writing
err = avio_open(&_formatContext->pb, filename, AVIO_FLAG_WRITE);
if (err >= 0) {
//Write header
err = avformat_write_header(_formatContext, &options);
}
Before writing the packet, I use this to convert PTS to the stream time_base
// Rescale output packet timestamp values from codec to stream timebase
av_packet_rescale_ts(inAVPacket, *inTimeStamp, [outputStream stream]->time_base);
The thing is that the avformat_write_header method is changing the stream time_base from 30000/1001 to 1/1000, so PTS loose precision. In VLC inspector, the frame rate shown is 1000 fps and in MediaInfo 30.033 fps.
The file is playing correctly and the video/audio sync is OK.
Is there something to do to specify the file frame rate somewhere else ?
Or a work around to avoid changing the time_base when calling avformat_write_header ?
Setting the avg_frame_rate fixes the issue...
videoStream->avg_frame_rate = _videoCodecContext->framerate;

sync dumped rtp streams [duplicate]

Hi I am in a need of a bit of a help/guidance because I got stuck in my research.
The problem:
How to convert RTP data using either gstreamer or avlib (ffmpeg) in either API (by programming) or console versions.
Data
I have RTP dump that comes from RTP/RTCP over TCP so I can get the precise start and stop for each RTP packet in file. It's a H264 video stream dump.
The data is in this fashion because I need to acquire the RTCP/RTP interleaved stream via libcurl (which I'm currently doing)
Status
I've tried to use ffmpeg to consume pure RTP packets but is seems that using rtp either by console or by programming involves "starting" the whole rtsp/rtp session business in ffmpeg. I've stopped there and for the time being I didn't pursue this avenue deeper. I guess this is possible with lover level RTP API like ff_rtp_parse_packet() I'm too new with this lib to do it straight out.
Then there is the gstreamer It has somewhat more capabilities to do it without programming, but for the time being I'm not able to figure out how to pass it the RTP dump I have.
I have also tried to do a little bit of a trickery and stream the dump via socat/nc to the udp port and listen on it via ffplay with sdp file as an input, there seems to be some progress the rtp at least gets recognized, but for socat there are loads of packet missing (data sent too fast perhaps?) and in the end the data is not visualized. When I used nc the video was badly misshapen but at least there were not that much receive errors.
One way or another the data is not properly visualized.
I know I can depacketize the data "by hand" but the idea is to do it via some kind of library because in the end there would also be second stream with audio that would have to be muxed together with the video.
I would appreciate any help on how to tackle this problem.
Thanks.
Finally after some period of time I had time to sit down at this problem again, and finally I've got the solution that satisfies me. I went on with RTP interleaved stream (RTP is interleaved with RTCP over single TCP connection).
So I had a interleaved RTCP/RTP stream that needed to be disassembled to Audio (PCM A-Law) and Video (h.264 Constrained baseline) RTP packets.
The decomposition of the RTSP stream containing RTP data is described here rfc2326.
Depacketization of the H264 is described here rfc6184, for the PCM A-Law the frames came out to be raw audio in RTP so no depacketization was necessary.
Next step was to calculate proper PTS (or presentation time stamp) for each stream, that was a bit of a hassle but finally the Live555 code came to help
(see RTP lipsync synchronization).
The last task was to mux it into a container that would support PCM alaw, I've used ffmpeg's avlibraries.
There are many examples over the Internet but many of them are outdated (ffmpeg is very 'dynamic' in API changes region) so I'm posting (most important parts of) what actually worked for me in the end:
The setup part:
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include "libavutil/intreadwrite.h"
#include "libavutil/mathematics.h"
AVFormatContext *formatContext;
AVOutputFormat *outputFormat;
AVStream *video_st;
AVStream *audio_st;
AVCodec *av_encode_codec = NULL;
AVCodec *av_audio_encode_codec = NULL;
AVCodecContext *av_video_encode_codec_ctx = NULL;
AVCodecContext *av_audio_encode_codec_ctx = NULL;
av_register_all();
av_log_set_level(AV_LOG_TRACE);
outputFormat = av_guess_format(NULL, pu8outFileName, NULL);
outputFormat->video_codec = AV_CODEC_ID_H264;
av_encode_codec = avcodec_find_encoder(AV_CODEC_ID_H264);
av_audio_encode_codec = avcodec_find_encoder(AV_CODEC_ID_PCM_ALAW);
avformat_alloc_output_context2(&formatContext, NULL, NULL, pu8outFileName);
formatContext->oformat = outputFormat;
strcpy(formatContext->filename, pu8outFileName);
outputFormat->audio_codec = AV_CODEC_ID_PCM_ALAW;
av_video_encode_codec_ctx = avcodec_alloc_context3(av_encode_codec);
av_audio_encode_codec_ctx = avcodec_alloc_context3(av_audio_encode_codec);
av_video_encode_codec_ctx->codec_id = outputFormat->video_codec;
av_video_encode_codec_ctx->codec_type = AVMEDIA_TYPE_VIDEO;
av_video_encode_codec_ctx->bit_rate = 4000;
av_video_encode_codec_ctx->width = u32width;
av_video_encode_codec_ctx->height = u32height;
av_video_encode_codec_ctx->time_base = (AVRational){ 1, u8fps };
av_video_encode_codec_ctx->max_b_frames = 0;
av_video_encode_codec_ctx->pix_fmt = AV_PIX_FMT_YUV420P;
av_audio_encode_codec_ctx->sample_fmt = AV_SAMPLE_FMT_S16;
av_audio_encode_codec_ctx->codec_id = AV_CODEC_ID_PCM_ALAW;
av_audio_encode_codec_ctx->codec_type = AVMEDIA_TYPE_AUDIO;
av_audio_encode_codec_ctx->sample_rate = 8000;
av_audio_encode_codec_ctx->channels = 1;
av_audio_encode_codec_ctx->time_base = (AVRational){ 1, u8fps };
av_audio_encode_codec_ctx->channel_layout = AV_CH_LAYOUT_MONO;
video_st = avformat_new_stream(formatContext, av_encode_codec);
audio_st = avformat_new_stream(formatContext, av_audio_encode_codec);
audio_st->index = 1;
video_st->avg_frame_rate = (AVRational){ 90000, 90000 / u8fps };
av_stream_set_r_frame_rate(video_st, (AVRational){ 90000, 90000 / u8fps });
The packets for video are written like this:
uint8_t *pu8framePtr = video_frame;
AVPacket pkt = { 0 };
av_init_packet(&pkt);
if (0x65 == pu8framePtr[4] || 0x67 == pu8framePtr[4] || 0x68 == pu8framePtr[4])
{
pkt.flags = AV_PKT_FLAG_KEY;
}
pkt.data = (uint8_t *)pu8framePtr;
pkt.size = u32LastFrameSize;
pkt.pts = av_rescale_q(s_video_sync.fSyncTime.tv_sec * 1000000 + s_video_sync.fSyncTime.tv_usec, (AVRational){ 1, 1000000 }, video_st->time_base);
pkt.dts = pkt.pts;
pkt.stream_index = video_st->index;
av_interleaved_write_frame(formatContext, &pkt);
av_packet_unref(&pkt);
and for the audio like this:
AVPacket pkt = { 0 };
av_init_packet(&pkt);
pkt.flags = AV_PKT_FLAG_KEY;
pkt.data = (uint8_t *)pu8framePtr;
pkt.size = u32AudioDataLen;
pkt.pts = av_rescale_q(s_audio_sync.fSyncTime.tv_sec * 1000000 + s_audio_sync.fSyncTime.tv_usec, (AVRational){ 1, 1000000 }, audio_st->time_base);
pkt.dts = pkt.pts;
pkt.stream_index = audio_st->index;
if (u8FirstIFrameFound) {av_interleaved_write_frame(formatContext, &pkt);}
av_packet_unref(&pkt)
and at the end some deinits:
av_write_trailer(formatContext);
av_dump_format(formatContext, 0, pu8outFileName, 1);
avcodec_free_context(&av_video_encode_codec_ctx);
avcodec_free_context(&av_audio_encode_codec_ctx);
avio_closep(&formatContext->pb);
avformat_free_context(formatContext);

Is it possible to decode MPEG4 frames without delay with ffmpeg?

I use ffmpeg's MPEG4 decoder. The decoder has CODEC_CAP_DELAY capability among others. It means the decoder will give me decoded frames with latency of 1 frame.
I have a set of MPEG4 (I- & P- )frames from AVI file and feed ffmpeg decoder with these frames. For the very first I-frame decoder gives me nothing, but decodes the frames successfully. I can force the decoder to get the decoded frame with the second call of avcodec_decode_video2 and providing nulls (flush it), but if I do so for each frame I get artifacts for the first group of pictures (e.g. second decoded P-frame is of gray color).
If I do not force ffmpeg decoder to give me decoded frame right now, then it works flawlessly and without artifacts.
Question: But is it possible to get decoded frame without giving the decoder next frame and without artifacts?
Small example of how decoding is implemented for each frame:
// decode
int got_frame = 0;
int err = 0;
int tries = 5;
do
{
err = avcodec_decode_video2(m_CodecContext, m_Frame, &got_frame, &m_Packet);
/* some codecs, such as MPEG, transmit the I and P frame with a
latency of one frame. You must do the following to have a
chance to get the last frame of the video */
m_Packet.data = NULL;
m_Packet.size = 0;
--tries;
}
while (err >= 0 && got_frame == 0 && tries > 0);
But as I said that gave me artifacts for the first gop.
Use the "-flags +low_delay" option (or in code, set AVCodecContext.flags |= CODEC_FLAG_LOW_DELAY).
I tested several options and "-flags low_delay" and "-probesize 32" is more important than others. bellow code worked for me.
AVDictionary* avDic = nullptr;
av_dict_set(&avDic, "flags", "low_delay", 0);
av_dict_set(&avDic, "probesize", "32", 0);
const int errorCode = avformat_open_input(&pFormatCtx, mUrl.c_str(), nullptr, &avDic);

Parsing FFMpeg AVPackets into h264 nal units

I am using FFMpeg To decode live video and stream it using Live555.i am able to decode video and getting the output AVPackets.
1. Convert the BGR Image to YUV422P format using FFMpeg's SWScale
// initilize a BGR To RGB converter using FFMpeg
ctx = sws_getContext(codecContext->width, codecContext->height, AV_PIX_FMT_BGR24, codecContext->width, codecContext->height, AV_PIX_FMT_YUV422P, SWS_BICUBIC, 0, 0, 0);
tempFrame = av_frame_alloc();
int num_bytes = avpicture_get_size(PIX_FMT_BGR24, codecContext->width, codecContext->height);
uint8_t* frame2_buffer = (uint8_t*)av_malloc(num_bytes*sizeof(uint8_t));
avpicture_fill((AVPicture*)tempFrame, frame2_buffer, PIX_FMT_BGR24, codecContext->width, codecContext->height);
// inside the loop of where frames are being encoded where rawFrame is a BGR image
tempFrame->data[0] = reinterpret_cast<uint8_t*>(rawFrame->_data);
sws_scale(ctx, tempFrame->data, tempFrame->linesize, 0, frame->height, frame->data, frame->linesize);
For decoding each Frame
ret = avcodec_encode_video2(codecContext, &packet, frame, &got_output);
if(ret < 0)
{
fprintf(stderr, "Error in encoding frame\n");
exit(1);
}
if(got_output)
{
//printf("Received frame! pushing to queue\n");
OutputFrame *outFrame = new OutputFrame();
outFrame->_data = packet.buf->data;
outFrame->_bufferSize = packet.buf->size;
outputQueue.push_back(outFrame);
}
Till here it works fine. i am able to write these frames to file and play it using VLC. after this i have to pass the output frame to Live555.i think AVPackets i am getting here doesn't need to be a single H264 Nal unit which is required by Live555.
How to break a AVPacket into Nal units which can be passed to Live555?
H264VideoStreamDiscreateFramer expect data without the start code '\x00\x00\x00\x01'.
It is needed to remove the 4 first bytes either in your LiveDeviceSource or inserting a FramedFilter to do this job.
Perhaps you can tried to use an H264VideoStreamFramer, like the testH264VideoStreamer test program.
If it could help, you can find one of my tries with live555 implementing an RTSP server feed from V4L2 capture https://github.com/mpromonet/h264_v4l2_rtspserver

Read from UDP Multicast RTSP Video Stream

I am currently developing an application that needs to decode a UDP multicast RTSP stream. At the moment, I can view the RTP stream using ffplay via
ffplay -rtsp_transport udp_multicast rtsp://streamURLGoesHere
However, I am trying to use FFMPEG to open the UDP stream via (error checking and cleanup code removed for the sake of brevity).
AVFormatContext* ctxt = NULL;
av_open_input_file(
&ctxt,
urlString,
NULL,
0,
NULL
);
av_find_stream_info(ctxt);
AVCodecContext* codecCtxt;
int videoStreamIdx = -1;
for (int i = 0; i < ctxt->nb_streams; i++)
{
if (ctxt->streams[i]->codec->codec_type == AVMEDIA_TYPE_VIDEO)
{
videoStreamIdx = i;
break;
}
}
AVCodecContext* codecCtxt = ctxt->streams[videoStreamIdx]->codec;
AVCodec* codec = avcodec_fine_decoder(codecCtxt->codec_id);
avcodec_open(codecCtxt, codec);
AVPacket packet;
while(av_read_frame(ctxt, &packet) >= 0)
{
if (packet.stream_index == videoStreamIdx)
{
/// Decoding performed here
...
}
}
...
This approach works fine with file inputs that consist of a raw encoded video stream, but for UDP multicast RTSP streams, it fails any error checking performed on av_open_input_file(). Please advise...
It turns out that opening a multicast UDP RTSP stream can be performed via the following:
AVFormatContext* ctxt = avformat_alloc_context();
AVDictionary* options = NULL;
av_dict_set(&options, "rtsp_transport", "udp_multicast", 0);
avformat_open_input(
&ctxt,
urlString,
NULL,
&options
);
...
avformat_free_context(ctxt);
Using avformat_open_input() in this manner instead of av_open_input_file() results in the desired behavior. I'm guessing that av_open_input_file() is either deprecated or was never intended to be used in this manner -- more than likely the latter ;)

Resources