raw h.264 bitstream decoding - ffmpeg

I can get raw h.264 frames from a camera. (it does NOT contain any network headers, for example rtsp, http).
They are h.264 raw data.
And I push these data to a queue frame by frame.
I googled many ffmpeg example which uses avformat_open_input() with either local file path or network path.
And I can see the video while I save the frames to a file and using avformat_open_input().
My problem is that I want to decode the frames realtime, not after it is saved as a file.
Does anyone have any idea on this?
Thanks!

You do not need avformat, you need avcodec. avformat is for parsing containers and protocols. avcodec is for encoding and decoding elementary streams (what you already have).
AVPacket avpkt; int err, frame_decoded = 0;
AVCodec *codec = avcodec_find_decoder ( AV_CODEC_ID_H264 );
AVCodecContext *codecCtx = avcodec_alloc_context3 ( codec );
avcodec_open2 ( codecCtx, codec, NULL );
// Set avpkt data and size here
err = avcodec_decode_video2 ( codecCtx, avframe, &frame_decoded, &avpkt );

Related

Fetching MacroBlock information in FFmpeg

I've a .mp4 file which contain h.264 video and AAC audio. I wants to extract MacroBlock and motion vector information of each frame while decoding. Please find my below Pseudocode.
avformat_open_input(file_name) //opening file
avcodec_open2(pCodecContext, pCodec, NULL) // opening decoder
while (response >= 0) // reading each frame
{
response = avcodec_receive_frame(pCodecContext, pFrame);
if (response == AVERROR(EAGAIN) || response == AVERROR_EOF || response < 0) {
break;
}
// extract macroblock of pFrame here
av_frame_unref(pFrame);
}
I have seen in other post mentioned that we can get MB information through MpegEncContext structure, but i'm confused were and how to instantiate object of that structure, how the MB data of the structure gets updated for each frame.?
Ultimately, I wants to compare macroblock from one frame to other frame using SAD (sum of absolute difference) and trigger alerts if there is any distortion in macroblock level.
I would really appreciate if anyone help on this.
You can get MB information(MV and other) from the AVFrame structure. There is a member int16_t (*motion_val[2])[2] in AVFrame where you can get MV.

FFMEG libavcodec decoder then re-encode video issue

I'm trying to use libavcodec library in FFMpeg to decode then re-encode a h264 video.
I have the decoding part working (rendes to an SDL window fine) but when I try to re-encode the frames I get bad data in the re-encoded videos samples.
Here is a cut down code snippet of my encode logic.
EncodeResponse H264Codec::EncodeFrame(AVFrame* pFrame, StreamCodecContainer* pStreamCodecContainer, AVPacket* pPacket)
{
int result = 0;
result = avcodec_send_frame(pStreamCodecContainer->pEncodingCodecContext, pFrame);
if(result < 0)
{
return EncodeResponse::Fail;
}
while (result >= 0)
{
result = avcodec_receive_packet(pStreamCodecContainer->pEncodingCodecContext, pPacket);
// If the encoder needs more frames to create a packed then return and wait for
// method to be called again upon a new frame been present.
// Else check if we have failed to encode for some reason.
// Else a packet has successfully been returned, then write it to the file.
if (result == AVERROR(EAGAIN) || result == AVERROR_EOF)
{
// Higher level logic, dedcodes next frame from source
// video then calls this method again.
return EncodeResponse::SendNextFrame;
}
else if (result < 0)
{
return EncodeResponse::Fail;
}
else
{
// Prepare packet for muxing.
if (pStreamCodecContainer->codecType == AVMEDIA_TYPE_VIDEO)
{
av_packet_rescale_ts(m_pPacket, pStreamCodecContainer->pEncodingCodecContext->time_base,
m_pDecodingFormatContext->streams[pStreamCodecContainer->streamIndex]->time_base);
}
m_pPacket->stream_index = pStreamCodecContainer->streamIndex;
int result = av_interleaved_write_frame(m_pEncodingFormatContext, m_pPacket);
av_packet_unref(m_pPacket);
}
}
return EncodeResponse::EncoderEndOfFile;
}
Strange behaviour I notice is that before I get the first packet from avcodec_receive_packet I have to send 50+ frames to avcodec_send_frame.
I built a debug build of FFMpeg and stepping into the code I notice that AVERROR(EAGAIN) is returned by avcodec_receive_packet because of the following in x264encoder::encode in encoder.c
if( h->frames.i_input <= h->frames.i_delay + 1 - h->i_thread_frames )
{
/* Nothing yet to encode, waiting for filling of buffers */
pic_out->i_type = X264_TYPE_AUTO;
return 0;
}
For some reason my code-context (h) never has any frames. I have spent a long time trying to debug ffmpeg and to determine what I'm doing wrong. But have reached the limit of my video codec knowledge (which is little).
I'm testing this with a video that has no audio to reduce complication.
I have created a cut down version of my application and provided a self contained (with ffmpeg and SDL built dependencies) project. Hopefully this can help anyone-one willing to help me :).
Project Link
https://github.com/maxhap/video-codec
After looking into encoder initialisation I found that I have to set the codec AV_CODEC_FLAG_GLOBAL_HEADER before calling avcodec_open2
pStreamCodecContainer->pEncodingCodecContext->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
This change led to the re-encoded moov box looking much heathier (used MP4Box.js to parse it). However, the video still does not play correctly, the output video has grey frames at the start when played in VLC and won't play in other players.
I have since tried creating an encoding context via the sample code, rather than using my decoding codec parameters. This led to fixing the bad/data or encoding issue. However, my DTS times are scaling to huge numbers
Here is my new codec init
if (pStreamCodecContainer->codecType == AVMEDIA_TYPE_VIDEO)
{
pStreamCodecContainer->pEncodingCodecContext->height = pStreamCodecContainer->pDecodingCodecContext->height;
pStreamCodecContainer->pEncodingCodecContext->width = pStreamCodecContainer->pDecodingCodecContext->width;
pStreamCodecContainer->pEncodingCodecContext->sample_aspect_ratio = pStreamCodecContainer->pDecodingCodecContext->sample_aspect_ratio;
/* take first format from list of supported formats */
if (pStreamCodecContainer->pEncodingCodec->pix_fmts)
{
pStreamCodecContainer->pEncodingCodecContext->pix_fmt = pStreamCodecContainer->pEncodingCodec->pix_fmts[0];
}
else
{
pStreamCodecContainer->pEncodingCodecContext->pix_fmt = pStreamCodecContainer->pDecodingCodecContext->pix_fmt;
}
/* video time_base can be set to whatever is handy and supported by encoder */
pStreamCodecContainer->pEncodingCodecContext->time_base = av_inv_q(pStreamCodecContainer->pDecodingCodecContext->framerate);
pStreamCodecContainer->pEncodingCodecContext->sample_aspect_ratio = pStreamCodecContainer->pDecodingCodecContext->sample_aspect_ratio;
}
else
{
pStreamCodecContainer->pEncodingCodecContext->channel_layout = pStreamCodecContainer->pDecodingCodecContext->channel_layout;
pStreamCodecContainer->pEncodingCodecContext->channels =
av_get_channel_layout_nb_channels(pStreamCodecContainer->pEncodingCodecContext->channel_layout);
/* take first format from list of supported formats */
pStreamCodecContainer->pEncodingCodecContext->sample_fmt = pStreamCodecContainer->pEncodingCodec->sample_fmts[0];
pStreamCodecContainer->pEncodingCodecContext->time_base = AVRational{ 1, pStreamCodecContainer->pEncodingCodecContext->sample_rate };
}
Any ideas why my DTS time is re-scaling incorrectly?
I managed to fix the DTS scalling by using the time_base value directly from the decoding streams.
So
pStreamCodecContainer->pEncodingCodecContext->time_base = m_pDecodingFormatContext->streams[pStreamCodecContainer->streamIndex]->time_base
Instead of
pStreamCodecContainer->pEncodingCodecContext->time_base = av_inv_q(pStreamCodecContainer->pDecodingCodecContext->framerate);
I will create an answer based on all my finding.
To fix the initial problem of a corrupted moov box I had to add the AV_CODEC_FLAG_GLOBAL_HEADER flag to the encoding codec context before calling avcodec_open2.
encCodecContext->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
The next issue was badly scaled DTS values in the encoded package, this was causing a side effect of the final mp4 duration being in the hundreds of hours long. To fix this I had to change the encoding codec context timebase to be that of the decoding context streams timebase. This is different than using av_inv_q(framerate) as suggested in the avcodec transcoding example.
encCodecContext->time_base = decCodecFormatContext->streams[streamIndex]->time_base;

Using FFMPEG to make HLS clips from H264

I am using a Hi35xx camera processor from HiSilicon. It is an Arm9 with a video pipeline bolted on the side. At one end of the pipeline is the CMOS sensor. At the other end is a H264 encoder. When I turn on the pipeline, the encoder outputs H264 NAL packets like this:
frame0: <SPS>,<PPS>,<SEI>,<key frame>
frame1: <delta frame>
frame2: <delta frame>
...
frameN: <delta frame>
frameN+1: <SPS>,<PPS>,<SEI><key frame>
frameN+2: <delta frame>
frameN+3: <delta frame>
...
etc.
I am turning that into HLS clips by doing the following (pseudo code for clarity) :
av_register_all();
avformat_network_init();
avformat_alloc_output_context2(&ctx_out, NULL, "hls", "./foo.m3u8");
strm_out = avformat_new_stream(ctx_out, NULL);
codec_out = strm_out->codecpar;
codec_out->codec_id = AV_CODEC_ID_H264;
codec_out->codec_type = AVMEDIA_TYPE_VIDEO;
codec_out->width = encoder_width;
codec_out->height = encoder_height;
codec_out->bit_rate = encoder_bitrate;
codec_out->codec_tag = 0;
avformat_write_header(ctx_out, NULL);
while(get_packet_from_pipeline_encoder(&encoder_packet)) {
AVPacket pkt;
av_init_packet(&pkt);
pkt.stream_index = 0;
pkt.dts = AV_NOPTS_VALUE;
pkt.pts = AV_NOPTS_VALUE;
pkt.duration = (1000000/FRAMERATE); // frame rate in microseconds
pkt.data = encoder_packet.data;
pkt.size = encoder_packet.size;
if (is_keyframe(&encoder_packet)) {
pkt.flags |= AV_PKT_FLAG_KEY;
}
av_write_frame(ctx_out, &pkt);
}
av_write_trailer(ctx_out);
avformat_free_context(ctx_out);
This seems to work fine except that the resulting HLS frame rate is not right. Of course, this happens because I am not setting the pts/dts stuff correctly and ffmpeg lets me know that. So I have two quetions:
Am I going about this right?
How can I set the pts/dts stuff correctly?
The encoder is giving me packets and I am submitting them as frames. Those <SPS>, <PPS> and <SEI> packets are really out of band data and don't really have a timestamp. How can I submit them correctly?
My conclusion is that I was going at this the wrong way. The basic problem was that I don't have an input context so there is no h264 parser that can take the SPS, PPS and SEI packets and do anything with them. I suspect that my loop appears to be working because I am writing to an 'mpegts' file which is just h264 packets with the leading NAL zeroes replace with length words (some bit-stream filter is doing that). But that would mean that there isn't much chance that I can get the timestamps right because I have to submit them as frames. I can't submit them as 'extradata/sidedata' because there is no decoder to catch them.
This problem is fixed by writing a custom IO context for the output of my encoder and then do a normal input context. I have done some experiments with this approach and it seems to work.

Does Cobalt support webm progressive playback

It seems that MediaSource and Progressive playback use the different demuxer. ChunkDemuxer is used for MediaSource, ShellDemuxer is used for Progressive playback.
In ShellParser.cpp implementation:
PipelineStatus ShellParser::Construct(
scoped_refptr<ShellDataSourceReader> reader,
scoped_refptr<ShellParser>* parser,
const scoped_refptr<MediaLog>& media_log) {
DCHECK(parser);
DCHECK(media_log);
*parser = NULL;
// download first 16 bytes of stream to determine file type and extract basic
// container-specific stream configuration information
uint8 header[kInitialHeaderSize];
int bytes_read = reader->BlockingRead(0, kInitialHeaderSize, header);
if (bytes_read != kInitialHeaderSize) {
return DEMUXER_ERROR_COULD_NOT_PARSE;
}
// attempt to construct mp4 parser from this header
return ShellMP4Parser::Construct(reader, header, parser, media_log);
}
It seems that Cobalt can only demux MP4 container(Only ShellMP4Parser) for progressive playback.
Is it known status for Cobalt ?how can we support webm progressive playback on the device?
Cobalt will not support WebM/VP9 progressive playback. We changed the Progressive Conformance test to change VP9 to H264. This will be pushed soon.
https://github.com/youtube/js_mse_eme/commit/d7767e13be7ed8b8bdb2efda39337a4a2fb121ba

encapsulating H.264 streams variable framerate in MPEG2 transport stream

Imagine I have H.264 AnxB frames coming in from a real-time conversation. What is the best way to encapsulate in MPEG2 transport stream while maintaining the timing information for subsequent playback?
I am using libavcodec and libavformat libraries. When I obtain pointer to object (*pcc) of type AVCodecContext, I set the foll.
pcc->codec_id = CODEC_ID_H264;
pcc->bit_rate = br;
pcc->width = 640;
pcc->height = 480;
pcc->time_base.num = 1;
pcc->time_base.den = fps;
When I receive NAL units, I create a AVPacket and call av_interleaved_write_frame().
AVPacket pkt;
av_init_packet( &pkt );
pkt.flags |= AV_PKT_FLAG_KEY;
pkt.stream_index = pst->index;
pkt.data = (uint8_t*)p_NALunit;
pkt.size = len;
pkt.dts = AV_NOPTS_VALUE;
pkt.pts = AV_NOPTS_VALUE;
av_interleaved_write_frame( fc, &pkt );
I basically have two questions:
1) For variable framerate, is there a way to not specify the foll.
pcc->time_base.num = 1;
pcc->time_base.den = fps;
and replace it with something to indicate variable framerate?
2) While submitting packets, what "timestamps" should I assign to
pkt.dts and pkt.pts?
Right now, when I play the output using ffplay it is playing at constant framerate (fps) which I use in the above code.
I also would love to know how to accommodate varying spatial resolution. In the stream that I receive, each keyframe is preceded by SPS and PPS. I know whenever the spatial resolution changes.
IS there a way to not have to specify
pcc->width = 640;
pcc->height = 480;
upfront? In other words, indicate that the spatial resolution can change mid-stream.
Thanks a lot,
Eddie
DTS and PTS are measured in a 90 KHz clock. See ISO 13818 part 1 section 2.4.3.6 way down below the syntax table.
As for the variable frame rate, your framework may or may not have a way to generate this (vui_parameters.fixed_frame_rate_flag=0). Whether the playback software handles it is an ENTIRELY different question. Most players assume a fixed frame rate regardless of PTS or DTS. mplayer can't even compute the frame rate correctly for a fixed-rate transport stream generated by ffmpeg.
I think if you're going to change the resolution you need to end the stream (nal_unit_type 10 or 11) and start a new sequence. It can be in the same transport stream (assuming your client's not too simple).

Resources