I've a .mp4 file which contain h.264 video and AAC audio. I wants to extract MacroBlock and motion vector information of each frame while decoding. Please find my below Pseudocode.
avformat_open_input(file_name) //opening file
avcodec_open2(pCodecContext, pCodec, NULL) // opening decoder
while (response >= 0) // reading each frame
{
response = avcodec_receive_frame(pCodecContext, pFrame);
if (response == AVERROR(EAGAIN) || response == AVERROR_EOF || response < 0) {
break;
}
// extract macroblock of pFrame here
av_frame_unref(pFrame);
}
I have seen in other post mentioned that we can get MB information through MpegEncContext structure, but i'm confused were and how to instantiate object of that structure, how the MB data of the structure gets updated for each frame.?
Ultimately, I wants to compare macroblock from one frame to other frame using SAD (sum of absolute difference) and trigger alerts if there is any distortion in macroblock level.
I would really appreciate if anyone help on this.
You can get MB information(MV and other) from the AVFrame structure. There is a member int16_t (*motion_val[2])[2] in AVFrame where you can get MV.
Related
I'm trying to use libavcodec library in FFMpeg to decode then re-encode a h264 video.
I have the decoding part working (rendes to an SDL window fine) but when I try to re-encode the frames I get bad data in the re-encoded videos samples.
Here is a cut down code snippet of my encode logic.
EncodeResponse H264Codec::EncodeFrame(AVFrame* pFrame, StreamCodecContainer* pStreamCodecContainer, AVPacket* pPacket)
{
int result = 0;
result = avcodec_send_frame(pStreamCodecContainer->pEncodingCodecContext, pFrame);
if(result < 0)
{
return EncodeResponse::Fail;
}
while (result >= 0)
{
result = avcodec_receive_packet(pStreamCodecContainer->pEncodingCodecContext, pPacket);
// If the encoder needs more frames to create a packed then return and wait for
// method to be called again upon a new frame been present.
// Else check if we have failed to encode for some reason.
// Else a packet has successfully been returned, then write it to the file.
if (result == AVERROR(EAGAIN) || result == AVERROR_EOF)
{
// Higher level logic, dedcodes next frame from source
// video then calls this method again.
return EncodeResponse::SendNextFrame;
}
else if (result < 0)
{
return EncodeResponse::Fail;
}
else
{
// Prepare packet for muxing.
if (pStreamCodecContainer->codecType == AVMEDIA_TYPE_VIDEO)
{
av_packet_rescale_ts(m_pPacket, pStreamCodecContainer->pEncodingCodecContext->time_base,
m_pDecodingFormatContext->streams[pStreamCodecContainer->streamIndex]->time_base);
}
m_pPacket->stream_index = pStreamCodecContainer->streamIndex;
int result = av_interleaved_write_frame(m_pEncodingFormatContext, m_pPacket);
av_packet_unref(m_pPacket);
}
}
return EncodeResponse::EncoderEndOfFile;
}
Strange behaviour I notice is that before I get the first packet from avcodec_receive_packet I have to send 50+ frames to avcodec_send_frame.
I built a debug build of FFMpeg and stepping into the code I notice that AVERROR(EAGAIN) is returned by avcodec_receive_packet because of the following in x264encoder::encode in encoder.c
if( h->frames.i_input <= h->frames.i_delay + 1 - h->i_thread_frames )
{
/* Nothing yet to encode, waiting for filling of buffers */
pic_out->i_type = X264_TYPE_AUTO;
return 0;
}
For some reason my code-context (h) never has any frames. I have spent a long time trying to debug ffmpeg and to determine what I'm doing wrong. But have reached the limit of my video codec knowledge (which is little).
I'm testing this with a video that has no audio to reduce complication.
I have created a cut down version of my application and provided a self contained (with ffmpeg and SDL built dependencies) project. Hopefully this can help anyone-one willing to help me :).
Project Link
https://github.com/maxhap/video-codec
After looking into encoder initialisation I found that I have to set the codec AV_CODEC_FLAG_GLOBAL_HEADER before calling avcodec_open2
pStreamCodecContainer->pEncodingCodecContext->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
This change led to the re-encoded moov box looking much heathier (used MP4Box.js to parse it). However, the video still does not play correctly, the output video has grey frames at the start when played in VLC and won't play in other players.
I have since tried creating an encoding context via the sample code, rather than using my decoding codec parameters. This led to fixing the bad/data or encoding issue. However, my DTS times are scaling to huge numbers
Here is my new codec init
if (pStreamCodecContainer->codecType == AVMEDIA_TYPE_VIDEO)
{
pStreamCodecContainer->pEncodingCodecContext->height = pStreamCodecContainer->pDecodingCodecContext->height;
pStreamCodecContainer->pEncodingCodecContext->width = pStreamCodecContainer->pDecodingCodecContext->width;
pStreamCodecContainer->pEncodingCodecContext->sample_aspect_ratio = pStreamCodecContainer->pDecodingCodecContext->sample_aspect_ratio;
/* take first format from list of supported formats */
if (pStreamCodecContainer->pEncodingCodec->pix_fmts)
{
pStreamCodecContainer->pEncodingCodecContext->pix_fmt = pStreamCodecContainer->pEncodingCodec->pix_fmts[0];
}
else
{
pStreamCodecContainer->pEncodingCodecContext->pix_fmt = pStreamCodecContainer->pDecodingCodecContext->pix_fmt;
}
/* video time_base can be set to whatever is handy and supported by encoder */
pStreamCodecContainer->pEncodingCodecContext->time_base = av_inv_q(pStreamCodecContainer->pDecodingCodecContext->framerate);
pStreamCodecContainer->pEncodingCodecContext->sample_aspect_ratio = pStreamCodecContainer->pDecodingCodecContext->sample_aspect_ratio;
}
else
{
pStreamCodecContainer->pEncodingCodecContext->channel_layout = pStreamCodecContainer->pDecodingCodecContext->channel_layout;
pStreamCodecContainer->pEncodingCodecContext->channels =
av_get_channel_layout_nb_channels(pStreamCodecContainer->pEncodingCodecContext->channel_layout);
/* take first format from list of supported formats */
pStreamCodecContainer->pEncodingCodecContext->sample_fmt = pStreamCodecContainer->pEncodingCodec->sample_fmts[0];
pStreamCodecContainer->pEncodingCodecContext->time_base = AVRational{ 1, pStreamCodecContainer->pEncodingCodecContext->sample_rate };
}
Any ideas why my DTS time is re-scaling incorrectly?
I managed to fix the DTS scalling by using the time_base value directly from the decoding streams.
So
pStreamCodecContainer->pEncodingCodecContext->time_base = m_pDecodingFormatContext->streams[pStreamCodecContainer->streamIndex]->time_base
Instead of
pStreamCodecContainer->pEncodingCodecContext->time_base = av_inv_q(pStreamCodecContainer->pDecodingCodecContext->framerate);
I will create an answer based on all my finding.
To fix the initial problem of a corrupted moov box I had to add the AV_CODEC_FLAG_GLOBAL_HEADER flag to the encoding codec context before calling avcodec_open2.
encCodecContext->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
The next issue was badly scaled DTS values in the encoded package, this was causing a side effect of the final mp4 duration being in the hundreds of hours long. To fix this I had to change the encoding codec context timebase to be that of the decoding context streams timebase. This is different than using av_inv_q(framerate) as suggested in the avcodec transcoding example.
encCodecContext->time_base = decCodecFormatContext->streams[streamIndex]->time_base;
I need to decode a video from specific frame by FFmpeg, I know h.264 video has I-frame/P-frame/B-frame, when I use av_seek_frame() to seek to a specific frame, it could only help me to find the nearest previous I-frame, so I tried to use the AVPacket.dts and a while loop to navigate to the location of the specific frame, and then start to demux the video frame.
auto DurationTime = static_cast<uint64_t>(m_pAVFormatContext->streams[m_VideoStreamIndex]->duration);
auto FrameNum = m_pAVFormatContext->streams[m_VideoStreamIndex]->nb_frames;
av_seek_frame(m_pAVFormatContext, m_VideoStreamIndex, DurationTime / FrameNum * (vSpecificFrame), AVSEEK_FLAG_FRAME | AVSEEK_FLAG_BACKWARD);
int TempValue = 0;
while ((TempValue = av_read_frame(m_pAVFormatContext, &m_AVPacket)) >= 0 && m_AVPacket.stream_index != m_VideoStreamIndex)
{
av_packet_unref(&m_AVPacket);
}
while(m_AVPacket.dts / (DurationTime / FrameNum) < vSpecificFrame)
{
if(m_AVPacket.data)
av_packet_unref(&m_AVPacket);
while ((TempValue = av_read_frame(m_pAVFormatContext, &m_AVPacket)) >= 0 && m_AVPacket.stream_index != m_VideoStreamIndex)
{
av_packet_unref(&m_AVPacket);
}
}
if(m_AVPacket.data)
av_packet_unref(&m_AVPacket);
But the question is when I demux the next frame, I can get it's pos, data, size and all other information, but the image is unable to display. Before demuxing the next I-frame, all frames between this specific frame with the next I-frame has the same case.
Did I use the wrong method? Or whether there has any other method to seek to the specific frame of a video? Thanks very much.
I can get raw h.264 frames from a camera. (it does NOT contain any network headers, for example rtsp, http).
They are h.264 raw data.
And I push these data to a queue frame by frame.
I googled many ffmpeg example which uses avformat_open_input() with either local file path or network path.
And I can see the video while I save the frames to a file and using avformat_open_input().
My problem is that I want to decode the frames realtime, not after it is saved as a file.
Does anyone have any idea on this?
Thanks!
You do not need avformat, you need avcodec. avformat is for parsing containers and protocols. avcodec is for encoding and decoding elementary streams (what you already have).
AVPacket avpkt; int err, frame_decoded = 0;
AVCodec *codec = avcodec_find_decoder ( AV_CODEC_ID_H264 );
AVCodecContext *codecCtx = avcodec_alloc_context3 ( codec );
avcodec_open2 ( codecCtx, codec, NULL );
// Set avpkt data and size here
err = avcodec_decode_video2 ( codecCtx, avframe, &frame_decoded, &avpkt );
I use packet duration to translate from frame index to pts and back, and I'd like to be sure that this is a reliable method of doing so.
Alternatively, is there a better way to translate pts to a frame index and vice versa?
A snippet showing my usage:
bool seekFrame(int64_t frame)
{
if(frame > container.frameCount)
frame = container.frameCount;
// Seek to a frame behind the desired frame because nextFrame() will also increment the frame index
int64_t seek = pts_cache[frame-1]; // pts_cache is an array of all frame pts values
// get the nearest prior keyframe
int preceedingKeyframe = av_index_search_timestamp(container.video_st, seek, AVSEEK_FLAG_BACKWARD);
// here's where I'm worried that packetDuration isn't a reliable method of translating frame index to
// pts value
int64_t nearestKeyframePts = preceedingKeyframe * container.packetDuration;
avcodec_flush_buffers(container.pCodecCtx);
int ret = av_seek_frame(container.pFormatCtx, container.videoStreamIndex, nearestKeyframePts, AVSEEK_FLAG_ANY);
if(ret < 0) return false;
container.lastPts = nearestKeyframePts;
AVFrame *pFrame = NULL;
while(nextFrame(pFrame, NULL) && container.lastPts < seek)
{
;
}
container.currentFrame = frame-1;
av_free(pFrame);
return true;
}
No, not guaranteed. It may work with some codec/container combination where frame-rate is static. avi, h264 raw (annex-b) and yuv4mpeg come to mind. But other containers like flv, mp4, ts, have a PTS/DTS (or CTS) for EVERY frame. The source could be variable frame rate, or frames could have be dropped at some point during processing due to bandwidth. Also some codecs will remove duplicate frames.
So unless you created the file yourself. Do not trust it. There is no guaranteed way to look at a frame and know its 'index' except start at the beginning and count.
Your method, MAY be good enough for most files however.
This is supposed to be possible on Mac OS X by overwriting the sample rate in the AudioStreamBasicDescription then create a new output queue.
I've been able to retrieve the default sample rate and write a new one (ie. replace 44100 with 48000) but this is not resulting in any pitch change in the output signal.
err = AudioFileGetProperty(mAudioFile, kAudioFilePropertyDataFormat, &size, &mDataFormat);
if (err != noErr)
NSLog(#"Couldn't determine the audio file format");
Float64 mySampleRate = mDataFormat.mSampleRate; //the initial rate
if (inRate != 1) {
//write a new value
mDataFormat.mSampleRate = inRate;
//then
err = AudioQueueNewOutput etc.
Any suggestions would be greatly appreciated.
Changing the sample rate doesn't change the pitch of the audio. You may perceive that something playing back faster has a higher pitch. However that's perception rather than reality.
To change pitch, you'll need to process the audio data through a Digital Signal Processing (DSP) library. Alternatively, take a look at running it through an AudioUnit:
Audio Unit Programming Guide