Faster way of getting number of key frames than "show_frames" in ffprobe? - ffmpeg

I'm making a little in-house utility using ffmpg and ffprobe. Works fine and does what is needed: give a count of the number of key frames in a video file plus some other details.
Alas, with the large video files this will be used on it can take many seconds for show_frames to return – and I then have to parse the JSON dump of frame data and keep a running count of the total key frames.
Is there a faster way? Perhaps it is listed in the "stream" or "format" data dumps and I am not recognizing what it is being called? I've been through the ffmpg and ffprobe docs and didn't find anything else.

For MP4 and MOV files, you can get this info by reading the contents of the STSS box
You can use a tool like MP4parser, which will generate a log file with an entry like this:
/moov/trak/mdia/minf/stbl/stss # 0x1d7218e
Box size: 0x74 version: 0x0 flags: 0x0
entry_count: 0x19
sample_number:
0x1 0x86 0x180 0x27a ....
That entry count (in Hex) is the number you want.
Alternatively, atomicparsley will also tell you of the location of the STSS within the file and you can then read it directly.

Related

When creating a Xing or Info tag in an MP3, may I use any MP3 header or does it have to match other frames?

I have a set of bare MP3 files. Bare as in I removed all tags (no ID3, no Xing, no Info) from those files.
Just before sending one of these files to the client, I want to add an Info tag. All of my files are CBR so we will use an Info tag (no Xing).
Right now I get the first 4 bytes of the existing MP3 to get the Version (MPEG-1, Layer III), Bitrate, Frequency, Stereo Mode, etc. and thus determine the size of one frame. I create the tag that way, reusing these 4 bytes for the Info tag and determining the size of the frame.
For those wondering, these 4 bytes may look like this:
FF FB 78 04
To me it felt like you are expected to use the exact same first 4 bytes in the Info tag as found in the other audio frames of the MP3, but when using ffmpeg, they stick an Info tag with a hard coded header (wrong bitrate, wrong frequency, etc.)
My question is: Is ffmpeg really doing it right? (LAME doesn't do that) Could I do the same, skipping the load of the first 4 bytes and still have the greater majority of the players out there play my files as expected?
Note: since I read these 4 bytes over the network, it would definitely save a lot of time and some bandwidth to not have to load these 4 bytes on a HEAD request. Resources I could use for the GET requests instead...
The reason for the difference is that with certain configurations, the size of a frame is less than 192 bytes. In that case, the full Info/Xing tag will not fit (and from what I can see, the four optional fields are always included, so an Info/Xing tag is always full even if not required to be).
So, for example, if you have a single channel with 44.1kHz data at 32kbps, the MP3 frame is 117 or 118 bytes. This is less than what is necessary to save the Info/Xing tag.
What LAME does in that situation is forfeit the Info/Xing tag. It's not going to be seen anywhere in the file.
On the other hand, what FFMPEG does is create a frame with a higher bitrate. So instead of 32kbps, it will try with 48kbps and then 64kbps. Once it finds a configuration which offers a frame large enough to support the Info/Xing tag, it stops. (I have not looked at the code, so how FFMPEG really finds a large enough frame, I do not know, but on my end I just incremented the bitrate index field by one until frame size >= 192 and it works).
You can replicate the feat by first creating (or converting) a WAVE file at 44.1kHz using a 32kbps bitrate then try to convert it to MP3 using ffmpeg and see that the Info/Xing tag has a different bitrate.

H.264 - Identify Access Units of an image

I need to parse a H.264 stream to collect only NAL's needed to form a complete image, of only one frame. I'm reading the H.264 standard, but it's confuse and hard to read. I made some experiments but, did not worked. For example, i extracted an access unit with primary_pic_type == 0 containing only slice_type == 7 (I-Slice), it should give me a frame, but i tried to extract from ffmpeg, it did not work. But, when i append the next access_unit, containing only slice_type == 5 (P-Slice) it worked. Maybe i need to extract POC information, but i think not, because i only need extract one frame, but i'm not sure. Someone have some tip of how get only NAL's i need to form one complete image?
I assume that you have an "Annex B" style stream that looks like this:
(AUD)(SPS)(PPS)(I-Slice)(PPS)(P-Slice)(PPS)(P-Slice) ... (AUD)(SPS)(PPS)(I-Slice)
I assume that you want to decode a single I frame and we hope that your I frame is also an IDR frame.
Your are somewhere in the middle of the stream.
Keep reading until your find an (AUD) = 0x00 0x00 0x00 0x01 0x09.
Now push everything into your decoder until you are in front of | marking the second (PPS) : (AUD)(SPS)(PPS)(I-Slice) | (PPS)
Flush your decoder to emit an uncompressed frame.
This doesn't solve the general case but probably decodes most well behaved streams.
Just in case someone has the same problem, i solved it. I go until i find an AUD of primary_pic_type == 0. So i extract the AUD and the next one (when it's a field), send the two AUD to the server, and decode the frame using ffmpeg to generate a JPG image.

When recording MP4 using ffmpeg suddenly power off

Now I used C language and ffmpeg realize a multiplex real-time audio and video to MP4 files of the program and everything works fine, but when in the process of reuse of sudden power failure, the recording is MP4 file is damaged, VLC can not play this file.
I think reason is no call to write the trailer function av_write_trailer , causing index and time stamp information lost, I use araxis merge tool compared the successful call av_write_trailer function of file and a no av_write_trailer to call the damaged files and found two different points:
1. Damaged files in the file header box number value not right
2. The damaged file no end of file.
Now I want to repair after power on my program can automatically repair the damaged files, in Google did not find effective methods.
my train of thought is in the normal recording process saves per second a damaged file is missing two information: box number and end of file, save it to a local file, when writing the MP4 file integrity delete this file after, if power off damaged, then in the next power, read the file and the corresponding information to write the damaged files corresponding position to. But now the problem is that I don't know how to save the number of box and the end of the file, I this is feasible? If possible, what should I do? Looking forward to your reply!
The main cause of MP4 file damage is due to header or trailer not written properly on the file , then , whole file become a junk data. Thus none of the media player able to play the broken mp4 file.
So,
First , broken file has to be repaired before playing the file.
there are some applications and tricks available to repair and get the data back
links are given below :
http://grauonline.de/cms2/?page_id=5 (Windows / Mac)(paid :( )
https://github.com/ponchio/untrunc (Linux based OS)(ofcourse,free!!!)
Second, Manually repairing the corrupt file using HEX editor.
Logic behind this hack :
This hack requires a broken mp4 file and good video file where both videos are captured from the same camera .Also its size should be larger than the broken mp4 file.
Open both video file in any HEX editor. Copy trailer part from good video file to broken video file and save it!Done!!
Note : Always have a backup of video file.
follow these links for detailed informations :
http://janit.iki.fi/repair-corrupted-mp4-video/
https://www.lfs.net/forum/thread/45156-Repair-a-corrupt-mp4-file%3F
http://hackaday.com/2015/04/02/manual-data-recovery-with-a-hex-editor/
http://www.hexview.org/hex-repair-corrupt-file.html
Third, Even tough MP4 file has many advantages , this kind of error is unpredictable and difficult to handle it.
Thus , Using format such as MPG and AV_CODEC_ID_MPEG1VIDEO/AV_CODEC_ID_MPEG2VIDEO (FFMPEG) may help to avoid this kind of error. The mentioned MPG format does not require any header/trailer.if there is any sudden power failure MPG file can play the file whatever frames are stored so far.
Note : there are other formats and codec also available with this kind of properties.

FFMpeg - Is it difficultt to use

I am trying to use ffmpeg, and have been doing a lot of experiment last 1 month.
I have not been able to get through. Is it really difficult to use FFmpeg?
My requirement is simple as below.
Can you please guide me if ffmpeg is suitable one or I have implement on my own (using codec libs available).
I have a webm file (having VP8 and OPUS frames)
I will read the encoded data and send it to remote guy
The remote guy will read the encoded data from socket
The remote guy will write it to a file (can we avoid decoding).
Then remote guy should be able to pay the file using ffplay or any player.
Now I will take a specific example.
Say I have a file small.webm, containing VP8 and OPUS frames.
I am reading only audio frames (OPUS) using av_read_frame api (Then checks stream index and filters audio frames only)
So now I have data buffer (encoded) as packet.data and encoded data buffer size as packet.size (Please correct me if wrong)
Here is my first doubt, everytime audio packet size is not same, why the difference. Sometimes packet size is as low as 54 bytes and sometimes it is 420 bytes. For OPUS will frame size vary from time to time?
Next say somehow extract a single frame (really do not know how to extract a single frame) from packet and send it to remote guy.
Now remote guy need to write the buffer to a file. To write the file we can use av_interleaved_write_frame or av_write_frame api. Both of them takes AVPacket as argument. Now I can have a AVPacket, set its data and size member. Then I can call av_write_frame api. But that does not work. Reason may be one should set other members in packet like ts, dts, pts etc. But I do not have such informations to set.
Can somebody help me to learn if FFmpeg is the right choice, or should I write a custom logic like parse a opus file and get frame by frame.
Now remote guy need to write the buffer to a file. To write the file
we can use av_interleaved_write_frame or av_write_frame api. Both of
them takes AVPacket as argument. Now I can have a AVPacket, set its
data and size member. Then I can call av_write_frame api. But that
does not work. Reason may be one should set other members in packet
like ts, dts, pts etc. But I do not have such informations to set.
Yes, you do. They were in the original packet you received from the demuxer in the sender. You need to serialize all information in this packet and set each value accordingly in the receiver.

Fail to generate correct wav file from raw stream

I captured raw audio data stream together with its WAVEFORMATEXTENSIBLE struct.
WAVEFORMATEXTENSIBLE is shown in the figure below:
Following the standard of wav file, I tried to write the raw bits into a wav file.
What I do is:
write "RIFF".
write a DWORD. (filesize - sizeof("RIFF") - sizeof(DWORD)).
=== WaveFormat Chunk ===
write "WAVEfmt "
write a DWORD. (size of the WAVEFORMATEXTENSIBLE struct)
write the WAVEFORMATEXTENSIBLE struct.
=== Fact Chunk ===
write "fact"
write a DWORD. ( 4 )
write a DWORD. ( num of samples in the stream, which should be sizeof(rawdata)*8/wBitsPerSample ).
=== Data Chunk ===
write "data"
write a DWORD (size of rawdata)
write the raw data.
After getting the wav file from the above steps, I played the wav file with media player, there is no sound, playing with audacity will give me a distorted sound, I can hear that it is the correct audio I want, but the sound is distorted with noise.
The raw data can be find here
The wav file I generate is here
It is very confusing to me, because when I use the same method to convert IEEE-float data to wav file, it works just fine.
I figured this out, it seems the getbuffer releasebuffer cycle in IAudioRenderClient is putting raw data that has the format same as that passed into the initialize method of the IAudioClient.
The GetMixFormat in IAudioClient in my case is different from the format passed into the initialize method. I think GetMixFormat gets the format that the device supports.
IAudioClient should have done the conversion of format from the initialized format to the mixformat. I intercept the initialize method, get the format, and it works like a charm.
I'm intercepting WASAPI to access the audio data and face the exact same issue where the generated audio file from the data sounds like the correct content but is very noisy somehow although the frame rate, sample width, number of channels etc. are set properly.
The SubFormat field of WAVEFORMATEXTENSIBLE shows that the data is actually KSDATAFORMAT_SUBTYPE_IEEE_FLOAT, while I originally treat it as integers. According to this page, KSDATAFORMAT_SUBTYPE_IEEE_FLOAT is equivalent to WAVE_FORMAT_IEEE_FLOAT in WAVEFORMATEX. Hence, setting the "audio format" in the wav file's fmt chunk(normally starts in the 20th position) to WAVE_FORMAT_IEEE_FLOAT(which is 3) solved the problem. Remember to put it in little endian.
Original value of audio format
After modification

Resources