Seeking in any large MKV H.264 video works very slow.
Call of av_seek_frame() in my code using FFmpeg LibAV takes about 3-10 seconds to jump to a middle position in 1-2 hours video.
I tried all combinations of flags: AVSEEK_FLAG_BACKWARD, AVSEEK_FLAG_BYTE, AVSEEK_FLAG_ANY, AVSEEK_FLAG_FRAME
However VLC seeks very fast in same MVK videos.
Seeking in MP4 H.264 works instantly. The problem is only with MKV.
Latest FFmpeg 4.1.3.
Related
I want to use an ffmpeg command to save segments of length 10 seconds from a video being streamed via HTTP using VLC.
The command I used to carry out the task is:
ffmpeg -i http://[Local IPv4 address]:8080 -f segment -segment_time 10 -vcodec copy -acodec copy -reset_timestamps 1 video%03d.mp4
However, the lengths of the output videos I'm receiving are around 8.333 or 16.666 seconds in duration. (The videos video000.mp4, video005.mp4, video010.mp4, video015.mp4... have duration of around 16.666 seconds and the remaining videos have duration of around 8.333 seconds).
I'm aware that the segmentation of input video happens based on the occurrences of keyframes in the video. It appears that the key frames in the video being streamed occur with an interval of around 8 seconds.
Is there a way to obtain video segments that are closer to 10 seconds in duration from the live stream of such a video?
Also, I occasionally get the "Non-monotonous DTS in output stream 0:0" warning while executing the above command. I tried using different flags (+genpts, +igndts, +ignidx) hoping that the warning message would not be displayed, but no luck. Is it possible that there is any correlation between this warning and the issue with lengths of the segments?
I am using ffmpeg on Linux to transcode video files. The files are video from a race car camera. They have been downloaded from Youtube as "webm" format. I want to compare two of the videos, side-by-side, using GridPlayer, which uses vlc as its underlying video processor. GridPlayer has very nice, frame-by-frame controls, but, they are very slow. What video codec should I use to impose the least decoding overhead on vlc/GridPlayer for smoother playback?
I have tried re-encoding as h264, 1920x1080, 30 fps, in mp4 container. I have since discovered a "-tune fastdecode" option that seems to be helpful, along with resizing to 854x480. Any other suggestions?
When extracting segments from a media file with video and audio streams without re-encoding (-c copy), it is likely that the requested seek & end time specified will not land precisely on a keyframe in the source.
In this case, ffmpeg will grab the nearest keyframe of each track and position them with differing starting PTS values so that they remain in sync.
Video keyframes tend to be a lot more spaced apart, so you can often end up with something like this:
Viewing the clip in VLC, the audio will start at 5 seconds in.
However, in other video players or video editors I've noticed this can lead to some playback issues or a/v desync.
A solution would be to re-encode both streams when extracting the clip, allowing ffmpeg to precisely seek to the specified seek time and generating equal length & synced audio and video tracks.
However, in my case I do not want to re-encode the video, it is costly and produces lower quality video and/or greater file sizes. I would prefer to only re-encode the audio, filling the initial gap with generated silence.
This should be simple, but everything I've tried has failed to generate silence before the audio stream begins.
I've tried apad, aresample=sync=1, and using amerge to combine the audio with anullsrc. None of it works.
All I can think to possibly get around this is to use ffprobe on the misaligned source to retrieve the first audio PTS, and in a second ffmpeg process apply this value as a negative -itoffset, then concatting the audio track with generated silence lasting the duration of silence... But surely there's a better way, with just one instance of ffmpeg?
Any ideas?
I just stumbled across the solution by trying some more things.
I take the misaligned source media and process it with another ffmpeg instance with some audio filters:
ffmpeg -fflags +genpts -i input.mkv -c copy -c:a aac -af apad,aresample=async=1:first_pts=0 -ac 2 -shortest -y output.mkv
And it does exactly what I want, pads the beginning (and end) of the audio stream with silence making the audio stream equal length to the video.
The only drawback is that I can't combine this with my original ffmpeg command that extracts the clip, the only way this works is as a 2-step process.
I am using ffmpeg to rotate videos 90 or 180 degrees in a Python script. It works great. But, I am curious as to why the output file would be a smaller amount of bytes than the input file.
Here are the commands I use:
180 degrees:
ffmpeg -i ./input.mp4 -preset veryslow -vf "transpose=2,transpose=2,format=yuv420p" -metadata:s:v rotate=0 -codec:v libx264 -codec:a copy ./output.mp4
90 degrees:
ffmpeg -i ./input.mp4 -vf "transpose=2" ./output.mp4
For example, a GoPro Hero 3 MP4 file was originally 2.0 GB. The resulting output file was 480.9 MB. Another GoPro file was 2.0 and its resulting file was 671.5 MB. Is this maybe because the GoPro files were 2.0 but contains empty space, sort of like how some NTFS filesystems make a minimal 4k file, even when there is less bytes in it?
If this isn't the GoPro Hero 3, how do I rotate the files 90 or 180 degrees but ensure the output file size is the same? Or, is data loss expected? Does the data loss have to do with the format?
Note that the quality of the video doesn't appear to be damaged, which is good. So, I am interested in learning more about why this is happening, then I can read the section of ffmpeg documentation that is relevant to this.
Thank you!
Bitrate is ignored from the start
ffmpeg fully decodes the input into uncompressed raw video and audio (except when stream copying – more about that below). The input format or bitrate does not matter: it does this for all formats. The encoder then works from these raw, decoded frames. See diagram.
H.264 vs H.264
Your input and output are both H.264. A format, such as H.264, is created by an encoder. Anyone can make an encoder. However, not all encoders are equal. Given the same input, the output from one H.264 encoder may have the same quality as an output from another H.264 encoder, but the bitrate may be several times smaller.
The GoPro H.264 encoder was made to work on a platform with limited hardware. That means bitrate (file size) is sacrificed for speed and quality. x264 is the ultimate H.264 encoder: nothing can beat its quality-to-bitrate ratio.
Rotate without re-encoding
You can stream copy (re-mux) and rotate at the same time. The rotation is handled by the metadata/sidedata:
ffmpeg -i input.mp4 -metadata:s:v rotate=90 -c copy output.mp4
Downside is your player/device may ignore the rotation, so you may have to physically rotate with filters which requires re-encoding, and therefore stream copy can't be used.
I had the same rotation issue once...
I fixed it by "resetting" the rotation instead...
ffmpeg ...... -metadata:s:v rotate="0" ......
I'm creating a fragmented MP4 for the use of playing in Media Source Extensions.
The command line is: ffmpeg.exe -probesize 10000000 -r 10 -i - -vcodec copy -an -f mp4 -reset_timestamps 0 -blocksize 30000 -movflags empty_moov+default_base_moof+frag_keyframe -loglevel debug -
The source of the video is an IP camera streaming H.264.
The configured and expected frame rate is 10FPS but there is no guarantee for 10FPS, for example a frame may get dropped occasionally, or the camera may just not play nice with what it declares.
I have simulate a 10% p-frames drops to emphasize the following issue:
With the above command, the output video plays faster than real-time and this is a problem because the whole pipe is a live stream.
With the 10% frame drop simulation, the effective playback rate it 1.1x.
I don't want to obligate to a fixed frame rate because there is no guarantee for a fixed-rate.
If I remove the -r 10 flag entirely, the MP4 seems to the playing at 2x-3x speed.
Is there a way building the MP4 timestamps in a more dynamic way? for example, giving it the RTP timestamp or somehow telling ffmpeg to build the MP4 with the timestamp of the "feed" time?