How to split accurately a LONG GOP video (h264/XDCAM...) with FFMPEG? - bash

My goal is to split a XDCAM or a H264 video, frame-accurately, with ffmpeg.
I guess that the problem comes from its long GOP structure, but I'm looking for a way to split the video without re-encoding it.
I apply an offset to encode only a specific section of the video (let say from the 10th second to the end of the media)
Any ideas ?

Please refer to the ffmpeg documentation.
You will find an option -frames. That option can be use to specify for a given input stream (in the following the stream 0:0 is the 1st input file, first video stream) the number of frame to record. That option can be combined with other options to start somewhere in the input file (time offset, etc ....)
ffmpeg -i intput.ts -frames:0:0 100 -vcodec copy test.ts
that command demux and remux only the first 100 frame of the video (no re-encoding).
as said you can combine it with a jump. Using ' ‘-ss offset (input)’
' you can specify a "Frame Accurate" position ie. frame 14 after 1min10seconds = 0:1:10:14. that option should be use before the input like below.
ffmpeg -ss 00:00:10.0 -i intput.ts -frames:0:0 100 -vcodec copy test.ts
ffmpeg discard the first 10 second and bypass 100 frame to the muxer.

I`m not sure if it possible to do by 1 pass with ffmpeg, but 2-3
1st pass: you just dump raw frames to file
2nd pass: you find closed gop(mxdcam)/idr frame (h264) with index <= index of frame you want to start
in case indexes equal you can start mux. otherwise you need to decode sequense from closed gop/idr frame to next closed gop/idr frame and encode starting frame you want

Related

How to remove a frame with ffmpeg without re-encoding?

I am making a datamoshing program in C++, and I need to find a way to remove one frame from a video (specifically, the p-frame right after a sequence jump) without re-encoding the video. I am currently using h.264 but would like to be able to do this with VP9 and AV1 as well.
I have one way of going about it, but it doesn't work for one frustrating reason (mentioned later). I can turn the original video into two intermediate videos - one with just the i-frame before the sequence jump, and one with the p-frame that was two frames later. I then create a concat.txt file with the following contents:
file video.mkv
file video1.mkv
And run ffmpeg -y -f concat -i concat.txt -c copy output.mp4. This produces the expected output, although is of course not as efficient as I would like since it requires creating intermediate files and reading the .txt file from disk (performance is very important in this project).
But worse yet, I couldn't generate the intermediate videos with ffmpeg, I had to use avidemux. I tried all sorts of variations on ffmpeg -y -ss 00:00:00 -i video.mp4 -t 0.04 -codec copy video.mkv, but that command seems to really bug out with videos of length 1-2 frames - while it works for longer videos no problem. My best guess is that there is some internal checker to ensure the output video is not corrupt (which, unfortunately, is exactly what I want it to be!).
Maybe there's a way to do it this way that gets around that problem, or better yet, a more elegant solution to the problem in the first place.
Thanks!
If you know the PTS or data offset or packet index of the target frame, then you can use the noise bitstream filter. This is codec-agnostic.
ffmpeg -copyts -i input -c copy -enc_time_base -1 -bsf:v:0 noise=drop=eq(pos\,11291) out
This will drop the packet from the first video stream stored at offset 11291 in the input file. See other available variables at http://www.ffmpeg.org/ffmpeg-bitstream-filters.html#noise

Does ffmpeg allow for hr:min:sec:frame?

I am looking to concat a video and an image using ffmpeg and was wondering if when cutting the video I can cut it with a timecode that uses seconds:frames (00:00:23:4)instead of seconds.miliseconds (00:00:23.43) and then concat it with seconds:frames?
No, see the time duration syntax documentation for accepted syntax.
You can use the reciprocal of the output frame rate instead. For example, frame #5 for a constant output frame rate of 25 is 5*(1/25)=0.2.

Scene detection and concat makes my video longer (FFMPEG)

I'm encoding videos by scenes. At this moment I got two solutions in order to do so. The first one is using a Python application which gives me a list of frames that represent scenes. Like this:
285
378
553
1145
...
The first scene begins from the frame 1 to 285, the second from 285 to 378 and so on. So, I made a bash script which encodes all this scenes. Basically what it does is to take the current and previous frames, then convert them to time and finally run the ffmpeg command:
begin=$(awk 'BEGIN{ print "'$previous'"/"'24'" }')
end=$(awk 'BEGIN{ print "'$current'"/"'24'" }')
time=$(awk 'BEGIN{ print "'$end'"-"'$begin'" }')
ffmpeg -i $video -r 24 -c:v libx265 -f mp4 -c:a aac -strict experimental -b:v 1.5M -ss $begin -t $time "output$count.mp4" -nostdin
This works perfect. The second method is using ffmpeg itself. I run this commands and gives me a list of times. Like this:
15.75
23.0417
56.0833
71.2917
...
Again I made a bash script that encodes all these times. In this case I don't have to convert to times because what I got are times:
time=$(awk 'BEGIN{ print "'$current'"-"'$previous'" }')
ffmpeg -i $video -r 24 -c:v libx265 -f mp4 -c:a aac -strict experimental -b:v 1.5M -ss $previous -t $time "output$count.mp4" -nostdin
After all this explained it comes the problem. Once all the scenes are encoded I need to concat them and for that what I do is to create a list with the video names and then run the ffmpeg command.
list.txt
file 'output1.mp4'
file 'output2.mp4'
file 'output3.mp4'
file 'output4.mp4'
command:
ffmpeg -f concat -i list.txt -c copy big_buck_bunny.mp4
The problem is that the "concated" video is longer than the original by 2.11 seconds. The original one lasts 596.45 seconds and the encoded lasts 598.56. I added up every video duration and I got 598.56. So, I think the problem is in the encoding process. Both videos have the same frames number. My goal is to get metrics about the encoding process, when I run VQMT to get the PSNR and SSIM I get weird results, I think is for this problem.
By the way, I'm using the big_buck_bunny video.
The probable difference is due to the copy codec. In the latter case, you tell ffmpeg to copy the segments, but it can't do that based on your input times.
It has to find first the previous I frames (a frame that can be decoded without any reference to any previous frame) and starts from here.
To get what you need, you need to either re-encode the video (like you did in the 2 former examples) or change the times to stop at I frames.
To assert I getting your issue correctly:
You have a source video (that's encoded at variable frame rate, close to 18fps)
You want to split the source video via ffmpeg, by forcing the frame rate to 24 fps.
Then you want to concat each segment.
I think the issue is mainly that you have some discrepancy in the timing (if I divide the frame index by the time you've given, I getting between 16fps to 18fps). When you are converting them in step 2, the output video segment time will be 24fps. ffmpeg does not resample in the time axis, so if you force a video rate, the video will accelerate or slow down.
There is also the issue of consistency for the stream:
Typically, a video stream must start with a I frame, so when splitting, FFMPEG has to locate the previous I frame (when using copy codec, and this changes the duration of the segment).
When you are concatenating, you could also have the issue of consistency (that is, if the segment you are concatenating does end with a I frame, and the next one starts with a I frame, it's possible FFMPEG drops either one, although I don't remember what is the current behavior now)
So, to solve your issue, if I were you, I would avoid step 2 (it's bad for quality anyway). That is, I would use ffmpeg to split the segments of interest based on the frame number (that's the only value that's not approximate in your scheme) in png or ppm frames (or to a pipe if you don't care about keeping them) and then concat all the frames by encoding them at the last step with the expected rate set to totalVideoTime / totalFrameCount.
You'll get a smaller and higher quality final video.
If you can't do what I said for whatever reason, at least for the concat input, you should use the ffconcat format:
ffconcat version 1.0
file segment1
duration 12.2
file segment2
duration 10.3
This will give you the expected duration by cutting each segment if it's longer
For selecting by frame number (instead of time as time is hard to get right on variable frame rate video), you should use the select filter like this:
-vf select=“between(n\,start_frame_num\,end_frame_num),setpts=STARTPTS"
I suggest checking the input and output frame rate and make sure they match. That could be a source of the discrepancy.

FFMPEG frame extraction - stuck

trying to extract specific frames from a video with the following command (with specific names of files removed!:
ffmpeg -i video.mp4 -vf "select-gte(n\,6956)" -vframes 10262 folder/frame%d.jpg
However, in many cases, this results in the same frame (the first one) extracted repeatedly, rather than a progression of frames extracted.
The image sequence muxer, by default, is set to assume a constant frame rate output, so it will fill in missing timestamp gaps with duplicates.
The select filter does not reset timestamps, so, in your command, there's a "gap" from 0 to the timestamp of the first selected frame.
Use instead
ffmpeg -i video.mp4 -vf "select-gte(n\,6956)" -vsync 0 -vframes 10262 folder/frame%d.jpg
This changes video sync method to prevent frame duplication.

Create a video with timestamp from multiple images with "picture taken date/time" in the meta data with ffmpeg or similar?

I have two time lapse videos with a rate of 1 fps. The camera took 1 Image every minute. Unfortunately it was missed to set the camera to burn/print on every image the time and date. I am trying to burn the time and date afterwards into the video.
I decoded with ffmpeg the two .avi files into ~7000 single images each and wrote a R script that renamed the files into their "creation" date (the time and date the pictures where taken). Then i used exiftoolto write those information "into" the file, into their exif data or meta data or whatever this is called.
The final images in the folder are looking like this:
2018-03-12 17_36_40.png
2018-03-12 17_35_40.png
2018-03-12 17_34_40.png
...
Is it possible to create a Video from these images again with ffmpeg or similiar with a "timestamp" in the video so you can see while watching a time and date stamp in the video?
I think this can be done in two steps.
First you create a mp4 file with the timestamp for every picture. This is a batch file which creates such video files.
#echo off
set "INPUT=C:\t\video"
for %%a in ("%INPUT%\*.png") do (
ffmpeg -i "%%~a" -vf "drawtext=text=%%~na:x=50:y=100:fontfile=/Windows/Fonts/arial.ttf:fontsize=25:fontcolor=white" -c:v libx264 -pix_fmt yuv420p "output/%%~na.mp4"
)
This will create mp4 for every png picture in the directory output/.
Explaining
For cycle is there loop via all *.png files and create *.mp4 files
The text is added via text overlay. It adds the filename without suffix via batch %%~na.
The text added here is the filename without the .png suffix.
Font used is arial.ttf (feel free to place any you want)
Next have x and y coordinates where you want to place the text
libx264 what x264 is used to encode
-pix_fmt yuv420p is for crappy players to be able to play it
Second step is to concate h.264 together using concat demuxer:
You need to create a file list like file_list.txt
file '2018-03-12 17_34_40.mp4'
duration 10
file '2018-03-12 17_35_40.mp4'
duration 10
file '2018-03-12 17_36_40.mp4'
duration 10
...
Examples can be found here.
Then you simply concat all the *.mp4 files - run in the output subdirectory:
ffmpeg -safe 0 -f concat -i file_list.txt -c copy output.mp4
Which will create one concat output.mp4 file.
If I understand correctly, you have some number of YUV frames and its information saved separately.
For example:
Let's say you originally have 10seconds video at 24 frames per seconds (constant, not variable).
So you have 240 yuv frames. In this case or similar, you can generate a video file via ffmpeg with a container format like mp4, with the resolution and frame rate information. So you will not need any metadata to make it back to video or to play, it will play normally in any decent player.
If you have only KEY FRAMES with different frame timing between them, yes you will need the metadata information and I'm not sure how can you do that.
Since you have the source, you can extract every single frame (24 frames in this example) per second and you are good to go. Similar answers already given for these, just look around.
Hope that helps.

Resources