I've tried to enode one still image at 25 fps with h.264(ffmpeg). WHy the codec produces DIFFERENT images?
This is my syntetic still image example:
ffmpeg -framerate 1 -r 25 -i ttest%.jpg -vcodec libx264 -crf 25 -pix_fmt yuv420p still_image.mp4
I've checked the result file with probe (and another software), and found that all images in the encoded sequnce is different!
I understand that h.264 is lossy encoder, and may be by default the algorithm compresses some MB in different way, but I want to produce file with all MB is B_skip or P_skip in all slices except the first IDR frame. Is it possible?
Whenever I use the ffmpeg commands, the video is larger and the seeking/keyframe inteval is not the same as the video rendered from Blender. How can I convert the following Blender settings to ffmpeg please?
Blender Settings:
Frame rate: 30
Codec: h.264
Output .mp4
Keyframe interval: 1
Output quality: Medium
Encoding speed: Good
Here's my current command however the seeking and file size is different:
ffmpeg -framerate 30 -i %04d.jpg -g 1 -vcodec libx264 video.mp4
ffmpeg -r 30 -i %04d.jpg -vcodec libx264 -crf 25 -x264-params keyint=30:scenecut=0 -preset veryslow video.mp4
Explanation:
-r 30 befor input pictures will say ffmpeg to use 30 pictures per second
-vcodec libx264 will let ffmpeg encode in plain old H.264
-crf 25 will let the encoder decide on the bitrate for a medium quality (lower it for better quality / higher file size, increase it for worse quality / lower file size. Need to find your right setting there through testing)
-x264-params keyint=30:scenecut=0 will tell the x264 encoder to set a keyframe every 30s frames (here 1s) and to disable scene detection. Be aware that this increases the file size a lot, you should not use a keyframe every second, except for livestreaming. Modern video encoders like AV1 will at most times set a keyframe every 10-20s based on scene detection.
-preset veryslow will use the best libx264 preset available to make the file as small as possible with H.264 (however needs more time to encode). If you want a faster encode but a larger file set it to slow.
Some general opinions from me:
If you don't need compatibility to very old devices rather encode with libx265 or 2-pass libvpx-vp9. This will save you a lot of space without quality loss. libx265 slow is even faster then libx264 veryslow for me.
Currently I am using this command to convert
ffmpeg -i <srcfile> -vcodec libx264 -profile:v main -level 3.1 -preset slower -crf 18 -x264-params ref=4 -acodec copy -movflags +faststart <outfile>
to convert some dashcam footage for viewing on an iOS device.
The above command took about 30 min to complete on a 2017 Macbookpro with 16G of RAM.
I want to speed it. One thing I tried is to harness the GPU in the computer. Therefore I added the flag -c:v h264_videotoolbox
It sped up by a lot. I can complete the conversion in 1 min.
However, when I inspected the output, the GPU version suffers from banding and blurry.
Here is a screenshot. CPU version on the left and GPU version on the right.
To highlight the difference, here are the parts of the videos
Trees in reflections
corrugated iron sheet wall
Is there any switch that I can manipulate to make the GPU version clearer?
This is a simplistic H.264 encoder compared to x264, so you're not going to get the same quality per bitrate. h264_videotoolbox is optimized for speed and does not support -crf.
You can view some options specific to this encoder with ffmpeg -h encoder=h264_videotoolbox, but as they are probably already set to "auto" (I didn't confirm via source code and I don't have the hardware to try it) these additional options may not make much of a difference.
So you'll just have to increase the bitrate, such as with -b:v 8000k.
Or continue to use libx264 with a faster -preset.
I see the question's been answered and nearly two years ago. Jumping in for others who might stumble on the thread. I get great results with VideoToolbox as encoder, using either GPU or software to accelerate, depending which machine I am using.
As already mentioned, setting a constant bitrate, and adjusting it upward is key to producing a result that is nearly indistinguishable from a large source file. A constant bitrate is as effective as two-pass encoding for high-quality output, when paired with other key parameters, and is much quicker than two-pass.
May seem counter-intuitive, but a computer running on all threads, full throttle, to encode a video won't give you best results. Several researchers have demonstrated that quality actually goes down if all cpu threads are engaged in encoding; better to use fewer threads and even throttle ffmpeg with a 3rd party app (encoding does not slow down significantly, in my experience). So limit threads on newer multithread desktops and laptops.
Common practice for target bitrates (seen on Netflix, Amazon) vary with resolution, naturally: at least 5,000kbps for 1080p; 3,500 for 720p. For a noticeable improvement in video quality, the encoder bitrate should be set to at least 1.5 times those common practice bitrates: ie, 7,500 for 1080p, 5,250 for 720p. Similarly for 4K GoPros or dash cams.
Often I work with large movie files from my bluray library, and create slimmed-down versions that are 1/3 to 1/2 the size of the original (20G original gives way to a file of 8-10GB with no perceptible loss of quality. Also: framerate. Maintaining the same framerate from source to slimmed-down file is essential, so that parameter is important. Framerate is either 24fps, 25fps, or 30fps for theatrical film, European tv, and North American tv, respectively. (Except in transferring film to a tv screen, 24fps becomes 23.976fps, in most cases.) Of course 60fps is common for GoPro-like cameras, but here 30fps would be a reasonable choice.
It is this control of framerate and bitrate the keeps ffmpeg in check, and gives you predictable, repeatable results. Not an errant, gigantic file that is larger than the one you may have started with.
I work on a Mac, so there may be slight differences on the command line, and here I use VideoToolbox as software encoder, but a typical command reads:
ffmpeg -loglevel error -stats -i source.video -map 0:0 -filter:v fps\=24000/1001 -c:v h264_videotoolbox -b:v 8500k -profile 3 -level 41 -coder cabac -threads 4 -allow_sw:v 1 -map 0:1 -c:a:0 copy -disposition:a:0 default -map 0:6 -c:s:0 copy -disposition:s:0 0 -metadata:g title\=“If you want file title in the metadata, goes here” -default_mode passthrough ‘outfile.mkv’
-loglevel error (to troubleshoot errors)
-stats (provides progess status in terminal window)
-i infile (source video to transcode)
-map 0:0 (specify each stream in the original to map to output)
-filter:v fps\=24000/1001 (framerate of 23.976, like source)
-c:v h264_videotoolbox (encoder)
-b:v (set bitrate, here I chose 8500k)
-profile 3 -level 41 (h264 profile high, level 4.1)
-coder cabac (cabac coder chosen)
-threads 4 (limit of 4 cpu threads, of 8 on this laptop)
-allow_sw:v 1 (using VideoToolbox software encoding for accleration; GPU is not enabled)
-map 0:1 -c:a:0 copy -disposition:a:0 default (copies audio stream over, unchanged, as default audio)
-map 0:6 -c:s:0 copy -disposition:s:0 0 (copies subtitle stream over, not as default ... ie, will play subtitles automatically)
-metadata:g (global metadata, you can reflect filename in metadata)
-default_mode passthrough (allow audio w/o further processing)
outfile (NOTE: no dash precedes filename/path. Chose mkv format to
hold my multiple streams; mp4 or other formats work just fine ... as
long as contents are appropriate for format.)
In addition of llogan's answer I'd recommend set 'realtime' property to zero (this can increase quality in motion scenes)
As llogan says, bitrate option is good parameter in this situation.
ffmpeg -i input.mov -c:v h264_videotoolbox -b:v {bitrate} -c:a aac output.mp4
if you want to set 1000kb/s bitrate, command is like this
ffmpeg -i input.mov -c:v h264_videotoolbox -b:v 1000k -c:a aac output.mp4
I am using ffmpeg to convert frames from an mp4 to png images.
I would like 20 frames per second AND I would also like the images to be scaled up to 1920x1080. The original mp4 is 240p (426x240).
It lets me specify 20 fps after the -vf flag, but it doesn't let me scale the images.
ffmpeg -i 240_video.mp4 -vf scale=1920:1080 fps=20 240_scaled/out%d.png
If I leave out scale=1920:1080 the command works, but of course, I get 426x240 images.
You can chain linear filters together with commas:
ffmpeg -i 240_video.mp4 -vf "fps=20,scale=1920:1080" 240_scaled/out%d.png
If your input has more than 20 fps, then ffmpeg will drop frames to convert to 20 fps. If your input has less than 20 fps, then ffmpeg will duplicate frames to convert to 20 fps. If you want all of the frames as is then omit the fps filter.
I used the fps filter first because in this case, assuming your input frame rate is higher than 20 fps, it will be slightly more efficient and faster than scaling first because frames will be dropped before the scale filter.
Many players won't like the output because it won't be 4:2:0, so you can add the format filter:
ffmpeg -i 240_video.mp4 -vf "fps=20,scale=1920:1080,format=yuv420p" 240_scaled/out%d.png
426x240 upscaled while keeping the aspect ratio is actually 1920x1082 or 1917x1080, so add pad or crop to compensate. Or refer to the force_original_aspect_ratio option in scale. setsar is added so you don't get a weird SAR. -movflags +faststart is added in case you are doing progressive playback.
ffmpeg -i 240_video.mp4 -vf "fps=20,scale=1920:-1,crop=1920:1080,setsar=1,format=yuv420p" -movflags +faststart 240_scaled/out%d.png
If using ffmpeg to precisely cut up a video into pieces, for instance using something like:
ffmpeg -i input.mkv -ss 0 -to 30 -c:v libx264 -preset ultrafast -qp 0 -c:a copy p_1.mkv
Would the resulting clip include the last frame? (e.g., frame 1800 for a 60FPS video)
Also, should the audio also be re-encoded to ensure that no audio de-sync happens if I were to concatenated it together with other clips?
If the video is constant-frame-rate, then it should include 1800 frames. to stops when the position is reached i.e. 30.000 So, the 1800th frame in a CFR video will have a PTS of 29.983 s
As for audio, if the audio is shorter than the video, it will be so even after encode unless you expressly apply padding. Since you seem to be extracting from inside larger videos, you shouldn't encounter that problem.