Allow MEncoder to use more CPU - bash

I am using MEncoder to combine a huge amount of jpg pictures into a time-lapse video. I have two main folders with about 10 subfolders each and in order to automate the process i am running:
find . -type d -name img -exec sh -c '(cd {} && /Volumes/SAMSUNG/PedestrianBehaviour/BreakableBonds/jpg2avi.sh t*)' ';'
where jpg2avi is the settings for MEncoder.
mencoder 'mf://t00*.jpg' -mf fps=10 -o raw.avi -ovc lavc -lavcopts vcodec=mpeg4:vhq:vbitrate=2000 -o out.avi
In order to parallelize it I have started this command in the two folders BreakableBonds and UnBreakableBonds. However each process only uses about 27% so a total of a bit above 50%. Are there any way that I can accelerate this? such that each process takes up about 50%. (I am aware that 50% on each process is not possible.)

Depending on the video codec you're using (x264 is a good choice), one encode should be able to saturate several CPU cores. I usually use ffmpeg directly, because mencoder was designed with some AVI-like assumptions.
See ffmpeg's wiki page on how to do this.
Again, I'd highly recommend h.264 in an mkv or mp4 container, rather than anything in an avi container. h.264 in avi is a hack, and the usual codec for avi is divx (h.263). h.264 is a big step forward.
h.264 is for video what mp3 is for audio: the first codec that's Good Enough, and that came along just as CPUs were getting fast enough to do it in realtime, and disks and networks were capable of handling the file sizes that produce good quality. Use it with ffmpeg as a frontend for libx264.
h.265 (and vp9) are both better codecs (in terms of compression efficiency) than h.264, but are far less widely supported, and take more CPU time. If you want to use them, use ffmpeg as a frontend for libx265 or libvpx. x265 is under heavy development, so it's probably improved, but several months ago, given equal encode CPU time, x265 didn't beat x264 in quality per bitrate. Given much more CPU time, x265 can do a great job and make as-good-looking encodes at much less bitrate than x264, though.
All 3 of those codecs are multi-threaded and can saturate 4 cores even at fast settings. At higher settings (spending more CPU time per block), you can saturate at least 16 cores, I'd guess. I'm not sure, but I think you can efficiently take advantage of 64 cores, and maybe more, on a HD x265 encode.
At fast settings and high bitrates, the gzip-style entropy coding final stage (i.e. CABAC for x264) limits the amount of CPUs you can keep busy. It's a serial task, so the whole encode only goes as fast as one CPU can compress the final bitstream.

Related

Is there a way to predict the amount of memory needed for ffmpeg?

I've just starting using ffmpeg and I want to create a VR180 video from a list of images with resolution 11520x5760. (Images are 80MB each, i have for now just 225 for testing.)
I used the code :
ffmpeg -framerate 30 -i "%06d.png" "output.mp4"
I ran out of my 8G RAM and ffmpeg crashed.
So I've create a 10G swap, ffmpeg filled it up and crashed.
Is there a way to know how much is needed for an ffmpeg command to run properly ?
Please provide output of the ffmpeg command when you run it.
I'm assuming FFmpeg will transcode to H.264, so it will create a H.264 encoder. Most memory sits in the lookahead queue and reference buffers. For H.264, the default for --rc-lookahead is 40. I believe H.264 allows something like 2x4=8 references (?) + current frame(s) (there can be frame-threading), so let's say roughly 50 frames in total. Frame size for YUV420P data is 1.5xresolution, so 1.5x11520x5760x50=~5GB. Add to that encoder-specific data which roughly doubles this, so 10GB should be enough.
If 8+10GB is not enough, my rough handwavy calculation is probably not precise enough. Your options are:
significantly reduce --rc-lookahead, --threads and --level so there's fewer frames alive at a time - read the documentation for each of these options to understand what they do, what their defaults are and what to change them to to reduce memory usage (see e.g. note here for --rc-lookahead).
You can also use a different (less complex) codec that has smaller memory requirements.

Faster FFMPEG conversion from WMV

I'm converting a lot of WMV files to MP4.
The command is:
ffmpeg -loglevel 0 -y -i source.wmv destination.mp4
I have roughly 100gb to convert. 24 hours later it's still not done (xeon, 64gb, super fast storage)
Am I missing something out? is there a better way to convert?
Here's a list of various things you can try:
Preset
Use a faster x264 encoding preset. A preset is a set of options that gives a speed vs compression efficiency tradeoff. Current presets in descending order of speed are: ultrafast, superfast, veryfast, faster, fast, medium, slow, slower, veryslow, placebo. The default preset is "medium". Example:
ffmpeg -i input.wmv -preset fast output.mp4
CPU capabilities
Check that the encoder is actually using the capabilities of your CPU. When encoding via libx264 the console output should show something like:
[libx264 # 0x7f8451001e00] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
If it shows none then encoding speed will suffer. Your x264 is likely misconfigured and you'll need to get a new one or re-compile.
ASM
Related to the above suggestion, but make sure your ffmpeg was not configured with --disable-asm, --disable-inline-asm, and/or --disable-yasm.
Also, make sure your x264 that is linked to ffmpeg is not compiled with --disable-asm.
If these configure options are used then encoding will be much slower.
AAC
You can encode AAC audio faster using the -aac_coder fast option when using the native FFmpeg AAC encoder (-c:a aac). However, this will have much less of an impact than choosing a faster preset for H.264 video encoding, and the audio quality will probably be reduced when compared to omitting -aac_coder fast.
ffmpeg
Use a recent ffmpeg. See the FFmpeg Download page for links to builds for Linux, macOS, and Windows.
In my case simple WMV to MP4 conversion using this command ffmpeg -i input.wmv output.mp4 increased size of file from 5.7MB to 350MB and the speed was 0.190x, which took ages to make mp4 video. Anyway, I waited more than 2 hours for it to finish and found out, that output video had 1000 frames/second. Having in mind, that it was Full HD video, 2 hours were pretty okay. My solution looks like this
ffmpeg -i input.wmv -crf 26 -vf scale=iw/3:ih/3,fps=15 output.mp4
Here I reduce video height and width in 3 times, make it only 15 fps and make a little compression
which resulted in:
only 5.7MB -> 15MB
278x speed improvement! (from 0.19x to 52.9x)
some quality loss, but for me it was not so neccessary

raspberry pi 3 OpenMax EmptyThisBuffer slow response when transcoding with libav or ffmpeg

The context is transcoding on a Raspberry Pi 3 from 1080i MPEG2 TS to 1080p#30fps H264 MP4 using libav avconv or ffmpeg. Both are using almost idenitical omx.c source file and share the same result.
The performance is short of 30fps (about 22fps) which makes it unsuitable for live transcoding without reducing the frame rate.
By timestamping the critical code, I noticed the following:
OMX_EmptyThisBuffer can take 10-20 msec to return. The spec/document indicates that this should be < 5msec. This along would almost account for the performance deficit. Can someone explains why this OMX call is out of spec?
In omx.c, a zerocopy option is used to optimized the image copying performance. But the precondition (contiguous planes and stride alignment) for this code is never satisfied and this the optimization was never in effect. Can someone explain how this zerocopy optimization can be employed?
Additional question on h264_omx encoder: it seems to on accept MP4 or raw H264 output format. How difficult it is to add other format, e.g. TS?
Thanks

Parallelize encoding of audio-only segments in ffmpeg

We are looking to decrease the execution time of segmentation/encoding from wav to aac segmented for HTTP live streaming using ffmpeg to segment and generate a m3u8 playlist by utilizing all the cores of our machine.
In one experiment, I had ffmpeg directly segment a wav file into aac with libfdk_aac, however it took quite a long time to finish.
In the second experiment, I had ffmpeg segment a wav file as is (wav) which was quite fast (< 1 second on our machines), then use GNU parallel to execute ffmpeg again to encode the wav segments to aac and manually changed the .m3u8 file without changing their durations. This was performed much faster however "silence" gaps could be heard when streaming the output audio.
I have initially tried the second scenario using mp3 and result was still quite the same. Though I've read that lame adds padding during encoding (http://scruss.com/blog/2012/02/21/generational-loss-in-mp3-re-encoding/), does this this mean that libfdk_aac also adds padding during encoding?
Maybe this one is related to this question: How can I encode and segment audio files without having gaps (or audio pops) between segments when I reconstruct it?
According to section 4 of HLS Specification, we have this:
A Transport Stream or audio elementary stream segment MUST be the
continuation of the encoded media at the end of the segment with the
previous sequence number, where values in a continuous series, such as
timestamps and Continuity Counters, continue uninterrupted
"Silence" gaps are 99,99% of times related to wrong counters/discontinuity. Because you wrote that you manually changed the .m3u8 file without changing their durations I deduce you tried to cut the audio by yourself. It can't be done.
An HLS stream can't have a parallelizable creation because of these counters. They must follow a sequence [ MPEG2-TS :-( ]. You better get a faster processor.

Internet Video | FFMPEG | 2-PASS encoding vs. 1-PASS CRF

What is the most preferable way to encode internet video?
2-Pass encoding probably takes longer processing time, but results in lower file size, and more average bitrate (?) Correct?
CRF (constant rate factor) results in a constant rate, but higher file size?
What is default way sites like youtube, vimeo encode their videos? And should I do it any other way than I do now with 2-Pass encoding?
Fredrick is right about VBR vs. CBR, but dropson mentions CRF (constant rate factor), which is actually kind of a third method. CBR and VBR both lock in on a bit rate, while CRF locks in on a perceived visual quality. It also takes into account motion in the video, and can typically achieve better compression than 2-pass VBR. More info.
It's the default setting if you're using x264 or Zencoder. I'd go with CRF any time you're doing h.264.
There are two encoding modes for video
CBR or Constant Bit Rate
Main usage is when you have a fixed carrier for your data, the best example here is the video telephony Use Case, where audio/video/control information needs to co-exist on a fixed 64 kbit carrier. Since this is a real-time UC, one pass encoding is used and the rate-controller (RC) does it's best to have a fixed number of bits assigned to each frame so that the bitrate is deterministic.
VBR or Variable Bit Rate
This encoding scheme is used practically every where else. Variable here means that e.g. if the video goes black or no motion, no bits are sent, i.e bitrate is 0 for this particular moment, then when things starts to move again, the bitrate sky rockets.
This encoding scheme have normally no real-time requirements, e.g. when encoding/transcoding an video. Normally you would use a multipass encoder here to get the highest quality and to even out the bitrate-peakes.
Youtube uses VBR. Use e.g clive to download videos from youtube and analyse them using ffmpeg and you'll see the variable bitrate in action.
As always, wikipedia is your friend, read their entry on VBR and CBR
There is no reason for you to use anything else than VBR (unless you plan to set up an streaming-server)

Resources