I am trying to build ffmpeg encoder on linux. I started with a custom built server Dual 1366 2.6 Ghz Xeon CPUs (6 cores) with 16 GB RAM with Ubuntu 16.04 minimal install. Built ffmpeg with h264 and aac. I am taking live source OTA channels and encoding/streaming them with following parameters
-vcodec libx264 -preset superfast -crf 25 -x264opts keyint=60:min-keyint=60:scenecut=-1 -bufsize 7000k -b:v 6000k -maxrate 6300k -muxrate 6000k -s 1920x1080 -format yuv420p -g 60 -sn -c:a aac -b:a 384k -ar 44100
And I am able to successfully udp out using mpegts. My problem starts with 5th stream. The server can handle four streams and as soon as I introduce 5th stream I start seeing hiccups in output. Looking at my cpu usage using top I still see only 65% to 75% usage with occasional 80% hit. Memory usage is well within acceptable parameters. So I am wondering either top is not giving me accurate cpu usage or something is not right with ffmpeg. The server is isolated for udp in/out on a 1 Gbps network.
I decided to up the cpu power and installed two 3.5 Ghz CPUs (6 cores) thinking it was perhaps the cpu clock. To my surprise the results were no different. So now I am wondering is there some built in limit I am hitting when I process at 1080p. If I change the resolution to 720p it is able to process 8 streams but 720 is not acceptable.
My target is 10 1080p streams per server.
So my questions are
1. If I use a quad motherboard and up the cpu count to 4 (6 or 8 cores) will I get 10 1080p streams? Is there any theoretical max I can go with ffmpeg per machine?
2. Do cores matter more or does clock matter more?
3. Any suggestions in improvement with my options. I have tried ultrafast preset but the output quality is unacceptable.
Thanks in advance
Have you really excluded the CPU? Make sure to check how each individual core is operating. If no core is reaching 100%, then your most likely candidate is bandwidth: either your motherboard cannot handle all the data going back and forth, or your memory. Exchanging memory with a faster version is a simple test and should give you your answer.
Related
I am building a local app for watching local videos from the browser, because of some of the videos beeing over 1hour they started to lagg out and using HLS instead of .mp4 solved this.
In the app I'm building the user will often skip 10-40 seconds forward.
My question is: Should I use -hls_time 60 or would it be better to just use -hls_time 10
Current code: ffmpeg -i "input.mp4" -profile:v baseline -level 3.0 -start_number 0 -hls_time 10 -hls_playlist_type vod -f hls "input\index.m3u8"
Longer segments imply greater segment sizes so after a seek the player might take longer to resume depending on the available bandwidth and whether the required segment has already been retrieved or not.
If the app is intended for mobile devices where network conditions are expected to vary you will also need to consider adaptive streaming. In this case, with longer segments you will see less quality switching but you risk stalling the playback. You can find a more detailed article here.
Some observations about your ffmpeg command:
don't set the level as it's already auto-calculated if not specified and you risk getting it wrong and messing device compatibility checks.
segments are cut only on keyframes and their duration can be greater than the specified hls_time. If you need precise segment durations you need to insert a keyframe at the desired interval.
I want to know that how can I implement fast encoding with ffmpeg.
I used this code:
vcodec libx264 -pix_fmt yuv420p -r 30 -g 60 -b:v 1000k -shortest -acodec libmp3lame -b:a 128k -ar 44100 -threads 0 -preset veryfast
But it only uses 50% CPU(dual xeon 2.3 ghz) and 2% (15gb) Ram.
Now I want it to use a lot of cpu and ram for fast encoding, how to do? Thanks everyone
How many threads are being used, highly depends on used codec, settings and hardware. Besides that, RAM is used rarely that amount by "just" 1000k bitrate with a small resolution. So you might never need about 15G of RAM.
In your case, you're setting -threads 0 which means "optimal usage" of hardware (will be set automatically by some algorithms). I do not recommend it, but you can try setting -threads 2 for 2 threads, or -threads 4 for 4 threads.
As a rule of thumb, you can set one thread per core (if you have 4 cores, use 4 threads, 8 cores - 8 threads, and so on).
Please be aware, that simultaneously encoding video at all cores and audio might result in lower speed, than another "optimal usage" calculated by ffmpeg itself. Just give it a try ;-)
I am piping frames into FFmpeg at quite a slow rate (1 per second) and I want to stream them out with very low latency.
There are not only sources (for example here and here) that don't mention that I need to set the GOP size (keyint) to a small value, but there are even sources (like here and here) that explicitly say that I don't have to set the GOP size to a small value.
However, so far the only way I found to reduce the really long start delay is to actually reduce the GOP size to 1.
Anyway, here's my current command line:
ffmpeg -f image2pipe
-probesize 32
-i -
-c:v libx264
-preset veryfast
-crf 23
-vsync 2
-movflags "frag_keyframe+empty_moov"
-profile baseline
-x264-params "intra-refresh=1"
-tune zerolatency
-f mp4
-
(I also tried adding :bframes=0:force-ctr:no-mbtree:sync-lookahead=0:sliced-threads:rc-lookahead=0 to -x264-params (what -tune zerolatency is supposed to do) because some of those values didn't appear in the debug output, but as expected it had no effect.)
As you can see here, we are already 182 frames (= 3 minutes wall clock) into the stream, but it still hasn't emitted anything (size was 1kB from the start).
frame= 182 fps=1.0 q=20.0 size= 1kB time=00:00:07.24 bitrate= 0.8kbits/s speed=0.0402x
This actually talks about the time-to-first-picture, but it makes it seem like it's not a big deal. ;) It is for me, so maybe I have to make the first GOP 1 frame long and then I can switch to longer GOPs? Can FFmpeg do that?
Adding -force_key_frames expr:eq(n,1) will force a KF on the 2nd frame.
Since your rate is 1 fps, I would suggest an expr of lt(n,5). Also, the default keyint is 250 and min-keyint is 40. So if you want to leave and rejoin the stream, it may take very long to restart. Consider reducing keyint.
I'm converting a lot of WMV files to MP4.
The command is:
ffmpeg -loglevel 0 -y -i source.wmv destination.mp4
I have roughly 100gb to convert. 24 hours later it's still not done (xeon, 64gb, super fast storage)
Am I missing something out? is there a better way to convert?
Here's a list of various things you can try:
Preset
Use a faster x264 encoding preset. A preset is a set of options that gives a speed vs compression efficiency tradeoff. Current presets in descending order of speed are: ultrafast, superfast, veryfast, faster, fast, medium, slow, slower, veryslow, placebo. The default preset is "medium". Example:
ffmpeg -i input.wmv -preset fast output.mp4
CPU capabilities
Check that the encoder is actually using the capabilities of your CPU. When encoding via libx264 the console output should show something like:
[libx264 # 0x7f8451001e00] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
If it shows none then encoding speed will suffer. Your x264 is likely misconfigured and you'll need to get a new one or re-compile.
ASM
Related to the above suggestion, but make sure your ffmpeg was not configured with --disable-asm, --disable-inline-asm, and/or --disable-yasm.
Also, make sure your x264 that is linked to ffmpeg is not compiled with --disable-asm.
If these configure options are used then encoding will be much slower.
AAC
You can encode AAC audio faster using the -aac_coder fast option when using the native FFmpeg AAC encoder (-c:a aac). However, this will have much less of an impact than choosing a faster preset for H.264 video encoding, and the audio quality will probably be reduced when compared to omitting -aac_coder fast.
ffmpeg
Use a recent ffmpeg. See the FFmpeg Download page for links to builds for Linux, macOS, and Windows.
In my case simple WMV to MP4 conversion using this command ffmpeg -i input.wmv output.mp4 increased size of file from 5.7MB to 350MB and the speed was 0.190x, which took ages to make mp4 video. Anyway, I waited more than 2 hours for it to finish and found out, that output video had 1000 frames/second. Having in mind, that it was Full HD video, 2 hours were pretty okay. My solution looks like this
ffmpeg -i input.wmv -crf 26 -vf scale=iw/3:ih/3,fps=15 output.mp4
Here I reduce video height and width in 3 times, make it only 15 fps and make a little compression
which resulted in:
only 5.7MB -> 15MB
278x speed improvement! (from 0.19x to 52.9x)
some quality loss, but for me it was not so neccessary
I am using MEncoder to combine a huge amount of jpg pictures into a time-lapse video. I have two main folders with about 10 subfolders each and in order to automate the process i am running:
find . -type d -name img -exec sh -c '(cd {} && /Volumes/SAMSUNG/PedestrianBehaviour/BreakableBonds/jpg2avi.sh t*)' ';'
where jpg2avi is the settings for MEncoder.
mencoder 'mf://t00*.jpg' -mf fps=10 -o raw.avi -ovc lavc -lavcopts vcodec=mpeg4:vhq:vbitrate=2000 -o out.avi
In order to parallelize it I have started this command in the two folders BreakableBonds and UnBreakableBonds. However each process only uses about 27% so a total of a bit above 50%. Are there any way that I can accelerate this? such that each process takes up about 50%. (I am aware that 50% on each process is not possible.)
Depending on the video codec you're using (x264 is a good choice), one encode should be able to saturate several CPU cores. I usually use ffmpeg directly, because mencoder was designed with some AVI-like assumptions.
See ffmpeg's wiki page on how to do this.
Again, I'd highly recommend h.264 in an mkv or mp4 container, rather than anything in an avi container. h.264 in avi is a hack, and the usual codec for avi is divx (h.263). h.264 is a big step forward.
h.264 is for video what mp3 is for audio: the first codec that's Good Enough, and that came along just as CPUs were getting fast enough to do it in realtime, and disks and networks were capable of handling the file sizes that produce good quality. Use it with ffmpeg as a frontend for libx264.
h.265 (and vp9) are both better codecs (in terms of compression efficiency) than h.264, but are far less widely supported, and take more CPU time. If you want to use them, use ffmpeg as a frontend for libx265 or libvpx. x265 is under heavy development, so it's probably improved, but several months ago, given equal encode CPU time, x265 didn't beat x264 in quality per bitrate. Given much more CPU time, x265 can do a great job and make as-good-looking encodes at much less bitrate than x264, though.
All 3 of those codecs are multi-threaded and can saturate 4 cores even at fast settings. At higher settings (spending more CPU time per block), you can saturate at least 16 cores, I'd guess. I'm not sure, but I think you can efficiently take advantage of 64 cores, and maybe more, on a HD x265 encode.
At fast settings and high bitrates, the gzip-style entropy coding final stage (i.e. CABAC for x264) limits the amount of CPUs you can keep busy. It's a serial task, so the whole encode only goes as fast as one CPU can compress the final bitstream.