I am currently playing with ffmpeg + libx264, but i couldn't find a way to limit the backward dependency between coded frames.
Let me explain what i mean: I want the coded frames to only contain references to at most, let's say, 5 frames in the future. As a result, no frame has to "wait" for more than 5 frames to be coded (makes sense for low latency applications).
I am aware of the -tune zerolatency option, but that's not what i want; I still want bidirectional prediction.
If you mean to limit the number of consecutive B-frames then you can use the --bframes <integer> x264 option or the -bf <integer> FFmpeg option.
See also: Diary Of An x264 Developer - x264: the best low-latency...
Related
I've just starting using ffmpeg and I want to create a VR180 video from a list of images with resolution 11520x5760. (Images are 80MB each, i have for now just 225 for testing.)
I used the code :
ffmpeg -framerate 30 -i "%06d.png" "output.mp4"
I ran out of my 8G RAM and ffmpeg crashed.
So I've create a 10G swap, ffmpeg filled it up and crashed.
Is there a way to know how much is needed for an ffmpeg command to run properly ?
Please provide output of the ffmpeg command when you run it.
I'm assuming FFmpeg will transcode to H.264, so it will create a H.264 encoder. Most memory sits in the lookahead queue and reference buffers. For H.264, the default for --rc-lookahead is 40. I believe H.264 allows something like 2x4=8 references (?) + current frame(s) (there can be frame-threading), so let's say roughly 50 frames in total. Frame size for YUV420P data is 1.5xresolution, so 1.5x11520x5760x50=~5GB. Add to that encoder-specific data which roughly doubles this, so 10GB should be enough.
If 8+10GB is not enough, my rough handwavy calculation is probably not precise enough. Your options are:
significantly reduce --rc-lookahead, --threads and --level so there's fewer frames alive at a time - read the documentation for each of these options to understand what they do, what their defaults are and what to change them to to reduce memory usage (see e.g. note here for --rc-lookahead).
You can also use a different (less complex) codec that has smaller memory requirements.
I am building a local app for watching local videos from the browser, because of some of the videos beeing over 1hour they started to lagg out and using HLS instead of .mp4 solved this.
In the app I'm building the user will often skip 10-40 seconds forward.
My question is: Should I use -hls_time 60 or would it be better to just use -hls_time 10
Current code: ffmpeg -i "input.mp4" -profile:v baseline -level 3.0 -start_number 0 -hls_time 10 -hls_playlist_type vod -f hls "input\index.m3u8"
Longer segments imply greater segment sizes so after a seek the player might take longer to resume depending on the available bandwidth and whether the required segment has already been retrieved or not.
If the app is intended for mobile devices where network conditions are expected to vary you will also need to consider adaptive streaming. In this case, with longer segments you will see less quality switching but you risk stalling the playback. You can find a more detailed article here.
Some observations about your ffmpeg command:
don't set the level as it's already auto-calculated if not specified and you risk getting it wrong and messing device compatibility checks.
segments are cut only on keyframes and their duration can be greater than the specified hls_time. If you need precise segment durations you need to insert a keyframe at the desired interval.
I have a scenario where I am streaming a reference video on a server machine and receiving it at a client machine with exact same codec, using FFMpeg via UDP/RTP.
So, I have a reference.avi file and a recording.ts file with me. Now, due to a network side issue and FFMpeg discarding old frames, often the recording.ts lacks exactly 12 FRAMES from the beginning. Sometimes, it may lack more frames in-between but that'd due to general network traffic and packet loss reason and I don't plan to account for that. Anyways, due to those 12 frames, when I calculate the PSNR, it drops down to ~13, even though remaining frames may/may not be affected.
So, my aim is to discard first 12 frames from reference.ts and then compare. For that, I would also need to adjust the frames from recording.ts.
Consider the following scenario:
reference.ts has 1500 frames. So naturally I am going to cut-short it 1488. Then we have the following cases:
recording.ts has 1500 frames. This is not affected. Still I will remove 12 frames to match the count. So frame 1 would then represent frame 13.
recording.ts has 1496 frames. This is not affected. Still I will remove 12 frames even though it'd get to 1484 count assuming that frame 1 would then represent frame 13.
recording.ts has 1488 frames. This is affected. No need to remove frames.
recording.ts has 1480 frames. This is affected. No need to remove frames.
Once that is done, then I will calcualte the PSNR. So, my FFMpeg should be able to do all this, hopefully in a single command on bash.
A better alternative would be for FFMpeg to find the where the 13th frame is in recording.ts and then cut-short from the beginning. That'd be more preferred and even more if there is no cut-shorting required, i.e. if offset could be set in-line to command and no additional video output is generated for use in PSNR comparison.
Current I am using the following command to calculate the PSNR.
ffmpeg -i 'recording.ts' -vf "movie='reference.avi', psnr=stats_file='psnr.txt'" -f rawvideo -y /dev/null
It'd be great if somebody could help me in this regard. Thanks.
I'm converting a lot of WMV files to MP4.
The command is:
ffmpeg -loglevel 0 -y -i source.wmv destination.mp4
I have roughly 100gb to convert. 24 hours later it's still not done (xeon, 64gb, super fast storage)
Am I missing something out? is there a better way to convert?
Here's a list of various things you can try:
Preset
Use a faster x264 encoding preset. A preset is a set of options that gives a speed vs compression efficiency tradeoff. Current presets in descending order of speed are: ultrafast, superfast, veryfast, faster, fast, medium, slow, slower, veryslow, placebo. The default preset is "medium". Example:
ffmpeg -i input.wmv -preset fast output.mp4
CPU capabilities
Check that the encoder is actually using the capabilities of your CPU. When encoding via libx264 the console output should show something like:
[libx264 # 0x7f8451001e00] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
If it shows none then encoding speed will suffer. Your x264 is likely misconfigured and you'll need to get a new one or re-compile.
ASM
Related to the above suggestion, but make sure your ffmpeg was not configured with --disable-asm, --disable-inline-asm, and/or --disable-yasm.
Also, make sure your x264 that is linked to ffmpeg is not compiled with --disable-asm.
If these configure options are used then encoding will be much slower.
AAC
You can encode AAC audio faster using the -aac_coder fast option when using the native FFmpeg AAC encoder (-c:a aac). However, this will have much less of an impact than choosing a faster preset for H.264 video encoding, and the audio quality will probably be reduced when compared to omitting -aac_coder fast.
ffmpeg
Use a recent ffmpeg. See the FFmpeg Download page for links to builds for Linux, macOS, and Windows.
In my case simple WMV to MP4 conversion using this command ffmpeg -i input.wmv output.mp4 increased size of file from 5.7MB to 350MB and the speed was 0.190x, which took ages to make mp4 video. Anyway, I waited more than 2 hours for it to finish and found out, that output video had 1000 frames/second. Having in mind, that it was Full HD video, 2 hours were pretty okay. My solution looks like this
ffmpeg -i input.wmv -crf 26 -vf scale=iw/3:ih/3,fps=15 output.mp4
Here I reduce video height and width in 3 times, make it only 15 fps and make a little compression
which resulted in:
only 5.7MB -> 15MB
278x speed improvement! (from 0.19x to 52.9x)
some quality loss, but for me it was not so neccessary
I am recording a continuous, live stream to a high-bitrate HLS stream. I then want to asynchronously transcode this to different formats/bitrates. I have this working, mostly, except audio artefacts are appearing between each segment (gaps and pops).
Here is an example ffmpeg command line:
ffmpeg -threads 1 -nostdin -loglevel verbose \
-nostdin -y -i input.ts -c:a libfdk_aac \
-ac 2 -b:a 64k -y -metadata -vn output.ts
Inspecting an example sound file shows that there is a gap at the end of the audio:
And the start of the file looks suspiciously attenuated (although this may not be an issue):
My suspicion is that these artefacts are happening because transcoding are occurring without the context of the stream as a whole.
Any ideas on how to convince FFMPEG to produce audio that will fit back into a HLS stream?
** UPDATE 1 **
Here are the start/end of the original segment. As you can see, the start still appears the same, but the end is cleanly ended at 30s. I expect some degree of padding with lossy encoding, but I there is some way that HLS manages to do gapless playback (is this related to iTunes method with custom metadata?)
** UPDATED 2 **
So, I converted both the original (128k aac in MPEG2 TS) and the transcoded (64k aac in aac/adts container) to WAV and put the two side-by-side. This is the result:
I'm not sure if this is representative of how a client will play it back, but it seems a bit odd that decoding the transcoded one introduces a gap at the start and makes the segment longer. Given they are both lossy encoding, I would have expected padding to be equally present in both (if at all).
** UPDATE 3 **
According to http://en.wikipedia.org/wiki/Gapless_playback - Only a handful of encoders support gapless - for MP3, I've switched to lame in ffmpeg, and the problem, so far, appears to have gone.
For AAC (see http://en.wikipedia.org/wiki/FAAC), I have tried libfaac (as opposed to libfdk_aac) and it also seems to produce gapless audio. However, the quality of the latter isn't that great and I'd rather use libfdk_aac is possible.
This is more of a conceptual answer rather than containing explicit tools to use, sorry, but it may be of some use in any case - it removes the problem of introducing audio artifacts at the expense of introducing more complexity in your processing layer.
My suggestion would be to not split your uncompressed input audio at all, but only produce a contiguous compressed stream that you pipe into an audio proxy such as an icecast2 server (or similar, if icecast doesn't support AAC) and then do the split/recombine on the client-side of the proxy using chunks of compressed audio.
So, the method here would be to regularly (say, every 60sec?) connect to the proxy and collect a chunk of audio a little bit bigger than the period that you are polling (say, 75sec worth?) - this needs to be set up to run in parallel, since at some points there will be two clients running - it could even be run from cron if need be or backgrounded from a shell script ...
Once that's working, you will have a series of chunks of audio that overlap a little - you'd then need to do some processing work to compare these and isolate the section of audio in the middle which is unique to each chunk ...
Obviously this is a simplification, but assuming that the proxy does not add any metadata info (ie, ICY data or hinting) then splitting up the audio this way should allow the processed chunks to be concatenated without any audio artifacts since there is only one set of output for the original audio input and comparing them will be a doddle since you actually don't care one whit about the format, it's just bytes at that point.
The benefit here is that you've disconnected the audio encoder from the client, so if you want to run some other process in parallel to transcode to different formats or bit rates or chunk the stream more aggressively for some other consumer then that doesn't change anything on the encoder side of the proxy - you just add another client to the proxy using a tool chain similar to the above.