Searching for a method to extract from video stream (.m2t or .ts file) a certain frames as is, encoded. OpenCV also extracts frames easily but decodes them immediately.
Given:
A .ts or .m2t file with H.264/MPEG-4 encoded stream.
Starting point in time for extraction like h:m:s.f (example: 0:2:1.12).
Ending point in time in the same format.
I need to read from the file all frames in the given interval and provide to another program as buffer frame by frame as they are. The catch here is to keep the frames encoded as they are, do not decode/encode/encapsulate them.
Picking a frame from the H.264 m2t to a pipe:
ffmpeg -ss 0:2:1.12 -i .\my_video.ts -c:v copy -f mpegts -frames:v 1 pipe: -y -hide_banner
Obviously, the time stamp is increasing for every next frame. From the pipe it is not a problem to convert it to a buffer.
Questions:
Is this method correct to extract a separate frame as it is without
any reference/recalculations with neighbor frames?
Not sure that flag -f mpegts really keeps the frame untouched. Is there better flag? (Maybe -f null?)
How to know the type of extracted frame (i, P, or B)?
Thank you.
This answer slightly diverts the original question. However, it does the job and gives good enough result.
The original question requested to extract encoded video frame by frame. Suggested variants extract to the standar output a compressed videos in batches of several frames. This is easy to pick in software and process/concatenate later as needed.
Variant 1
This method gives really smooth video by concatenation of resulting chunks. This method consumes more CPU than the Variant 2. And processing of every next chunk in the movie takes longer and longer. Therefore, Variant 2 is way better in terms of performance.
ffmpeg.exe -y -nostdin -loglevel quiet -hide_banner -i "c:\\temp\\in.ts" -vf trim=<from_second>:<to_second> -f mpegts pipe:
The order of keys is important and means:
-y -nostdin -loglevel quiet -hide_banner - don't ask questions, don't print excessive output.
-vf x:y - videofilter which trims out all the movie except the part between start and stop position of required video chunk from original file. These are floats.
-f mpegts - normally not needed if the output goes to the file: ffmpeg knows to derive format from the file name. In this case the result goes to the pipe to explicit specification of the output format needed.
Variant 2
This method gives almost smooth video by concatenation of resulting chunks. Almost good is visible jumping video, so it is not perfect method in terms of quality.
ffmpeg.exe -y -nostdin -loglevel quiet -hide_banner -ss <from_second> -t <chunk_duration> -i "c:\\temp\\in.ts" -copyts -f mpegts pipe:
Not repeating the explained options in the variant 1, only new options:
-ss <from_second> - skip the movie until specified position. It is important to give this key BEFORE the -i to save processing time. Else the ffmpeg will read all the movie until specified position, not skipping it. This can be supplied in format h:mm:ss.ff or just float seconds.
-t <chunk_duration> - required/optimal chunk size. This is float. If the GOP is known it's better to take chunks by the GOP size. This improves performance.
-copyts - keep the timestamps of video chunks from the original video. Without this key the result will play only forst frame of each chunk. Better interpretation/understanding is welcome.
Related
The need
Hello, I need to extract two regions of a .h264 video file via the crop filter into two files. The output videos need to be monochrome and extension .mp4. The encoding (or format?) should guarantee that video frames are organized monotonically. Finally, I need to get the timestamps for both files (which I'd bet are the same timestamps that I would get from the input file, see below).
In the end I will be happy to do everything in one command via an elegant one liner (via a complex filter I guess), but I start doing it in multiple steps to break it down in simpler problems.
In this path I get into many difficulties and despite having searched in many places I don't seem to find solutions that work. Unfortunately I'm no expert of ffmpeg or video conversion, so the more I search, the more details I discover, the less I solve problems.
Below you find some of my attempts to work with the following options:
-filter:v "crop=400:ih:260:0,format=gray" to do the crop and the monochrome conversion
-vf showinfo possibly combined with -vsync 0 or -copyts to get the timestamps via stderr redirection &> filename
-c:v mjpeg to force monotony of frames (are there other ways?)
1. cropping each region and obtaining monochrome videos
$ ffmpeg -y -hide_banner -i inVideo.h264 -filter:v "crop=400:ih:260:0,format=gray" outL.mp4
$ ffmpeg -y -hide_banner -i inVideo.h264 -filter:v "crop=400:ih:1280:0,format=gray" outR.mp4
The issue here is that in the output files the frames are not organized monotonically (I don't understand why; how come would that make sense in any video format? I can't say if that comes from the input file).
EDIT. Maybe it is not frames, but packets, as returned by av .demux() method that are not monotonic (see below "instructions to reproduce...")
I have got the advice to do a ffmpeg -i outL.mp4 outL.mjpeg after, but this produces two videos that look very pixellated (at least playing them with ffplay) despite being surprisingly 4x bigger than the input. Needless to say, I need both monotonic frames and lossless conversion.
EDIT. I acknowledge the advice to specify -q:v 1; this fixes the pixellation effect but produces a file even bigger, ~12x in size. Is it necessary? (see below "instructions to reproduce...")
2. getting the timestamps
I found this piece of advice, but I don't want to generate hundreds of image files, so I tried the following:
$ ffmpeg -y -hide_banner -i outL.mp4 -vf showinfo -vsync 0 &>tsL.txt
$ ffmpeg -y -hide_banner -i outR.mp4 -vf showinfo -vsync 0 &>tsR.txt
The issue here is that I don't get any output because ffmpeg claims it needs an output file.
The need to produce an output file, and the doubt that the timestamps could be lost in the previous conversions, leads me back to making a first attempt of a one liner, where I am testing also the -copyts option, and the forcing the encoding with -c:v mjpeg option as per the advice mentioned above (don't know if in the right position though)
ffmpeg -y -hide_banner -i testTex2.h264 -copyts -filter:v "crop=400:ih:1280:0,format=gray" -vf showinfo -c:v mjpeg eyeL.mp4 &>tsL.txt
This does not work because surprisingly the output .mp4 I get is the same as the input. If instead I put the -vf showinfo option just before the stderr redirection, I get no redirected output
ffmpeg -y -hide_banner -i testTex2.h264 -copyts -filter:v "crop=400:ih:260:0,format=gray" -c:v mjpeg outR.mp4 -vf showinfo dummy.mp4 &>tsR.txt
In this case I get the desired timestamps output (too much: I will need some solution to grab only the pts and pts_time data out of it) but I have to produce a big dummy file. The worst thing is anyway, that the mjpeg encoding produces a low resolution very pixellated video again
I admit that the logic how to place the options and the output files on the command line is obscure to me. Possible combinations are many, and the more options I try the more complicated it gets, and I am not getting much closer to the solution.
3. [EDIT] instructions how to reproduce this
get a .h264 video
turn it into .mp by ffmpeg command $ ffmpeg -i inVideo.h264 out.mp4
run the following python cell in a jupyter-notebook
see that the packets timestamps have diffs greater and less than zero
%matplotlib inline
import av
import numpy as np
import matplotlib.pyplot as mpl
fname, ext="outL.direct", "mp4"
cont=av.open(f"{fname}.{ext}")
pk_pts=np.array([p.pts for p in cont.demux(video=0) if p.pts is not None])
cont=av.open(f"{fname}.{ext}")
fm_pts=np.array([f.pts for f in cont.decode(video=0) if f.pts is not None])
print(pk_pts.shape,fm_pts.shape)
mpl.subplot(211)
mpl.plot(np.diff(pk_pts))
mpl.subplot(212)
mpl.plot(np.diff(fm_pts))
finally create also the mjpeg encoded files in various ways, and check packets monotony with the same script (see also file size)
$ ffmpeg -i inVideo.h264 out.mjpeg
$ ffmpeg -i inVideo.h264 -c:v mjpeg out.c_mjpeg.mp4
$ ffmpeg -i inVideo.h264 -c:v mjpeg -q:v 1 out.c_mjpeg_q1.mp4
Finally, the question
What is a working way / the right way to do it?
Any hints, even about single steps and how to rightly combine them will be appreciated. Also, I am not limited tio the command line, and I would be able to try some more programmatic solution in python (jupyter notebook) instead of the command line if someone points me in that direction.
There are two ffmpeg commands. First one is used to seek and copy video chunk. Second one is used to transcode video chunk applying select filter for exact frames match.
Here is how:
ffmpeg -ss <sec_from> -to <sec_to> -copyts -i <input> -map 0:v:0 -c copy chunk.mp4
ffmpeg -copyts -i chunk.mp4 -vf 'select=between(pts\,<pts_from>\,<pts_to>)' transcoded_cunk.mp4
It works fine most of the times. But for some inputs there is a little pts drift in downloaded chunk so missing frames is possible. In other words pts of the same packets (compared by hash) are shifted by several points (in my case 0,0002 sec) between input and chunked output.
What is the possible reason for such pts drift?
UPDATE 1: That's because ffmpeg set timescale=1000 in mvhd atom so edit list media time to start from looses precision. Is it possible to force mvhd timescale?
UPDATE 2: It's not possible to change mvhd timescale because ffmpeg uses constant (MOV_TIMESCALE 1000):
https://github.com/FFmpeg/FFmpeg/blob/82bd02a2c73bb5e6b7cf5e5eba486e279f1a7358/libavformat/movenc.c#L3498
UPDATE 3: same issue discussed earlier
i try to concat multiple videos to one video and add an background music to it.
for some reason the background music is perfectly added to the output video but the audio of each part of the output is speed up to a chipmunk version of the video itself. this results in an output video of 7 minutes with about 5 minutes of silence since everything is so fast that all the audio finishes after about 2 minutes.
my command is:
ffmpeg -safe 0 -i videolist.ffconcat -i bg_loop.mp3 -y -filter_complex "[1:0]volume=0.3[a1];[0:a][a1]amix=inputs=2" -vcodec libx264 -r 25 -filter:v scale=w=1920:h=1080 -map 0:v:0 output.mp4
i tried to remove the background music (since i wasn't able to loop it through the video i thought maybe that's the issue) and still.. all the audio of the video clips is still speed up resulting in chaotic audio at the beginning and silence at the end.
my video list looks like this:
ffconcat version 1.0
file intro.mp4
file clip-x.mp4
file clip-y.mp4
file clip-x.mp4
file clip-y.mp4
[... and so on]
i hope somebody can tell me what i'm doing wrong here (and maybe how to adjust my command to loop the background music through all the clips)
i googled a bit and found the adjustment of my command to add amix=inputs=2:duration=first but that doesn't do the trick and if i add duration=shortest or duration=longest nothing changes the output audio
The concat demuxer requires that all streams in inputs have the same properties. For audio, that includes codec, sampling rate, channel layout, sample format..
If audio of some inputs is sounding funny after concat, that usually indicates a sampling rate mismatch. Run ffprobe -show_streams -select_streams a -v 0 "input-file" on each input to check. For those which are different, you can re-encode only the audio by adding -ar X where X is the most common sampling rate found among your inputs e.g. -ar 44100. Other parameters will depend on format details. Keep video by using -c:v copy.
I have a project where I need to extract part of a video near the end of video files and it needs to be frame accurate. I can extract my segment frame accurately now using '-vf' filter command and it works very well. The only problem is for large high-resolution files, the seek time ends up being 10X the time required for the extraction/encoding part. Is there any other or faster way to extract videos frame accurately?
The working command that I am currently using is below:
ffmpeg -i 'test_input.mp4' -vf 'trim=start_frame=134037:end_frame=135024,setpts=PTS-STARTPTS' -an -c:v libx264 -preset slow -f mp4 'test_output.mp4' -y 2>&1
My task is to generate (by piping, so that a file can be played at the same time with generation) an mp4 file which is a part of a larger file, with the result looking like a static file link, being seekable before it fully loads (i.e. supporting range headers).
Here is how I do it now:
ffmpeg -ss $1 -i teststream_hls1_replay.mp4 -t $2 -timecode $3 \
-codec copy -movflags frag_keyframe+faststart -f mp4 pipe:1
Result is OK (video starts from the right point) except the player does not see the total duration of a file so a controlbar looks weird, and seeking isn't possible properly, just because the controlbar jumps all the time.
How do I indicate to ffmpeg that it has to set moov atom to contain right duration?
Basically the question boils down to: how do I force set some arbitrary duration of file in a moov atom, when I am generating a fragmented mp4? ffmpeg will not get know how long will it be, so explainably it can't do it itself, but I know... is there a command line parameters to specify a 'forced duration'?
Even if it's possible to set moov atom to contain the right duration, the player still won't be able to seek properly. Due to the nature of piping, the player needs to process the video sequentially. How many seconds the video can be sought forward or backward depends on how much the video data the player is caching.
For example, in mpv you can set forward cache and backward cache to 4 MiB and 2 MiB, respectively:
ffmpeg [..] | mpv - --cache=yes --demuxer-max-bytes=4MiB --demuxer-max-back-bytes=2MiB
If the video's bitrate is 100 KiB/s, you can't seek forward and backward more than 40s and 20s, respectively, from the current timestamp.
For some players it might be more desirable to set the duration to 0 by setting -movflags empty_moov. In your case:
ffmpeg -ss $1 -i /root/nextpro-livestream/replay/teststream_hls1_replay.mp4 -t $2 -timecode $3 -codec copy -movflags frag_keyframe+empty_moov -f mp4 pipe:1
That way, the player's control bar won't jump all the time, so users will be able to seek more properly. But still, the seek amount is limited by the player's cache.
If you really want the users to be able to seek to any timestamp, you need to change the protocol from pipe to file or http or other protocols supported by ffmpeg. You might not need to generate a fragmented mp4 video (-movflags frag_keyframe), but you might still need to set -movflags +faststart.