how can i create a checksum of only the media data without the metadata to get a stable identification for a media file. preferably an cross platform approach with a library that has support for many formats. e.g. vlc, ffmpeg or mplayer.
(media files should be audio and video in common formats, images would be nice to have too)
Well, it may be 11 years too late for an answer, but in case others like me stumble upon this...
ffmpeg can output checksums for individual streams. So the same audio or video would output the same checksum independently of it's container format or metadata.
Example for the video track of file $filename, writing the output to $filename.md5 :
ffmpeg -i "$filename" -map 0:v -codec copy -f md5 "$filename.md5"
For audio, use -map 0:a.
To output to STDOUT, use -. For example:
ffmpeg -i "$filename" -map 0:a -codec copy -hide_banner -loglevel warning -f md5 -
I don't know of any existing platform-independent software that will accomplish this, but I do know a way that this could be accomplished in an interpreted (platform-independent) language such as Java.
Essentially, we simply need to strip any metadata (tags) from the file, demultiplexing video files beforehand. Theoretically after demux and removing metadata, one could hash the file and compare against another file that has undergone the same process to match identical files despite having different tags. Unlike a fingerprint, this would not identify similar songs/movies but identical files (imagine you might want the 10 different versions or bitrates of a given song you've archived, but don't want 2 identical copies of any of them floating around).
The most troubling part of this is removing tags as there are many different specifications for tag formats which are not necessarily implemented the same across different applications, i.e. the same exact audio file given identical tags separately through two different applications may not result in identical output files. The only way this could pose an issue fatal to the concept of an audio-only checksum is if popular tagging software makes any changes to the binary audio portion of the file, or pads the audio in a non-standard way.
Taking a checksum is trivial, but I'm not aware off the top of my head of any platform independent libraries to demux and detag mpeg files. I know that in 'nix environments, mpgtx is a great command-line tool that could perform the demux and detag, but obviously that is not a platform-independent solution.
Maybe someone out there feels ambitious?
one possible solution i found seems to be with vlc:
./VLC -I rc snd.mp3 :sout='#std{mux=raw,access=file,dst=-}' vlc://quit | sha1sum
Related
I am making a datamoshing program in C++, and I need to find a way to remove one frame from a video (specifically, the p-frame right after a sequence jump) without re-encoding the video. I am currently using h.264 but would like to be able to do this with VP9 and AV1 as well.
I have one way of going about it, but it doesn't work for one frustrating reason (mentioned later). I can turn the original video into two intermediate videos - one with just the i-frame before the sequence jump, and one with the p-frame that was two frames later. I then create a concat.txt file with the following contents:
file video.mkv
file video1.mkv
And run ffmpeg -y -f concat -i concat.txt -c copy output.mp4. This produces the expected output, although is of course not as efficient as I would like since it requires creating intermediate files and reading the .txt file from disk (performance is very important in this project).
But worse yet, I couldn't generate the intermediate videos with ffmpeg, I had to use avidemux. I tried all sorts of variations on ffmpeg -y -ss 00:00:00 -i video.mp4 -t 0.04 -codec copy video.mkv, but that command seems to really bug out with videos of length 1-2 frames - while it works for longer videos no problem. My best guess is that there is some internal checker to ensure the output video is not corrupt (which, unfortunately, is exactly what I want it to be!).
Maybe there's a way to do it this way that gets around that problem, or better yet, a more elegant solution to the problem in the first place.
Thanks!
If you know the PTS or data offset or packet index of the target frame, then you can use the noise bitstream filter. This is codec-agnostic.
ffmpeg -copyts -i input -c copy -enc_time_base -1 -bsf:v:0 noise=drop=eq(pos\,11291) out
This will drop the packet from the first video stream stored at offset 11291 in the input file. See other available variables at http://www.ffmpeg.org/ffmpeg-bitstream-filters.html#noise
I discovered some damaged AVI files that VLC complains about broken index when I try to play them. I can play directly without ability to scroll the timeline or wait...wait... for the index to be built (but not saved) and play normally. Some other players can play them without complaining, others refuse to play.
I can solve the problem seamlessly in VirtualDub by opening the .avi with "extended options" in Open with "re-derive keyframe flags" and then saving a new .AVI file with
direct-stream-copy for video and audio. The resulting file plays perfectly.
I can also solve the problem with ffmpeg but not without problems.
ffmpeg -i INFILE -vcodec copy -acodec copy OUTFILE
Important: only stream copy and same container are of interest.
The resulting file plays in VLC without complaints or the next problem, but in many other players when jumping on the timeline the video gets distorted immediately at the jump destination and stays heavily distorted until the next I frame in the stream. All this doesn't happen when it was processed with VirtualDub.
ffmpeg is faster but most importantly it is scriptable and one could make automation for many files. With VirtualDub one has to manually process each file and wait a looooooong time for the open process to re-derive keyframe flags first. Wouldn't mind if ffmpeg speed was lost because of the automation it can provide.
So far I only found a very old unanswered mailing list post here
Can ffmpeg fix such files, without the afore mentioned problem? If yes, how?
Thank you.
AVI file indexes contain all frames (key or not), but they have a flags field (which FFmpeg fills in) which should help players seek only to keyframes. I don't have access to your exact file (ffprobe information would be helpful), but we can assume the flags field is not written correctly, e.g. it might be set for every frame or for none at all.
VLC likely parses the codec packets to derive the keyframe flag if absent in the container, but other players might not. I think what you're looking for is to derive keyframe flags while stream-copying. The exact commandline depends a bit on the codec. For example, for H264 you'd want to dump to annex-B as intermediate file format, and then re-read that so the H264 parser is invoked, which sets the keyframe flag, and then re-mux that into AVI - but H264 in AVI is rare so that's probably not what's happening here.
So for a solution, I will need the output of ffprobe $file so I know what codec the AVI file contains.
I recently asked baout how I could download segments of an online m3u8 file, and someone pointed out that this could be accomplished via ffmpeg:
ffmpeg -i [LINK] -codec copy [OUTPUT FILE] #downloads only audio segments;
ffmpeg -i [LINK] -bsf:a aac_adtstoasc -vcodec copy -c copy -crf 50 [OUTPUT] #downloads audio and video segments
For those who aren't familiar, m3u8 is formatted kinda of like a "playlist", with an m3u8 file pointing to a bunch of smaller "segments" which are pieced together to form the whole of the video. As a result, it's completely possible to halt the above commands partway through their execution and still produce a watchable video (i.e. one that will be interpreted correctly by video editors).
I'm wondering if there's a built-in method with ffmpeg that allows me to grab segments N-M of a given m3u8. If there are methods outside of ffmpeg, feel free to mention them as well. Thanks for the help.
After having looked into it, I can say that this isn't possible via ffmpeg. You could theoretically use the -ss and -t parameters to specify a starting point and duration, but ffmpeg appears to look at every clip up until the specified endpoint, making the download process prohibitively long.
If you want to download only a specific number of segments, you need to look at the m3u8 file, find its associated media list, and download segments from that media list.
I understand how to make timelapse video from the sequence of files.
But what if my files have names like YYYYMMDDHHmmSS.jpg? How can I pass them in the correct order? I would prefer not to rename them (there are 55'000 files, almost 10 Gb).
I just found that there is no much sense to do any additional actions, the files are already sorted in the correct order, so the command below works well:
ffmpeg -framerate 500 -pattern_type glob -i '*.jpg' -c:v libx264 -pix_fmt yuv420p out2.mp4
I know there are some bat/shell commands possible for that, but IMHO it makes things more complicated for so little.
In similar cases I prefer using renaming softwares like Ant Renamer.
Drag & drop your files in the main window
In the Actions tab, click Enumeration in the list
You're given a naming scheme (look down the options to see the different schemes available).
I recommend using the default %name%_%num%%ext%, starting at 1 and with one more digit than your total number of files. Which in your case will result in YYYYMMDDHHmmSS_XXX.jpg
Click the Go button to process.
Once finished, check if the numbers added adequate the original file names (it should since the naming used is already chronological, but do check for safety).
It might not suit you, especially if you really want to do everything from command lines. But for other people, it might be enough.
Say we have many video records that we want merge with -vcodec copy (or equivalent syntax). Without reencoding, without loss of quality. And few records (minor set), with another codecs, parameters and so on. So we can use ffprobe for file, that represent majority of sources. We get lot of information.
But can we get here commandline hints for ffmpeg, that could be used to convert another (not yet "compatible") files to this same format? At least for one selected stream of "master" file, for example.
Question is not about some scpecific output codec and so on.
There is no exsitsing tool to to this. You would need to write one.
Each video stream inside a video file can only include same codec. So I recommend you to at first step, merge files with same codec with -vcodec copy. The check if which codec is mostly available in your merged files (e.g. CodecA). At second step, convert other merged files with other codecs to CodecA. Finally, merge all files (which all have now CodecA) with -vcodec copy.
Please keep in mind that if the video files are in different sizes, you have to reencode them.