To get the dimensions of a file, I can do:
$ mediainfo '--Inform=Video;%Width%x%Height%' ~/Desktop/lawandorder.mov
1920x1080
However, if I give a url instead of a file, it returns None:
$ mediainfo '--Inform=Url;%Width%x%Height%' 'http://url/lawandorder.mov'
(none)
How would I correctly pass a url to MediaInfo?
You can also use curl | head to partially download the file before running mediainfo.
Here's an example of getting the dimensions of a 12 MB file from the web, where only a small portion (less than 10 KB) from the start needs to be downloaded:
curl --silent http://www.jhepple.com/support/SampleMovies/MPEG-2.mpg \
| head --bytes 10K > temp.mpg
mediainfo '--Inform=Video;%Width%x%Height%' temp.mpg
To do this, I needed to re-compile from source using '--with-libcurl' option.
$ ./CLI_Compile.sh --with-libcurl
$ cd MediaInfo/Project/GNU/CLI
$ make install
Then I used this command to get video dimensions via http:
$ mediainfo '--Inform=Video;%Width%x%Height%' 'http://url/lawandorder.mov'
Note, this took a considerable amount of time to return the results. I'd recommend using ffmpeg if the file is not local.
Related
I've seen many similar questions, but none really works for me.
I have a simple script that downloads latest 2 video episodes from one channel, so I can watch them later on mobile in offline mode. It uses yt-dlp command with -w switch not to overwrite if video exists. I would like to colorize "Destination" so I see more clearly file name, and "has already been downloaded" if file has already been downloaded.
So my script looks like this:
cd /tmp
EPI=$(curl -s https://rumble.com/c/robbraxman | grep -m 2 "href=/\v"| awk '{print $5}' |awk -F\> '{print $1}')
for i in ${EPI//href=/http://rumble.com}; do
printf "\e[1;38;5;212m Downloading \e[1;38;5;196m $i\e[m \n"
yt-dlp -f mp4-480p -w $i
done
output of the yt-dlp is like:
[Rumble] Extracting URL: http://rumble.com/v2820uk-let-me-show-you-brax.me-privacy-focused-app.html
[Rumble] v2820uk-let-me-show-you-brax.me-privacy-focused-app.html: Downloading webpage
[RumbleEmbed] Extracting URL: https://rumble.com/embed/v25g3g0
[RumbleEmbed] v25g3g0: Downloading JSON metadata
[info] v25g3g0: Downloading 1 format(s): mp4-480p
[download] Let Me Show You Brax.Me - Privacy Focused App [v25g3g0].mp4 has already been downloaded
[download] 100% of 20.00MiB
[Rumble] Extracting URL: http://rumble.com/v2820uk-let-me-show-you-brax.me-privacy-focused-app.html
[Rumble] v2820uk-let-me-show-you-brax.me-privacy-focused-app.html: Downloading webpage
[RumbleEmbed] Extracting URL: https://rumble.com/embed/v25g3g0
[RumbleEmbed] v25g3g0: Downloading JSON metadata
[info] v25g3g0: Downloading 1 format(s): mp4-480p
[download] Destination: Let Me Show You Brax.Me - Privacy Focused App [v25g3g0].mp4
[download] 3.8% of 525.67MiB at 5.79MiB/s ETA 01:27^C
So I would probably need to pipe output somehow, and then color it. yt-dlp actually colors progress when downloading. I've tried to put grc in front of yt-dlp but didn't color it.
You can colorize specific things you want using grep:
yt-dlp 'URL' | grep --color -P "^|(Destination.*|.*has already been downloaded)"
Basically it will match all lines with ^, so those lines will still be printed, however since this match contains no visible characters those lines will not be colored.
Then you just add in the parts you want colored after the regex |. The --color shouldn't be neccesary, just adding it to make sure that is not the issue.
In most linux system the file name is limited to maximum of 255 bytes.
getconf -a | grep -i name_max
NAME_MAX 255
_POSIX_NAME_MAX 255
LOGNAME_MAX 256
TTY_NAME_MAX 32
TZNAME_MAX 6
_POSIX_TZNAME_MAX 6
CHARCLASS_NAME_MAX 2048
HOST_NAME_MAX 64
LOGIN_NAME_MAX 256
I find that some videos' name length in youtube is larger than 255 bytes ,how to download that kind of video and keep the long name unchaged as downloaded file's name?
youtube-dl $url can't work for the long name video.
Have a look at the following youtube-dl options:
--id
--output
--restrict-filenames
The --id limits filenames to using the video id, --output lets you specify a template for naming the output file. Using --restrict-filenames will ensure filenames are script and shell friendly.
Have a look at the help section entitled OUTPUT TEMPLATE to see how templates work, an example is:
$ youtube-dl --output '%(title)s.%(ext)s' BaW_jenozKc --restrict-filenames
You may also find the --get-filename option useful. It will show you the filename that will be used without actually downloading it.
I'm trying to create a video quiz, that will contain small parts of other videos, concatenated together (with the purpose, that people will identify from where these short snips are taken from).
For this purpose I created a file that contain the URL of the video, the starting time of the "snip", and its length. for example:
https://www.youtube.com/watch?v=5-j6LLkpQYY 00:00 01:00
https://www.youtube.com/watch?v=b-DqO_D1g1g 14:44 01:20
https://www.youtube.com/watch?v=DPAgWKseVhg 12:53 01:00
Meaning that the first part should take the video from the first URL from its beginning and last for a minute, the second part should be taken from the second URL starting from 14:44 (minutes:seconds) and last one minute and 20 seconds and so forth.
Then all these parts should be concatenated to a single video.
I'm trying to write a script (I use ubuntu and fluent in several scripting languages) that does that, and I tried to use youtube-dl command line package and ffmpeg, but I couldn't find the right options to achieve what I need.
Any suggestions will be appreciated.
Considering the list of videos is in foo.txt, and the output video to be foo.mp4, this bash script should do the job:
eval $(cat foo.txt | while read u s d; do echo "cat <(youtube-dl -q -o - $u | ffmpeg -v error -hide_banner -i - -ss 00:$s -t 00:$d -c copy -f mpegts -);"; done | tee /dev/tty) | ffmpeg -i - -c copy foo.mp4
This is using a little trick with process substitution and eval to avoid intermediate files, container mpegts to enable simple concat protocol, and tee /dev/tty just for debugging.
I have tested with youtube-dl 2018.09.26-1 and ffmpeg 1:4.0.2-3.
Is there an easy/efficient way of get the duration of about 20k videos stored in a S3 Bucket?
Right now, I tried mounting the bucket in OS X using expandrive and running a bash script using mediainfo but I always get a "Argument list too long" error.
This is the script
#! /bin/bash
# get video length of file.
for MP4 in `ls *mp4`
do
mediainfo $MP4 | grep "^Duration" | head -1 | sed 's/^.*: \([0-9][0-9]*\)mn *\([0-9][0-9]*\)s/00:\1:\2/' >> results.txt
done
# END
ffprobe can read videos from various sources. HTTP is also supported - this should help you as it lifts the burden to transfer all files to your computer.
ffprobe -i http://org.mp4parser.s3.amazonaws.com/examples/Cosmos%20Laundromat%20faststart.mp4
Even if your S3 bucket is not public you can easily generate signed URLs which allow time limited access to an object if security is of concern.
Use Bucket GET to get all files in the bucket and then execute the ffprobe with appropriate filtering on all files.
This answers your question but the problem you are having is well explained by Rambo Ramone's answer.
Try using xargs instead of the for loop. The backtics run the command and insert its output at this spot. 20K files are probably too much for your shell.
ls *.mp4 | xargs mediainfo | grep "^Duration" | head -1 | sed 's/^.*: \([0-9][0-9]*\)mn *\([0-9][0-9]*\)s/00:\1:\2/' >> results.txt
If mounting the S3 bucket and running mediainfo against a video file to retrieve video metadata (including the duration header) results in a complete download of the video from S3 then that is probably a bad way to do this. Especially if you're going to do it again and again.
For new files being uploaded to S3, I would pre-calculate the duration (using mediainfo or whatever) and upload the calculated duration as S3 object metadata.
Or you could use a Lambda function that executes when a video is uploaded and have it read the relevant part of the video file, extract the duration header, and store it back in the S3 object metadata. For existing files, you could programmatically invoke the Lambda function against the existing S3 objects. Or you could simply do the upload process again from scratch, triggering the Lambda.
I want to extract just the first filename from a remote zip archive without downloading the entire zip. In particular, I'm trying to get the build number of dartium (link to zip file). Since the file is quite large, I don't want to download the entire thing.
If I download the entire thing, unzip -l reports the first file as being: 0 2013-04-07 12:18 dartium-lucid64-inc-21033.0/. I want to get just this filename so I can parse out the 21033 portion as the build number.
I was doing this (total hack):
_url="https://storage.googleapis.com/dartium-archive/continuous/dartium-lucid64.zip"
curl -s $_url | head -c 256 | sed -n "s:.*dartium-lucid64-inc-\([0-9]\+\).*:\1:p"
It was working when I had my shell in ASCII mode, but I recently switched it to UTF-8 and it seems sed is now honoring that, which breaks my script.
I thought about hacking it by doing:
export LANG=
curl -s ...
But that seemed like an even bigger hack.
Is there a better way?
Firstly, you can set bytes range using curl.
Next, use "strings" to extract all strings from binary stream.
Add "q" after "p" to quit after find only first occurrence.
curl -s $_url -r0-256 | strings | sed -n "s:.*dartium-lucid64-inc-\([0-9]\+\).*:\1:p;q"
Or this:
curl -s $_url -r0-256 | strings | sed -n "/dartium-lucid64/{s:.*-\([^-]\+\)\/.*:\1:p;q}"
It must be a bit faster and more reliable. Also it extracts full version, including subversion (if you need it).