Youtube-dl download script debug - bash

Dumping very bad idea of updating this post. New place for this script:
https://gist.github.com/Wogol/66e9936b6d49cc5fecca59eaeca1ca2e
Im trying to create a .command macOS script (Should also work under GNU/Linux) that use Youtube-dl that is simple to use. I have fixed so it downloads description, thumbnail, subtitles, json, creates folder structure and also saves video ID, uploader and upload date.
ISSUES WITH THE MAIN SCRIPT:
FIXED (13th august) Problem I struggle with is the option of Audio & Video or Audio only. For some reason only audio works in the script. The download script for the video&audio dont work but if I paste that same command line ("The DEBUG output line") in a terminal window it works. Scratching my head.
Youtube-dl gives me this message:
ERROR: requested format not available
FIXED (31th august) Get max resolution of video working. Have found information to force mp4 or max resolution but not combined them.
ISSUES WITH INFORMATION FILE:
Also creating a info file with title, channel name, release date, description. Im now struggling with getting video information from .json and youtube-dl to be exported into the info.txt file.
FIXED (5th september) textfile=""$folder"info.txt" not working. Gives this error: (There I want to add youtube-dl folder.
ytdl.command: line 104: ~/Downloads/ytdl/dog_vids/info.txt: No such file or directory
FIXED (5th september) Find youtube-dl folder and get it to work with grep.
Something like this:
youtube-dl --simulate --SHOW_THE_OUTPUT_PATH -o $folder'/%(title)s/%(title)s - (%(id)s) - %(uploader)s - %(upload_date)s.%(ext)s' https://www.youtube.com/watch?v=dQw4w9WgXcQ
FIXED (5th september) With grep command i named the json file "*.json" because there will only be one per directory but I dont like that solution. (Could be answered with point above)
FIXED (5th september) How to make so grep dont grab "? It now adds them before and after everything.
FIXED (5th september) How to get the tags information from json file? Tags look like this:
"tags": ["music", "video", "classic"]
FIXED (5th september) Run the creation of info file part of the script in the background of downloading the video?
CURRENT VERSION TRYING TO GET IT WORKING
(12 august)
textfile=""$folder"info.txt"
echo TITLE >> ~/Downloads/ytdl/dog_vids/info.txt
youtube-dl -e $url >> ~/Downloads/ytdl/dog_vids/info.txt
echo \ >> ~/Downloads/ytdl/dog_vids/info.txt
echo CHANNEL >> $textfile
echo \ >> $textfile
echo CHANNEL URL >> $textfile
echo \ >> $textfile
echo UPLOAD DATE >> $textfile
echo \ >> $textfile
echo URL >> $textfile
echo $url >> $textfile
echo \ >> $textfile
echo TAGS >> $textfile
echo \ >> $textfile
echo DESCRIPTION >> $textfile
youtube-dl --get-description $url >> $textfile
EXPERIMENT FUTURE VERSION - EXTRACTING INFORMATION FROM JSON FILE
This isnt a working script. Showing how I want it with $textfile, $ytdlfolder and $jsonfile.
url=https://www.youtube.com/watch?v=dQw4w9WgXcQ
textfile=""$folder""$YOUTUBE-DL_PATH"info.txt"
ytdlfolder="$folder""$YOUTUBE-DL_PATH"
jsonfile="$folder""$YOUTUBE-DL_JSON-FILE"
Echo TITLE >> $textfile
grep -o '"title": *"[^"]*"' $jsonfile | grep -o '"[^"]*"$' >> $textfile
Echo \ >> $textfile
Echo CHANNEL >> $textfile
grep -o '"uploader": *"[^"]*"' $jsonfile | grep -o '"[^"]*"$' >> $textfile
Echo \ >> $textfile
Echo CHANNEL URL >> $textfile
grep -o '"uploader_url": *"[^"]*"' *.json | grep -o '"[^"]*"$' >> $textfile
Echo \ >> $textfile
Echo UPLOAD DATE >> $textfile
grep -o '"upload_date": *"[^"]*"' *.json | grep -o '"[^"]*"$' >> $textfile
Echo \ >> $textfile
Echo TAGS >> $textfile
grep -o '"tags": *"[^"]*"' *.json | grep -o '"[^"]*"$' >> $textfile
Echo \ >> $textfile
echo URL >> $textfile
echo $url >> $textfile
echo \ >> $textfile
Echo DESCRIPTION >> $textfile
youtube-dl --get-description $url >> $textfile
THE SCRIPT:
12 august.
Moved url to the top so when user paste the url they get the videos title back. This so the user know they got the right video.
Added max resolution 1920x1080. (Do not work)
13 august.
Downloading Audio & Video works.
31 august.
Fixed force mp4 and max heigh of 1080.
5 september.
Finally working script. Read more about it here (Or scroll down):
Youtube-dl download script debug
2020-09-17
Folders can now have spaces in them.
2020-09-22
Select menus is now one column.
Minor fixes.
Now all the bugs is fixed. Issues left is only optimizations.
#! /bin/bash
################################################################################
# Script Name: Youtube-dl Easy Download Script
# Description: Easy to use script to download YouTube videos with a couple of
# options.
#
# What this script do:
# - Downloads video in MP4 with highest quality and max resolution 1920x1080.
# - Downloads thumbnail and subtitles.
# - Gives user option where to download the video and video or only audio.
# - Creates a folder with same name as video title and puts all files there.
# - Creates a .txt file with information about the video.
#
#
# Author: Wogol - Stackoverflow.com, Github.com
# License: The GNU General Public License v3.0 - GNU GPL-3
#
#
# Big thanks to the people at youtube-dl GitHub and Stack Overflow. Without
# their help this would never ever been possible for me.
#
# Special thanks to:
# Reino # Stack Overflow
#
# #####
#
# Software required: youtube-dl, xidel, printf
#
# macOS: 1. Install Homebrew: https://brew.sh
# 2. Terminal command: brew install youtube-dl xidel
#
# Linux: Depends on package manager your distribution use.
#
# #####
#
# Version history:
# 2020-09-22
# - Select menus is now one column.
# - Minor fixes.
# - Now all the bugs is fixed. Issues left is only optimizations.
#
# 2020-09-17
# - Folders can now have spaces in them.
#
# 2020-09-05
# - First working version.
#
# #####
#
# Issues left:
# - In the beginning there is a confirmation that show the title of the
# video so user know they got the correct video. It takes youtube-dl a
# couple of seconds. To speed up the script it is DISABLED by default.
#
# - Have found out that the script dont need xidel to get json information
# but youtube-dl can get it. Dont know how to use youtube-dl --dump-json
# to get the same result.
#
# - To get the path to the .txt file script use youtube-dl. This gives the
# script a pause for a few seconds. Best would get to get the path some how
# without connecting to YouTube again but use the output from youtube-dl
# some how. ... or run it in the background when video is downloading.
#
################################################################################
clear
# - WELCOME MESSAGE -
echo
COLUMNS=$(tput cols)
title="-= Youtube-dl Easy Download Script =-"
printf "%*s\n" $(((${#title}+$COLUMNS)/2)) "$title"
# - PASTE URL -
echo -e "\n*** - Paste URL address and hit RETURN. Example:\nhttps://www.youtube.com/watch?v=dQw4w9WgXcQ --OR-- https://youtu.be/dQw4w9WgXcQ\n"
read url
# - VIDEO TITLE -
# So users know they have the correct URL.
#echo -e "\nThe video is: (This takes 3-4 seconds, or more ...)"
#youtube-dl -e $url
#echo
# - DOWNLOAD LOCATION -
# DIRECTORY MUST END WITH SLASH: /
echo -e "\n\n*** - Choose download folder:\n"
COLUMNS=0
PS3='Choose: '
select directory in "~/Downloads/ytdl/Rick Astley/" "~/Downloads/ytdl/Never Gonna Give You Up/" "~/Downloads/ytdl/Other Rick Videos/" ; do
echo -e "\nOption $REPLY selected. Download directory is:\n $directory"
# - AUDIO/VIDEO SETTINGS -
echo -e "\n\n*** - Choose download settings:\n"
COLUMNS=0
PS3='Choose: '
options=("Audio & Video" "Audio only")
select settingsopt in "${options[#]}"
do
case $settingsopt in
"Audio & Video")
av="-f bestvideo[ext=mp4][height<=1080]+bestaudio[ext=m4a]/best[ext=mp4]/best --merge-output-format mp4"
;;
"Audio only")
av="-f bestaudio[ext=m4a]/bestaudio"
;;
esac
echo -e "\nOption $REPLY selected:\n $settingsopt"
# - THE DOWNLOAD SCRIPT -
echo -e "\n\n*** - Starting download:\n"
youtube-dl $av --write-thumbnail --all-subs --restrict-filenames -o "$directory%(title)s/%(title)s.%(ext)s" $url
# - INFORMATION FILE -
textfile=$(youtube-dl --get-filename --restrict-filenames -o "$directory%(title)s/%(title)s.txt" $url)
xidel -s "$url" -e '
let $json:=json(
//script/extract(.,"ytplayer.config = (.+?\});",1)[.]
)/args,
$a:=json($json/player_response)/videoDetails,
$b:=json($json/player_response)/microformat
return (
"- TITLE -",
$a/title,"",
"- CHANNEL -",
$a/author,"",
"- CHANNEL URL -",
$b//ownerProfileUrl,"",
"- UPLOAD DATE -",
$b//publishDate,"",
"- URL -",
$json/loaderUrl,"",
"- TAGS -",
$a/keywords,"",
"- DESCRIPTION -",
$a/shortDescription
)
' --printed-json-format=compact >> "$textfile"
# - THE END -
echo
COLUMNS=$(tput cols)
ending="Download Complete!"
printf "%*s\n\n" $(((${#ending}+$COLUMNS)/2)) "$ending"
exit
done
done

Finally got the script working.
Have got a lot of help from many people but big thanks to Reino with his help in this thread:
Grep command questions - Grep text from program output?
The script has issues and it can be optimized but I dont know how to fix them. This is the first bash script I have created.
The goals with this was to create a script that:
Simple and easy to use.
No terminal commands.
Initial sorting in different directories.
Video or only audio.
MP4 with max resolution 1920x1080 because everything supports it out of the box.
Text file with additional information about the video.
These are features I miss in programs like Downie (macOS) and Clipgrab.
For other people to use this script and future fixing I tried to create a Github page... not my cup of tea so to say.
Script is in the first post on this page.
Youtube-dl download script debug

I fixed so Audio & Video download now works. The problem was ' in the av line. Removed them and now it works fine. Also updated the av line from the man/manual for youtube-dl.
Not working:
av="-f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/bestvideo+bestaudio' --merge-output-format mp4"
Working:
av="-f bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best --merge-output-format mp4"

Have now fixed so the script force mp4 and max height of 1080.
-f bestvideo[ext=mp4][height<=1080]+bestaudio[ext=m4a]/best[ext=mp4]/best --merge-output-format mp4
Now the rest of the issues left.

Related

youtube-dl get channel / playlist name and pass it to bash one line script

I'm trying to download an entire Youtube channel and that worked.
But I'm having the directories' names like below and thus I need to change that all manually.
I need a way to pass channel / playlist name and id to the script instead of fetching the url.
Script I used :
# get working/beginning directory
l=$(pwd);
clear;
# get playlists data from channel list
youtube-dl -j --flat-playlist \
"https://www.youtube.com/channel/UC-QDfvrRIDB6F0bIO4I4HkQ/playlists" \
|cut -d ' ' -f4 \
|cut -c 2-73 \
|while IFS= read -r line;
do;
# loop: do with every playlist link
# make a directory named by the playlist identifier in url
mkdir ${line:38:80};
# change directory to new directory
cd $l/${line:38:80};
# download playlist
youtube-dl -f mp4 "$line";
# print playlist absolute dir to user
pwd;
# change directory to beginning directory
cd $l;
done;
Names of directories :
.
├── PLxl69kCRkiI0oIqgQW3gWjDfI-e7ooTUF
├── PLxl69kCRkiI0q0Ib8lm3ZJsG3HltLQDuQ
├── PLxl69kCRkiI1Ebm-yvZyUKnfoi6VVNdQ7
├── ...
└── PLxl69kCRkiI3u-k02uTpu7z4wzYLOE3sq
This is not working :
https://github.com/ytdl-org/youtube-dl/issues/23442
# any playlist is seen as private
youtube-dl -J \
https://m.youtube.com/playlist?list=PL3GeP3YLZn5jOiHM8Js1_S0p_5HeS7TbY \
| jq -r '.title'
How to use youtube-dl from a python program?
I need it for bash script not for python
Edit: simply explained
How to get channel name from bash youtube-dl and replace it with list id for file name in this script
Consider the following:
#!/usr/bin/env bash
# if we don't delete slashes from titles there are serious security issues here
slash=/
# note that this url, being in quotes, needs no backslash escaping.
url='https://www.youtube.com/playlist?list=PLXmMXHVSvS-CoYS177-UvMAQYRfL3fBtX'
# First, get the name for the whole playlist
playlist_title=$(youtube-dl -J --flat-playlist "$url" | jq -r .title) || exit
# ...and, for the rest of the script, work under a directory named by that title
mkdir -p -- "${playlist_title//$slash/}" || exit
cd "${playlist_title//$slash/}" || exit
# Finally, loop over the individual videos and download them one at a time.
# ...arguably, you could tell youtube-dl to do this itself; call it an exercise.
youtube-dl -j --flat-playlist "$url" | # one JSON document per playlist entry
jq -r '[.id, .title] | #tsv' | # write id, then title, tab-separated
while IFS=$'\t' read -r id title; do ( # read that id and title from jq
# because of the ()s, this is a subshell; exits just go to the next item
# ...which is also why exec doesn't end the entire script.
dir=${title//$slash/} # take slashes out of the title to form directory
mkdir -p -- "$dir" || exit
cd -- "$dir" || exit # if cd fails, do not download anything
exec youtube-dl "$id" # exec is a minor perf optimization; consume subshell
); done
Note:
We're using jq to convert the JSON to a more safely-readable format that we can parse without byte offsets.
The extra backslashes were removed from the URL to prevent the 404 error described in the comments to the question.
Putting the body of the loop in a subshell with parenthesis means that the cd inside that subshell is automatically reversed when the parenthesized section is exited.
We don't trust titles not to contain slashes -- you don't need someone naming their video /etc/sudoers.d/hi-there or otherwise placing files in arbitrarily-chosen places.
Note the use of "$dir" instead of just $dir. That's important; see shellcheck warning SC2086.

tshark to split pcap file based on MAC address

I have around 7 PCAP files and I would like to split them based on MAC address then place them into separate files and a new directory for each PCAP file based on the title of the PCAP files. My current approach (see below) is obviously not the best approach and would need the loop part.
#!/bin/bash
#Name of PCAP file to analyse
pcap_file="tcpdump_2019-06-21_213044.pcap"
#MAC address to filter
mac="00:17:88:48:89:21"
mkdir /2019-06-21/
for steam in $pcap_file;
do
/usr/bin/tshark -r $pcap_file -Y "eth.addr eq 00:17:88:48:89:21" -w
$mac.pcap
done
#!/bin/bash
pcap_file=(tcpdump_2019-07-01_000001.pcap tcpdump_2019-06-26_120301.pcap)
macs=( 00:17:88:71:78:72 )
devices=(phillips_1 phillips_2)
for pcf in ${pcap_file[*]}
do
echo "$pcap_file" >&2
/usr/bin/tshark -r "$pcf" -Y "eth.addr eq $macs" -w "$devices.pcap"
done
Something like:
#!/bin/bash
# usage "$0" pcap_file1 pcap_file2 ...
#macs=( 00:17:88:48:89:21 00:17:88:48:89:22 00:17:88:48:89:23 )
ips=( 192.168.202.68 192.168.202.79 192.168.229.153 192.168.23.253 )
for pcf in "$#"
do
dir="`basename "$pcf" | sed -r 's/(tcpdump_)(.*)(_[0-6]{6}.pcap)/\2/'`"
mkdir "$dir"
cd "$dir" || exit # into the newly created child dir
pcf="`realpath -e "$pcf"`" # make sure the file can be found from the new directory
#for mac in ${macs[*]}
for ip in ${ips[*]}
do
#echo "$mac" >&2
echo "$ip" >&2
#/usr/bin/tshark -r "$pcf" -Y "eth.addr eq $mac" -w "$mac.pcap"
/usr/bin/tshark -r "$pcf" -Y "ip.addr == $ip" -w "$ip.pcap"
done
cd .. # back to the parent dir
done
Where in your case you would use the commented out lines.
I used IPs to test, for I couldn't find an appropriate file to test mac's on. I used the file maccdc2012_00000.pcap.gz found here: https://www.netresec.com/?page=MACCDC to test (note: my example takes a long time to finish on that large file).

youtube-dl problems (scripting)

Okay, so I've got this small problem with a bash script that I'm writing.
This script is supposed to be run like this:
bash script.sh https://www.youtube.com/user/<channel name>
OR
bash script.sh https://www.youtube.com/user/<random characters that make up a youtube channel ID>
It downloads an entire YouTube channel to a folder named
<uploader>{<uploader_id>}/
Or, at least it SHOULD...
the problem I'm getting is that the archive.txt file that youtube-dl creates is not created in the same directory as the videos. It's created in the directory from which the script is run.
Is there a grep or sed command that I could use to get the archive.txt file to the video folder?
Or maybe create the folder FIRST, then cd into it, and run the command from there?
I dunno
Here is my script:
#!/bin/bash
pwd
sleep 1
echo "You entered: $1 for the URL"
sleep 1
echo "Now downloading all videos from URL "$1""
youtube-dl -iw \
--no-continue $1 \
-f bestvideo+bestaudio --merge-output-format mkv \
-o "%(uploader)s{%(uploader_id)s}/[%(upload_date)s] %(title)s" \
--add-metadata --download-archive archive.txt
exit 0
I ended up solving it with this:
uploader="$(youtube-dl -i -J $URL --playlist-items 1 | grep -Po '(?<="uploader": ")[^"]*')"
uploader_id="$(youtube-dl -i -J $URL --playlist-items 1 | grep -Po '(?<="uploader_id": ")[^"]*')"
uploaderandid="$uploader{$uploader_id}"
echo "Uploader: $uploader"
echo "Uploader ID: $uploader_id"
echo "Folder Name: $uploaderandid"
echo "Now downloading all videos from URL "$URL" to the folder "$DIR/$uploaderandid""
Basically I had to parse the JSON with grep, since the youtube-dl devs said that implementing -o type variables into any other variable would clog up the code and make it bloated.

How can I get actions logs in the script?

I am a new at this so please be patiant.
I have a script that I did not write (not that advance, maybe one day) and I need it to output logs in var/logs.
Can anyone please help wiht it please?
I need to know that all the actions are completed in that script.
Here is the script:
#!/bin/bash
# copy the old database to a timestamped copy
TIMESTAMP=$(date +%d)
REPORT_EMAIL=user#domain.com
# backup Spamassassin bayes db
sa-learn -p /usr/mailcleaner/etc/mailscanner/spam.assassin.prefs.conf --siteconfigpath /usr/mailcleaner/share/spamassassin --backup >/var/mailcleaner/spool/spamassassin/spamass_rules.bak
# backup Bogofilter bayes db
cp -a /root/.bogofilter/wordlist.db "/root/.bogofilter/$TIMESTAMP-wordlist.db"
if [ -f "/root/.bogofilter/$TIMESTAMP-wordlist.db.gz" ]
then
rm -f "/root/.bogofilter/$TIMESTAMP-wordlist.db.gz"
fi
gzip "/root/.bogofilter/$TIMESTAMP-wordlist.db"
# get the spam and ham from the imap mailbox for the spamassassin and bogofilter db's
/opt/mailcleaner/scripts/imap-sa-learn.pl
if [ $? -ne 0 ]
then
(
echo "Subject: Bogofilter database update $(hostname) failed"
ls -l /var/mailcleaner/spool/bogofilter/database/
) | /usr/sbin/sendmail $REPORT_EMAIL
exit 1
fi
# copy the database to the right location
cp /root/.bogofilter/wordlist.db /var/mailcleaner/spool/bogofilter/database/wordlist.db
# If slave(s) Mailcleaner exists, ssh copy dbs to the slave(s)
#scp /root/.bogofilter/wordlist.db mailcleaner2.domain.com:/var/mailcleaner/spool/bogofilter/database/
#scp /var/mailcleaner/spool/spamassassin/spamass_rules.bak mailcleaner2.domain.com:/var/mailcleaner/spool/spamassassin/
# get the spam and ham counts from bogofilter - this just prints how many spam and ham you collected so far...
/opt/bogofilter/bin/bogoutil -w /var/mailcleaner/spool/bogofilter/database/wordlist.db .MSG_COUNT
Again thanks for the help.
Raj
It's easiest just to use output redirection when you run the script:
./the_script.sh > /var/log/the_script.log 2>&1
The 2>&1 is to capture stderr (the error stream) as well as stdout (the output stream).
If you want to write a log file and see the output on the console, use tee:
./the_script.sh 2>&1 | tee /var/log/the_script.log

creating a file downloading script with checksum verification

I want to create a shellscript that reads files from a .diz file, where information about various source files are stored, that are needed to compile a certain piece of software (imagemagick in this case). i am using Mac OSX Leopard 10.5 for this examples.
Basically i want to have an easy way to maintain these .diz files that hold the information for up-to-date source packages. i would just need to update these .diz files with urls, version information and file checksums.
Example line:
libpng:1.2.42:libpng-1.2.42.tar.bz2?use_mirror=biznetnetworks:http://downloads.sourceforge.net/project/libpng/00-libpng-stable/1.2.42/libpng-1.2.42.tar.bz2?use_mirror=biznetnetworks:9a5cbe9798927fdf528f3186a8840ebe
script part:
while IFS=: read app version file url md5
do
echo "Downloading $app Version: $version"
curl -L -v -O $url 2>> logfile.txt
$calculated_md5=`/sbin/md5 $file | /usr/bin/cut -f 2 -d "="`
echo $calculated_md5
done < "files.diz"
Actually I have more than just one question concerning this.
how to calculate and compare the checksums the best? i wanted to store md5 checksums in the .diz file and compare it with string comparison with "cut"ting out the string
is there a way to tell curl another filename to save to? (in my case the filename gets ugly libpng-1.2.42.tar.bz2?use_mirror=biznetnetworks)
i seem to have issues with the backticks that should direct the output of the piped md5 and cut into the variable $calculated_md5. is the syntax wrong?
Thanks!
The following is a practical one-liner:
curl -s -L <url> | tee <destination-file> |
sha256sum -c <(echo "a748a107dd0c6146e7f8a40f9d0fde29e19b3e8234d2de7e522a1fea15048e70 -") ||
rm -f <destination-file>
wrapping it up in a function taking 3 arguments:
- the url
- the destination
- the sha256
download() {
curl -s -L $1 | tee $2 | sha256sum -c <(echo "$3 -") || rm -f $2
}
while IFS=: read app version file url md5
do
echo "Downloading $app Version: $version"
#use -o for output file. define $outputfile yourself
curl -L -v $url -o $outputfile 2>> logfile.txt
# use $(..) instead of backticks.
calculated_md5=$(/sbin/md5 "$file" | /usr/bin/cut -f 2 -d "=")
# compare md5
case "$calculated_md5" in
"$md5" )
echo "md5 ok"
echo "do something else here";;
esac
done < "files.diz"
My curl has a -o (--output) option to specify an output file. There's also a problem with your assignment to $calculated_md5. It shouldn't have the dollar sign at the front when you assign to it. I don't have /sbin/md5 here so I can't comment on that. What I do have is md5sum. If you have it too, you might consider it as an alternative. In particular, it has a --check option that works from a file listing of md5sums that might be handy for your situation. HTH.

Resources