Files are not downloaded in the folders having names extracted from the url - bash

ok, I have these urls
https://www.ppppppppppp.com/it/yyyy/911-omicidio-al-telefono/stagione-1-appesa-a-un-filo
https://www.ppppppppppp.com/it/yyyy/avamposti-dispacci-dal-confine/stagione-1-cerignola
https://www.ppppppppppp.com/it/yyyy/belle-da-morire/stagione-1-bellezza-stalking
I try to create these folder with these names
911-omicidio-al-telefono
avamposti-dispacci-dal-confine
belle-da-morire
extracting the name from urls
for example I would like the file from the url
https://www.ppppppppppp.com/it/yyyy/911-omicidio-al-telefono/stagione-1-appesa-a-un-filo
to download directly inside the folder name extracted from the url
911-omicidio-al-telefono
but this seems problematic because no folder names are extracted and each file is downloaded outside their folderURL name
To solve this problem I created a script.sh with this code
#!/bin/bash
# Extract the folder name from the URL
url=$1
folder_name=$(echo $url | cut -d "/" -f6)
echo "folder_name: $folder_name"
if [[ "$folder_name" == "NA" ]]
then
echo "Can't extract folder name from $url"
exit
fi
# Create the folder if it doesn't exist
mkdir -p "$folder_name"
echo "file_path: $file_path"
# Download the video and audio files
ffmpeg -i "$file_path.fdash-video=6157520.mp4" -i "$file_path.fdash-audio_eng=160000.m4a" -c copy "$file_path.mp4"
# Move the file to the correct folder and rename it with .mp4 extension
mv "$file_path.mp4" "$folder_name/$file_path.mp4"
and then from bash terminal I call it in this way
yt-dlp --referer "https://ppppppppppp.com/" --add-header "Cookie:COOKIE" --batch-file links_da_scaricare.txt -o '%(playlist)s/%(title)s.%(ext)s' --exec "~/script.sh {}"
I use cygwin and script.sh is in C:\cygwin64\home\Administrator but I test also with ubuntu and problem is the same: it creates a folder called NA and download inside that folder.
All files are downloaded in the same NA folder and not in their folders, in other word are not downloaded in the folders having names extracted from the url from which the files are downloaded
EDIT
I use SpellCheck and I fix code of script.sh and now I haven't issues
#!/bin/bash
url=$1
file_path=$2
# Extract the folder name from the URL
folder_name=$(echo "$url" | cut -d "/" -f4)
echo "folder_name: $folder_name"
# Create the folder if it doesn't exist
mkdir -p "$folder_name"
echo "The script is running and creating folder: $folder_name" > ~/script.log
# Move the file to the correct folder and rename it with .mp4 extension
mv "$file_path" "$folder_name/$folder_name.mp4"
but when I try to run this command from Cygwin terminal
yt-dlp --referer "https://pppppppppp.com" --add-header "Cookie:COOKIE" --batch-file links_da_scaricare.txt -o '%(playlist)s/%(title)s.%(ext)s' --exec "C:\cygwin64\home\Administrator\script.sh {} $file_path"
NA folder is still created and no other folders are created so files are downloaded only into NA and not in their folders

I think you are trying to do something like this:
#!/bin/bash
#QUESTION: https://stackoverflow.com/questions/75088710/files-are-not-downloaded-in-the-folders-having-names-extracted-from-the-url
# Extract the folder name from the URL
#url=$1
url="https://www.ppppppppppp.com/it/yyyy/911-omicidio-al-telefono/stagione-1-appesa-a-un-filo"
folder_name=$(echo $url | cut -d "/" -f6 )
### Missing
if [ "${folder_name}" = "" ] ; then folder_name="NA" ; fi
echo "folder_name: $folder_name"
if [[ "$folder_name" == "NA" ]]
then
echo "Can't extract folder name from $url"
exit
fi
### Missing
file_path="$(pwd)/${folder_name}"
# Create the folder if it doesn't exist
mkdir -p "${file_path}"
echo "file_path: ${file_path}"
download="arbitrary_name"
wget -O "${file_path}/${download}" "${url}"
# Download the video and audio files
ffmpeg -i "${file_path}/${download}.fdash-video=6157520.mp4" -i "${file_path}/${download}.fdash-audio_eng=160000.m4a" -c copy "${file_path}/${download}.mp4"
Some of that can't be verified by myself, because I don't have an account to access primevideo, at
https://www.primevideo.com/detail/0S0UEN2OCD7CTY5TQF3N6KB1ET/ref=atv_dp_season_select_s1
Also, I don't think you can download the two, video and audio, separately.

Related

how to unzip a zip file inside another zip file?

I have multiple zip files inside a folder and another zip file exists within each of these zip folders. I would like to unzip the first and the second zip folders and create their own directories.
Here is the structure
Workspace
customer1.zip
application/app1.zip
customer2.zip
application/app2.zip
customer3.zip
application/app3.zip
customer4.zip
application/app4.zip
As shown above, inside the Workspace, we have multiple zip files, and within each of these zip files, there exists another zip file application/app.zip. I would like to unzip app1, app2, app3, and app4 into new folders. I would like to use the same name as the parent zip folder to place each of the results. I tried the following answers but this unzips just the first folder.
sh '''
for zipfile in ${WORKSPACE}/*.zip; do
exdir="${zipfile%.zip}"
mkdir "$exdir"
unzip -d "$exdir" "$zipfile"
done
'''
Btw, I am running this command inside my Jenkins pipeline.
No idea about Jenkins but what you need is a recursive function.
recursiveUnzip.sh
#!/bin/dash
recursiveUnzip () { # $1=directory
local path="$(realpath "$1")"
for file in "$path"/*; do
if [ -d "$file" ]; then
recursiveUnzip "$file"
elif [ -f "$file" -a "${file##*.}" = 'zip' ]; then
# unzip -d "${file%.zip}" "$file" # variation 1
unzip -d "${file%/*}" "$file" # variation 2
rm -f "$file" # comment this if you want to keep the zip files.
recursiveUnzip "${file%.zip}"
fi
done
}
recursiveUnzip "$1"
Then call the script like this
./recursiveUnzip.sh <directory>
In you case, probably like this
./recursiveUnzip.sh "$WORKSPACE"

BASH Script for creating multiple directories, moving files, and then renaming said files

I am trying to make a bash script to create directories with the same name as each file in a given directory, then move said files to their respective directories, and then rename the files.
Basically - a quantum chemistry program that I use requires that the input files be named "ZMAT". So, if I have multiple jobs, I currently need to manually create directories, and then move the ZMAT files into them (can only run one job per folder).
When I run my code, I get "binary operator expected". I am not sure what this means. Some help please.
Here is what I have so far:
#!/bin/bash
if [ -e *.ZMAT ];
then
echo "CFOUR Job Detected"
for INPFILE in *.ZMAT; do
BASENAME=$(basename $INPFILE )
INPFILE=$BASENAME.ZMAT
OUTFILE=$BASENAME.out
XYZFILE=$BASENAME.xyz
ERRORFILE=$BASENAME.slu
if [ ! -e $ERRORFILE ];
then
# Create folder in scratch directory with the basename
mkdir /scratch/CFOUR/$BASENAME
# Move the file to its directory
mv -f $INPFILE /scratch/CFOUR/$BASENAME
# cd to the new directory
cd /scratch/CFOUR/$BASENAME
# Change the file name to just ZMAT
mv -f $INPFILE ZMAT
echo "Submitting CFOUR Job"
# Submit to scheduler
#RUN_COMMAND="sbatch -J $BASENAME _CFOUR_MRCC_SLURM.SUB"
#eval $RUN_COMMAND
else
echo "Error File Detected - Not Submitting Job"
fi
done
fi
An alternative would be to create symlinks to the original files.
As you said before, each ZMAT symlink would need to be in its own directory.
The upside is that the original data doesn't move, so less risk of breaking it, but the tool you want to use should read the symlinks as if they are the files it is looking for.
This one-liner creates an out directory in the current folder that you could subsequently move wherever you want it. You could easily create it where you do want it by replacing "out" with whatever absolute path you wanted
for i in *.ZMAT; do mkdir -p out/$i ; ln -s $PWD/$i out/$i/ZMAT ; done
I believe I have solved my problem. Here is the new script, which appears to be working fine. Any input is welcome though!
#!/bin/bash
SUBDIR=$(pwd)
for i in *.ZMAT; do
BASENAME=$(basename $i .ZMAT)
INPFILE=$BASENAME.ZMAT
OUTFILE=$BASENAME.out
XYZFILE=$BASENAME.xyz
ERRORFILE=$BASENAME.slu
if [ ! -e $ERRORFILE ];
then
mkdir /scratch/CFOUR/$BASENAME # Create Scratch Folder
cp $INPFILE /scratch/cdc/CFOUR/$BASENAME # Move Input to Scratch
cd /scratch/CFOUR/$BASENAME #cd to Scratch Folder
mv -f $INPFILE ZMAT # Change Input Name
echo "Submitting CFOUR Job"
# Submit to scheduler
#RUN_COMMAND="sbatch -J $BASENAME _CFOUR_MRCC_SLURM.SUB"
#eval $RUN_COMMAND
cd $SUBDIR #Go back to SUBDIR
else
echo "Error File Already Exists"
fi
done

Is there a way I can take user input and make it into a file?

I am not able to find a way to make bash create a file with the same name as the file the user dragged into the terminal.
read -p 'file: ' file
if [ "$file" -eq "" ]; then
cd desktop
mkdir
fi
I am trying to make this part of the script take the name of the file they dragged in so for example /Users/admin/Desktop/test.app cd into it copy the "contents" file make another folder with the same name so test.app for this example and then paste the contents file into that folder and delete the old file.
From your .app example, I assume you are using MacOS. Therefore you will need to test this script yourself since I don't have MacOS, but I think it should be doing what you want. Execute it as bash script.sh and it will give you your desired directory test.app/contents in the current working directory.
#! /bin/bash
read -rp 'file: ' file
if [ -e "$file" ]; then
if [ -e "$file"/contents ]; then
base=$(basename "$file")
mkdir "$base"
cp -R "$file"/contents "$base"
rm -rf "$file"
else
echo "The specified file $file has no directory 'contents'."
fi
else
echo "The specified file $file does not exist."
fi

Access to zipped files without unzipping them

I have a zip file that contains a tar.gz file. I would like to access the content of the tar.gz file but without unzipping it
I could list the files in the zip file but of course when trying to untar one of those files bash says : "Cannot open: No such file or directory" since the file does not exist
for file in $archiveFiles;
#do echo ${file: -4};
do
if [[ $file == README.* ]]; then
echo "skipping readme, not relevant"
elif [[ $file == *.tar.gz ]]; then
echo "this is a tar.gz, must extract"
tarArchiveFiles=`tar -tzf $file`
for tarArchiveFile in $tarArchiveFiles;
do echo $tarArchiveFile
done;
fi
done;
Is this possible to extract it "on the fly" without storing it temporarily. I have the impression that this is doable in python
You can't do it without unzipping (obviously), but I assume what you mean is, without unzipping to the filesystem.
unzip has -c and -p options which both unzip to stdout. -c outputs the filename. -p just dumps the binary unzipped file data to stdout.
So:
unzip -p zipfile.zip path/within/zip.tar.gz | tar zxf -
Or if you want to list the contents of the tarfile:
unzip -p zipfile.zip path/within/zip.tar.gz | tar ztf -
If you don't know the path of the tarfile within the zipfile, you'd need to write something more sophisticated that consumes the output of unzip -c, recognises the filename lines in the output. It may well be better to write something in a "proper" language in this case. Python has a very flexible ZipFile library function, and most mainstream languages have something similar.
You can pipe an individual member of a zip file to stdout with the -p option
In your code change
tarArchiveFiles=`tar -tzf $file`
to
tarArchiveFiles=`unzip -p zipfile $file | tar -tzf -`
replace "zipfile" with the name of the zip archive where you sourced $archiveFiles from

youtube-dl problems (scripting)

Okay, so I've got this small problem with a bash script that I'm writing.
This script is supposed to be run like this:
bash script.sh https://www.youtube.com/user/<channel name>
OR
bash script.sh https://www.youtube.com/user/<random characters that make up a youtube channel ID>
It downloads an entire YouTube channel to a folder named
<uploader>{<uploader_id>}/
Or, at least it SHOULD...
the problem I'm getting is that the archive.txt file that youtube-dl creates is not created in the same directory as the videos. It's created in the directory from which the script is run.
Is there a grep or sed command that I could use to get the archive.txt file to the video folder?
Or maybe create the folder FIRST, then cd into it, and run the command from there?
I dunno
Here is my script:
#!/bin/bash
pwd
sleep 1
echo "You entered: $1 for the URL"
sleep 1
echo "Now downloading all videos from URL "$1""
youtube-dl -iw \
--no-continue $1 \
-f bestvideo+bestaudio --merge-output-format mkv \
-o "%(uploader)s{%(uploader_id)s}/[%(upload_date)s] %(title)s" \
--add-metadata --download-archive archive.txt
exit 0
I ended up solving it with this:
uploader="$(youtube-dl -i -J $URL --playlist-items 1 | grep -Po '(?<="uploader": ")[^"]*')"
uploader_id="$(youtube-dl -i -J $URL --playlist-items 1 | grep -Po '(?<="uploader_id": ")[^"]*')"
uploaderandid="$uploader{$uploader_id}"
echo "Uploader: $uploader"
echo "Uploader ID: $uploader_id"
echo "Folder Name: $uploaderandid"
echo "Now downloading all videos from URL "$URL" to the folder "$DIR/$uploaderandid""
Basically I had to parse the JSON with grep, since the youtube-dl devs said that implementing -o type variables into any other variable would clog up the code and make it bloated.

Resources