How to use Bash to delete unwanted files/folders

How to use Bash to delete unwanted files/folders - bash

I'm trying to utilize a bash script to delete some unwanted files with the same name in different directories, eg: text1.txt exists in multiple directories and I wish to remove it in every directory it exists in.
I need the script to delete the unwanted files and then also delete the directory in which that filename 'text1.txt' exists, so if it exists in a folder named 'TextFiles' I need that folder directory to be deleted.
This is my current code I'm working on:
for files in "/*"
do
rm file.1txt file2.txt file3.txt
I'm a bit curious about whether the "/*" will look into all directories and whether the 'do' is working to remove the files stated.
Also, after utilising the 'rm' to remove specific files how do I delete the directory it exists in.
Many thanks!

Before I start, I have to note that the rm command can do some nasty things in your system. Automating it can lead to unintended data loss (system or personal files and folders) if used carelessly.
Now that I said that, imagine the following file structure:
bhuiknei#debian:~/try$ tree
.
├── dir1
│   └── this.txt
└── dir2
├── dir3
│   ├── this
│   └── this.txt
├── notthis.txt
└── this.txt
3 directories, 5 files
To find and filter specific files find and grep are your friends. The "-w" option will match to whole words only (so the notthis.txt is not picked up):
bhuiknei#debian:~/try$ find . | grep -w this.txt
./dir1/this.txt
./dir2/dir3/this.txt
./dir2/this.txt
Now that we have all paths for the files lined up, these can be piped into a while loop where we can delete the files one-by-one. Then the empty directories can be deleted in a second step.
I would not suggest deleting the containing folders forcibly as they might contain other files and folders too.
The following script does the trick:
#!/bin/bash
#Exiting if no file name was given
[[ $# -ne 1 ]] && { echo "Specify a filename to delete in all sub folders"; exit 1; }
#Deleting files matching input parameter
echo "Deleting all files named ${1} in current and sub-directories."
find . | grep -w "$1" | \
while IFS= read LINE; do
rm -v "$LINE"
done
#Deleting only-empty folders
rmdir -v *
exit 0
And the result:
bhuiknei#debian:~/try$ tree
.
├── dir1
│   └── this.txt
├── dir2
│   ├── dir3
│   │   ├── this
│   │   └── this.txt
│   ├── notthis.txt
│   └── this.txt
└── script
3 directories, 6 files
bhuiknei#debian:~/try$ ./script this.txt
Deleting all files named this.txt in current and sub-directories.
removed './dir1/this.txt'
removed './dir2/dir3/this.txt'
removed './dir2/this.txt'
rmdir: removing directory, 'dir1'
rmdir: removing directory, 'dir2'
rmdir: failed to remove 'dir2': Directory not empty
rmdir: removing directory, 'script'
rmdir: failed to remove 'script': Not a directory
bhuiknei#debian:~/try$ tree
.
├── dir2
│   ├── dir3
│   │   └── this
│   └── notthis.txt
└── script
2 directories, 3 files
Also a side note: I didn't test what happens if the working directory is different where the script is located, so make sure to run it locally from the parent dir, or add some protection. Working with absolute paths can be a solution.
Good luck!

You know the extension of the file name and so you can utilise this in a loop parsing the output of find with parameter expansion and so:
find /path -name "file1.txt" | while read var
do
echo "rm -Rf ${var%/file1.txt}" # echo the command
# rm -Rf "${var%/file1.txt}" # execute the command when sure that command list as expected
done
${var%/file1.txt} -
will expand the output from find and expand the output only up to /file1.txt (the directory) rm -Rf will then force removal the directory along with the file
Alternatively you can use printf natively in find to print only the directory without the file:
find /path -name "file1.txt" -printf "%h\n" | while read var
do
echo "rm -Rf $var" # echo the command
# rm -Rf "$var" # execute the command when sure that command list as expected
done

Related

how can I process a file that has different names depending on the folder (in other words get the name first)?

Let's suppose there is a folder with several subfolders. In each subfolder there is a file, that has a different name depending on the folder. For example
basefolder
|________f1_1_1: video_1_1_1.mp4
|________f1_2_1: video_1_2_1.mp4
|
|_ .....
I want to write a shell script that do some processing on these files
So I have
search_dir=/path/to/the/basefolder/
for entry in "$search_dir"*/
do
echo "$entry"
#ls "$entry" #<--------HERE
echo "========================"
done
As you can see I can list the subfolders.
I want to do something like
process video_1_1_1.mp4 video_1_1_1_out.mp4
but the file name varies.
Yes I see that I can perhaps use the entry variable to compose the name of the file, but what if the files don't follow this pattern and the only thing I know is that they start with "video"?
Is there a way to get the name of the file in the folder so as to use it later?

Consider this file tree:
$ tree /tmp/test
/tmp/test
├── one
│   ├── one-1.mp4
│   ├── one-2.mp4
│   ├── one-3.mp4
│   ├── video-1.mp4
│   └── video-2.mp4
└── two
├── two-1.mp4
├── two-2.mp4
├── two-3.mp4
├── video-1.mp4
└── video-2.mp4
2 directories, 10 files
You can use a recursive glob to find all the .mp4 files in that tree:
$ for fn in "/tmp/test/"**/*".mp4"; do echo "$fn"; done
/tmp/test/one/one-1.mp4
/tmp/test/one/one-2.mp4
/tmp/test/one/one-3.mp4
/tmp/test/one/video-1.mp4
/tmp/test/one/video-2.mp4
/tmp/test/two/two-1.mp4
/tmp/test/two/two-2.mp4
/tmp/test/two/two-3.mp4
/tmp/test/two/video-1.mp4
/tmp/test/two/video-2.mp4
Or just the ones starting with video:
$ for fn in "/tmp/test/"**/"video-"*".mp4"; do echo "$fn"; done
/tmp/test/one/video-1.mp4
/tmp/test/one/video-2.mp4
/tmp/test/two/video-1.mp4
/tmp/test/two/video-2.mp4
Instead of echo you can process...
If process involves more than one file, you can use xargs.
You can also use find:
$ find "/tmp/test/" -iname "video*.mp4" -type f
/tmp/test//one/video-1.mp4
/tmp/test//one/video-2.mp4
/tmp/test//two/video-1.mp4
/tmp/test//two/video-2.mp4
Then you would construct a pipe to xargs or use find -exec:
$ find [ what ] -print0 | xargs -0 process # xargs way
$ find [ what ] -exec process {} + # modern find

Moving several files to different folders with same name

I have some data for several stations that are separated for the station they are in, and the day they were recorded, so for station 1, for example, I have multiple folders called 2019.001,2019.002 etc. and inside these folders I have the files (all with the same name) ending with HHZ. What I have done is getting these files from each of the stations and putting them on another folder while renaming them to have the name of the folder above and maintaining the name of the station, afterwards I created the folders corresponding to their names. My actual question is how to move the files that correspond to the same day, e.g. 2019.001.station1 and 2019.001.station2 to the folder 2019.001.
dir0=`pwd`
mkdir -p data || exit 1
for pathname in $dir0/stam/*/*HHZ; do
cp "$pathname" "data/$( basename "$( dirname "$pathname" )" )STAMHHZ"
done
for pathname in $dir0/macu/*/*HHZ; do
cp "$pathname" "data/$( basename "$( dirname "$pathname" )" )MACUHHZ"
done
cd $dir0/data
mkdir 2019.0{10..31}
mkdir 2019.00{1..9}
If there is also another way of executing the part of the code where I take the files so I can generalize for several stations that would be nice, since I am only working with two stations right now but in the future I'll work with more.
Here is the tree to where the data is
macu
├── 2019.001
│   └── MACUHHZ
├── 2019.002
│   └── MACUHHZ
├── 2019.003
And
stam
├── 2019.001
│   └── STAMHHZ
├── 2019.002
│   └── STAMHHZ
├── 2019.003
│   └── STAMHHZ
So ideally the final situation would be:
data
├── 2019.001
│   ├── 2019.001MACUHHZ
│   └── 2019.001STAMHHZ
And so on

The script below creates the wanted file structure. The top level directories from which you want to copy data (in your example macu and stam) should be added to the top_dir variable. You can also change it to use a wildcard, or read them from a file, etc.
The basic idea is simple: For each top level directory, for each data directory, create the corresponding directory in data, and for each file, copy the file.
pushd and popd are used as a simple hack to make the * wildcards do what we want. $dir0 contains the root folder of the operation, so we always know where data is.
set -e is used to exit immediately if there is an error.
#!/bin/bash
set -e
top_dirs=( macu stam )
dir0="$(pwd)"
mkdir -p data
for dir in "${top_dirs[#]}" ; do
pushd "$dir" >/dev/null
for datadir in * ; do
mkdir -p "$dir0/data/$datadir"
pushd "$datadir" >/dev/null
for file in *HHZ ; do
cp "$file" "$dir0/data/$datadir/$datadir$file"
done
popd >/dev/null
done
popd >/dev/null
done

deleting intermediary folders

Maybe one of you guys has something like this at hand already? I tried to use robocopy on windows but to no avail. I also tried to write a bash script in linux with find etc... but gave up on that one also ^^ Google search brought no solution also unfortunately. I need this for my private photo library.
Solution could be linux or windows based, both are fine. Any ideas?
I would like to get rid of hundreds of 'intermediary folders'.
I define an 'intermediary folder' as a folder that contains nothing else than exactly one sub-folder. Example
folder 1
file in folder 1
folder 2 <-- 'intermediary folder: contains exactly one sub-folder, nothing else'
folder 3
file in folder 3
What I would like to end up with is:
folder 1
file in folder 1
folder 3
file in folder 3
I do not need the script to be recursive (removing several layers of intermediary folders at once), I'll just run it several times.
Even cooler would be if the script could rename folder 3 in the above example to 'folder 2 - folder 3', but I can live without this feature I guess.
I guess one of you linux experts has a one liner handy for that? ^^
Thank you very much!

Take a look at this code:
#!/usr/bin/env bash
shopt -s nullglob
while IFS= read -rd '' dir; do
f=("$dir"/*)
if ((${#f[#]}==1)) && [[ -d $f ]]; then
mv -t "${dir%/*}" "$f" || continue
rm -r "$dir"
fi
done < <(find folder1 -depth -mindepth 1 -type d -print0)
Explanation:
shopt -s nullglob: allows filename patterns which match no files to expand to a null string
find ... -depth: makes find traverse the file system in a depth-first order
find ... -mindepth 1: processes all directories except the starting-point
find ... -type d: finds only directories
find ... -print0: prints the directories separated by a null character \0 (to correctly handle possible newlines in filenames)
while IFS= read ...: loops over all the directories (the output of find)
f=("$dir"/*): creates an array with all files in the currently processed directory
((${#f[#]}==1)) && [[ -d $f ]]: true if there is only one file and it is a directory
mv -t "${dir%/*}" "$f": moves the only subdirectory one directory above
mv ... || continue: mv can fail if the subdirectory already exists in the directory above. || continue ignores such subdirectory
rm -r "$dir": removes the processed directory
Test run:
$ tree folder1
folder1
├── file1
├── folder2
│   └── folder3
│   └── file3
├── folder4
│   ├── file4a
│   ├── file4b
│   └── file4c
└── folder5
└── folder6
├── file6
└── folder7
└── folder8
└── folder9
├── dir9
└── file9
$ ./script
$ tree folder1
folder1
├── file1
├── folder3
│   └── file3
├── folder4
│   ├── file4a
│   ├── file4b
│   └── file4c
└── folder6
├── file6
└── folder9
├── dir9
└── file9

Compress all files of certain file types in subfolders in one file per subfolder using shell script or AppleScript

I am looking for a way to archive all files of certain file types in one zip file per subfolder.
My folder structure is as follows:
/path/to
└── TopLevel
├── SubLevel1
│ ├── SubSubLevel1
│ ├── SubSubLevel2
│ └── SubSubLevel3
├── SubLevel2
│ ├── SubSubLevel1
│ ├── SubSubLevel2
│ └── SubSubLevel3
├── SubLevel3
│ ├── SubSubLevel1
│ └── SubSubLevel2
└── SubLevel4
In each folder or subfolder or sub-subfolder, there are files of the file type *.abc, *.xyz and also *.001 through *.999 and all these files I want to compress into one zip file per folder, i.e. all files of the specified types in folder "SubSubLevel1" of "SubLevel1" of "TopLevel" should be packaged into one file named "SubSubLevel1_data.zip" inside the "SubSubLevel1" folder. All other files in these folders, which do not match the search criteria as described above, should be kept unzipped in the same directory.
I have found some ideas here or here, but both approaches are based on a different way of archiving the files and I have so far not found a way to adopt them to my needs since I am not very experienced with shell scripting. I have also tried to get a solution with AppleScript, but there I face the problem how to get all files in the folder with the number as an extension (*.001 through *.999). With RegEx, I would do something like ".abc|.xyz.\d\d\d" which would cover my search for certain file types, but I am also not sure now how to implement the result of a grep in AppleScript.
I guess someone out there must have an idea how to address my archiving issue. Thanks in advance for your suggestions.

After some playing around I came up with the following solution:
#!/bin/bash
shopt -s nullglob
find -E "$PWD" -type d -maxdepth 1 -regex ".*201[0-5][0-1][0-9].*" -print0 | while IFS="" read -r -d "" thisFolder ; do
echo "The current folder is: $thisFolder"
to_archive=( "$thisFolder"/*.[Aa][Bb][Cc] "$thisFolder"/*.[Xx][Yy][Zz] "$thisFolder"/*.[0-9][0-9][0-9] )
if [ ${#to_archive[#]} != 0 ]
then
7z a -mx=9 -uz1 -x!.DS_Store "$thisFolder"/"${thisFolder##*/}"_data.7z "${to_archive[#]}" && rm "${to_archive[#]}"
fi
find "$thisFolder" -type d -mindepth 1 -maxdepth 1 -print0 | while IFS="" read -r -d "" thisSubFolder ; do
echo "The current subfolder is: $thisSubFolder"
to_archive=( "$thisSubFolder"/*.[Aa][Bb][Cc] "$thisSubFolder"/*.[Xx][Yy][Zz] "$thisSubFolder"/*.[0-9][0-9][0-9] )
if [ ${#to_archive[#]} != 0 ]
then
7z a -mx=9 -uz1 -x!.DS_Store "$thisSubFolder"/"${thisSubFolder##*/}"_data.7z "${to_archive[#]}" && rm "${to_archive[#]}"
fi
done
done
My script has two nested for loops to iterate through subfolders and sub-subfolders. With "find" I look for a regex pattern in order to only backup folders from 2010-2015 . All files matching the specified extensions inside the folders are compressed in one target archive per folder.

Prevent CP hardlink by copying files into subfolder

I have a folder structure like this
a/
-b/
-c.txt
-d.txt
-backups/
I want to move the contents of folder a into backups so the folder structure is this.
a/
-b/
-c.txt
-d.txt
-backups/
-b/
-c.txt
-d.txt
Here are the commands I have used so far.
for d in a/*/ ; do
mkdir -p ${d}backups/
cp -ra ${d}* backups
done
I make the folder backups then I try to copy the content into the backups folder. However, I get the error: CP Hardlink cannot copy folder onto itself. How can i do this?
Thank You

a
├── b
├── backups
├── c.txt
└── d.txt
2 directories, 2 files
Enable extglob
by shopt -s extglob and execute cp -r !(backups/) backups. The following will be the result:
a
├── b
├── backups
│   ├── b
│   ├── c.txt
│   └── d.txt
├── c.txt
└── d.txt
3 directories, 4 files

it is trying to copy "backups" into "backups" , so you need to make sure you exclude "backups" from the a/*/ pattern.
you should probably use "find" to find files with a given pattern and exclude the "backup" directory. With find you can do "-not -name backup"

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio