Recursively deleting all "*.foo" files with corresponding "*.bar" files - bash

How can I recursively delete all files ending in .foo which have a sibling file of the same name but ending in .bar? For example, consider the following directory tree:
.
├── dir
│   ├── dir
│   │   ├── file4.bar
│   │   ├── file4.foo
│   │   └── file5.foo
│   ├── file2.foo
│   ├── file3.bar
│   └── file3.foo
├── file1.bar
└── file1.foo
In this example file.foo, file3.foo, and file4.foo would be deleted since there are sibling file{1,3,4}.bar files. file{2,5}.foo should be left alone leaving this result:
.
├── dir
│   ├── dir
│   │   ├── file4.bar
│   │   └── file5.foo
│   ├── file2.foo
│   ├── file3.bar
└── file1.bar

Remember to first take a backup before you try this find and rm command.
Use this find:
find . -name "*.foo" -execdir bash -c '[[ -f "${1%.*}.bar" ]] && rm "$1"' - '{}' \;

while IFS= read -r FILE; do
rm -f "${FILE%.bar}".foo
done < <(exec find -type f -name '*.bar')
Or
find -type f -name '*.bar' | sed -e 's|.bar$|.foo|' | xargs rm -f

In bash 4.0 and later, and in zsh:
shopt -s globstar # Only needed by bash
for f in **/*.foo; do
[[ -f ${f%.foo}.bar ]] && rm ./"$f"
done
In zsh, you can define a selective pattern that matches files ending in .foo only if there is a corresponding .bar file, so that rm is invoked only once, rather than once per file.
rm ./**/*.foo(e:'[[ -f ${REPLY%.foo}.bar ]]':)

Related

For loop with if-else statement to unzip files

Compared to other solutions (eg. using the "find" tool), I would like to get a solution that uses the common:
for z in *zip
do
unzip $z
done
however, my parent zip file often has 2, 3, 4 or 5 levels of zipped files. So, for a 2-levels parent zip file I can do:
for z in *zip
do
unzip $z
for x in *zip
do
unzip $x
done
done
which is a pretty basic (or archaic) solution. I was wondering if an if-else call could be included inside it (e.g., [[ -a *zip ]]) to automagically get all zip files without looping again on the ones that have been already unzipped. The tricky part is that the parent zip file must remain in the current directory (I could use cd or mkdir to move it but it may exist a cleaner way).
Any hints are much welcomed,
If you allow to extract the files in hierarchy structure according to
the zip files, would you please try:
#!/bin/bash
#
# return nonexistent dirname such as "name(1)"
#
newname() {
local name=$1 # dirname which already exists
local new # new dirname
local i=1 # number for the suffix
while :; do
new=$(printf "%s(%d)" "$name" "$i")
[[ ! -e $new ]] && break # break if the newname is unique
(( i++ )) # increment for the next loop
done
echo "$new" # return the new name
}
#
# extract zip files in the directory which name is the basename of the zip file
#
extract() {
local zip # zip filename
for zip in "$#"; do
if [[ -f $zip ]]; then # check the existence of the file
local dir=${zip%.zip} # generate the dirname removing extension
if [[ -e $dir ]]; then # if the dirname collides
dir=$(newname "$dir") # then give a new name
fi
unzip "$zip" -d "$dir"
extract "$dir"/*.zip # do it recursively
fi
done
}
extract *.zip
For instance, if you have following files:
level1.zip contains file1, file2, file3 and level2.zip.
level2.zip contains file4 and file5.
then the extracted result will have a structure like:
.
|-- level1
| |-- file1
| |-- file2
| |-- file3
| |-- level2
| | |-- file4
| | +-- file5
| +-- level2.zip
+-- level1.zip
[Explanations]
The extract function unzippes the zip files recursively creating
new directories with the basename of the zip files.
The newname function is used to avoid filename collisions if
the directory name to create already exists.
I am wondering why you think find is not a for loop, while it loops over directories faster.
task
tree
.
├── a
│   ├── b
│   │   ├── c
│   │   │   ├── d
│   │   │   │   └── hello.zip
│   │   │   └── hello.zip
│   │   └── hello.zip
│   └── hello.zip
└── hello.zip
4 directories, 5 files
find them
mapfile -t zips < <(find -name \*.zip)
unzip them
for zip in "${zips[#]}"; do unzip "$zip" -d "${zip%/*}"; done
done
tree
.
├── a
│   ├── b
│   │   ├── c
│   │   │   ├── d
│   │   │   │   ├── hello.txt
│   │   │   │   └── hello.zip
│   │   │   ├── hello.txt
│   │   │   └── hello.zip
│   │   ├── hello.txt
│   │   └── hello.zip
│   ├── hello.txt
│   └── hello.zip
├── hello.txt
└── hello.zip
4 directories, 10 files

A bash script to rename files from different directories at once

If I have a directory named /all_images, and inside this directory there's a ton of directories, all the directories named dish_num as shown below. and inside each dish directory, there's one image named rgb.png. How can i rename all the image files to be the name of its directory.
Before
|
├── dish_1
│ └── rgb.png
├── dish_2
│ └── rgb.png
├── dish_3
│ └── rgb.png
├── dish_4
│ └── rgb.png
└── dish_5
└── rgb.png
After
|
├── dish_1
│ └── dish_1.png
├── dish_2
│ └── dish_2.png
├── dish_3
│ └── dish_3.png
├── dish_4
│ └── dish_4.png
└── dish_5
└── dish_5.png
WARNING: Make sure you have backups before running code you got someplace on the Internet!
find /all_images -name rgb.png -execdir sh -c 'mv rgb.png $(basename $PWD).png' \;
where
find /all_images will start looking from the directory "/all_images"
-name rbg.png will look anywhere for anything named "rbg.png"
optionally use -type f to restrict results to only files
-exedir in every directory where you got a hit, execute the following:
sh -c shell script
mv move, or "rename" in this case
rgb.png file named "rgb.png"
$(basename $PWD).png output of "basename $PWD", which is the last section of the $PWD - the current directory - and append ".png" to it
\; terminating string for the find loop
If you want to benefited from your multi-core processors, consider using xargs instead of find -execdir to process files concurrently.
Here is a solution composed of find, xargs, mv, basename and dirname.
find all_images -type f -name rgb.png |
xargs -P0 -I# sh -c 'mv # $(dirname #)/$(basename $(dirname #)).png'
find all_images -type f -name rgb.png prints a list of file paths whose filename is exactly rgb.png.
xargs -P0 -I# CMD... executes CMD in a parallel mode with # replaced by path names from find command. Please refer to man xargs for more information.
-P maxprocs
Parallel mode: run at most maxprocs invocations of utility at once. If maxprocs is set to 0, xargs will run as many processes as possible.
dirname all_images/dash_4/rgb.png becomes all_images/dash_4
basename all_images/dash_4 becomes dash_4
Demo
mkdir all_images && seq 5 |
xargs -I# sh -c 'mkdir all_images/dash_# && touch all_images/dash_#/rgb.png'
tree
find all_images -type f -name rgb.png |
xargs -P0 -I# sh -c 'mv # $(dirname #)/$(basename $(dirname #)).png'
tree
Output
.
└── all_images
├── dash_1
│   └── rgb.png
├── dash_2
│   └── rgb.png
├── dash_3
│   └── rgb.png
├── dash_4
│   └── rgb.png
└── dash_5
└── rgb.png
.
└── all_images
├── dash_1
│   └── dash_1.png
├── dash_2
│   └── dash_2.png
├── dash_3
│   └── dash_3.png
├── dash_4
│   └── dash_4.png
└── dash_5
└── dash_5.png
6 directories, 5 files

Changing names (version numbers) in nested folders with find and mv

TL;DR: I want all the 2.6s to say 2.7
lib
└── python2.6
└── site-packages
├── x
│   ├── x.py
│   ├── x.pyc
│   ├── __init__.py
│   ├── __init__.pyc
│   └── test
│   ├── __init__.py
│   └── __init__.pyc
└── x-0.2.0-py2.6.egg-info
├── dependency_links.txt
├── entry_points.txt
├── PKG-INFO
├── requires.txt
├── SOURCES.txt
└── top_level.txt
What I've tried:
find . -type d -name "*2.6*" -exec bash -c 'mv "$1" "${1/2.6/2.7}"' -- {} \;
Obviously this doesn't work because it sees the main folder, moves that, and then sees the nested folder and tries to move it, but it no longer exists in that spot and says no such file or directory
Is there a good way to do nested find and moves? In this case, I can just run the command twice and that would technically work, but it feels dirty.
Also, I know this could screw up the versioning of the package, or that I could do
find . -type d -name "*python2.6*" -exec bash -c 'mv "$1" "${1/2.6/2.7}"' -- {} \;
find . -type d -name "*py2.6*" -exec bash -c 'mv "$1" "${1/2.6/2.7}"' -- {} \;
But I'm more interested in learning if bash has a method to solve this in general than how to deal with this narrow scenario.
You can go depth first and substitute only in the basename:
find lib -depth -type d -name "*2.6*" -exec \
bash -c 'basename="${1##*/}" && mv "$1" "${1%/*}/${basename/2.6/2.7}"' -- {} \;
If you run it with an echo as:
find lib -depth -type d -name "*2.6*" -exec \
bash -c 'bn="${1##*/}" && echo mv "$1" "${1%/*}/${bn/2.6/2.7}"' -- {} \;
on a tree created with:
mkdir -p lib/python2.6/site-packages/{x/test,x-0.20-py2.6.egg-info}
i.e., on:
lib/
└── python2.6
└── site-packages
├── x
│   └── test
└── x-0.20-py2.6.egg-info
You get:
mv lib/python2.6/site-packages/x-0.20-py2.6.egg-info lib/python2.6/site-packages/x-0.20-py2.7.egg-info
mv lib/python2.6 lib/python2.7
Remove the echos, and the moves should proceed error-free.

Find, move, and create empty file in place of file that was moved

I'm looking to make a crontab that will search through a directory and all subdirectories and find all files with extension *.mkv then move them to a different directory and create an empty file with the same name and extension in place of the original file.
So it would look like this:
find *.mkv in subdirectories of /home/user/directoryA/~
move *.mkv to /home/user/directoryB/
create empty *.mkv with same filename as the original in place of file in /home/user/directoryA/~
What would be the best way to accomplish this?
The process isn't too difficult if you recognize that when forming your new directory names, your old base directory will simply be a substring within the new directory name. Bash provides a parameter expansion with substring replacement that is tailor made for this process.
Essentially, you find each file below your source directory with the *.mkv extension, you use parameter expansion with substring replacement to form the new full-filename containing your destination directory, (e.g. nffn="${ffn/$srcdir/$destdir}", where ffn is short for full-filename and nffn short for new full-filename)
With your new full-filename formed containing the updated path, it is just a matter of making sure the destination directory exists before moving the file. mkdir -p is perfect here as it will create the full path, and will not complain if the directory already exists. You simply use a parameter expansin with substring removal to isolate the new directory from the new full-filename to pass to mkdir -p, and finally, you check that mkdir -p succeeds or you handle the error, e.g.
## create new directory, handle error if create fails
mkdir -p "${nffn%/*}" || {
echo "error: creating '${nffn%/*}'" >&2
exit 1
}
Putting all the pieces together, you can do what you are attempting with a short script similar to the following.
#!/bin/bash
## source and destination directories, file pattern
# (note: to change destdir, two arguments required
# to change patrn, three arguments required)
srcdir="${1:-/home/david/dev/src-c/tmp/debug/AAA}"
destdir="${2:-/home/david/dev/src-c/tmp/debug/BBB}"
patrn="${3:-*.mkv}"
while read -r ffn; do ## loop over each full-filename
nffn="${ffn/$srcdir/$destdir}" ## form new full-filename
## create new directory, handle error if create fails
mkdir -p "${nffn%/*}" || {
echo "error: creating '${nffn%/*}'" >&2
exit 1
}
mv "$ffn" "$nffn" ## move full-filename to new full-filename
touch "$ffn" ## touch full-filename for zero original
done < <(find "$srcdir" -name "$patrn")
(note: you can pass the directories and file pattern as positional parameters, but note, if you pass more than 1, you must pass each required parameter (or you could implement getotp))
Initial Directories AAA & BBB
$ tree AAA
AAA
├── a.mkv
├── b.mkv
├── dir1
│   ├── a.mkv
│   └── b.mkv
├── dir2
│   ├── a.mkv
│   └── b.mkv
├── dir3
│   ├── a.mkv
│   └── b.mkv
└── dira
├── a.mkv
└── b.mkv
$ tree BBB
BBB [error opening dir]
Final Directories AAA & BBB
$ bash mvemptydir.sh
$ tree AAA
AAA
├── a.mkv
├── b.mkv
├── dir1
│   ├── a.mkv
│   └── b.mkv
├── dir2
│   ├── a.mkv
│   └── b.mkv
├── dir3
│   ├── a.mkv
│   └── b.mkv
└── dira
├── a.mkv
└── b.mkv
$ tree BBB
BBB
├── a.mkv
├── b.mkv
├── dir1
│   ├── a.mkv
│   └── b.mkv
├── dir2
│   ├── a.mkv
│   └── b.mkv
├── dir3
│   ├── a.mkv
│   └── b.mkv
└── dira
├── a.mkv
└── b.mkv
Look things over and let me know if you have further questions.
you can write a script like this :
#!/bin/bash
cd /[ADDRESS]
find . -name *.mkv > /tmp/find_result.txt
mv `cut -f1 /tmp/find_result.txt` /backup/
touch `cut -f1 /tmp/find_result.txt`
1- go to your directory that you want to find this files
2- find all .mkv files and send the result to a file like /tmp/find_result.txt in this example
3- move all files (that save in file "/tmp/find_result.txt") to your desired directory (like "/backup" in this example)
4- finaly create empty file with same name (that save in file "/tmp/find_result.txt")
you can add this script to crontab.
You could use a loop to do this for each file matching your criteria!
for f in `find . -name *.mkv`; do
mv $f /home/user/directoryB/
touch $f
done;
If you wanted to get fancy you could put this into a script and accept directoryA/B as arguments:
for f in `find $1 -name *.mkv`; do mv $f $2; touch $f; done;
and run as ./script.sh /home/user/directoryA/~ /home/user/directoryB/

Script to remove oldest files of type in each directory?

Much research has turned almost similar questions yet nothing close enough to give me an idea of how to accomplish part my task. I'll try to keep this clear and short, while explaining the situation and desired result. My structure would be as follows:
-mobile
--Docs
--Downloads
--SomeFile
----this.is.crazy_0.0.1-1_named-silly.txt
----dont.touch.me.pdf
----leave.me.alone.png
----this.is.crazy_0.0.1-2_named-silly.txt
----this.is.crazy_0.0.1-3_named-silly.txt <---- file to keep
--SomeFileA
----this.is.crazy_0.0.1-1_also-silly.txt
----this.is.crazy_0.0.1-2_also-silly.txt
----dont.touch.me.either.pdf
----leave.me.alone.too.png
----this.is.crazy_0.0.1-3_also-silly.txt
----this.is.crazy_0.0.1-11_also-silly.txt <----file to keep
The first part of my script to find the .txt files ignores every directory that is constant in this working directory and prints them to a list (which is a completely ugly hack and most likely a hinder to the way most would accomplish this task) "SomeFileB and SomeFileC" could come along with the same file structure and I'd like to catch them in this script as well.
The idea is to keep the newest .txt file in each directory according to its time stamp which obviously isn't in the filename. The files to keep will continue to change of course. To clarify the question again, how to go about keeping the newest .txt file in each variable directory with variable crazy name, according to timestamp which isn't in the filename? Hopefully I've been clear enough for help. This script should be in bash.
I'm not with the current code right now, as i said its ugly but heres a snippet of what I have find /path/to/working/directory -maxdepth 0 -not -path "*Docs*" -not -path "*Downloads* -name "*.txt" >list
Assuming the question was understood correctly, the task could be expressed as:
Recursively remove all files *.txt except the newest in each respective directory
#!/bin/bash
# Find all directories from top of tree
find a -type d | while read -r dir; do
# skip $dir if doesn't contain any files *.txt
ls "$dir"/*.txt &>/dev/null || continue
# list *.txt by timestamp, skipping the newest file
ls -t "$dir"/*.txt | awk 'NR>1' | while read -r file; do
rm "$file"
done
done
Assuming this directory tree, where a.txt is always the newest:
$ tree -t a
a
├── otherdir
├── b
│   ├── d e
│   │   ├── a.txt
│   │   ├── b.txt
│   │   ├── c.txt
│   │   ├── bar.txt
│   │   └── foo.pdf
│   ├── c
│   │   ├── a.txt
│   │   ├── b.txt
│   │   └── c.txt
│   ├── a.txt
│   ├── b.txt
│   ├── c.txt
│   └── foo.pdf
├── a.txt
├── b.txt
└── c.txt
This is the result after running the script:
$ tree -t a
a
├── b
│   ├── c
│   │   └── a.txt
│   ├── d e
│   │   ├── a.txt
│   │   └── foo.pdf
│   ├── a.txt
│   └── foo.pdf
├── otherdir
└── a.txt
Change rm "$file" to echo rm "$file" to check what would be removed before running "for real"

Resources