Bash - Find files from a specific directory - bash

I am new to bash (but not to programming). I have a bash script that looks for all .txt files in a project
for i in `find . -name "*.txt"`;
do
basename= "${i}"
cp ${basename} ./dest
done
However, I would like to get the .txt files only from a specific sub directory. For e.g this is my project structure:
project/
├── controllers/
│ ├── a/
│ │ ├── src/
│ │ │ ├── xxx
│ │ │ └── xxx
│ │ └── files/
│ │ ├── abc.txt
│ │ └── xxxx
│ └── b/
│ ├── src/
│ │ ├── xxx
│ │ └── xxx
│ └── files/
│ ├── abcd.txt
│ └── xxxx
├── lib
└── tests
I would like to get .txt files only from controllers/a/files and controllers/b/files. I tried replacing find . -name "*.txt" with find ./controllers/*/files/*txt, it works fine, but errors out on GitHub actions with No such file or directory found. So I'm looking for a more robust way of finding .txt files from the subdirectory without having to hardcode the path in the for loop. Is that possible?

You can use brace expansion for the search directory, e.g.
find ./project/controllers/{a,b} -type f -name "*.txt"
To select file only below ./project/controllers/a and ./project/controllers/b
Additionally, your use of basename would no longer be needed in your script (and cure the error with the ' ' (space) to the right of the '=' sign. Traditionally in bash, you will use process substitution to feed a while loop rather than using a for loop, e.g.
while read -r fname; do
# basename="${fname}" # note! no ' ' on either side of =
cp -ua "$fname" ./dest
done < <(find ./project/controllers/{a,b} -type f -name "*.txt")
Edit Based On Comment of Many Paths
If you have many controllers not just a and b, then using the -path option instead of the -name options can provide a solution, e.g.
find . -path "./project/controllers/*/files/*.txt" -type f
would select any ".txt" files below any directory below controllers that contains a files directory.

It seems to me what you need is a simple cp command,
cp project/controllers/*/files/*.txt ./dest/
if you want to copy only the files with .txt extension under the directory files (but not in its subdirectories, if any)

find ./controllers/*/files/*txt does not really make sense to me, as it means that find should start at a directory tree matching this name, but you don't have a directory with a name ending in txt.
You can do a
find controllers -prune src -name "*.txt"
to exclude txt files in the src directories. Another possibility would be to do
shopt -s nullglob
for file in controllers/*/files/*.txt
do
...
done

Related

How to copy all files having the same name into another directory, with unique filenames, using bash?

Say I have a directory structure like this:
├── Directory 1
│ ├── img.png
├── Directory 2
│ ├── imgx.png
├── Directory 3
│ ├── img.png
...
...
├── Directory n
│ ├── img.png
├── Images
I want to find and copy all the img.png files into the directory Images. Is there any way to:
Find all the img.png files (not all directories have this file)
Copy them into Images giving them unique filenames in the process (for example 2_img.png if it is being copied from Directory 2).
You're looking for something like this:
for dir in 'Directory '*; do
src=$dir/img.png
dest=Images/${dir#* }_img.png
if test -f "$src"; then
echo cp "$src" "$dest"
fi
done
Remove echo if the output looks good.

How can I get all directories that include more than one file with specific extension?

I'm trying to get directories names of those which contain more than one file with .tf extension.
Supposing this directory:
.
├── docs
│ ├── README.md
│ └── diagram.png
├── project
│ └── main.py
├── Makefile
├── terraform
│ ├── environments
│ │ ├── prod
│ │ │ └── main.tf
│ │ └── staging
│ │ └── main.tf
│ └── module
│ ├── ecs.tf
│ ├── rds.tf
│ ├── s3.tf
│ ├── security_group.tf
│ ├── sqs.tf
│ └── variable.tf
├── tests
| └── test_main.py
└── .terraform
└── ignore_me.tf
I expect terraform/module as a result.
I tried all solutions at https://superuser.com/questions/899347/find-directories-that-contain-more-than-one-file-of-the-same-extension but nothing worked as expected.
Those solution to that link almost have done what you wanted but
you could try something like this.
find . -type d -exec bash -O nullglob -c 'a=("$1"/*.tf); (( ${#a[#]} > 1 ))' bash {} \; -print
The accepted answer in that link, just remove the -c, should give you the expected output.
See:
How can I check if a directory is empty or not
Understanding the -exec option of find
In pure bash:
#!/bin/bash
shopt -s globstar dotglob
for dir in ./ **/; do
tf_files=("$dir"*.tf)
(( ${#tf_files[*]} > 1 )) && echo "${dir%?}"
done
There are many ways to achieve this. Here is one that filters using awk
$ find /path/to/root -type d -printf "\0%p\001" -o -type f -iname '*.tf' -printf 'c' | awk 'BEGIN{RS=ORS="\0";FS="\001"}(length($2)>1){print $1}'
The idea is to build a set of records of two fields.
The record separator is the null character \0
The field separator is the one-character \001
This is to avoid problems with special filenames.
The find command will then print the directory name in the first field and fill the second field with the character c for every matching file. If we would use newlines and spaces, it would look like
dir1 c
dir2 cc
dir3
dir4 cccc
The awk code is then just filtering the results based on the amount of characters in the second column.

A bash script to rename files from different directories at once

If I have a directory named /all_images, and inside this directory there's a ton of directories, all the directories named dish_num as shown below. and inside each dish directory, there's one image named rgb.png. How can i rename all the image files to be the name of its directory.
Before
|
├── dish_1
│ └── rgb.png
├── dish_2
│ └── rgb.png
├── dish_3
│ └── rgb.png
├── dish_4
│ └── rgb.png
└── dish_5
└── rgb.png
After
|
├── dish_1
│ └── dish_1.png
├── dish_2
│ └── dish_2.png
├── dish_3
│ └── dish_3.png
├── dish_4
│ └── dish_4.png
└── dish_5
└── dish_5.png
WARNING: Make sure you have backups before running code you got someplace on the Internet!
find /all_images -name rgb.png -execdir sh -c 'mv rgb.png $(basename $PWD).png' \;
where
find /all_images will start looking from the directory "/all_images"
-name rbg.png will look anywhere for anything named "rbg.png"
optionally use -type f to restrict results to only files
-exedir in every directory where you got a hit, execute the following:
sh -c shell script
mv move, or "rename" in this case
rgb.png file named "rgb.png"
$(basename $PWD).png output of "basename $PWD", which is the last section of the $PWD - the current directory - and append ".png" to it
\; terminating string for the find loop
If you want to benefited from your multi-core processors, consider using xargs instead of find -execdir to process files concurrently.
Here is a solution composed of find, xargs, mv, basename and dirname.
find all_images -type f -name rgb.png |
xargs -P0 -I# sh -c 'mv # $(dirname #)/$(basename $(dirname #)).png'
find all_images -type f -name rgb.png prints a list of file paths whose filename is exactly rgb.png.
xargs -P0 -I# CMD... executes CMD in a parallel mode with # replaced by path names from find command. Please refer to man xargs for more information.
-P maxprocs
Parallel mode: run at most maxprocs invocations of utility at once. If maxprocs is set to 0, xargs will run as many processes as possible.
dirname all_images/dash_4/rgb.png becomes all_images/dash_4
basename all_images/dash_4 becomes dash_4
Demo
mkdir all_images && seq 5 |
xargs -I# sh -c 'mkdir all_images/dash_# && touch all_images/dash_#/rgb.png'
tree
find all_images -type f -name rgb.png |
xargs -P0 -I# sh -c 'mv # $(dirname #)/$(basename $(dirname #)).png'
tree
Output
.
└── all_images
├── dash_1
│   └── rgb.png
├── dash_2
│   └── rgb.png
├── dash_3
│   └── rgb.png
├── dash_4
│   └── rgb.png
└── dash_5
└── rgb.png
.
└── all_images
├── dash_1
│   └── dash_1.png
├── dash_2
│   └── dash_2.png
├── dash_3
│   └── dash_3.png
├── dash_4
│   └── dash_4.png
└── dash_5
└── dash_5.png
6 directories, 5 files

Duplicating a nested set of folders in bash/fish

I have a folder full of other folders. Within each of these folders, there also exists another folder that I want to duplicate, but with a new name (same for all copies).
For example:
├── application
│   └── foo
│ └── bar
│   └── redacted.txt
│
├── something_different
│   └── foo
│ └── bar
│   └── RobotoMono.ttf
So every top level folder has a "foo/bar/" folder. I'd like to clone the "bar" folder (and the contents) so there is a "bar2" folder under each "foo" folder.
Then it would look like this:
├── application
│ └── foo
│ └── bar
│ └── redacted.txt
│ └── bar2
│ └── redacted.txt
│
├── something_different
│ └── foo
│ └── bar
│ └── RobotoMono.ttf
│ └── bar2
│ └── RobotoMono.ttf
I can successfully get the list with "find". Here is what I have tried:
find . -name bar -exec cp -r '{}' '{}'/bar2 \;
find . -name bar | xargs cp -r /bar2
And of course theses don't work and leave some nice looping that was fun to clean up. Thank you for reading and explaining what I'm doing incorrectly, or if I'm even close to what I should be doing.
First of all the single quotes around the parentheses will give you a string not the actual target (the directory) that you want. Removing these and copying to one level above "bar" seems to work (bash):
#> mkdir -p application/foo/bar && touch application/foo/bar/redacted.txt
#> find application -name bar -exec cp -r {} {}/../bar2 \;
#> ls -l application/foo/ | awk '{ print $NF }'
./
../
bar/
bar2/
#> ls -l application/foo/bar2/ | awk '{ print $NF }'
./
../
redacted.txt
find . \
-path '.*/foo/bar' \
-type d \
-exec sh -c 'p="${0%/*}"; n="${0##*/}"; cp -rp -- "$p/$n" "$p/${n}2"' {} \;
-exec sh -c Executes the inline shell script
Here is the content of the inline shell script with added comments:
# Extract the path before /bar from argument 0
p="${0%/*}"
# Extract the trailing name bar or anything else from argument 0
n="${0##*/}"
# Perform a recursive copy with preserved permissions
# from the source to the destination with suffix2
cp -rp -- "$p/$n" "$p/${n}2"

Bash - Combine files in separated sub folders

So I'm looking for a way to cat .html files in multiple subfolders, but by keeping them in their place.
Actual situation:
$ Folder1
.
├── Subfolder1
│ └── File1.html
└── File2.html
├── Subfolder2
│ └── File1.html
└── File2.html
Desired outcome:
$ Folder1
.
├── Subfolder1
│ └── Mergedfile1.html
└── File1.html
└── File2.html
├── Subfolder2
│ └── Mergedfile2.html
└── File1.html
└── File2.html
So far I've came up with this:
find . -type f -name *.html -exec cat {} + > Mergedfile.html
But this combines all the files of all the subfolders of Folder1, while I want to keep them separated.
Thanks a lot!
You can loop on all subfolders with a for statement:
for i in Folder1/SubFolder*; do
cat "$i"/File*.html > MergeFile$(echo "$i" | sed 's,.*\([0-9]\+\)$,\1,').html
done
Like told by AK_ , you can use find with exec.
find Folder1/ -mindepth 1 -maxdepth 1 -type d -exec sh -c "rep='{}';cat "'"$rep"'"/*.html > "'"$rep"'"/Mergedfile.html" \;

Resources