Bash - Directory dependent script - bash

I am trying to run a python script in a directory and using bash apply this script to each of its subdirectories.
I found a script on unix stack exchange that does it for 1 set of subdirectories here . But I want it to recursively work for all sub-directories.
The problem is I have a single wav.py in the parent directory but none in the sub-directories.
for d in ./*/ ; do (cd "$d" && python3 $1 SA1.wav); done
As you can see $1 (wav.py) is the path to my python file set when I call the bash script. I would also like the path to be relative to how many levels of the subdirectory tree I have traversed. I know I can use an absolute path. But it will cause issues later on, so I'd like to avoid it.
Eg. for 1 level
for d in ./*/ ; do (cd "$d" && python3 "../$1" SA1.wav); done
for 2 levels
for d in ./*/ ; do (cd "$d" && python3 "../../$1" SA1.wav); done
Sorry if this seems trivial. I'm still new to bash.
Additional Info:
This is my full directory path:
root#Chiku-Y700:/mnt/e/Code/Python - WorkSpace/timit/TIMIT/TEST/DR1# bash recursive.sh wav.py suit rag
the full command I'm trying to run is:
python3 $1 SA1.wav $2 SA2.wav $3
$2 and $3 are unrelated to any directory info.
I get:
python3: can't open file '/mnt/e/Code/Python': [Errno 2] No such file or directory
This error came 12 times for 11 subdirectories.

Let's look at your command, with wav.py being $1:
for d in ./*/ ; do (cd "$d" && python3 $1 SA1.wav); done
Can we reduce the complexity by making wav.py executable and giving it a shebang, so that you can call it directly? Then you may move it to your PATH or temporarily putting the location, where it sits, into the path. It's generally best habit, that your script does not depend upon the place, from where it is invoked, especially that it isn't needed to be in the same directory from where it is called.
PATH=$PWD:$PATH
for d in ./*/ ; do (cd "$d" && wav.py SA1.wav); done
The input data should not depend on the same directory restriction, so that you can call it from every dir and with the data being in an arbitrary dir, too:
for d in ./*/ ; do wav.py $d/SA1.wav; done
Probably, you produce an output file, which is written to the current directory, then you either should extract the output dir from the input dir, if this is what you always want to achieve, or let the user specify an output dir. A default outputdir might still be the inputdir or the current dir. Or you write to STDOUT, and pipe the output to a file, located to your choice.
But your full command is:
python3 $1 SA1.wav $2 SA2.wav $3
That's fine for simple commands, but maybe you can name these parameters in a meaningful way:
pyprog="$1"
samplerate="$2"
log="$3"
python3 $pyprog SA1.wav $samplerate SA2.wav $log
or, as done before
$pyprog SA1.wav "$samplerate" SA2.wav "$log"
Then, John1024s solution might work:
find . -type d -execdir $pyprog SA1.wav "$samplerate" SA2.wav "$log" ";"
If changing pyprog is not an option, there is a second approach to solve the problem:
Write a wrapper script, which takes the directory to work in as a parameter and test it with different depths of directories.
Then call that wrapper by find:
find . -type d -exec ./wrapper.sh {} ";"
The wrapper.sh should start with:
#/bin/bash
#
#
directory="$1"
and use it where needed.
Btw.: I would rename the Python - WorkSpace to Python-WorkSpace (even better: python-workSpace) too, because blanks in file and path names always cause trouble.

Related

Recursively read folders and executes command on each of them if a file exists in that folder

I have this command which is fully functional and really like it.
for d in ./*/ ; do (cd "$d" && ../Process.sh); done
It goes through all the subfolders and runs the Process.sh inside each folder.
What I need is, it just run the Process.sh if the "kill_by_pid" file exists in the folder. Unless, if the "kill_by_pid" does not exist in that specific folder, the folder be skipped and it just move to the next folder.
What about using find and execdir, for example:
find . -type f -name "kill_by_pid" -execdir ../Process.sh \;
This will only call the Process.sh the script on the parent directory where the file kill_by_pid exists.
Try this,
for d in ./*/ ;
do
if [ -f "$d"/kill_by_pid ]; then
(cd "$d" && ../Process.sh);
fi
done
A slightly different approach:
for f in ./*/kill_by_pid; do
pushd -- "${f#/*}"
../Process.sh
popd
done
This finds each kill_by_pid file, then determines the name of the directory that contains it. (Using pushd/popd is just to demonstrate
a way of changing and restoring the current directory that doesn't require a subshell, in the event Process.sh needs to execute in the current shell session.)

Find all duplicate subdirectories in directory

I need to make a shell script that "lists all identical sub-directories (recursively) under the current working directory."
I'm new to shell scripts. How do I approach this?
To me, this means:
for each directory starting in some starting directory, compare it to every other directory it shares by name.
if the other directory has the same name, check size.
if same size also, recursively compare contents of each directory item by item, maybe by md5sum(?) and continuing to do so for each subdirectory within the directories (recursively?)
then, continue by recursively calling this on every subdirectory encountered
then, repeat for every other directory in the directory structure
It would have been the most complicated program I'd have ever written, so I assume I'm just not aware of some shell command to do most of it for me?
I.e., how should I have approached this? All the other parts were about googling until I discovered the shell command that did it 90% of it for me.
(For a previous assignment that I wasn't able to finish, took a zero on this part, need to know how to approach it in the future.)
I'd be surprised to hear that there is a special Unix tool or special usage of a standard Unix tool to do exactly what you describe. Maybe your understanding of the task is more complex than what the task giver intended. Maybe with "identical" something concerning linking was meant. Normally, hardlinking directories is not allowed, so this probably also isn't meant.
Anyway, I'd approach this task by creating checksums for all nodes in your tree, i. e. recursively:
For a directory take the names of all entries and their checksums (recursion) and compute a checksum of them,
for a plain file compute a checksum of its contents,
for symlinks and special files (devices, etc.) consider what you want (I'll leave this out).
After creating checksums for all elements, search for duplicates (by sorting a list of all and searching for consecutive lines).
A quick solution could be like this:
#!/bin/bash
dirchecksum() {
if [ -f "$1" ]
then
checksum=$(md5sum < "$1")
elif [ -d "$1" ]
then
checksum=$(
find "$1" -maxdepth 1 -printf "%P " \( ! -path "$1" \) \
-exec bash -c "dirchecksum {}" \; |
md5sum
)
fi
echo "$checksum"
echo "$checksum $1" 1>&3
}
export -f dirchecksum
list=$(dirchecksum "$1" 3>&1 1>/dev/null)
lastChecksum=''
while read checksum _ path
do
if [ "$checksum" = "$lastChecksum" ]
then
echo "duplicate found: $path = $lastPath"
fi
lastChecksum=$checksum
lastPath=$path
done < <(sort <<< "$list")
This script uses two tricks which might not be clear, so I mention them:
To pass a shell function to find -exec one can export -f it (done below it) and then call bash -c ... to execute it.
The shell function has two output streams, one for returning the result checksum (this is via stdout, i. e. fd 1), and one for giving out each checksum found on the way to this (this is via fd 3).
The sorting at the end uses the list given out via fd 3 as input.
Maybe something like this:
$ find -type d -exec sh -c "echo -n {}\ ; sh -c \"ls -s {}; basename {}\"|md5sum " \; | awk '$2 in a {print "Match:"; print a[$2], $1; next} a[$2]=$1{next}'
Match:
./bar/foo ./foo
find all directories: find -type d, output:
.
./bar
./bar/foo
./foo
ls -s {}; basename {} will print the simplified directory listing and the basename of the directory listed, for example for directory foo: ls -s foo; basename foo
total 0
0 test
foo
Those will cover the files in each dir, their sizes and the dir name. That output will be sent to md5sum and that along the dir:
. 674e2573b49826d4e32dfe81d9680369 -
./bar 4c2d588c5fa9781ad63ad8e86e575e01 -
./bar/foo ff8d1569685be86366f18ea89851db35 -
./foo ff8d1569685be86366f18ea89851db35 -
will be sent to awk:
$2 in a { # hash as array key
print "Match:" # separate hits in output
print a[$2], $1 # print matching dirscompared to
next # next record
}
a[$2]=$1 {next} # only first match is stored and
Test dir structure:
$ mkdir -p test/foo; mkdir -p test/bar/foo; touch test/foo/test; touch test/bar/foo/test
$ find test/
test/
test/bar
test/bar/foo
test/bar/foo/test # touch test
test/foo
test/foo/test # touch test

Do actions in each folder from current directory via terminal

I'm trying to run a series of commands on a list of files in multiple directories located directly under the current branch.
An example hierarchy is as follows:
/tmp
|-1
| |-a.txt
| |-b.txt
| |-c.txt
|-2
| |-a.txt
| |-b.txt
| |-c.txt
From the /tmp directory I'm sitting at my prompt and I'm trying to run a command against the a.txt file by renaming it to d.txt.
How do I get it to go into each directory and rename the file? I've tried the following and it won't work:
for i in ./*; do
mv "$i" $"(echo $i | sed -e 's/a.txt/d.txt/')"
done
It just doesn't jump into each directory. I've also tried to get it to create files for me, or folders under each hierarchy from the current directory just 1 folder deep, but it won't work using this:
for x in ./; do
mkdir -p cats
done
OR
for x in ./; do
touch $x/cats.txt
done
Any ideas ?
Place the below script in your base directory
#!/bin/bash
# Move 'a.txt's to 'd.txt's recursively
mover()
{
CUR_DIR=$(dirname "$1")
mv "$1" "$CUR_DIR/d.txt"
}
export -f mover
find . -type f -name "a.txt" -exec bash -c 'mover "$0"' {} \;
and execute it.
Note:
If you wish be a bit more innovative and generalize the script, you could accept directory name to search for as a parameter to the script and pass the directory name to find
> for i in ./*; do
As per your own description, this will assign ./1 and then ./2 to i. Neither of those matches any of the actual files. You want
for i in ./*/*; do
As a further aside, the shell is perfectly capable of replacing simple strings using glob patterns. This also coincidentally fixes the problem with not quoting $i when you echo it.
mv "$i" "${i%/a.txt}/d.txt"

How to cd into multiple directories, zip it (naming it after the directory name)?

So I have a bash script that cds into a directory, executes a command and then exits and enters a new directory again:
for d in ./*/ ; do (cd "$d" && somecommand); done
From here.
Unfortunately I'm not sure how to zip up the directory it is in (maybe using something such as 7z). It was a long shot but I tried this command and it didn't work (I didn't expect the asterisk to take the name of the directory...but I hoped):
7z a -r *.zip
I don't suppose anyone has any suggestions?
The variable $d contains the name of the directory (among other things):
for d in ./*/ ; do (
cd "$d"
dirname=${d%/} # remove trailing /
dirname=${dirname##*/} # remove everything up to the last /
7z a -r "$dirname".zip
)
done
I'm assuming that your 7z command was correct.
Perhaps you're looking for something like this:
for d in *; do
test -d "$d" && zip -r "$d.zip" "$d"
done
That examines all files in the working directory whose names do not begin with '.' (for d in *). For those that are directories (test -d $d) it zips the directory contents, recursively, as members of a directory. The zip files are left in the original working directory (the parent of all the directories that get zipped), but they could as easily be put into the subdirectories.

Collapse nested directories in bash

Often after unzipping a file I end up with a directory containing nothing but another directory (e.g., mkdir foo; cd foo; tar xzf ~/bar.tgz may produce nothing but a bar directory in foo). I wanted to write a script to collapse that down to a single directory, but if there are dot files in the nested directory it complicates things a bit.
Here's a naive implementation:
mv -i $1/* $1/.* .
rmdir $1
The only problem here is that it'll also try to move . and .. and ask overwrite ./.? (y/n [n]). I can get around this by checking each file in turn:
IFS=$'\n'
for file in $1/* $1/.*; do
if [ "$file" != "$1/." ] && [ "$file" != "$1/.." ]; then
mv -i $file .
fi
done
rmdir $1
But this seems like an inelegant workaround. I tried a cleaner method using find:
for file in $(find $1); do
mv -i $file .
done
rmdir $1
But find $1 will also give $1 as a result, which gives an error of mv: bar and ./bar are identical.
While the second method seems to work, is there a better way to achieve this?
Turn on the dotglob shell option, which allows the your pattern to match files beginning with ..
shopt -s dotglob
mv -i "$1"/* .
rmdir "$1"
First, consider that many tar implementations provide a --strip-components option that allows you to strip off that first path. Not sure if there is a first path?
tar -tf yourball.tar | awk -F/ '!s[$1]++{print$1}'
will show you all the first-level contents. If there is only that one directory, then
tar --strip-components=1 -tf yourball.tar
will extract the contents of that directory in tar into the current directory.
So that's how you can avoid the problem altogether. But it's also a solution to your immediate problem. Having extracted the files already, so you have
foo/bar/stuff
foo/bar/.otherstuff
you can do
tar -cf- foo | tar --strip-components=2 -C final_destination -xf-
The --strip-components feature is not part of the POSIX specification for tar, but it is on both the common GNU and OSX/BSD implementations.

Resources