Programatically performing the same operation in multiple same-level subfolders - shell

I want to write a shell script that would perform a single operation - pdflatex - on multiple files of the same type in same-level subfolders. Here's a simplified version of the directory structure:
/-|
/a-|
| a1.tex
|
/b-|
| b1.tex
|
/c-|
c1.tex
What I'd like to do is have the script launch from /, and perform pdflatex on all the .tex files without having to manually include all of those subdirs in the actual script file.
The algorithm is simple enough on paper:
Go to directory /
Find all directories matching regex given in script (may also use wildcard expression)
Go down n levels, again following regex as needed
Perform pdflatex on all .tex files in current directory
... But I'm not sure how to implement this in a Unix shell script.
Do shell scripts allow for this kind of operation at all? If so, what would an implementation look like?

find is what you seem to be looking for:
find / -mindepth 1 -maxdepth 1 -type f -exec pdflatex {} \;
This would execute pdflatex on all files in the current directory.
In order to perform the action only for files n levels down, replace 1 with n+1 for both -mindepth and -maxdepth options above.
In order to say that you want to filenames matching *.tex, say
find / -mindepth 1 -maxdepth 1 -type f -name "*.tex" -exec pdflatex {} \;

The only thing I might add from what others have said is a couple of parameters, given the desire as such in the question to turn it into a contrived little program.
pdflatesxdirs
#!/bin/bash
pattern='*.tex'
if [ -z $1 ]; then
pattern="${1}.tex"
fi
maxdepth=' '
if [ ! -z $2 ]; then
maxdepth=" -maxdepth $2 "
fi
# borrowed from #devnull ;)
find / -mindepth 1${maxdepth}-type f -name "$pattern" -exec pdflatex {} \;
Then chmod u+x pdflatexdirs
Now call it as
pdflatexdirs [optional search pattern] [optional maxdepth]
With this script:
The default search pattern here is *.tex
If you want to specify maxdepth, you must also provide a search
When you do provide a search, *.tex is appended to whatever you pass

Related

Find Command Exclude Hidden files when using empty flag

I am looking for a way to use the find command to tell if a folder has no files in it. I have tried using the -empty flag, but since I am on macOS the system files the OS places in the directory such as .DS_Store cause find to not consider the directory empty. I have tried telling find to ignore .DS_Store but it still considers the directory not empty because that file is present.
Is there a way to have find exclude certain files from what it considers -empty? Also is there a way to have find return a list of directories with no visible files?
The -empty predicate is rather simple, it's true for a directory if it has any entries other than . or ...
Kind of an ugly solution, but you can use -exec to run another find in each directory which will implement your criteria for deciding what directories you want to include.
Below:
the outer find will execute sh -c for each directory in /starting/point
sh will execute another find with different criteria.
the inner find will print the first match and then quit
read will consume the output (if any) of the inner find. read will have an exit status of 0 only if the inner find printed at least one line, non-zero otherwise
if there was no output from the inner find, the outer find's -exec predicate will evaluate to false
since -exec is followed by -o, the following -print action will be executed only for those directories which do not match the inner find's criteria
find /starting/point \
-type d \( \
-exec sh -c \
'find "$1" -mindepth 1 -maxdepth 1 ! -name ".*" -print -quit | read' \
sh {} \; \
-o -print \
\)
Also note that the 'find FOLDER -empty' is somewhat tricky. It will consider FOLDER empty even if it contains files, as long as these are empty.
Maybe not exactly what was asked, but I prefer the brute force approach if I want to avoid a no-match error on using FOLDER/*. In tcsh:
ls -d FOLDER/* >& /dev/null
if !($status) COMMANDS FOLDER/* ...
A variation of this might be usable here (like also using
ls -d FOLDER/.* | wc -l
and drawing the desired conclusions from the combined results).

how to count files only in specific subdirectories located deeply in the hierarchy?

I need to count all sessions files sess_* located in TMP directories (Debian machine) and know path to each TMP with the count for each one.
All parent direcotries are in /somepath/to/clientsDirs.
The directory structure for one client is
../ClientDirX/webDirYX/someDirZx
../ClientDirX/webDirYX/someDirZy
../ClientDirX/webDirYX/tmp
../ClientDirX/webDirYX/someDirZz
../ClientDirX/webDirYX/...
../ClientDirX/webDirYX/someDirZN
../ClientDirX/webDirYY/someDirZx
../ClientDirX/webDirYY/someDirZy
../ClientDirX/webDirYY/tmp
../ClientDirX/webDirYY/someDirZz
../ClientDirX/webDirYY/...
../ClientDirX/webDirYY/someDirZN
all someDirZ and tmp directories have a various count of subdirectories. Sessions files are in tmp dir only and not in tmp subdirectories. In one tmp dir could be more than millions sess_* files, so the solution needs to be very time effective.
X, YY, etc. in directory names are always numbers, but not in a continuous line, e.g.:
ClientDir1/webDir3/*
ClientDir4/webDir31/*
ClientDir4/webDir35/*
ClientDir18/webDir2/*
Could you please help me count all sess_* files in each tmp dir by command line or bash script?
EDIT: change of answer after changing the sense of a question
The whole task is divided into 3 parts.
I changed the directory names to simpler.
1.Build a list of tmp directories to search (first script)
#!/bin/bash
find /var/log/clients/sd*/wd*/ -maxdepth 1 -type d -name "tmp" >list
explanation
-type d only search for directories
-maxdpth 1 specifies the maximum search depth
-name specifies the name of the items sought
>list redirects the result to the list file
* it is so-called shell globbing in this case means
any string of characters
We perform this task for two reasons in a separate file. First of all, the execution time will be significant. Secondly, the list of customers does not change very often and it makes no sense to check it every time.
2.iterating loop over list items in bash (see finaly script)
3.search for sess_* files in the tmp directory without including subdirectories
find /path/to/tmp -maxdepth 1 -type f -name "sess_*" -exec printf "1" \; |wc -c
explanation
-type f only searches files
-exec executes any system command in this case, printf
\; necessary part ending the -exec command, must contain a space!
-exec printf is used because not every version of find has a printf command built in, so this will also work on busyboxes or outside of the GNU world
If your find has printf, use it instead of -exec (-printf "1")
For more, see command man find
Finally the second script:
#!/bin/bash
for x in `cat list`
do
printf "%s \t" $x
find $x -maxdepth 1 -type f -name "sess_*" -exec printf "1" \; | wc -c
done
Example result:
/var/log/clients/sd1/wd1/tmp 3
/var/log/clients/sd2/wd1/tmp 62
EDIT:
Note in some versions find GNU (eg 4.7.0-git) when the order -maxdepth 1 changes the -type f program throws worning or does not work. It seems that these versions do not use the getopt mechanism for some reason. Other versions of find do not seem to have this problem.

Using cp -ur, but update only directories

I have two directories that I want to contain mostly the same information.
I wrote the basic script
#!/bin/bash
# Update the two folders
cp -vur ../catkin_ws/src .
cp -vur . ../catkin_ws/src
Now I want to change this, to only update the directories and their content but not other files on the top level directory, like the bash script itself.
If that is not possible, is there a way to exclude certain files during the update?
Suppose you want to synchronize ../catkin_ws/src with the current directory, and the current script is located in the current directory. As far as I understand, you want to synchronize only the top-level directories including their contents, but not other types of nodes possibly located at the top level, i.e. directly within ../catkin_ws/src, or ./.
Then it is easily done with find command:
src_dir="../catkin_ws/src"
find "$src_dir" -mindepth 1 -maxdepth 1 -type d \
-exec cp -vru {} . \;
find ./ -mindepth 1 -maxdepth 1 -type d \
-exec cp -vru {} "$src_dir" \;
where {} stands for the next directory found.
If you want to filter further, you may use extra options such as -name, -path, or -regex. For example, the following skips directory x:
find "$src_dir" -mindepth 1 -maxdepth 1 -type d \
! -name 'x' \
-exec cp -vru {} . \;
where ! is causes the next expression to return True, if the expression is false, i.e. acts as some kind of logical NOT operator.
The drawback of the above commands is that the cp command is launched for each folder sequentially (due to \;). If you want to run single cp command for all sources, you may use an approach described in this answer.
P.S.
I didn't try to suggest better way to synchronize the directories, but only suggested a way to fix your current approach.

How to rename multiple directories in bash using a symbol pattern

I am very new to bash so please don't overcomplicate the answer!
I have roughly 200 sub-directories each named similarly to this. (I think they are sub directories. They live within another directory at least.)
XMMXCS J083454.8+553420.58
I need to bulk rename all of these directories and change the '+' in the directory name to '-'.
To change the names of my directory I have tried:
find . -depth -type d -name + -exec sh -c 'mv "${0}" "${0%/+}/-"' {} \;
and
find . -name + -type d -execdir mv {} - \
However I think this isn't working because + and - aren't letter characters.
How do I get around this?
Everything I have found online relates to renaming files as opposed to directories, and if anyone knows how to get round this without having to rename them all manually it would be very appreciated.
This previous question I have tried and the syntax doesn't work for me. The folders are all called the same thing after running.
Rename multiple directories matching pattern
Thanks
You can have a script like this.
#!/bin/bash
DIR='.' ## Change to the directory you want.
for SDIR in "$DIR"/*; do
[[ -d $SDIR ]] || continue ## Skip if it's not a directory
BASE=${SDIR##*/} ## Gets the base filename (removes directory part)
NEW_NAME=${BASE//+/-} ## Creates a new name based from $BASE with + chars changed to -
echo mv -- "$SDIR" "$DIR/$NEW_NAME" ## Rename. Remove echo if you think it works the right way already.
done
Then run bash script.sh.
Your original syntax was pretty close, try something like this
find -mindepth 1 -maxdepth 1 -type d -name '*+*' -exec bash -c 'mv "${0}" "${0//+/-}"' {} \;
Issues
-depth Performs a dfs traversal, but it seems like you only want directories one level deep
You need to match globs that contain +. So *+* and not just + (quoting is needed with globs so they get processed by find and not the shell)
With "${0%/+}/-" you seem to be mixing up a few syntaxes, ${0//SUBSTRING/TO_REPLACE} with replace all instances of SUBSTRING with TO_REPLACE

Unix find: list of files from stdin

I'm working in Linux & bash (or Cygwin & bash).
I have a huge--huge--directory structure, and I have to find a few needles in the haystack.
Specifically, I'm looking for these files (20 or so):
foo.c
bar.h
...
quux.txt
I know that they are in a subdirectory somewhere under ..
I know I can find any one of them with
find . -name foo.c -print. This command takes a few minutes to execute.
How can I print the names of these files with their full directory name? I don't want to execute 20 separate finds--it will take too long.
Can I give find the list of files from stdin? From a file? Is there a different command that does what I want?
Do I have to first assemble a command line for find with -o using a loop or something?
If your directory structure is huge but not changing frequently, it is good to run
cd /to/root/of/the/files
find . -type f -print > ../LIST_OF_FILES.txt #and sometimes handy the next one too
find . -type d -print > ../LIST_OF_DIRS.txt
after it you can really FAST find anything (with grep, sed, etc..) and update the file-lists only when the tree is changed. (it is a simplified replacement if you don't have locate)
So,
grep '/foo.c$' LIST_OF_FILES.txt #list all foo.c in the tree..
When want find a list of files, you can try the following:
fgrep -f wanted_file_list.txt < LIST_OF_FILES.txt
or directly with the find command
find . type f -print | fgrep -f wanted_file_list.txt
the -f for fgrep mean - read patterns from the file, so you can easily grepping input for multiple patterns...
You shouldn't need to run find twenty times.
You can construct a single command with a multiple of filename specifiers:
find . \( -name 'file1' -o -name 'file2' -o -name 'file3' \) -exec echo {} \;
Is the locate(1) command an acceptable answer? Nightly it builds an index, and you can query the index quite quickly:
$ time locate id_rsa
/home/sarnold/.ssh/id_rsa
/home/sarnold/.ssh/id_rsa.pub
real 0m0.779s
user 0m0.760s
sys 0m0.010s
I gave up executing a similar find command in my home directory at 36 seconds. :)
If nightly doesn't work, you could run the updatedb(8) program by hand once before running locate(1) queries. /etc/updatedb.conf (updatedb.conf(5)) lets you select specific directories or filesystem types to include or exclude.
Yes, assemble your command line.
Here's a way to process a list of files from stdin and assemble your (FreeBSD) find command to use extended regular expression matching (n1|n2|n3).
For GNU find you may have to use one of the following options to enable extended regular expression matching:
-regextype posix-egrep
-regextype posix-extended
echo '
foo\\.c
bar\\.h
quux\\.txt
' | xargs bash -c '
IFS="|";
find -E "$PWD" -type f -regex "^.*/($*)$" -print
echo find -E "$PWD" -type f -regex "^.*/($*)$" -print
' arg0
# note: "$*" uses the first character of the IFS variable as array item delimiter
(
IFS='|'
set -- 1 2 3 4 5
echo "$*" # 1|2|3|4|5
)

Resources