Dyamic find in shell scripting - shell

Here is the problem I'm working on
I have a bunch of files in different directories which are all in one root dir. I have scan all these folders and pick up the files in them avoiding some specific sub-dirs. These folder that I have to avoid will come from a config file. For the purpose I prepared this find command (with some help)
find ${ROOT}/au* -type f -not \( -path '*auction123/incoming*' -o -path '*autobase/incoming*'* \)
Here ROOT will also come from the config file. can I also provide a place holder for this
-path '*auction123/incoming*' -o -path '*autobase/incoming*'*
so that I can add any number of folders that has to be ignored in the find. I read at some places that find with grep is a better option.

You can do:
find ${ROOT}/au* -type f | grep -v -f files_containing_list_of_ignore_directories
Where files_containing_list_of_ignore_directories places each directory you wish to ignore on a separate line (without the wildcards *).
(This is what files_containing_list_of_ignore_directories should look like:)
auction123/incoming
autobase/incoming
...
Explanation:
find ${ROOT}/au* -type f: recursively find all files under any directories that match au* in ${ROOT}
| is the pipe operator in shell: it means "take whatever is output to stdout from the previous command (in this case, find) and feed that output to stdin of the next command (in this case, grep)
grep: invoke grep, a tool used for searching
-v: Use the --invert-match option for grep. Basically it means to "filter out" anything that is matched.
-f files_containing_list_of_ignore_directories: Supply a file that holds the list of patterns you want grep to match (and filter out in this case).

Related

How to delete all files in a dir except ones with a certain pattern in their name?

I have a lot of kernel .deb files from my custom kerenls. I would like to write a bash that would delete all the old files except the ones associated with the currently installed kernel version. My script:
#!/bin/bash
version='uname -r'
$version
dir=~/Installed-kernels
ls | grep -v '$version*' | xargs rm
Unfortunately, this deletes all files in the dir.
How can I get the currently installed kernel version and set said version as a perimeter with? Each .deb I want to keep contains the kernel version (5.18.8) but have other strings in their name (linux-headers-5.18.8_5.18.8_amd64.deb).
Edit: I am only deleting .deb files inside the noted directory. The current list of file names in the tree are
linux-headers-5.18.8-lz-xan1_5.18.8-lz-1_amd64.deb
linux-libc-dev_5.18.8-lz-1_amd64.deb
linux-image-5.18.8-lz-xan1_5.18.8-lz-1_amd64.deb
This can be done as a one-liner, though I've preserved your variables:
#!/bin/bash
version="$(uname -r)"
dir="$HOME/Installed-kernels"
find "$dir" -maxdepth 1 -type f -not -name "*$version*" -print0 |xargs -0 rm
To set a variable to the output of a command, you need either $(…) or `…`, ideally wrapped in double-quotes to preserve spacing. A tilde isn't always interpreted correctly when passed through variables, so I expanded that out to $HOME.
The find command is much safer to parse than the output of ls, plus it lets you better filter things. In this case, -maxdepth 1 will look at just that directory (no recursion), -type f seeks only files, and -not -name "*$version*" removes paths or filenames that match the kernel version (which is a glob, not a regex—you'd otherwise have to escape the dots). Also note those quotes; we want find to see the asterisks, and without the quotes, the shell will expand the glob prematurely. The -print0 and corresponding -0 ensure that you preserve spacing by delimiting entries with null characters.
You can remove the prompts regarding read-only files with rm -f.
If you also want to delete directories, remove the -type f part and add -r to the end of that final line.

Check if file is in a folder with a certain name before proceeding

So, I have this simple script which converts videos in a folder into a format which the R4DS can play.
#!/bin/bash
scr='/home/user/dpgv4/dpgv4.py';mkdir -p 'DPG_DS'
find '../Exports' -name "*1080pnornmain.mp4" -exec python3 "$scr" {} \;
The problem is, some of the videos are invalid and won't play, and I've moved those videos to a different directory inside the Exports folder. What I want to do is check to make sure the files are in a folder called new before running the python script on them, preferably within the find command. The path should look something like this:
../Exports/(anything here)/new/*1080pnornmain.mp4
Please note that (anything here) text does not indicate a single directory, it could be something like foo/bar, foo/b/ar, f/o/o/b/a/r, etc.
You cannot use -name because the search is on the path now. My first solution was:
find ./Exports -path '**/new/*1080pnornmain.mp4' -exec python3 "$scr" {} \;
But, as #dan pointed out in the comments, it is wrong because it uses the globstar wildcard (**) unnecessarily:
This checks if /new/ is somewhere in the preceding path, it doesn't have to be a direct parent.
So, the star is not enough here. Another possibility, using find only, could be this one:
find ./Exports -regex '.*/new/[^\/]*1080pnornmain.mp4' -exec python3 "$scr" {} \;
This regex matches:
any number of nested folders before new with .*/new
any character (except / to leave out further subpaths) + your filename with [^\/]*1080pnornmain.mp4
Performances could degrade given that it uses regular expressions.
Generally, instead of using the -exec option of the find command, you should opt to passing each line of find output to xargs because of the more efficient thread spawning, like:
find ./Exports -regex '.*/new/[^\/]*1080pnornmain.mp4' | xargs -0 -I '{}' python3 "$scr" '{}'

Find Command Exclude Hidden files when using empty flag

I am looking for a way to use the find command to tell if a folder has no files in it. I have tried using the -empty flag, but since I am on macOS the system files the OS places in the directory such as .DS_Store cause find to not consider the directory empty. I have tried telling find to ignore .DS_Store but it still considers the directory not empty because that file is present.
Is there a way to have find exclude certain files from what it considers -empty? Also is there a way to have find return a list of directories with no visible files?
The -empty predicate is rather simple, it's true for a directory if it has any entries other than . or ...
Kind of an ugly solution, but you can use -exec to run another find in each directory which will implement your criteria for deciding what directories you want to include.
Below:
the outer find will execute sh -c for each directory in /starting/point
sh will execute another find with different criteria.
the inner find will print the first match and then quit
read will consume the output (if any) of the inner find. read will have an exit status of 0 only if the inner find printed at least one line, non-zero otherwise
if there was no output from the inner find, the outer find's -exec predicate will evaluate to false
since -exec is followed by -o, the following -print action will be executed only for those directories which do not match the inner find's criteria
find /starting/point \
-type d \( \
-exec sh -c \
'find "$1" -mindepth 1 -maxdepth 1 ! -name ".*" -print -quit | read' \
sh {} \; \
-o -print \
\)
Also note that the 'find FOLDER -empty' is somewhat tricky. It will consider FOLDER empty even if it contains files, as long as these are empty.
Maybe not exactly what was asked, but I prefer the brute force approach if I want to avoid a no-match error on using FOLDER/*. In tcsh:
ls -d FOLDER/* >& /dev/null
if !($status) COMMANDS FOLDER/* ...
A variation of this might be usable here (like also using
ls -d FOLDER/.* | wc -l
and drawing the desired conclusions from the combined results).

Find, unzip and grep the content of multiple files in one step/command

First I made a question here: Unzip a file and then display it in the console in one step
It works and helped me a lot. (please read)
Now I have a second issue. I do not have a single zipped log file but I have a lot of them in defferent folders, which I need to find first. The files have the same names. For example:
/somedir/server1/log.gz
/somedir/server2/log.gz
/somedir/server3/log.gz
and so on...
What I need is a way to:
find all the files like: find /somedir/server* -type f -name log.gz
unzip the files like: gunzip -c log.gz
use grep on the content of the files
Important! The whole should be done in one step.
I cannot first store the extracted files in the filesystem because it is a readonly filesystem. I need somehow to connect, with pipes, the output from one command to the input of the next.
Before, the log files were in text format (.txt), therefore I had not to unzip them first. In this case it was easy:
ex.
find /somedir/server* -type f -name log.txt | xargs grep "term"
Now I have to deal with zipped files. That means, after I find the files, I need first somehow do unzip them and then send the contents to grep.
With one file I do:
gunzip -p /somedir/server1/log.gz | grep term
But for multiple files I don't know how to do it. For example how to pass the output of find to gunzip and the to grep?!
Also if there is another way / "best practise" how to do that, it is welcome :)
find lets you invoke a command on the files it finds:
find /somedir/server* -type f -name log.gz -exec gunzip -c '{}' + | grep ...
From the man page:
-exec command {} +
This variant of the -exec action runs the specified command on
the selected files, but the command line is built by appending
each selected file name at the end; the total number of
invocations of the command will be much less than the number
of matched files. The command line is built in much the same
way that xargs builds its command lines. Only one instance of
{} is allowed within the command, and (when find is being
invoked from a shell) it should be quoted (for example, '{}')
to protect it from interpretation by shells. The command is
executed in the starting directory. If any invocation with
the + form returns a non-zero value as exit status, then
find returns a non-zero exit status. If find encounters an
error, this can sometimes cause an immediate exit, so some
pending commands may not be run at all. This variant of -exec
always returns true.

bash find command for directories without permission denied errors

I want to find all files with the name java in folders containing /current/jre/bin/ and without the many permission denied errors.
So I thought find / -type d '*/current/jre/bin/*' 2>/dev/null should do the job.
But the return is nothing. I also tried it without the *, with -wholename (with and without *), with an additional -name, -name but without -type d and some other commands.
If I instead search for the java files with find / -name 'java' 2>/dev/null I receive eleven path, from which I only need three.
Putting the '*/current/jre/bin/*' after -type d confuses find so it cannot determine which path you want to search. If you removed the 2>/dev/null you would see the error find: paths must precede expression.
Instead, use a pipe to grep:
find / -name 'java' 2>/dev/null | grep '/current/jre/bin/'
The proper way to say "the path must contain this" is with -path
find / -type d -path '*/current/jre/bin/*' 2>/dev/null
Specifying a bare string in the predicates is an error, which you would easily have found out if you didn't redirect error messages to /dev/null. Even then, having the command return immediately even though you are scanning the entire file tree should be a dead giveaway.
Pro tip: also add -xdev to the options, to avoid having find going into /dev and etc. If you have your files split on multiple partitions, you will then need to specify each partition you want to search in the path list before the predicates.
(The general syntax is find path1 path2 path3 ... -list -of -predicates.)

Resources