How to use find utility with logical operators and post processing - shell

Is there way to find all directories that have executable file that matches a partial name of parent directory?
Situation
/distribution/software_a_v1.0.0/software_a
/distribution/software_a_v1.0.1/software_a
/distribution/software_a_v1.0.2/config.cfg
I need result
/distribution/software_a_v1.0.0/software_a
/distribution/software_a_v1.0.1/software_a
I've gotten only so far
find /distribution -maxdepth 1 -type d #and at depth 2 -type f -perm /u=x and binary name matches directory name, minus version

Another way using awk:
find /path -type f -perm -u=x -print | awk -F/ '{ rec=$0; sub(/_v[0-9].*$/,"",$(NF-1)); if( $NF == $(NF-1) ) print rec }'
The awk part is based on your sample and stated condition ... name matches directory name, minus version. Modify it if needed.

I would use grep:
find /distribution -maxdepth 1 -type d | grep "/distribution/software_\w_v\d*?\.\d*?\.\d*?/software_\w"

I don't know if this is the most efficient, but here's one way you could do it, using just bash...
for f in /distribution/*/*
do
if [[ -f "${f}" && -x "${f}" ]] # it's a file and executable
then
b="${f##*/} # get just the filename
[[ "${f}" =~ "/distribution/${b}*/${b}" ]] && echo "${f}"
fi
done

Related

Select parent directory if non-unique directory is found

Hello I am trying to figure out how I can parse directories using built-in bash functionality.
The directory structure would look something like.
/home/mikal/PluginSDK/vendor_name1/ver1/plugin_name/plugin-config.json
/home/mikal/PluginSDK/vendor_name1/ver2/plugin_name/plugin-config.json
/home/mikal/PluginSDK/vendor_name2/ver1/plugin_name/plugin-config.json
/home/mikal/PluginSDK/vendor_name3/plugin_name/plugin-config.json
So far I have narrowed down to the name of the plugin which covers most of what I needed for the rest of the script.
find /home/mikal/PluginSDK -type f -name plugin-config.json | sed -r 's|/[^/]+$||' | awk -F "/" '{print $NF}'
The problem that I am running into is when the same vendor has different versions of plugin available for the same release. We may not always want to run a newer version of the plugin due to compatibility or performance of the plugin so having these show something like ver1-plugin_name or similar would be preferrable. I can't find anything that would be able to pick out the non-unique plugin/version so that I can make an array with all of the options.
This is the entirety of what I have written right now for this section of the script I am writing to make configuration changes to the system.
options=()
while IFS= read -r line; do
options+=( "$line" )
done < <( find /home/mikal/PluginSDK -type f -name plugin-config.json | sed -r 's|/[^/]+$||' | awk -F "/" '{print $NF}' )
select opt_number in "${options[#]}" "Quit";
do
if [[ $opt_number == "Quit" ]];
then
echo "Quitting"
break;
else
find /home/mikal/PluginSDK -type f -name plugin-config.json -exec sh -c "sed -i 's/"preferred": true/"preferred": false/g'" {} \;
find /home/mikal/PluginSDK/${options[$(($REPLY-1))]} -type f -name plugin-config.json -exec sh -c "sed -i 's/"preferred": false/"preferred": true/g'" {} \;
break;
fi
done
Desired output for the entire thing would be something like.
1.) Ver1-Plugin_name
2.) Ver2-Plugin_name
3.) Plugin_name
4.) Plugin_name
5.) Quit
I apologize if my formatting is bad. First time posting.
Maybe
lst=( Quit
$( find /home/mikal/PluginSDK -type f -name plugin-config.json |
awk -F/ '{ if (7==NF) { print $6 } else { print $6"-"$7 } }' )
select opt_number in "${lst[#]}"
. . .
You might want to c.f. BashFAQ 20 if your filenames could have any weirdness like embedded spaces.

How to get list of certain strings in a list of files using bash?

The title is maybe not really descriptive, but I couldn't find a more concise way to describe the problem.
I have a directory containing different files which have a name that e.g. looks like this:
{some text}2019Q2{some text}.pdf
So the filenames have somewhere in the name a year followed by a capital Q and then another number. The other text can be anything, but it won't contain anything matching the format year-Q-number. There will also be no numbers directly before or after this format.
I can work something out to get this from one filename, but I actually need a 'list' so I can do a for-loop over this in bash.
So, if my directory contains the files:
costumerA_2019Q2_something.pdf
costumerB_2019Q2_something.pdf
costumerA_2019Q3_something.pdf
costumerB_2019Q3_something.pdf
costumerC_2019Q3_something.pdf
costumerA_2020Q1_something.pdf
costumerD2020Q2something.pdf
I want a for loop that goes over 2019Q2, 2019Q3, 2020Q1, and 2020Q2.
EDIT:
This is what I have so far. It is able to extract the substrings, but it still has doubles. Since I'm already in the loop and I don't see how I can remove the doubles.
find original/*.pdf -type f -print0 | while IFS= read -r -d '' line; do
echo $line | grep -oP '[0-9]{4}Q[0-9]'
done
# list all _filanames_ that end with .pdf from the folder original
find original -maxdepth 1 -name '*.pdf' -type f -print "%p\n" |
# extract the pattern
sed 's/.*\([0-9]{4}Q[0-9]\).*/\1/' |
# iterate
while IFS= read -r file; do
echo "$file"
done
I used -print %p to print just the filename, instead of full path. The GNU sed has -z option that you can use with -print0 (or -print "%p\0").
With how you have wanted to do this, if your files have no newline in the name, there is no need to loop over list in bash (as a rule of a thumb, try to avoid while read line, it's very slow):
find original -maxdepth 1 -name '*.pdf' -type f | grep -oP '[0-9]{4}Q[0-9]'
or with a zero seprated stream:
find original -maxdepth 1 -name '*.pdf' -type f -print0 |
grep -zoP '[0-9]{4}Q[0-9]' | tr '\0' '\n'
If you want to remove duplicate elements from the list, pipe it to sort -u.
Try this, in bash:
~ > $ ls
costumerA_2019Q2_something.pdf costumerB_2019Q2_something.pdf
costumerA_2019Q3_something.pdf other.pdf
costumerA_2020Q1_something.pdf someother.file.txt
~ > $ for x in `(ls)`; do [[ ${x} =~ [0-9]Q[1-4] ]] && echo $x; done;
costumerA_2019Q2_something.pdf
costumerA_2019Q3_something.pdf
costumerA_2020Q1_something.pdf
costumerB_2019Q2_something.pdf
~ > $ (for x in *; do [[ ${x} =~ ([0-9]{4}Q[1-4]).+pdf ]] && echo ${BASH_REMATCH[1]}; done;) | sort -u
2019Q2
2019Q3
2020Q1

Print the content of all the files in the newest directory in BASH [duplicate]

Is there any sort option available in find command to get directory with least access date/time
find . -type d -printf "%A# %p\n" | sort -n | tail -n 1 | cut -d " " -f 2-
If you prefer the filename without leading path, replace %p by %f.
the below linux command displays the access and modified time along with size
stat -f
find -type d -printf '%T+ %p\n' | sort | head -1
source
find -type d -printf '%T+ %p\n' | sort
This sound like more of a job for ls:
ls -ultd *|grep ^d
The problem with using find, at least on my system (cygwin/bash), is that find accesses the dirs, so all access-times result in current time, defeating your apparent purpose.
A simple shell script will also do:
unset -v oldest
for i in "$dir"/*; do
[ "$i" -ot "$oldest" -o "$oldest" = "" ] && oldest="$i"
done
note: to find the oldest directory use "$dir"/*/ above (thanks Cyrus) and -type d below with the find command.
In bash if you need a recursive solution, then you can rewrite it as a while loop with process substitution using find
unset -v oldest
while IFS= read -r i; do
[ "$i" -ot "$oldest" -o "$oldest" = "" ] && oldest="$i"
done < <(find "$dir" -type f)

How to make this script grep only the 1st line

for i in USER; do
find /home/$i/public_html/ -type f -iname '*.php' \
| xargs grep -A1 -l 'GLOBALS\|preg_replace\|array_diff_ukey\|gzuncompress\|gzinflate\|post_var\|sF=\|qV=\|_REQUEST'
done
Its ignoring the -A1. The end result is I just want it to show me files that contain any of matching words but only on the first line of the script. If there is a better more efficient less resource intensive way that would be great as well as this will be ran on very large shared servers.
Use awk instead:
for i in USER; do
find /home/$i/public_html/ -type f -iname '*.php' -exec \
awk 'FNR == 1 && /GLOBALS|preg_replace|array_diff_ukey|gzuncompress|gzinflate|post_var|sF=|qV=|_REQUEST/
{ print FILENAME }' {} +
done
This will print the current input file if the first line matches. It's not ideal, since it will read all of each file. If your version of awk supports it, you can use
awk '/GLOBALS|.../ { print FILENAME } {nextfile}'
The nextfile command will execute for the first line, effectively skipping the rest of the file after awk tests if it matches the regular expression.
The following code is untested:
for i in USER; do
find /home/$i/public_html/ -type f -iname '*.php' | while read -r; do
head -n1 "$REPLY" | grep -q 'GLOBALS\|preg_replace\|array_diff_ukey\|gzuncompress\|gzinflate\|post_var\|sF=\|qV=\|_REQUEST' \
&& echo "$REPLY"
done
done
The idea is to loop over each find result, explicitly test the first line, and print the filename if a match was found. I don't like it though because it feels so clunky.
for j in (find /home/$i/public_html/ -type f -iname '*.php');
do result=$(head -1l $j| grep $stuff );
[[ x$result |= x ]] && echo "$j: $result";
done
You'll need a little more effort to skip leasing blank lines. Fgrep will save resources.
A little perl would bring great improvement, but it's hard to type it on a phone.
Edit:
On a less cramped keyboard, inserted less brief solution.

copy list of filenames in a textfile in bash

I need to copy a list of filenames in a textfile. Trying by this:
#!/bin/sh
mjdstart=55133
mjdend=56674
datadir=/nfs/m/ir1/ssc/evt
hz="h"
for mjd in $(seq $mjdstart $mjdend); do
find $datadir/ssc"${hz}"_allcl_mjd"${mjd}".evt -maxdepth 1 -type f -printf $datadir'/%f\n' > ssc"${hz}".list
done
I tried also:
find $datadir/ssc"${hz}"_allcl_mjd"${mjd}".evt -maxdepth 1 -type f -printf $datadir'/%f\n' | split -l999 -d - ssc"${hz}".list
Or other combinations, but clearly I am missing something: the textfile is empty. Where is my mistake?
Use >> (append) instead of > (overwrite) otherwise you will have output of last command only:
> ssc"${hz}".list
for mjd in $(seq $mjdstart $mjdend); do
find $datadir/ssc"${hz}"_allcl_mjd"${mjd}".evt -maxdepth 1 -type f -printf $datadir'/%f\n' >> ssc"${hz}".list
done
You don't need to use find here, as you simply have a range of specific file names whose existence you are checking for:
#!/bin/sh
mjdstart=55133
mjdend=56674
datadir=/nfs/m/ir1/ssc/evt
hz="h"
for mjd in $(seq $mjdstart $mjdend); do
fnname="$datadir/ssc${hz}_allcl_mjd${mjd}.evt"
[[ -f $fname ]] && printf "$fname\n"
done > "ssc$hz.list"
You are using find wrong. The first argument is the directory, in which it should search. Also, using > overwrites your list file in every turn. Use >> to concatenate:
find $datadir -maxdepth 1 -type f -name "src${hz}_allcl_mjd${mjd}.evt" >> ssc"${hz}".list

Resources