How to find basename of path via pipe - bash

This doesn't work:
find "$all_locks" -mindepth 1 -maxdepth 1 -type d | basename
apparently basename cannot read from stdin - in any case basename requires at least one argument.

To apply a command to every result of a piped operation, xargs is your friend. As it says on the man page I linked...
xargs reads items from the standard input, delimited by blanks (which
can be protected with double or single quotes or a backslash) or
newlines, and executes the command (default is /bin/echo) one or more
times with any initial-arguments followed by items read from standard
input.
In this case that means it will take each result from your find command and run basename <find result>ad nauseum, until find has completed its search. I believe what you want is going to look a lot like this:
find "$all_locks" -mindepth 1 -maxdepth 1 -type d | xargs basename

Since mindepth and maxdepth are GNU extensions, using another one such as printf will not make it less portable.
find "$all_locks" -mindepth 1 -maxdepth 1 -type d -printf '%f\n'

The problem here is basename doesn't accept the stdin and hence unnamed pipes may not be useful. I would like to modify your command a little bit. Let me know if it serves the purpose.
find -mindepth 1 -maxdepth 1 -type d -exec basename {} \;
Note: Not enough reputation to comment, hence posting it here.

Related

Script to find recursively the number of files with a certain extension

We have a highly nested directory structure, where we have a directory, let's call it 'my Dir', appearing many times in our hierarchy. I am interested in counting the number of "*.csv" files in all directories named 'my Dir' (yes, there is a whitespace in the name). How can I go about it?
I tried something like this, but it does not work:
find . -type d -name "my Dir" -exec ls "{}/*.csv" \; | wc -l
If you want to the number of files matching the pattern '*.csv' under "my Dir", then:
don't ask for -type d; ask for -type f
don't ask for -name "my Dir" if you really want -name '*.csv'
don't try to ls *.csv on each match, because if there's more N csv files in a directory, you would potentially count each one N times
also beware of embedding {} in -exec code!
For counting files from find, I like to use a trick I learned from Stéphane Chazelas on U&L; for example, from: Counting files in Linux:
find "my Dir" -type f -name '*.csv' -printf . | wc -c
This requires GNU find, as -printf is a GNU extension to the POSIX standard.
It works by looking within "my Dir" (from the current working directory) for files that match the pattern; for each matching file, it prints a single dot (period); that's all piped to wc who counts the number of characters (periods) that find produced -- the number of matching files.
You would exclude all pathcs that are not My Dir:
find . -type f -not '(' -not -path '*/my Dir/*' -prune ')' -name '*.csv'
Another solution is to use the -path predicate to select your files.
find . -path '*/my Dir/*.csv'
Counting the number of occurrences could be a simple matter of piping to wc -l, though this will obviously produce the wrong result if some of the files contain newlines in their names. (This is slightly pathological, but definitely something you want to cover in production code.) A common arrangement is to just print a newline for every found file, instead of its name.
find . -path '*/my Dir/*.csv' -printf '.\n' | wc -l
(The -printf predicate is not in POSIX but it's not hard to replace with an -exec or similar.)

How do I use find command with pipe in bash?

The directory structure looks like
home
--dir1_foo
----subdirectory.....
--dir2_foo
--dir3_foo
--dir4_bar
--dir5_bar
I'm trying to use 'find' command to get directories containing specific strings first, (in this case 'foo'), then use 'find' command again to retrieve some directories matching conditions.
So, I first tried
#!/bin/bash
for dir in `find ./ -type d -name "*foo*" `;
do
for subdir in `find $dir -mindepth 2 -type d `;
do
[Do some jobs]
done
done
, and this script works fine.
Then I thought that using only one loop with pipe like below would also work, but this does not work
#!/bin/bash
for dir in `find ./ -type d -name "*foo*" | find -mindepth 2 -type d `;
do
[Do some jobs]
done
and actually this script works the same as
for dir in `find -mindepth 2 -type d`;
do
[Do some jobs]
done
, which means that the first find command is ignored..
What is the problem?
What your script is doing is not a good practice and has lot of potential pitfalls. See BashFAQ- Why you don't read lines with "for" to understand why.
You can use xargs with -0 to read null delimited files and use the another find command without needing to use the for-loop
find ./ -type d -name "*foo*" -print0 | xargs -0 -I{.} find {.} -mindepth 2 -type d
The string following -I in xargs acts like a placeholder for the input received from the previous pipeline and passes it to the next command. The -print0 option is GNU specific which is a safe option to hande filenames/directory names containing spaces or any other shell meta-characters.
So with the above command in-place, if you are interested in doing some action over the output from 2nd command, do a process-substitution syntax with the while command,
while IFS= read -r -d '' f; do
echo "$f"
# Your other actions can be done on "$f" here
done < <(find ./ -type d -name "*foo*" -print0 | xargs -0 -I{.} find {.} -mindepth 2 -type d -print0)
As far the reason why your pipelines using find won't work is that you are not reading the previous find command's output. You needed either xargs or -execdir while the latter is not an option I would recommend.

Building up a command string for find

I'm trying to parse the android source directory and i need to extract all the directory names excluding certain patterns. If you notice below., for now i included only 1 directory to the exclude list, but i will be adding more.,
The find command doesn't exclude the directory with name 'docs'.
The commented out line works., but the other one doesn't. For easy debugging, i included the min and maxdepth which i would remove later.
Any comments or hints on why it doesn't work?
#! /bin/bash
ANDROID_PATH=$1
root=/
EXCLUDES=( doc )
cd ${root}
for dir in "${EXCLUDES[#]}"; do
exclude_name_cmd_string=${exclude_name_cmd_string}$(echo \
"-not -name \"${dir}*\" -prune")
done
echo -e ${exclude_name_cmd_string}
custom_find_cmd=$(find ${ANDROID_PATH} -mindepth 1 -maxdepth 1 \
${exclude_name_cmd_string} -type d)
#custom_find_cmd=$(find ${ANDROID_PATH} -mindepth 1 -maxdepth 1 \
# -not -name "doc*" -prune -type d)
echo ${custom_find_cmd}
Building up a command string with possibly-quoted arguments is a bad idea. You get into nested quoting levels and eval and a bunch of other dangerous/confusing syntactic stuff.
Use an array to build the find; you've already got the EXCLUDES in one.
Also, the repeated -not and -prune seems weird to me. I would write your command as something like this:
excludes=()
for dir in "${EXCLUDES[#]}"; do
excludes+=(-name "${dir}*" -prune -o)
done
find "${ANDROID_PATH}" -mindepth 1 -maxdepth 1 "${excludes[#]}" -type d -print
The upshot is, you want the argument to -name to be passed to find as a literal wildcard that find will expand, not a list of files returned by the shell's expansion, nor a string containing literal quotation marks. This is very hard to do if you try to build the command as a string, but trivial if you use an array.
Friends don't let friends build shell commands as strings.
When I run your script (named fin.sh) as:
bash -x fin.sh $HOME/tmp
one of the lines of trace output is:
find /Users/jleffler/tmp -mindepth 1 -maxdepth 1 -not -name '"doc*"' -prune -type d
Do you see the single quotes around the double quotes? That's bash trying to be helpful. I'm guessing that your "doesn't work" problem is that you still get directories under doc* included in the output; other than that, it seems to work for me.
How to fix that?
...it seems you've found a way to fix that...I'm not sure I'd trust it with a Bourne shell (but the Korn shell seems to agree with Bash), but it looks like it might work with Bash. I'm pretty sure this is something that changed during the last 30 years or so, but it is hard to prove that; getting hands on the old code is not easy.
I also wonder whether you need repeated -prune options if you have repeated excluded directories; I'm not sufficiently familiar with -prune to be sure.
Found the problem. Its with the escape sequence in the exclude_name_cmd_string.
Correct syntax should have been
exclude_name_cmd_string=${exclude_name_cmd_string}$(echo \
"-not -name ${dir}* -prune")

Explain how many processes created?

Could someone answer how many processes are created in each case for the commands below as I dont understand it :
The following three commands have roughly the same effect:
rm $(find . -type f -name '*.o')
find . -type f -name '*.o' | xargs rm
find . -type f -name '*.o' -exec rm {} \;
Exactly 2 processes - 1 for rm, the other for find.
3 or more processes. 1 for find, another for xargs, and one or more rm. xargs will read standard input, and if it reads more lines than can be passed as parameters to a program (There is a maximum value named ARG_MAX).
Many processes, 1 for find and another one for each file ending in .o for rm.
In my opinion, option 2 is the best, because it handles the maximum parameter limit correctly and doesn't spawn too many processes. However, I prefer to use it like this (with GNU find and xargs):
find . -type f -name '*.o' -print0 | xargs -0 rm
This terminates each filename with a \0 instead of a newline, since filenames in UNIX can legally contain newlines. This also handles spaces in filenames (much more common) correctly.

Piping find to find

I want to pipe a find result to a new find. What I have is:
find . -iname "2010-06*" -maxdepth 1 -type d | xargs -0 find '{}' -iname "*.jpg"
Expected result: Second find receives a list of folders starting with 2010-06, second find returns a list of jpg's contained within those folders.
Actual result: "find: ./2010-06 New York\n: unknown option"
Oh darn. I have a feeling it concerns the format of the output that the second find receives as input, but my only idea was to suffix -print0 to first find, with no change whatsoever.
Any ideas?
You need 2 things. -print0, and more importantly -I{} on xargs, otherwise the {} doesn't do anything.
find . -iname "2010-06*" -maxdepth 1 -type d -print0 | xargs -0 -I{} find '{}' -iname '*.jpg'
Useless use of xargs.
find 2010-06* -iname "*.jpg"
At least Gnu-find accepts multiple paths to search in. -maxdepth and type -d is implicitly assumed.
How about
find . -iwholename "./2010-06*/*.jpg
etc?
Although you did say that you specifically want this find + pipe problem to work, its inefficient to fork an extra find command. Since you are specifying -maxdepth as 1, you are not traversing subdirectories. So just use a for loop with shell expansion.
for file in *2010-06*/*.jpg
do
echo "$file"
done
If you want to find all jpg files inside each 2010-06* folders recursively, there is also no need to use multiple finds or xargs
for directory in 2010-06*/
do
find $directory -iname "*.jpg" -type f
done
Or just
find 2006-06* -type f -iname "*.jpg"
Or even better, if you have bash 4 and above
shopt -s globstar
shopt -s nullglob
for file in 2010-06*/**/*.jpg
do
echo "$file"
done

Resources