An alias in .bashrc fails vs command line succeeds - bash

I run this from a user's home dir to show me the most recent files while omitting the shell profile files:
find ./ -type f -printf "%T# %p\n"|grep -vP "/\.(bash|emacs|gtkrc|kde/|zshrc)" |sort -n| tail -10|cut -f2- -d" "|while read EACH; do ls -l "$EACH"; done;
This works, but just not as well when placed in my .bashrc as an alias:
alias recentfiles='find ./ -type f -printf "%T# %p\n"|grep -vP "/\.\(bash|emacs|gtkrc|kde/|zshrc\)"|sort -n| tail -10|cut -f2- -d" "|while read EACH; do ls -l "$EACH"; done;'
In the image you see the results without doing any filtering, followed by the desired result using grep -v for filtering which works on command line. Then final result - only partially succeeds in weeding out those files.
I have tried using bash_ and [b]ash. Not even bas (which fails to even get .basin) work ?!? And also I can use macs or acs AND still get the .emacs omitted so obviously the syntax in my alias is not respecting the /. either. Not a problem with reserved words as I originally thought.
I DO get the expected results if I place my original command as is in a file and then use the alias that way:
alias recentfiles='. /root/mycommands/recentfiles'
Can someone explain or point me to a reference to understand what is at play here? I wouldn't know what phrase with the proper terms to search on.

This should fix your problems:
alias recentfiles='find ./ -type f -printf "%T# %p\n"|grep -vP "/\.(bash|emacs|gtkrc|kde/|zshrc)"|sort -n| tail -10|cut -f2- -d" "|while read EACH; do ls -l "$EACH"; done;'
The issue is with grep -P, where -P makes it use the perl regular expressions. In perl there is no need to use \ in grouping. So (bash|emacs|...) instead of \(bash|emacs|...\) . I really doubt it worked outside of .bashrc, unless you have some alias for grep which make it behave differently outside of .bashrc.
As other have said in the comments, your filtering is inefficient. Better rewrite your command with:
find ./ \( -name ".bash*" -o -name ".emacs*" -o -name .gtkrc -o -name .kde -o -name .zshrc \) -prune -o \( -type f -printf "%T# %p\n" \) |sort -n| tail -10|cut -f2- -d" "| tr "\n" "\0" | xargs -0 ls -l;
This way it will not waste time searching files inside .emacs.d/ or inside .kde/, and will immediately prune the search. Also, xargs -0 ls -l is so much shorter and clearer than the while loop.
To avoid issues with filenames that contain newlines, better use \0 characters, that are never part of a file name:
find ./ \( -name ".bash*" -o -name .emacs -o -name .gtkrc -o -name .kde -o -name .zshrc \) -prune -o \( -type f -printf "%T# %p\0" \) |sort -n -z | tail -z -n -10| cut -z -f2- -d" " | xargs -0 ls -l

Part 1: Fixing The Issue
Use a function instead.
There are several major issues with aliases:
Because you pass your content to be string-prefixed inside quotes when creating an alias, it's parsed differently than it would be when typed directly at the command line.
Because an alias is simple prefix substitution, they don't have their own arguments ($1, $2, etc); they don't have a call stack; debugging mechanisms like PS4=':$BASH_SOURCE:$LINENO+'; set -x can't tell you which file code from an alias originated in; etc.
Aliases are an interactive feature; POSIX doesn't mandate that shells support them at all, and they're turned off by default during script execution.
Functions solve all these problems.
recentfiles() {
find ./ \
'(' -name '.bash*' -o -name '.emacs*' -o -name .gtkrc -o -name .kde -o -name .zshrc ')' -prune \
-o -type f -printf "%T# %p\0" |
sort -nz |
tail -z -n -10 |
while read -d' ' _ && IFS= read -r -d '' file; do
printf '%s\0' "$file"
done |
xargs -0 ls -ld --
}
Note that I also made several other changes:
Instead of using \n as a separator, the above code uses \0. This is because newlines can be found in filenames; a file that contained newlines in its name could look like any number of files, with any arbitrary sizes it wanted, to the rest of your pipeline. (Unfortunately, POSIX doesn't require that sort and tail support newline delimiters, so the -z options used above are GNUisms).
Instead of using grep -v to remove dotfiles, I used the -prune option to find. This is particularly important for directories like .kde, since it stops find from spending the time and I/O bandwidth to recurse down directories for which you intend to throw the results away anyhow.
For documentation of the importance of the IFS= and -r arguments used in the while read loop, see BashFAQ #1. Both of these improve behavior in presence of unusual filenames (clearing IFS prevents trailing whitespace from being stripped; passing -r prevents literal backslashes from being elided).
Instead of grep -P -- a GNU extension which is only available if grep was compiled with libpcre support -- my first cut (prior to moving to find -prune) switched to grep -E, which is adequately expressive, much more widely available, and lends itself to higher performance implementations.
Part 2: Explaining The Issue
Running your alias after set -x, we see:
+ find ./ -type f -printf '%T# %p\n'
+ grep -vP '/\.\(bash|emacs|gtkrc|kde/|zshrc\)'
+ sort -n
+ tail -10
+ cut -f2- '-d '
+ read EACH
By contrast, running the command it was intended to wrap, we see:
+ find ./ -type f -printf '%T# %p\n'
+ grep -vP '/\.(bash|emacs|gtkrc|kde/|zshrc)'
+ sort -n
+ tail -10
+ cut -f2- '-d '
+ read EACH
In the command itself, there are no literal backslashes before ( and ).

Related

"find | xargs | ls" not running ls on filenames from find

So I have a directory with files and sub-directories in it. I want to get all the files recursively and then list them in long format, sorted by the modified date. Here's what I came up with.
find . -type f | xargs -d "\n" | ls -lt
However this only lists the files in the current directory and not the sub-directories. I don't understand why, given that the following prints out all the files.
find . -type f | xargs -d "\n" | cat
Any help appreciated.
xargs can only start ls if it's passed ls as an argument. When you pipe from xargs into ls, only one copy of ls is started -- by the parent shell -- and it isn't given any of the filenames from find | xargs as arguments -- instead they're on its stdin, but ls never reads its stdin, so it doesn't even know that they're there.
Thus, you need to remove the | character:
# Does what you specified in the common case, but buggy; don't use this
# (filenames can contain newlines!)
# ...also, xargs -d is GNU-only
find . -type f | xargs -d '\n' ls -lt
...or, better:
# uses NUL separators, which cannot exist inside filenames
# also, while a non-POSIX extension, this is supported in both GNU and BSD xargs
find . -type f -print0 | xargs -0 ls -lt
...or, even better than that:
# no need for xargs at all here; find -exec can do the same thing
# -exec ... {} + is POSIX-mandated functionality since 2008
find . -type f -exec ls -lt {} +
Much of the content in this answer is also covered in the Actions, Complex Actions, and Actions in Bulk sections of Using Find, which is well worth reading.

Solution for find -exec if single and double quotes already in use

I would like to recursively go through all subdirectories and remove the oldest two PDFs in each subfolder named "bak":
Works:
find . -type d -name "bak" \
-exec bash -c "cd '{}' && pwd" \;
Does not work, as the double quotes are already in use:
find . -type d -name "bak" \
-exec bash -c "cd '{}' && rm "$(ls -t *.pdf | tail -2)"" \;
Any solution to the double quote conundrum?
In a double quoted string you can use backslashes to escape other double quotes, e.g.
find ... "rm \"\$(...)\""
If that is too convoluted use variables:
cmd='$(...)'
find ... "rm $cmd"
However, I think your find -exec has more problems than that.
Using {} inside the command string "cd '{}' ..." is risky. If there is a ' inside the file name things will break and might execcute unexpected commands.
$() will be expanded by bash before find even runs. So ls -t *.pdf | tail -2 will only be executed once in the top directory . instead of once for each found directory. rm will (try to) delete the same file for each found directory.
rm "$(ls -t *.pdf | tail -2)" will not work if ls lists more than one file. Because of the quotes both files would be listed in one argument. Therefore, rm would try to delete one file with the name first.pdf\nsecond.pdf.
I'd suggest
cmd='cd "$1" && ls -t *.pdf | tail -n2 | sed "s/./\\\\&/g" | xargs rm'
find . -type d -name bak -exec bash -c "$cmd" -- {} \;
You have a more fundamental problem; because you are using the weaker double quotes around the entire script, the $(...) command substitution will be interpreted by the shell which parses the find command, not by the bash shell you are starting, which will only receive a static string containing the result from the command substitution.
If you switch to single quotes around the script, you get most of it right; but that would still fail if the file name you find contains a double quote (just like your attempt would fail for file names with single quotes). The proper fix is to pass the matching files as command-line arguments to the bash subprocess.
But a better fix still is to use -execdir so that you don't have to pass the directory name to the subshell at all:
find . -type d -name "bak" \
-execdir bash -c 'ls -t *.pdf | tail -2 | xargs -r rm' \;
This could stll fail in funny ways because you are parsing ls which is inherently buggy.
You are explicitely asking for find -exec. Usually I would just concatenate find -exec find -delete but in your case only two files should be deleted. Therefore the only method is running subshell. Socowi already gave nice solution, however if your file names do not contain tabulator or newlines, another workaround is find while read loop.
This will sort files by mtime
find . -type d -iname 'bak' | \
while read -r dir;
do
find "$dir" -maxdepth 1 -type f -iname '*.pdf' -printf "%T+\t%p\n" | \
sort | head -n2 | \
cut -f2- | \
while read -r file;
do
rm "$file";
done;
done;
The above find while read loop as "one-liner"
find . -type d -iname 'bak' | while read -r dir; do find "$dir" -maxdepth 1 -type f -iname '*.pdf' -printf "%T+\t%p\n" | sort | head -n2 | cut -f2- | while read -r file; do rm "$file"; done; done;
find while read loop can also handle NUL terminated file names. However head can not handle this, so I did improve other answers and made it work with nontrivial file names (only GNU + bash)
replace 'realpath' with rm
#!/bin/bash
rm_old () {
find "$1" -maxdepth 1 -type f -iname \*.$2 -printf "%T+\t%p\0" | sort -z | sed -zn 's,\S*\t\(.*\),\1,p' | grep -zim$3 \.$2$ | xargs -0r realpath
}
export -f rm_old
find -type d -iname bak -execdir bash -c 'rm_old "{}" pdf 2' \;
However bash -c might still exploitable, to make it more secure let stat %N do the quoting
#!/bin/bash
rm_old () {
local dir="$1"
# we don't like eval
# eval "dir=$dir"
# this works like eval
dir="${dir#?}"
dir="${dir%?}"
dir="${dir//"'$'\t''"/$'\011'}"
dir="${dir//"'$'\n''"/$'\012'}"
dir="${dir//$'\047'\\$'\047'$'\047'/$'\047'}"
find "$dir" -maxdepth 1 -type f -iname \*.$2 -printf '%T+\t%p\0' | sort -z | sed -zn 's,\S*\t\(.*\),\1,p' | grep -zim$3 \.$2$ | xargs -0r realpath
}
find -type d -iname bak -exec stat -c'%N' {} + | while read -r dir; do rm_old "$dir" pdf 2; done

bash function grep --exclude-dir not working

I have the following function defined in my .bashrc, but for some reason the --exclude-dir option is not excluding the .git directory. Can anyone see what I've done wrong? I'm using Ubuntu 13.10 if that helps.
function fif # find in files
{
pattern=${1?" Usage: fif <word_pattern> [files pattern]"};
files=${2:+"-iname \"$2\""};
grep "$pattern" --color -n -H -s $(find . $files -type f) --exclude-dir=.git --exclude="*.min.*"
return 0;
}
Make sure not to include a trailing slash when you specify the directory to exclude. For example:
Do this:
$ grep -r --exclude-dir=node_modules firebase .
NOT this:
$ grep -r --exclude-dir=node_modules/ firebase .
(This answer not applicable to OP, but may be helpful for others who find --exclude-dir not to be working -- it worked for me.)
Do a man grep on your system, and see what version you have. Your version of grep may not be able to use --exclude-dirs.
You're really better off using find to find the files you want, then use grep to parse them:
$ find . -name '.git' -type d -prune \
-o -name "*.min.*" -prune \
-o -type f -exec grep --color -n -H {} "$pattern" \;
I'm not a fan of the recursive grep. Its syntax has become bloated, and it's really unnecessary. We have a perfectly good tool for finding files that match a particular criteria, thank you.
In the find program, the -o separate out the various clauses. If a file has not been filtered out by a previous -prune clause, it is passed to the next one. Once you've pruned out all of the .git directories and all of the *.min.* files, you pass the results to the -exec clause that executes your grep command on that one file.
Some people prefer it this way:
$ find . -name '.git' -type d -prune \
-o -name "*.min.*" -prune \
-o -type f -print0 | xargs -0 grep --color -n -H "$pattern"
The -print0 prints out all of the found files separated by the NULL character. The xargs -0 will read in that list of files and pass them to the grep command. The -0 tells xargs that the file names are NULL separated and not whitespace separated. Some xargs will take --null instead of the -0 parameter.

In-line text replacement using sed, shell, or some other means

I want to pass two parameters to a program, a file name and a modified version of the file name. The situation is I have a bunch of .html.erb files in a directory tree, and I want invoke html2haml on them with the original filename and a new output filename with the haml extension, like so:
html2haml thing.html.erb thing.html.haml
Here's my current best attempt at this:
find . -name "*.html.erb" -exec echo {} `echo {} | sed "s/.erb/.haml/g"` \;
(after I'm done testing I'll replace echo with html2haml and run it again)
However it doesn't work. The result of the expression inside backticks is the unmodified string.
Here are some experiments I tried which DO behave as expected (to test if my syntax and levels of escaping/quotes were correct):
1. echo myfile.foo | sed 's/foo/foo2/g'
2. find . -name "*.html.erb" -exec echo {} `echo xyz | sed "s/y/Y/g"` \;
3. find . -name "*.html.erb" -exec echo {} `echo {} hello` \;
4. find . -name "*.html.erb" -exec echo {} `echo {}` \;
The fact that these all behave as expected suggest to me that I am getting some small thing wrong in the syntax, and that is is indeed possible to do this with a one-liner.
If this is impossible, it might be because of a misunderstanding about "when" find inserts its results on each invocation. example #3 above suggest to me that it does it exactly when i need/expect it to (because I'm successfully concatenating each individual result string with "hello").
If you have gsed:
find . -name \*.erb -print0 | gsed -z 'p;s/.erb$/.haml/' | xargs -0 -n2 html2haml
If you don't have gsed and only have sed, this will work, but only if none of your file names have whitespace.
find . -name \*.erb -print | sed 'p;s/.erb$/.haml/' | xargs -n2 html2haml
Discussion about these and other techniques follows:
I have different versions of sed - my GNU sed is called gsed, if your sed is GNU - instead of gsed use sed.
You can check your sed with the sed --version, if prints something like:
sed (GNU sed) 4.2.2
Copyright (C) 2012 Free Software Foundation, Inc.
You have a GNU sed.
The above - for the next find
$ find . -name \*foo -print
./a/test.foo
./b/c/test.foo
./b/te st.foo #<- note the filename with space
./b/test.foo
the above command produces:
$find . -name \*foo -print0 | gsed -z 'p;s/foo$/foo2/' | xargs -0 -n2 echo bar
bar ./a/test.foo ./a/test.foo2
bar ./b/c/test.foo ./b/c/test.foo2
bar ./b/te st.foo ./b/te st.foo2
bar ./b/test.foo ./b/test.foo2
Without additional scripts or functions. ;)
or you can replace the sed with perl, so the next
find . -name \*foo -print0 | perl -n0le 'print;s/foo/foo2/;print' | xargs -0 -n2 echo bar
produces the same result:
bar ./a/test.foo ./a/test.foo2
bar ./b/c/test.foo ./b/c/test.foo2
bar ./b/te st.foo ./b/te st.foo2
bar ./b/test.foo ./b/test.foo2
IF you REALLY want to do it within one find, try:
find . -name \*html.erb -exec sh -c 'echo html2haml "{}" "$(echo "{}" | sed 's/\.erb/\.haml/')"' \;
or elimitating two useless echo the final command:
find . -name \*html.erb -exec sh -c 'html2haml "{}" "$(sed 's/\.erb/\.haml/'<<<"{}")"' \;
What about a loop?
find . -name "*.html.erb" | while read file
do
haml_file=${file%.erb}.haml
html2haml $file $haml_file
done
The ${var%glob} syntax takes an environment variable ${var} and filters out the smallest portion of the right side that matches glob.
If you know that the filename ends with .foo, then you can use:
do_something "$filename" "${filename%.foo}.foo2"
(In the unlikely case that you really want to just put a 2 on the end, you could of course just use "${filename}2". But I assume the foo and foo2 are to be substituted with less similar strings.)
If you want to invoke do_something from find, your best bet would be to pass it only one filename (or, better, a number of filenames each of them representing a single operation). For example:
-- do_something.sh
#!/bin/bash
# This is the definition of what you want to do.
# It is called as `bar old_filename new_filename`
bar() {
# For example
mv "$1" "$2"
}
for filename in "$#"; do
bar "$filename" "${filename%.foo}.foo2"
done
-- find command:
find . -type f -name '*.foo' -exec do_something.sh {} +
If you really need to use sed (for something that you can't even do with the bash replace syntax, ${var/pattern/substitution}), then set up do_something as above, but replace the line inside the for loop with, for example:
bar "$filename" "$(sed -r 's/([^.]+)\.([^.]+)$/\2.\1/' <<<"$filename")"
Explanation: The above sed expression (gnu-specific) flips the last two extensions around, so it would change some.file.html.en into some.file.en.html. -r causes gnu sed to use extended regex format, which I find more readable. <<< is a bashism which expands the word following it and feeds it into stdin, somewhat similar to echo "$filename" | sed ... but without creating another subprocess.
You can call your find like this:
find . -name "*.html.erb" -print0 -print0|xargs -0 -J % html2haml % | sed 's/\.erb$/.haml/'
This will result in executing:
html2haml thing.html.erb thing.html.haml

Find, grep, and execute - all in one?

This is the command I've been using for finding matches (queryString) in php files, in the current directory, with grep, case insensitive, and showing matching results in line:
find . -iname "*php" -exec grep -iH queryString {} \;
Is there a way to also pipe just the file name of the matches to another script?
I could probably run the -exec command twice, but that seems inefficient.
What I'd love to do on Mac OS X is then actually to "reveal" that file in the finder. I think I can handle that part. If I had to give up the inline matches and just let grep show the files names, and then pipe that to a third script, that would be fine, too - I would settle.
But I'm actually not even sure how to pipe the output (the matched file names) to somewhere else...
Help! :)
Clarification
I'd like to reveal each of the files in a finder window - so I'm probably not going to using the -q flag and stop at the first one.
I'm going to run this in the console, ideally I'd like to see the inline matches printed out there, as well as being able to pipe them to another script, like oascript (applescript, to reveal them in the finder). That's why I have been using -H - because I like to see both the file name and the match.
If I had to settle for just using -l so that the file name could more easily be piped to another script, that would be OK, too. But I think after looking at the reply below from #Charlie Martin, that xargs could be helpful here in doing both at the same time with a single find, and single grep command.
I did say bash but I don't really mind if this needs to be ran as /bin/sh instead - I don't know too much about the differences yet, but I do know there are some important ones.
Thank you all for the responses, I'm going to try some of them at the command line and see if I can get any of them to work and then I think I can choose the best answer. Leave a comment if you want me to clarify anything more.
Thanks again!
You bet. The usual thing is something like
$ find /path -name pattern -print | xargs command
So you might for example do
$ find . -name '*.[ch]' -print | xargs grep -H 'main'
(Quiz: why -H?)
You can carry on with this farther; for example. you might use
$ find . -name '*.[ch]' -print | xargs grep -H 'main' | cut -d ':' -f 1
to get the vector of file names for files that contain 'main', or
$ find . -name '*.[ch]' -print | xargs grep -H 'main' | cut -d ':' -f 1 |
xargs growlnotify -
to have each name become a Growl notification.
You could also do
$ grep pattern `find /path -name pattern`
or
$ grep pattern $(find /path -name pattern)
(in bash(1) at least these are equivalent) but you can run into limits on the length of a command line that way.
Update
To answer your questions:
(1) You can do anything in bash you can do in sh. The one thing I've mentioned that would be any different is the use of $(command) in place of using backticks around command, and that works in the version of sh on Macs. The csh, zsh, ash, and fish are different.
(2) I think merely doing $ open $(dirname arg) will opena finder window on the containing directory.
It sounds like you want to open all *.php files that contain querystring from within a Terminal.app session.
You could do it this way:
find . -name '*.php' -exec grep -li 'querystring' {} \; | xargs open
With my setup, this opens MacVim with each file on a separate tab. YMMV.
Replace -H with -l and you will get a list of those filenames that matched the pattern.
if you have bash4, simply do
grep pattern /path/**/*.php
the ** operator is like
grep pattern `find -name \*.php -print`
find /home/aaronmcdaid/Code/ -name '*.cpp' -exec grep -q -iH boost {} \; -exec echo {} \;
The first change I made is to add -q to your grep command. This is "Exit immediately with zero status if any match is found".
The good news is that this speeds up grep when a file has many matching lines. You don't care how many matches there are. But that means we need another exec on the end to actually print the filenames when grep has been successful
The grep result will be sent to stdout, so another -exec predicate is probably the best solution here.
Pipe to another script:
find . -iname "*.php" | myScript
File names will come into the stdin of myScript 1 line at a time.
You can also use xargs to form/execute commands to act on each file:
find . -iname "*.php" | xargs ls -l
act on files you find that match:
find . -iname "*.php" | xargs grep -l pattern | myScript
act that don't match pattern
find . -iname "*.php" | xargs grep -L pattern | myScript
In general using multiple -exec's and grep -q will be FAR faster than piping, since find has implied short circuits -a's separating each juxtaposed pair of expressions that's not separated with an explicit operator. The main problem here, is that you want something to happen if grep matches something AND for matches to be printed. If the files are reasonably sized then this should be faster (because grep -q exits after finding a single match)
find . -iname "*php" -exec grep -iq queryString {} \; -exec grep -iH queryString {} \; -exec otherprogram {} \;
If the files are particularly big, encapsulating it in a shell script may be faster then running multiple grep commands
find . -iname "*php" -exec bash -c \
'out=$(grep -iH queryString "$1"); [[ -n $out ]] && echo "$out" && exit 0 || exit 1' \
bash {} \; -print
Also note, if the matches are not particularly needed, then
find . -iname "*php" -exec grep -iq queryString {} \; -exec otherprogram {} \;
Will virtually always be faster than then a piped solution like
find . -iname "*php" -print0 | xargs -0 grep -iH | ...
Additionally, you should really have -type f in all cases, unless you want to catch *php directories
Regarding the question of which is faster, and you actually care about the minuscule time difference, which maybe you might if you are trying to see which will save your processor some time... perhaps testing using the command as a suffix to the "time" command, and see which one performs better.

Resources