Weird issue when running grep with the --include option - bash

Here is the code at the bash shell. How is the file mask supposed to be specified, if not this way? I expected both commands to find the search expression, but it's not happening. In this example, I know in advance that I prefer to restrict the search to python source code files only, because unqualified searches are silly time wasters.
So, this works as expected:
grep -rni '/home/ga/projects' -e 'def Pr(x,u,v)'
/home/ga/projects/anom/anom.py:27:def Pr(x,u,v): blah, blah, ...
but this won't work:
grep --include=\*.{py} -rni '/home/ga/projects' -e 'def Pr(x,u,v)'
I'm using GNU grep version 2.16.

--include=\*.{py} looks like a broken attempt to use brace expansion (an unquoted {...} expression).
However, for brace expansion
to occur in bash (and ksh and zsh), you must either have:
a list of at least 2 items, separated with ,; e.g. {py,txt}, which expands to 2 arguments, py and txt.
or, a range of items formed from two end points, separated with ..; e.g., {1..3}, which expands to 3 arguments, 1, 2, and 3.
Thus, with a single item, simply do not use brace expansion:
--include=\*.py
If you did have multiple extensions to consider, e.g., *.py as well as *.pyc files, here's a robust form that illustrates the underlying shell features:
'--include=*.'{py,pyc}
Here:
Brace expansion is applied, because {...} contains a 2-item list.
Since the {...} directly follows the literal (single-quoted) string --include=*., the results of the brace expansion include the literal part.
Therefore, 2 arguments are ultimately passed to grep, with the following literal content:
--include=*.py
--include=*.pyc

Your command fails because of the braces '{}'. It will search for it in the file name. You can create a file such as 'myscript.{py}' to convince yourself. You'll see it will appear in the results.
The correct option parameter would be '*.py' or the equivalent \*.py. Either way will protect it from being (mis)interpreted by the shell.
On the other side, I can only advise to use the command find for such jobs :
find /home/ga/projects -regex '.*\.py$' -exec grep -e "def Pr(x,u,v)" {} +
That will protect you from hard to understand shell behaviour.

Try like this (using quotes to be safe; also better readability than backslash escaping IMHO):
grep --include='*.py' ...
your \*.{py} brace expansion usage isn't supported at all by grep. Please see the comments below for the full investigation regarding this. For the record, blame this answer for the resulting brace wars ;)
By the way, the brace expansion works generally fine in Bash. See mklement0 answer for more details.
Ack. As an alternative, you might consider switching to ack instead from now on. It's a tool just like grep, but fully optimized for programmers.
It's a great fit for what you are doing. A nice quote about it:
Every once in a while something comes along that improves an idea so much, you can't ignore it. Such a thing is ack, the grep replacement.

Related

Bash pattern matching 'or' using shell parameter expansion

hashrate=${line//*:/}
hashrate=${hashrate//H\/s/}
I'm trying to unify this regex replace into a single command, something like:
hashrate=${line//*:\+/H\/s/}
However, this last option doesn't work. I also tried with \|, but it doesn't seem to work and I haven't found anything useful in bash manuals and documentation. I need to use ${} instead of sed, even if using it solves my problem.
The alternation for shell patterns (assuming extended globbing, shopt -s extglob is enabled), is #(pattern|pattern...). For your case:
${line//#(*:|H\/s)}
The trailing / is optional if you just remove a pattern instead of replacing it.
Notice that because of the double slash, //, all occurrences of the patterns will be removed, one at a time. If you used *(...) (see randomir's answer), consecutive patterns would be removed all in one go. Unless you have giant string, the difference should be negligible. (If you have giant strings, you don't want to use globbing anyway, as it's not optimized for this kind of thing.)
If you enable extended globbing (extglob via shopt), you can use the *(pattern1|pattern2|...) operator to match zero or more glob patterns:
hashrate="${line//*(*:|H\/s)/}"

Is there a linter for fish like there is for bash with shellcheck?

For sh/bash/zsh there is https://github.com/koalaman/shellcheck however there won't be support for fish with it https://github.com/koalaman/shellcheck/issues/209 - is there any linters for fish?
To my knowledge, there is not (and obviously this is impossible to prove).
And if someone were to create such a thing, there'd need to be consensus about what the "typical beginner's syntax issues" and "semantic problems that cause a shell to behave strangely and counter-intuitively" are.
Fish doesn't have many of POSIX sh's warts (as it was written as a reaction to them). Some examples from the shellcheck README:
echo $1 # Unquoted variables
Fish's quoting behavior is quite different - in particular, there is no word splitting on variables, so unquoted variables usually do what you want.
v='--verbose="true"'; cmd $v # Literal quotes in variables
This is presumably an (unsuccessful) attempt to defeat word splitting, which isn't necessary.
This example nicely illustrates the issue - there are multiple decades worth of sh scripts. The flaws and unintuitive behaviors are really well known. So well known in fact, that the common-but-incorrect workarounds are known as well. That's just not the case for fish.
(Obviously, other examples do apply to fish as well, especially the "Frequently misused commands" section.)
Some things in fish that I know new users often trip over:
Unquoted variables expand to one argument per element in the list (since every variable is one). That includes zero if the list is empty, which is an issue with test - e.g. test -n $var will return 0 because fish's test builtin is one of the few parts that are POSIX-compatible (since POSIX demands test with one argument returns 0). Double-quote if you always need one argument.
{} expands to nothing and {x} expands to "x", which means find -exec needs quoting, as do some git commit-ishes (HEAD#{4}). (edit: This has since been changed, {} expands to {} and {x} expands to {x} unless x has a comma or other expansion, so HEAD#{4} works)
fish -n or --no-execute "does not execute any commands, only performs syntax checking", so you could do something like what I am doing here:
for f in **/*.fish; do fish -n "$f"; done

In bash, how do I force variable never to be interpreted as a list?

In my bash scripts, I regularly use file paths which may contain spaces:
FOO=/path\ with\ spaces/
Later, if I want to use FOO, I have to wrap it in quotes ("$FOO") or it will be interpreted as a list (/path, with, spaces/). Is there a better way to force a variable never to be interpreted as a list? It is cumbersome to have to constantly quote-wrap.
No. You must always use quotes or bash will word-split (except in [[, but that is a special case).
You can also change the internal field separator, IFS, as in:
ORIGIFS="$IFS"
IFS=$(echo -en "\n\b")
# do stuff...
IFS="$ORIGIFS"
However, this affects all situations where bash looks to do field splitting, which might be more broad than you'd like.

Tricky brace expansion in shell

When using a POSIX shell, the following
touch {quick,man,strong}ly
expands to
touch quickly manly strongly
Which will touch the files quickly, manly, and strongly, but is it possible to dynamically create the expansion? For example, the following illustrates what I want to do, but does not work because of the order of expansion:
TEST=quick,man,strong #possibly output from a program
echo {$TEST}ly
Is there any way to achieve this? I do not mind constricting myself to Bash if need be. I would also like to avoid loops. The expansion should be given as complete arguments to any arbitrary program (i.e. the program cannot be called once for each file, it can only be called once for all files). I know about xargs but I'm hoping it can all be done from the shell somehow.
... There is so much wrong with using eval. What you're asking is only possible with eval, BUT what you might want is easily possible without having to resort to bash bug-central.
Use arrays! Whenever you need to keep multiple items in one datatype, you need (or, should use) an array.
TEST=(quick man strong)
touch "${TEST[#]/%/ly}"
That does exactly what you want without the thousand bugs and security issues introduced and concealed in the other suggestions here.
The way it works is:
"${foo[#]}": Expands the array named foo by expanding each of its elements, properly quoted. Don't forget the quotes!
${foo/a/b}: This is a type of parameter expansion that replaces the first a in foo's expansion by a b. In this type of expansion you can use % to signify the end of the expanded value, sort of like $ in regular expressions.
Put all that together and "${foo[#]/%/ly}" will expand each element of foo, properly quote it as a separate argument, and replace each element's end by ly.
In bash, you can do this:
#!/bin/bash
TEST=quick,man,strong
eval echo $(echo {$TEST}ly)
#eval touch $(echo {$TEST}ly)
That last line is commented out but will touch the specified files.
Zsh can easily do that:
TEST=quick,man,strong
print ${(s:,:)^TEST}ly
Variable content is splitted at commas, then each element is distributed to the string around the braces:
quickly manly strongly
Taking inspiration from the answers above:
$ TEST=quick,man,strong
$ touch $(eval echo {$TEST}ly)

Search and replace in Shell

I am writing a shell (bash) script and I'm trying to figure out an easy way to accomplish a simple task.
I have some string in a variable.
I don't know if this is relevant, but it can contain spaces, newlines, because actually this string is the content of a whole text file.
I want to replace the last occurence of a certain substring with something else.
Perhaps I could use a regexp for that, but there are two moments that confuse me:
I need to match from the end, not from the start
the substring that I want to scan for is fixed, not variable.
for truncating at the start: ${var#pattern}
truncating at the end ${var%pattern}
${var/pattern/repl} for general replacement
the patterns are 'filename' style expansion, and the last one can be prefixed with # or % to match only at the start or end (respectively)
it's all in the (long) bash manpage. check the "Parameter Expansion" chapter.
amn expression like this
s/match string here$/new string/
should do the trick - s is for sustitute, / break up the command, and the $ is the end of line marker. You can try this in vi to see if it does what you need.
I would look up the man pages for awk or sed.
Javier's answer is shell specific and won't work in all shells.
The sed answers that MrTelly and epochwolf alluded to are incomplete and should look something like this:
MyString="stuff ttto be edittted"
NewString=`echo $MyString | sed -e 's/\(.*\)ttt\(.*\)/\1xxx\2/'`
The reason this works without having to use the $ to mark the end is that the first '.*' is greedy and will attempt to gather up as much as possible while allowing the rest of the regular expression to be true.
This sed command should work fine in any shell context used.
Usually when I get stuck with Sed I use this page,
http://sed.sourceforge.net/sed1line.txt

Resources