Problem using grep inside a bash script that run remote on a server

Problem using grep inside a bash script that run remote on a server - bash

I m using a script that run remote on server via ssh.
Inside the script I'm using this line from below:
ls | grep -oP "\d{4} -\d{2}-\d{2}"
On my local machine that run Ubuntu the script work fine.
But when I try to run it remote I got this
grep: invalid option -- 'P'
BusyBox v1.24.1 multi-call binary.
Usage: grep [-HhnlLoqvsriwFE] [-m N] [-A/B/C N] PATTERN/-e PATTERN/...-f file [FILE]...
The first thing I thought was an alias problem, i tryed
type grep
Output is: grep is /bin/grep I think this is ok.
What worries me is BusyBox (I do not know what it is) but i think this can be the problem ?

You may use [0-9] / [[:digit:]] instead of \d with POSIX BRE (no option) or ERE (-E option):
grep -o "[0-9]\{4\} -[0-9]\{2\}-[0-9]\{2\}"
grep -oE "[0-9]{4} -[0-9]{2}-[0-9]{2}"
Note that in the first command you need to escape the braces since unescaped { and } match literal brace symbols in a POSIX BRE regex. When escaped, they mean range (interval, limiting) quantifiers. And in the second command, POSIX ERE is enabled with -E, and the behavior is reverse: when the braces are escaped they are literal chars, else they are quantifiers.

Related

Grep with a regex character range that includes the NULL character

When I include the NULL character (\x00) in a regex character range in BSD grep, the result is unexpected: no characters match. Why is this happening?
Here is an example:
$ echo 'ABCabc<>/ă' | grep -o [$'\x00'-$'\x7f']
Here I expect all characters up until the last one to match, however the result is no output (no matches).
Alternatively, when I start the character range from \x01, it works as expected:
$ echo 'ABCabc<>/ă' | grep -o [$'\x01'-$'\x7f']
A
B
C
a
b
c
<
>
/
Also, here are my grep and BASH versions:
$ grep --version
grep (BSD grep) 2.5.1-FreeBSD
$ echo $BASH_VERSION
3.2.57(1)-release

On BSD grep, you may be able to use this:
LC_ALL=C grep -o '[[:print:][:cntrl:]]' <<< 'ABCabc<>/ă'
A
B
C
a
b
c
<
>
/
Or you can just install gnu grep using home brew package and run:
grep -oP '[[:ascii:]]' <<< 'ABCabc<>/ă'

Noting that $'...' is a shell quoting construct, this,
$ echo 'ABCabc<>/ă' | grep -o [$'\x00'-$'\x7f']
would try to pass a literal NUL character as part of the command line argument to grep. That's impossible to do in any Unix-like system, as the command line arguments are passed to the process as NUL-terminated strings. So in effect, grep sees just the arguments -o and [.
You would need to create some pattern that matches the NUL byte without including it literally. But I don't think grep supports the \000 or \x00 escapes itself. Perl does, though, so this prints the input line with the NUL:
$ printf 'foo\nbar\0\n' |perl -ne 'print if /\000/'
bar
As an aside, at least GNU grep doesn't seem to like that kind of a range expression, so if you were to use that, you'd to do something different. In the C locale, [[:cntrl:][:print:]]' might perhaps work to match the characters from \x01 to \x7f, but I didn't check comprehensively.
The manual for grep has some descriptions of the classes.
Note also that [$'\x00'-$'\x7f'] has an unquoted pair of [ and ] and so is a shell glob. This isn't related to the NUL byte, but if you had files that match the glob (any one-letter names, if the glob works on your system -- it doesn't on my Linux), or had failglob or nullglob set, it would probably give results you didn't want. Instead, quote the brackets too: $'[\x00-\x7f]'.

execute grep regular expression involving `|` using ssh on remote host

I am trying to run grep command involving regular expression '|' on a server using ssh.
ssh rpatil#192.168.1.5 grep -E "GapEvent|GapFilled" "$logFile" > $server-$testName.log
now '|' in the command is being treated as pipe and error "no command GapFilled" is being raised.
I tried 'GapEvent|GapFilled' or '(GapEvent|GapFilled)'
so how should regular expression "GapEvent|GapFilled" should be written so that | is not treated as pipe?

You need two levels of quotes since the command line is evaluated twice (once locally when ssh is executed and once when grep is executed on the remote side). You can use one of these patterns:
"'a|b'"
'"a|b"'
"\"a|b\""

Escape the | like this \|
grep -E "GapEvent\|GapFilled" "$logFile" file

Simply use two expressions:
grep -E -e "GapEvent" -e "GapFilled" "$logFile"
-E may no longer be needed here. -F may also be a preference.

Use literal (*) in grep pattern within shell script

I'm trying to evaluate a grep expression inside of a shell script, and that grep uses a literal asterisk (*), but that asterisk appears to be expanded by my bash instead of remaining a literal asterisk:
branch_description=$(git branch --list -vv | grep "^\*")
What can I do to run grep in this context and let it receive a literal asterisk in its PATTERN argument?

A solution is to use the ascii octal code :
branch_description=$(git branch --list -vv | grep "^\052")
See
man 7 ascii

The problem is that bash is interpreting the \ and stripping it away, because it's inside double quotes. Changing to
branch_description=$(git branch --list -vv | grep '^\*')
will do what you want. See the section on QUOTING in the bash manual.

You can use single quote in grep to avoid expansion of * by shell:
branch_description=$(git branch --list -vv | grep '^\*')

Is there an easy way to pass a "raw" string to grep?

grep can't be fed "raw" strings when used from the command-line, since some characters need to be escaped to not be treated as literals. For example:
$ grep '(hello|bye)' # WON'T MATCH 'hello'
$ grep '\(hello\|bye\)' # GOOD, BUT QUICKLY BECOMES UNREADABLE
I was using printf to auto-escape strings:
$ printf '%q' '(some|group)\n'
\(some\|group\)\\n
This produces a bash-escaped version of the string, and using backticks, this can easily be passed to a grep call:
$ grep `printf '%q' '(a|b|c)'`
However, it's clearly not meant for this: some characters in the output are not escaped, and some are unnecessarily so. For example:
$ printf '%q' '(^#)'
\(\^#\)
The ^ character should not be escaped when passed to grep.
Is there a cli tool that takes a raw string and returns a bash-escaped version of the string that can be directly used as pattern with grep? How can I achieve this in pure bash, if not?

If you want to search for an exact string,
grep -F '(some|group)\n' ...
-F tells grep to treat the pattern as is, with no interpretation as a regex.
(This is often available as fgrep as well.)

If you are attempting to get grep to use Extended Regular Expression syntax, the way to do that is to use grep -E (aka egrep). You should also know about grep -F (aka fgrep) and, in newer versions of GNU Coreutils, grep -P.
Background: The original grep had a fairly small set of regex operators; it was Ken Thompson's original regular expression implementation. A new version with an extended repertoire was developed later, and for compatibility reasons, got a different name. With GNU grep, there is only one binary, which understands the traditional, basic RE syntax if invoked as grep, and ERE if invoked as egrep. Some constructs from egrep are available in grep by using a backslash escape to introduce special meaning.
Subsequently, the Perl programming language has extended the formalism even further; this regex dialect seems to be what most newcomers erroneously expect grep, too, to support. With grep -P, it does; but this is not yet widely supported on all platforms.
So, in grep, the following characters have a special meaning: ^$[]*.\
In egrep, the following characters also have a special meaning: ()|+?{}. (The braces for repetition were not in the original egrep.) The grouping parentheses also enable backreferences with \1, \2, etc.
In many versions of grep, you can get the egrep behavior by putting a backslash before the egrep specials. There are also special sequences like \<\>.
In Perl, a huge number of additional escapes like \w \s \d were introduced. In Perl 5, the regex facility was substantially extended, with non-greedy matching *? +? etc, non-grouping parentheses (?:...), lookaheads, lookbehinds, etc.
... Having said that, if you really do want to convert egrep regular expressions to grep regular expressions without invoking any external process, try ${regex/pattern/substitution} for each of the egrep special characters; but recognize that this does not handle character classes, negated character classes, or backslash escapes correctly.

When I use grep -E with user provided strings I escape them with this
ere_quote() {
sed 's/[][\.|$(){}?+*^]/\\&/g' <<< "$*"
}
example run
ere_quote ' \ $ [ ] ( ) { } | ^ . ? + *'
# output
# \\ \$ \[ \] \( \) \{ \} \| \^ \. \? \+ \*
This way you may safely insert the quoted string in your regular expression.
e.g. if you wanted to find each line starting with the user content, with the user providing funny strings as .*
userdata=".*"
grep -E -- "^$(ere_quote "$userdata")" <<< ".*hello"
# if you have colors in grep you'll see only ".*" in red

I think that previous answers are not complete because they miss one important thing, namely string which begin with dash (-). So while this won't work:
echo "A-B-C" | grep -F "-B-"
This one will:
echo "A-B-C" | grep -F -- "-B-"

quote() {
sed 's/[^\^]/[&]/g;s/[\^]/\\&/g' <<< "$*"
}
Usage: grep [OPTIONS] "$(quote [STRING])"
This function has some substantial benefits:
quote is independent from the regex flavor. You can use quote's output in
grep (-G)` (BRE, the default)
grep -E (ERE)
grep -P (PCRE)
sed (-E) "s/$(quote [STRING])/.../" (as long as you don't use \, [, or ] instead of /).
quote even works in corner cases that are not directly quoting related, for instance
Leading - are quoted so that they aren't misinterpreted as options by grep.
Trailing spaces are quoted so that the aren't removed by $(...).
quote only fails if [STRING] contains linebreaks. But in general there is no fix for this since tools like grep and sed may not support linebreaks in their search pattern (even if they are written as \n).
Also, there is the drawback that the quoted output usually is three times longer than the unquoted input.

Just want to comment example below which shows that substring "-B" is iterpreted by grep as a command line option and the command failed.
echo "A-B-C" | grep -F "-B-"
grep has a special option for this case:
-e PATTERNS, --regexp=PATTERNS
Use PATTERNS as the patterns. If this option is used multiple times or is combined with the -f (--file) option,
search for all patterns given. This option can be used to protect a pattern beginning with “-”.
So a fix for the issue is:
echo "A-B-C" | grep -F -e "-B-" -

Replace comma with newline in sed on MacOS?

I have a file of strings that are comma separated. I'm trying to replace the commas with a new line. I've tried:
sed 's/,/\n/g' file
but it is not working. What am I missing?

Use tr instead:
tr , '\n' < file

Use an ANSI-C quoted string $'string'
You need a backslash-escaped literal newline to get to sed.
In bash at least, $'' strings will replace \n with a real newline, but then you have to double the backslash that sed will see to escape the newline, e.g.
echo "a,b" | sed -e $'s/,/\\\n/g'
Note this will not work on all shells, but will work on the most common ones.

sed 's/,/\
/g'
works on Mac OS X.

If your sed usage tends to be entirely substitution expressions (as mine tends to be), you can also use perl -pe instead
$ echo 'foo,bar,baz' | perl -pe 's/,/,\n/g'
foo,
bar,
baz

MacOS is different, there is two way to solve this problem with sed in mac
first ,use \'$'\n'' replace \n, it can work in MacOS:
sed 's/,/\'$'\n''/g' file
the second, just use an empty line:
sed 's/,/\
/g' file
Ps. Pay attention the range separated by '
the third, use gnu-sed replace the mac-sed

Apparently \r is the key!
$ sed 's/, /\r/g' file3.txt > file4.txt
Transformed this:
ABFS, AIRM, AMED, BOSC, CALI, ECPG, FRGI, GERN, GTIV, HSON, IQNT, JRCC, LTRE,
MACK, MIDD, NKTR, NPSP, PME, PTIX, REFR, RSOL, UBNT, UPI, YONG, ZEUS
To this:
ABFS
AIRM
AMED
BOSC
CALI
ECPG
FRGI
GERN
GTIV
HSON
IQNT
JRCC
LTRE
MACK
MIDD
NKTR
NPSP
PME
PTIX
REFR
RSOL
UBNT
UPI
YONG
ZEUS

This works on MacOS Mountain Lion (10.8), Solaris 10 (SunOS 5.10) and RHE Linux (Red Hat Enterprise Linux Server release 5.3, Tikanga)...
$ sed 's/{pattern}/\^J/g' foo.txt > foo2.txt
... where the ^J is done by doing ctrl+v+j. Do mind the \ before the ^J.
PS, I know the sed in RHEL is GNU, the MacOS sed is FreeBSD based, and although I'm not sure about the Solaris sed, I believe this will work pretty much with any sed. YMMV tho'...

To make it complete, this also works:
echo "a,b" | sed "s/,/\\$(echo -e '\n\r')/"

Though I am late to this post, just updating my findings. This answer is only for Mac OS X.
$ sed 's/new/
> /g' m1.json > m2.json
sed: 1: "s/new/
/g": unescaped newline inside substitute pattern
In the above command I tried with Shift+Enter to add new line which didn't work. So this time I tried with "escaping" the "unescaped newline" as told by the error.
$ sed 's/new/\
> /g' m1.json > m2.json
Worked! (in Mac OS X 10.9.3)

$ echo $PATH | sed -e $'s/:/\\\n/g'
/usr/local/sbin
/Library/Oracle/instantclient_11_2/sdk
/usr/local/bin
...
Works for me on Mojave

Just to clearify: man-page of sed on OSX (10.8; Darwin Kernel Version 12.4.0) says:
[...]
Sed Regular Expressions
The regular expressions used in sed, by default, are basic regular expressions (BREs, see re_format(7) for more information), but extended
(modern) regular expressions can be used instead if the -E flag is given. In addition, sed has the following two additions to regular
expressions:
1. In a context address, any character other than a backslash (``\'') or newline character may be used to delimit the regular expression.
Also, putting a backslash character before the delimiting character causes the character to be treated literally. For example, in the
context address \xabc\xdefx, the RE delimiter is an ``x'' and the second ``x'' stands for itself, so that the regular expression is
``abcxdef''.
2. The escape sequence \n matches a newline character embedded in the pattern space. You cannot, however, use a literal newline charac-
ter in an address or in the substitute command.
[...]
so I guess one have to use tr - as mentioned above - or the nifty
sed "s/,/^M
/g"
note: you have to type <ctrl>-v,<return> to get '^M' in vi editor

The sed on macOS Mojave was released in 2005, so one solution is to install the gnu-sed,
brew install gnu-sed
then use gsed will do as you wish,
gsed 's/,/\n/g' file
If you prefer sed, just sudo sh -c 'echo /usr/local/opt/gnu-sed/libexec/gnubin > /etc/paths.d/brew', which is suggested by brew info gnu-sed. Restart your term, then your sed in command line is gsed.

FWIW, the following line works in windows and replaces semicolons in my path variables with a newline. I'm using the tools installed under my git bin directory.
echo %path% | sed -e $'s/;/\\n/g' | less

I have found another command that is working also.
find your_filename.txt -type f -exec sed -i 's/,/\n/g' {} \;

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio