execute grep regular expression involving `|` using ssh on remote host - bash

I am trying to run grep command involving regular expression '|' on a server using ssh.
ssh rpatil#192.168.1.5 grep -E "GapEvent|GapFilled" "$logFile" > $server-$testName.log
now '|' in the command is being treated as pipe and error "no command GapFilled" is being raised.
I tried 'GapEvent|GapFilled' or '(GapEvent|GapFilled)'
so how should regular expression "GapEvent|GapFilled" should be written so that | is not treated as pipe?

You need two levels of quotes since the command line is evaluated twice (once locally when ssh is executed and once when grep is executed on the remote side). You can use one of these patterns:
"'a|b'"
'"a|b"'
"\"a|b\""

Escape the | like this \|
grep -E "GapEvent\|GapFilled" "$logFile" file

Simply use two expressions:
grep -E -e "GapEvent" -e "GapFilled" "$logFile"
-E may no longer be needed here. -F may also be a preference.

Related

Problem using grep inside a bash script that run remote on a server

I m using a script that run remote on server via ssh.
Inside the script I'm using this line from below:
ls | grep -oP "\d{4} -\d{2}-\d{2}"
On my local machine that run Ubuntu the script work fine.
But when I try to run it remote I got this
grep: invalid option -- 'P'
BusyBox v1.24.1 multi-call binary.
Usage: grep [-HhnlLoqvsriwFE] [-m N] [-A/B/C N] PATTERN/-e PATTERN/...-f file [FILE]...
The first thing I thought was an alias problem, i tryed
type grep
Output is: grep is /bin/grep I think this is ok.
What worries me is BusyBox (I do not know what it is) but i think this can be the problem ?
You may use [0-9] / [[:digit:]] instead of \d with POSIX BRE (no option) or ERE (-E option):
grep -o "[0-9]\{4\} -[0-9]\{2\}-[0-9]\{2\}"
grep -oE "[0-9]{4} -[0-9]{2}-[0-9]{2}"
Note that in the first command you need to escape the braces since unescaped { and } match literal brace symbols in a POSIX BRE regex. When escaped, they mean range (interval, limiting) quantifiers. And in the second command, POSIX ERE is enabled with -E, and the behavior is reverse: when the braces are escaped they are literal chars, else they are quantifiers.

Error on sed script - extra characters after command

I've been trying to create a sed script that reads a list of phone numbers and only prints ones that match the following schemes:
+1(212)xxx-xxxx
1(212)xxx-xxxx
I'm an absolute beginner, but I tried to write a sed script that would print this for me using the -n -r flags (the contents of which are as follows):
/\+1\(212\)[0-9]{3}-[0-9]{4}/p
/1\(212\)[0-9]{3}-[0-9]{4}/p
If I run this in sed directly, it works fine (i.e. sed -n -r '/\+1\(212\)[0-9]{3}-[0-9]{4}/p' sample.txt prints matching lines as expected. This does NOT work in the sed script I wrote, instead sed says:
sed: -e expression #1, char 2: extra characters after command
I could not find a good solution, this error seems to have so many causes and none of the answers I found apply easily here.
EDIT: I ran it with sed -n -r script.sed sample.txt
sed can not automatically determine whether you intended a parameter to be a script file or a script string.
To run a sed script from a file, you have to use -f:
$ echo 's/hello/goodbye/g' > demo.sed
$ echo "hello world" | sed -f demo.sed
goodbye world
If you neglect the -f, sed will try to run the filename as a command, and the delete command is not happy to have emo.sed after it:
$ echo "hello world" | sed demo.sed
sed: -e expression #1, char 2: extra characters after command
Of the various unix tools out there, two use BRE as their default regex dialect. Those two tools are sed and grep.
In most operating systems, you can use egrep or grep -E to tell that tool to use ERE as its dialect. A smaller (but still significant) number of sed implementations will accept a -E option to use ERE.
In BRE mode, however, you can still create atoms with brackets. And you do it by escaping parentheses. That's why your initial expression is failing -- the parentheses are NOT special by default in BRE, but you're MAKING THEM SPECIAL by preceding the characters with backslashes.
The other thing to keep in mind is that if you want sed to execute a script from a command line argument, you should use the -e option.
So:
$ cat ph.txt
+1(212)xxx-xxxx
1(212)xxx-xxxx
212-xxx-xxxx
$ grep '^+\{0,1\}1([0-9]\{3\})' ph.txt
+1(212)xxx-xxxx
1(212)xxx-xxxx
$ egrep '^[+]?1\([0-9]{3}\)' ph.txt
+1(212)xxx-xxxx
1(212)xxx-xxxx
$ sed -n -e '/^+\{0,1\}1([0-9]\{3\})/p' ph.txt
+1(212)xxx-xxxx
1(212)xxx-xxxx
$ sed -E -n -e '/^[+]?1\([0-9]{3}\)/p' ph.txt
+1(212)xxx-xxxx
1(212)xxx-xxxx
Depending on your OS, you may be able to get a full list of how this works from man re_format.

match multiple conditions with GNU sed

I'm using sed to replace values in other bash scripts, such as:
somedata="$(<somefile.sh)"
somedata=`sed 's/ ==/==/g' <<< $somedata` # [space]== becomes ==
somedata=`sed 's/== /==/g' <<< $somedata` # ==[space] becomes ==
The same for ||, &&, !=, etc. I think steps should be reduced with the right regex match. The operator does not need surrounding spaces, but may have a space before and after, only before, or only after. Is there a way to handle all of these with one sed command?
There are many other conditions not mentioned also. The script takes more time to execute than desired.
The goal is to reduce the overall execution time so I am hoping to reduce the number of commands used with clever regex to match multiple conditions.
I'm also considering tr, awk or perl - whichever is fastest?
With GNU sed, you can use the | (or) operator:
$ sed -r 's/ *(&&|\|\|) */\1/g' <<< "foo && bar || baz"
foo&&bar||baz
*(&&|\|\|) *: search for zero or more space followed by any of the | separated strings followed by zero or more space
the matching strings are captured and output using backreference
Edit:
As pointed out in comments, you can use the -E flag with GNU sed in place of -r. Your command will be more portable:
sed -E 's/ *(\&\&|\|\|) */\1/g'
As GNU sed also supports \| alternation operator with Basic Regular Expressions, you can use it for better readability:
sed 's/ *\(&&\|||\) */\1/g'
You can chain multiple sed substitutions with the -e flag:
$ echo -n "test data here" | sed -e 's/test/TEST/' \
-e 's/data/HERE/' \
-e 's/here/DATA/'
$ TEST HERE DATA
you can use a sedfile (-f option) alongside with the -i option (replace in-place, no need to store in env. variable):
sed -i -f mysedfile somefile.sh
mysedfile may contain expressions, 1 per line
s/ *&& */\&\&/g
s/ *== */==/g
(or use the -e option to use several expression, but if you have a lot of them, it wil become quickly unreadable)
BTW: -i option creates a temporary file within the processed file directory, so in the end, if operation succeeds, the original file is deleted and the temporary file is renamed into the original file name
When the end of the file is reached, the temporary file is renamed
to the output file's original name. The extension, if supplied,
is used to modify the name of the old file before renaming the
temporary file, thereby making a backup copy(2))
so there's no I/O overhead with that option. No need at all to store in a variable.

How to use variables in sed command

I have file called "text_file1.txt" and the content in the file is
"subject= /C=US/O=AAA/OU=QA/OU=12345/OU=TESTAPP/"
Now what i want to achieve is to the content to be like below:
"subject= /C=US/O=AAA/$$$QA/###12345/###TESTAPP/"
when i execute the below piece of code:
#! /bin/ksh
OU1="QA"
OU2=12345
OU3="TESTAPP"
`sed -i "s/OU=$OU1/$$$\${OU1}/g" text_file1.txt`
`sed -i "s/OU=$OU2/###\${OU2}/g" text_file1.txt`
`sed -i "s/OU=$OU3/###\${OU3}/g" text_file1.txt`
content=`cat text_file1.txt`
echo "content:$content"
i get the output like this:
content:subject= /C=US/O=Wells Fargo/2865528655{OU1}/###12345/###TESTAPP/CN=03032015_CUST_2131_Unix_CLBLABB34C02.wellsfargo.com
only this command "sed -i "s/OU=$OU1/$$$\${OU1}/g" text_file1.txt" is not working as expected.Can anyone please suggest some idea on this?
Thanks in advance.
Two things play into this:
You have to escape $ (i.e., use \$) in doubly-quoted shell strings if you want a literal $, and
\ does not retain its literal meaning when it comes before a $ inside backticks (that is to say, inside backticks, \$ becomes just $).
When you write
`sed -i "s/OU=$OU1/$$$\${OU1}/g" text_file1.txt`
because the command is in backticks, you spawn a subshell with the command
sed -i "s/OU=$OU1/$$$${OU1}/g" text_file1.txt
Since $$$$ is inside a doubly-quoted string, variable expansion takes place, and it is expanded as two occurrences of $$ (the process ID of the shell that's doing the expansion). This means that the code sed sees is ultimately
s/OU=QA/1234512345{OU1}/g
...if the process ID of the spawned subshell is 12345.
In this particular case, you don't need the command substitution (the backticks), so you could write
sed -i "s/OU=$OU1/\$\$\$${OU1}/g" text_file1.txt
However, using shell variables in sed code is always a problem. Consider, if you will, what would happen if OU1 had the value /; e rm -Rf * # (hint: GNU sed has an e instruction that runs shell commands). For this reason, I would always prefer awk to do substitutions that involve shell variables:
cp text_file1.txt text_file1.txt~
awk -v OU1="$OU1" '{ gsub("OU=" OU1, "$$$" OU1) } 1' text_file1.txt~ > text_file1.txt
This avoids code injection problems by not treating OU1 as code.
If you have GNU awk 4.1 or later,
awk -v OU1="$OU1" -i inplace '{ gsub("OU=" OU1, "$$$" OU1) } 1' text_file1.txt
can do the whole thing without a (visible) temporary file.
Does this help as a start?
echo ''
OU1="QA"
echo "subject= /C=US/O=AAA/OU=${OU1}/OU=12345/OU=TESTAPP/" \
| sed -e "s|/OU=${OU1}/|/OU=\$\$\$${OU1}/|g"
The result is:
subject= /C=US/O=AAA/OU=$$$QA/OU=12345/OU=TESTAPP/
(You are mixing up the use of $ signs .)
You must be careful when putting $ inside double quotes.
sed -i "s/OU=$OU1/"'$$$'"${OU1}/g" text_file1.txt
Example:
$ OU1="QA"
$ echo 'OU=QA' | sed "s/OU=$OU1/"'$$$'"${OU1}/g"
$$$QA

Using egrep to search for IP address by octets from a shell variable

I'm writing a BASH script that outputs iptables -L -n and searches for the existence of an IP address. I'm stuck with how to use this with egrep. Roughly:
CHECK=$(iptables -L -n | egrep $the_string)
which "looks" like it would work, but it doesn't have an end delimiter $ so it would match:
25.24.244
and
25.24.24
When I really just need to match for 25.24.24 only.
I tried escaping this but the $ creates issues with the regular expression.
At least this is the only means I've found to search for the IP in the iptables system. It doesn't appear to have any query mechanism itself (puzzling).
I am probably missing something very simple here, and just need a pointer or two :-)
Thanks.
You should backslash the . : this means any character in regex...
iptables -L -n | grep "25\.24\.24$"
(no need egrep there)
The $ at the end of the regular expression works as expected:
the_ip=25.24.24
the_string=$(echo $the_ip | sed 's/\./\\\./g')
iptables -L -n | egrep "$the_string$"

Resources