How to grep a line starting with an single quote , alphabet followed by fixed digit only? - shell

I wish to grep a line staring which start with Alphabet followed by 5 numbers.
MY approach is
eg
'A12345'
'A123456'
output should be only line starting with Alphabet followed by 5 numbers
ie 'A12345'
My approach : it doesn't work
grep -E '[A-Z]''[0-9]{5}'

# PCRE allows \xHH escapes
$ grep -P '^\x27[A-Z][0-9]{5}\x27' ip.txt
'A12345'
# double quotes can also be used here since there's no clash
$ grep -E "^'[A-Z][0-9]{5}'" ip.txt
'A12345'
# this is same as ^' followed by [A-Z][0-9]{5} and then another '
$ grep -E "^'"'[A-Z][0-9]{5}'"'" ip.txt
'A12345'
Here's an example where double quotes can be problematic. See https://mywiki.wooledge.org/Quotes and Difference between single and double quotes in Bash for more details.
$ echo '1a$(z)b' | grep -E "a$(z)b"
z: command not found

Related

sed using dollar sign for both environment variable and end-of-line pattern matching [duplicate]

This question already has answers here:
Using dollar sign in sed for both variable replacement and character
(3 answers)
Closed 3 years ago.
In sed, we use double quotes to read environment variable indicated by a dollar sign, what should i do if I want to use the dollar sign for end-of-line pattern at the same time?
Here is example of my command (not working),
cat inputfile | xargs -n4 sh -c 'sed -ne "$1,$2p" bigfile|sed -e "$s/$/\n+--$3-----+/" > $0.outputfile'
I pipe the content of a input file (containing the output filename, target line number (start and end), and some tags to add to each file) to xargs which break the big file into smaller files. Example of input file:
dataA
59
88
sometagA
dataB
91
236
sometagB
....
In the second sed command: sed -e "$s/$/\n+--$3-----+/", $3 is environment variable from xargs, and i want to use the other $s and $ to target end-of-file and end-of-line pattern respectively, as i intent to insert some tags to each output file. I cannot use single and double quote at the same time as the outter xargs already used single quote.
Use backslash to escape a dollar sign from the shell.
xargs -n4 sh -c 'sed -ne "$1,$2p" bigfile |
sed -e "\$s/\$/\n+--$3-----+/" > "$0".outputfile' <inputfile
You only pass four arguments and they are numbered from zero, so I guess you mean $3, not $4.
Notice also how to avoid the useless cat.
The two sed scripts could probably be merged into something like
sed -n "$1,$2!d;$2{;s/\$/+--$3-----+/p;q;}p"
escape it from bash shell to have an end anchor
sed -ne "$1,$2p" bigfile|sed -e "$s/\$/\n+--$3-----+/"
escape it once more from sed to have a literal dollar sign e.g;
sed -ne "$1,$2p" bigfile|sed -e "$s/\$/\n+--$3-----+/| sed 's/.*/& pays \\$99/'

sed: remove all characters except for last n characters

I am trying to remove every character in a text string except for the remaining 11 characters. The string is Sample Text_that-would$normally~be,here--pe_-l4_mBY and what I want to end up with is just -pe_-l4_mBY.
Here's what I've tried:
$ cat food
Sample Text_that-would$normally~be,here--pe_-l4_mBY
$ cat food | sed 's/^.*(.{3})$/\1/'
sed: 1: "s/^.*(.{3})$/\1/": \1 not defined in the RE
Please note that the text string isn't really stored in a file, I just used cat food as an example.
OS is macOS High Sierra 10.13.6 and bash version is 3.2.57(1)-release
You can use this sed with a capture group:
sed -E 's/.*(.{11})$/\1/' file
-pe_-l4_mBY
Basic regular expressions (used by default by sed) require both the parentheses in the capture group and the braces in the brace expression to be escaped. ( and { are otherwise treated as literal characters to be matched.
$ cat food | sed 's/^.*\(.\{3\}\)$/\1/'
mBY
By contrast, explicitly requesting sed to use extended regular expressions with the -E option reverses the meaning, with \( and \{ being the literal characters.
$ cat food | sed -E 's/^.*(.{3})$/\1/'
mBY
Try this also:
grep -o -E '.{11}$' food
grep, like sed, accepts an arbitrary number of file name arguments, so there is no need for a separate cat. (See also useless use of cat.)
You can use tail or Parameter Expansion :
string='Sample Text_that-would$normally~be,here--pe_-l4_mBY'
echo "$string" | tail -c 11
echo "${string#${string%??????????}}"
pe_-l4_mBY
pe_-l4_mBY
also with rev/cut/rev
$ echo abcdefghijklmnopqrstuvwxyz | rev | cut -c1-11 | rev
pqrstuvwxyz
man rev => rev - reverse lines characterwise

Adding double quotes to beginning, end and around comma's in bash variable

I have a shell script that accepts a parameter that is comma delimited,
-s 1234,1244,1567
That is passed to a curl PUT json field. Json needs the values in a "1234","1244","1567" format.
Currently, I am passing the parameter with the quotes already in it:
-s "\"1234\",\"1244\",\"1567\"", which works, but the users are complaining that its too much typing and hard to do. So I'd like to just take a comma delimited list like I had at the top and programmatically stick the quotes in.
Basically, I want a parameter to be passed in as 1234,2345 and end up as a variable that is "1234","2345"
I've come to read that easiest approach here is to use sed, but I'm really not familiar with it and all of my efforts are failing.
You can do this in BASH:
$> arg='1234,1244,1567'
$> echo "\"${arg//,/\",\"}\""
"1234","1244","1567"
awk to the rescue!
$ awk -F, -v OFS='","' -v q='"' '{$1=$1; print q $0 q}' <<< "1234,1244,1567"
"1234","1244","1567"
or shorter with sed
$ sed -r 's/[^,]+/"&"/g' <<< "1234,1244,1567"
"1234","1244","1567"
translating this back to awk
$ awk '{print gensub(/([^,]+)/,"\"\\1\"","g")}' <<< "1234,1244,1567"
"1234","1244","1567"
you can use this:
echo QV=$(echo 1234,2345,56788 | sed -e 's/^/"/' -e 's/$/"/' -e 's/,/","/g')
result:
echo $QV
"1234","2345","56788"
just add double quotes at start, end, and replace commas with quote/comma/quote globally.
easy to do with sed
$ echo '1234,1244,1567' | sed 's/[0-9]*/"\0"/g'
"1234","1244","1567"
[0-9]* zero more consecutive digits, since * is greedy it will try to match as many as possible
"\0" double quote the matched pattern, entire match is by default saved in \0
g global flag, to replace all such patterns
In case, \0 isn't recognized in some sed versions, use & instead:
$ echo '1234,1244,1567' | sed 's/[0-9]*/"&"/g'
"1234","1244","1567"
Similar solution with perl
$ echo '1234,1244,1567' | perl -pe 's/\d+/"$&"/g'
"1234","1244","1567"
Note: Using * instead of + with perl will give
$ echo '1234,1244,1567' | perl -pe 's/\d*/"$&"/g'
"1234""","1244""","1567"""
""$
I think this difference between sed and perl is similar to this question: GNU sed, ^ and $ with | when first/last character matches
Using sed:
$ echo 1234,1244,1567 | sed 's/\([0-9]\+\)/\"\1\"/g'
"1234","1244","1567"
ie. replace all strings of numbers with the same strings of numbers quoted using backreferencing (\1).

sed or grep to read between a set of parentheses

I'm trying to read a version number from between a set of parentheses, from this output of some command:
Test Application version 1.3.5
card 0: A version 0x1010000 (1.0.0), 20 ch
Total known cards: 1
What I'm looking to get is 1.0.0.
I've tried variations of sed and grep:
command.sh | grep -o -P '(?<="(").*(?=")")'
command.sh | sed -e 's/(\(.*\))/\1/'
and plenty of variations. No luck :-(
Help?
You were almost there! In pgrep, use backslashes to keep literal meaning of parentheses, not double quotes:
grep -o -P '(?<=\().*(?=\))'
Having GNU grep you can also use the \K escape sequence available in perl mode:
grep -oP '\(\K[^)]+'
\K removes what has been matched so far. In this case the starting ( gets removed from match.
Alternatively you could use awk:
awk -F'[()]' 'NF>1{print $2}'
The command splits input lines using parentheses as delimiters. Once a line has been splitted into multiple fields (meaning the parentheses were found) the version number is the second field and gets printed.
Btw, the sed command you've shown should be:
sed -ne 's/.*(\(.*\)).*/\1/p'
There are a couple of variations that will work. First with grep and sed:
grep '(' filename | sed 's/^.*[(]\(.*\)[)].*$/\1/'
or with a short shell script:
#!/bin/sh
while read -r line; do
value=$(expr "$line" : ".*(\(.*\)).*")
if [ "x$value" != "x" ]; then
printf "%s\n" "$value"
fi
done <"$1"
Both return 1.0.0 for your given input file.

understanding SED commands

I need to understand a shell code which uses the following command to fetch directions from a source to destination using GOOGLE MAPS API:
wget --no-parent -O - https://maps.googleapis.com/maps/api/directions/json?origin=$begin\&destination=$finish\&sensor=false > new.txt
Next we fetch the following line of the output:
**"html_instructions" : "Head \u003cb\u003enorthwest\u003c/b\u003e"**
grep -n html_instructions new.txt > new1.txt
Can somebody please tell me the meaning of using:
sed -e 's/\\u003cb//g'
etc in the following command:
sed -e 's/\\u003cb//g' -e 's/\\u003e//g' -e 's/\\u003c\/b//g' -e 's/\\u003c//g' -e 's/div.*div//g' -e 's/.*://g' -e 's/"//g' -e 's/ "//g' new1.txt > new2.txt
Which outputs Head northwest only.
Thanks in advance!
sed -e 's/\\u003cb//g' -e 's/\\u003e//g' -e 's/\\u003c\/b//g' -e 's/\\u003c//g' -e 's/div.*div//g' -e 's/.*://g' -e 's/"//g' -e 's/ "//g' new1.txt > new2.txt
The string after each -e is a sed command. The sed command s/\\u003cb//g searches for all occurrences of the unicode character 003CB (which is a greek small letter upsilon with dialytika) and replaces it with nothing. In other words, it remove the character from the string.
Thus, that sed command removes every occurrence of unicode characters 003cb, u003e, and u003c from the lines and new1.txt and sends the output to new2.txt.
Additionally, s/div.*div//g causes any string that begins and ends with "div" to be removed. The command s/.*://g removes any text from the beginning of the line to the last colon in the line. s/"//g removes the every occurence of the double-quote character. s/ "//g removes every occurrence of space followed by double-quote.
In general, the sed command s/new/old/ searches for the first occurrence of new and replaces it with old. With a g appended at the end, as in s/new/old/g, it makes the substitution globally: looks for every occurrence of new and replaces it with old. Adding a lot of power to these commands, new may be a regular expression. Consider s/.*://g. The dot character has the special meaning of "any character at all". The star character means zero or more of the preceding character. Thus the regular expression.*:` means zero or more of any characters followed by a colon.
You can take all in one go with awk:
awk -F\" '/html_instructions/ {gsub(/(\\u003(c|cb|e)|\/b)/,x);print $4}'
Head northwest
So whole line should be:
wget --no-parent -O - https://maps.googleapis.com/maps/api/directions/json?origin=$begin\&destination=$finish\&sensor=false | awk -F\" '/html_instructions/ {gsub(/(\\u003(c|cb|e)|\/b)/,x);print $4}'
Head northwest
to get it into a variable
d=$(wget --no-parent -O - https://maps.googleapis.com/maps/api/directions/json?origin=$begin\&destination=$finish\&sensor=false | awk -F\" '/html_instructions/ {gsub(/(\\u003(c|cb|e)|\/b)/,x);print $4}')
echo $d
Head northwest

Resources