Grep last match of returned multi line result and assign to variable - bash

Lets say that I have a command list kittens that returns something in this multi line format in my terminal (in this exact layout):
[ 'fluffy'
'buster'
'bob1' ]
How can I fetch bob1 and assign to a variable for scripting use? Here's my non working try so far.
list kittens | grep "'([^']+)' \]"
I am not overly familiar with grepping on the cli and am running into issues of syntax with quotes and such.

If you know that bob1 will be in the last line, you can capture it like that:
myvar="$(list kittens | tail -n1 | grep -oP "'\K[^']+(?=')")"
This uses tail to find the last line and then grep with a lookahead and a lookbehind in the regular expression to extract the part inside the quotes.
Edit: The above assume that you are using GNU grep (for the -P mode). Here's an alternative with sed:
myvar="$(list kittens | tail -n1 | sed -e "s/^[^']*'//; s/'[^']*$//")"

Could be done by awk alone:
list kittens |awk 'END{gsub(/\047|[[:blank:]]|\]/,"");print $0}'
bob1
Example:
echo "$kit"
[ 'fluffy'
'buster'
'bob1' ]
echo "$kit" |awk 'END{gsub(/\047|[[:blank:]]|\]/,"");print $0}'
bob1
To Assign it to any variable:
var=$(list kittens |awk 'END{gsub(/\047|[[:blank:]]|\]/,"");print $0}'
Explanation:
END{}: End block is used to take data from last line as we are interested only for last line.
gsub: This is awk's inbuilt function for search and replacement tasks. Here white space and double quoted and single quotes are removed. Not that \047 is used for single quote replacement.

Related

How do I cut everything on each line starting with a multi-character string in bash?

How do I cut everything starting with a given multi-character string using a common shell command?
e.g., given:
foo+=bar
I want:
foo
i.e., cut everything starting with +=
cut doesn't work because it only takes a single-character delimiter, not a multi-character string:
$ echo 'foo+=bar' | cut -d '+=' -f 1
cut: bad delimiter
If I can't use cut, I would consider using perl instead, or if there's another shell command that is more commonly installed.
cut only allows single character delimiter.
You may use bash string manipulation:
s='foo+=bar'
echo "${s%%+=*}"
foo
or use more powerful awk:
awk -F '\\+=' '{print $1}' <<< "$s"
foo
'\\+=' is a regex that matches + followed by = character.
You can use 'sed' command to do this:
string='foo+=bar'
echo ${string} | sed 's/+=.*//g'
foo
or if you're using Bash shell, then use the below parameter expansion (recommended) since it doesn't create unnecessary pipeline and another sed process and so is efficient:
echo ${string%%\+\=*}
or
echo ${string%%[+][=]*}

how to grep everything between single quotes?

I am having trouble figuring out how to grep the characters between two single quotes .
I have this in a file
version: '8.x-1.0-alpha1'
and I like to have the output like this (the version numbers can be various):
8.x-1.0-alpha1
I wrote the following but it does not work:
cat myfile.txt | grep -e 'version' | sed 's/.*\?'\(.*?\)'.*//g'
Thank you for your help.
Addition:
I used the sed command sed -n "s#version:\s*'\(.*\)'#\1#p"
I also like to remove 8.x- which I edited to sed -n "s#version:\s*'8.x-\(.*\)'#\1#p".
This command only works on linux and it does not work on MAC. How to change this command to make it works on MAC?
sed -n "s#version:\s*'8.x-\(.*\)'#\1#p"
If you just want to have that information from the file, and only that you can quickly do:
awk -F"'" '/version/{print $2}' file
Example:
$ echo "version: '8.x-1.0-alpha1'" | awk -F"'" '/version/{print $2}'
8.x-1.0-alpha1
How does this work?
An awk program is a series of pattern-action pairs, written as:
condition { action }
condition { action }
...
where condition is typically an expression and action a series of commands.
-F "'": Here we tell awk to define the field separator FS to be a <single quote> '. This means the all lines will be split in fields $1, $2, ... ,$NF and between each field there is a '. We can now reference these fields by using $1 for the first field, $2 for the second ... etc and this till $NF where NF is the total number of fields per line.
/version/{print $2}: This is the condition-action pair.
condition: /version/:: The condition reads: If a substring in the current record/line matches the regular expression /version/ then do action. Here, this is simply translated as if the current line contains a substring version
action: {print $2}:: If the previous condition is satisfied, then print the second field. In this case, the second field would be what the OP requests.
There are now several things that can be done.
Improve the condition to be /^version :/ && NF==3 which reads _If the current line starts with the substring version : and the current line has 3 fields then do action
If you only want the first occurance, you can tell the system to exit immediately after the find by updating the action to {print $2; exit}
I'd use GNU grep with pcre regexes:
grep -oP "version: '\\K.*(?=')" file
where we are looking for "version: '" and then the \K directive will forget what it just saw, leaving .*(?=') to match up to the last single quote.
Try something like this: sed -n "s#version:\s*'\(.*\)'#\1#p" myfile.txt. This avoids the redundant cat and grep by finding the "version" line and extracting the contents between the single quotes.
Explanation:
the -n flag tells sed not to print lines automatically. We then use the p command at the end of our sed pattern to explicitly print when we've found the version line.
Search for pattern: version:\s*'\(.*\)'
version:\s* Match "version:" followed by any amount of whitespace
'\(.*\)' Match a single ', then capture everything until the next '
Replace with: \1; This is the first (and only) capture group above, containing contents between single quotes.
When your only want to look at he quotes, you can use cut.
grep -e 'version' myfile.txt | cut -d "'" -f2
grep can almost do this alone:
grep -o "'.*'" file.txt
But this may also print lines you don't want to: it will print all lines with 2 single quotes (') in them. And the output still has the single quotes (') around it:
'8.x-1.0-alpha1'
But sed alone can do it properly:
sed -rn "s/^version: +'([^']+)'.*/\1/p" file.txt

How do I seperate a link to get the end of a URL in shell?

I have some data that looks like this
"thumbnailUrl": "http://placehold.it/150/adf4e1"
I want to know how I can get the trailing part of the URL, I want the output to be
adf4e1
I was trying to grep when starting with / and ending with " but I'm only a beginner in shell scripting and need some help.
I came up with a quick and dirty solution, using grep (with perl regex) and cut:
$ cat file
"thumbnailUrl": "http://placehold.it/150/adf4e1"
"anotherUrl": "http://stackoverflow.com/questions/3979680"
"thumbnailUrl": "http://facebook.com/12f"
"randortag": "http://google.com/this/is/how/we/roll/3fk19as1"
$ cat file | grep -o '/\w*"$' | cut -d'/' -f2- | cut -d'"' -f1
adf4e1
3979680
12f
3fk19as1
We could kill this with a thousand little cuts, or just one blow from Awk:
awk -F'[/"]' '{ print $(NF-1); }'
Test:
$ echo '"thumbnailUrl": "http://placehold.it/150/adf4e1"' \
| awk -F'[/"]' '{ print $(NF-1); }'
adf4e1
Filter thorugh Awk using double quotes and slashes as field separators. This means that the trailing part ../adf4e1" is separated as {..}</>{adf4e1}<">{} where curly braces denote fields and angle brackets separators. The Awk variable NF gives the 1-based number of fields and so $NF is the last field. That's not the one we want, because it is blank; we want $(NF-1): the second last field.
"Golfed" version:
awk -F[/\"] '$0=$(NF-1)'
If the original string is coming from a larger JSON object, use something like jq to extract the value you want.
For example:
$ jq -n '{thumbnail: "http://placehold.it/150/adf4e1"}' |
> jq -r '.thumbnail|split("/")[-1]'
adf4e1
(The first command just generates a valid JSON object representing the original source of your data; the second command parses it and extracts the desired value. The split function splits the URL into an array, from which you only care about the last element.)
You can also do this purely in bash using string replacement and substring removal if you wrap your string in single quotes and assign it to a variable.
#!/bin/bash
string='"thumbnailUrl": "http://placehold.it/150/adf4e1"'
string="${string//\"}"
echo "${string##*/}"
adf4e1 #output
You can do that using 'cut' command in linux. Cut it using '/' and keep the last cut. Try it, its fun!
Refer http://www.thegeekstuff.com/2013/06/cut-command-examples

shell script cut from variables

The file is like this
aaa&123
bbb&234
ccc&345
aaa&456
aaa$567
bbb&678
I want to output:(contain "aaa" and text after &)
123
456
I want to do in in shell script,
Follow code be consider
#!/bin/bash
raw=$(grep 'aaa' 1.txt)
var=$(cut -f2 -d"&" "$raw")
echo $var
It give me a error like
cut: aaa&123
aaa&456
aaa$567: No such file or directory
How to fix it? and how to cut (or grep or other) from exist variables?
Many thanks!
With GNU grep:
grep -oP 'aaa&\K.*' file
Output:
123
456
\K: ignore everything before pattern matching and ignore pattern itself
From man grep:
-o, --only-matching
Print only the matched (non-empty) parts of a matching line,
with each such part on a separate output line.
-P, --perl-regexp
Interpret PATTERN as a Perl compatible regular expression (PCRE)
Cyrus has my vote. An awk alternative if GNU grep is not available:
awk -F'&' 'NF==2 && $1 ~ /aaa/ {print $2}' file
Using & as the field separator, for lines with 2 fields (i.e. & must be present) and the first field contains "aaa", print the 2nd field.
The error with your answer is that you are treating the grep output like a filename in the cut command. What you want is this:
grep 'aaa.*&' file | cut -d'&' -f2
The pattern means "aaa appears before an &"

Text Manipulation using sed or AWK

I get the following result in my script when I run it against my services. The result differs depending on the service but the text pattern showing below is similar. The result of my script is assigned to var1. I need to extract data from this variable
$var1=HOST1*prod*gem.dot*serviceList : svc1 HOST1*prod*kem.dot*serviceList : svc3, svc4 HOST1*prod*fen.dot*serviceList : svc5, svc6
I need to strip the name of the service list from $var1. So the end result should be printed on separate line as follow:
svc1
svc2
svc3
svc4
svc5
svc6
Can you please help with this?
Regards
Using sed and grep:
sed 's/[^ ]* :\|,\|//g' <<< "$var1" | grep -o '[^ ]*'
sed deletes every non-whitespace before a colon and commas. Grep just outputs the resulting services one per line.
Using gnu grep and gnu sed:
grep -oP ': *\K\w+(, \w+)?' <<< "$var1" | sed 's/, /\n/'
svc1
svc3
svc4
svc5
svc6
grep is the perfect tool for the job.
From man grep:
-o, --only-matching
Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.
Sounds perfect!
As far as I'm aware this will work on any grep:
echo "$var1" | grep -o 'svc[0-9]\+'
Matches "svc" followed by one or more digits. You can also enable the "highly experimental" Perl regexp mode with -P, which means you can use the \d digit character class and don't have to escape the + any more:
grep -Po 'svc\d+' <<<"$var1"
In bash you can use <<< (a Here String) which supplies "$var1" to grep on the standard input.
By the way, if your data was originally on separate lines, like:
HOST1*prod*gem.dot*serviceList : svc1
HOST1*prod*kem.dot*serviceList : svc3, svc4
HOST1*prod*fen.dot*serviceList : svc5, svc6
This would be a good job for awk:
awk -F': ' '{split($2,a,", "); for (i in a) print a[i]}'

Resources