How can I capture all numbers in a string using sed - shell

I tried to use sed to capture numbers in a string with following script:
echo '["770001,德邦优化混合","750005,安信平稳增长混合发起A"]' | sed -n 's/.*"\(\d{6}\),/\1/p'
My expectation is echo
770001
750005
While nothing output. Why?

In case you are ok with awk then following awk may help you in same. Since I have old version of awk so I am using --re-interval if you have newer version of awk then you may not need it.
echo '["770001,德邦优化混合","750005,安信平稳增长混合发起A"]' |
awk --re-interval '{while(match($0,/[0-9]{6}/)){print substr($0,RSTART,RLENGTH);$0=substr($0,RSTART+RLENGTH+1)}}'
Output will be as follows.
770001
750005

Related

Extract substring from a variables between two patterns in bash with special characters

I am trying to Extract substring from variables between two patterns in bash that as special characters inside the variable.
The variable:
MQ_URI=ssl://b-7dda5da6-59a5-4150-8e2f-16534985665-1.mq.us-east-1.amazonaws.com:61617?jms.prefetchPolicy.queuePrefetch=0
What I've tried so far:
echo "$MQ_URI" | sed -E 's/.*ssl:// (.*) :61617.*/\1/'
Got me this in response:
sed: -e expression #1, char 12: unknown option to `s'
Also tried with grep:
echo $MQ_URI | grep -o -P '(?<=ssl://).*(?=:61617jms.prefetchPolicy.queuePrefetch=0)
The output I need is everything between: "ssl://" and ":61617?jms.prefetchPolicy.queuePrefetch=0"
which is : "b-7dda5da6-59a5-4150-8e2f-16534985665-1.mq.us-east-1.amazonaws.com"
Using bash
$ mq_uri=${mq_uri##*/}
$ mq_uri=${mq_uri//:*}
$ echo "$mq_uri"
b-7dda5da6-59a5-4150-8e2f-16534985665-1.mq.us-east-1.amazonaws.com
sed
$ sed -E 's~[^-]*/([^?]*):.*~\1~' <<< "$mq_uri"
b-7dda5da6-59a5-4150-8e2f-16534985665-1.mq.us-east-1.amazonaws.com
grep
$ grep -Po '[^-]*/\K[^:]*' <<< "$mq_uri"
b-7dda5da6-59a5-4150-8e2f-16534985665-1.mq.us-east-1.amazonaws.com
awk
$ awk -F'[/:]' '{print $4}' <<< "$mq_uri"
b-7dda5da6-59a5-4150-8e2f-16534985665-1.mq.us-east-1.amazonaws.com
If this is what you expect
echo "$MQ_URI" | sed -E 's#.*ssl://(.*):61617.*#\1#'
b-7dda5da6-59a5-4150-8e2f-16534985665-1.mq.us-east-1.amazonaws.com
replace the delimiters by # or anything not found in the string.
With your shown samples and attempts please try following codes.
##Shell variable named `mq_uri` being created here.
##to be used in following all solutions.
mq_uri="ssl://b-7dda5da6-59a5-4150-8e2f-16534985665-1.mq.us-east-1.amazonaws.com:61617?jms.prefetchPolicy.queuePrefetch=0"
1st solution: Using awk's match function along with split` function here.
awk 'match($0,/^ssl:.*:61617\?/){split(substr($0,RSTART,RLENGTH),arr,"[/:]");print arr[4]}' <<<"$mq_uri"
2nd solution: Using GNU grep along with its -oP options and its \K option to get required output.
grep -oP '^ssl:\/\/\K[^:]*(?=:61617\?)' <<<"$mq_uri"
3rd solution: Using match function of awk along with using gsub to Globally substitute values to get required output.
awk 'match($0,/^ssl:.*:61617\?/){val=substr($0,RSTART,RLENGTH);gsub(/^ssl:\/\/|:.*\?/,"",val);print val}' <<<"$mq_uri"
4th solution: Using awk's match function along with its array creation capability in GNU awk.
awk 'match($0,/^ssl:\/\/(.*):61617\?/,arr){print arr[1]}' <<<"$mq_uri"
5th solution: With perl's One-liner solution please try following code.
perl -pe 's/ssl:\/\/(.*):61617\?.*/\1/' <<<"$mq_uri"

One-liner POSIX command to lowercase string in bash

Problem
I have this comand:
sed $((SS - default_scripts))!d customScripts.txt
and it gives me Foo Bar.
I want to convert this to lowercase.
Attempt
When I tried using the | awk '{print tolower($0)}' command on it it returned nothing:
$($(sed $((SS - default_scripts))!d customScripts.txt) | awk '{print tolower($0)}')
Final
Please enlighten me on my typo, or recommend me another POSIX way of converting a whole string to lowercase in a compact manner. Thank you!
The pipe to awk should be inside the same command substitution as sed, so that it processes the output of sed.
$(sed $((SS - default_scripts))!d customScripts.txt | awk '{print tolower($0)}')
You don't need another command substitution around both of them.
Your typo was wrapping everything in $(...) and so first trying to execute the output of just the sed part and then trying to execute the output of the sed ... | awk ... pipeline.
You don't need sed commands nor shell arithmetic operations when you're using awk. If I understand what you're trying to do with this:
$(sed $((SS - default_scripts))!d customScripts.txt) | awk '{print tolower($0)}'
correctly then it'd be just this awk command:
awk -v s="$SS" -v d="$default_scripts" 'BEGIN{n=s-d} NR==n{print tolower($0); exit}' customScripts.txt

How to extract hostname from OCS ID

I'm fixing a bash script on GLPI that displays machines that have not been updated for more than a month. Data in are displayed like this :
myhostname01t-2015-03-09-16-47-42
I'd like, through regex, get only the name : myhostname01t
I've created this regex :
(.*)-([0-9]{4}.*)
With sed how could i get only the name ?
Simple cut isn't enough? Easier to comprehend than sed.
echo 'myhostname01t-2015-03-09-16-47-42' | cut -d'-' -f1
Since you asked for sed:
$ S="myhostname01t-2015-03-09-16-47-42"
$ sed 's/\(^[^-]\+\).*/\1/' <<<$S
myhostname01t
or even simpler with awk:
$ awk -F- '{print $1}' <<<$S
myhostname01t
Perl:
$ perl -F- -lane 'print $F[0]' <<<$S
myhostname01t
If the aim is to extract the hostname from an OCS ID stored in one variable and put it in another variable then it can be done with pure Bash code:
ocsid='parta-1234-partb-2015-03-09-16-47-42'
hostname=${ocsid%-*-*-*-*-*-*}
printf 'hostname=%s\n' "$hostname"
See Removing part of a string (BashFAQ/100 (How do I do string manipulation in bash?)) for an explanation of ${ocsid%-*-*-*-*-*-*}.

Reading numbers from a text line in bash shell

I'm trying to write a bash shell script, that opens a certain file CATALOG.dat, containing the following lines, made of both characters and numbers:
event_0133_pk.gz
event_0291_pk.gz
event_0298_pk.gz
event_0356_pk.gz
event_0501_pk.gz
What I wanna do is print the numbers (only the numbers) inside a new file NUMBERS.dat, using something like > ./NUMBERS.dat, to get:
0133
0291
0298
0356
0501
My problem is: how do I extract the numbers from the text lines? Is there something to make the script read just the number as a variable, like event_0%d_pk.gz in C/C++?
A grep solution:
grep -oP '[0-9]+' CATALOG.dat >NUMBERS.dat
A sed solution:
sed 's/[^0-9]//g' CATALOG.dat >NUMBERS.dat
And an awk solution:
awk -F"[^0-9]+" '{print $2}' CATALOG.dat >NUMBERS.dat
There are many ways that you can achieve your result. One way would be to use awk:
awk -F_ '{print $2}' CATALOG.dat > NUMBERS.dat
This sets the field separator to an underscore, then prints the second field which contains the numbers.
Awk
awk 'gsub(/[^[:digit:]]/,"")' infile
Bash
while read line; do echo ${line//[!0-9]}; done < infile
tr
tr -cd '[[:digit:]\n]' <infile
You can use grep command to extract the number part.
grep -oP '(?<=_)\d+(?=_)' CATALOG.dat
gives output as
0133
0291
0298
0356
0501
Or
much simply
grep -oP '\d+' CATALOG.dat
You don't need perl mode in grep for this. BREs can do this.
grep -o '[[:digit:]]\+' CATALOG.dat > NUMBERS.dat

shell command to truncate/cut a part of string

I have a file with the below contents. I got the command to print version number out of it. But I need to truncate the last part in the version file
file.spec:
Version: 3.12.0.2
Command used:
VERSION=($(grep -r "Version:" /path/file.spec | awk '{print ($2)}'))
echo $VERSION
Current output : 3.12.0.2
Desired output : 3.12.0
There is absolutey no need for external tools like awk, sed etc. for this simple task if your shell is POSIX-compliant (which it should be) and supports parameter expansion:
$ cat file.spec
Version: 3.12.0.2
$ version=$(<file.spec)
$ version="${version#* }"
$ version="${version%.*}"
$ echo "${version}"
3.12.0
Try this:
VERSION=($(grep -r "Version:" /path/file.spec| awk '{print ($2)}' | cut -d. -f1-3))
Cut split string with field delimiter (-d) , then you select desired field with -f param.
You could use this single awk script awk -F'[ .]' '{print $2"."$3"."$4}':
$ VERSION=$(awk -F'[ .]' '{print $2"."$3"."$4}' /path/file.spec)
$ echo $VERSION
3.12.0
Or this single grep
$ VERSION=$(grep -Po 'Version: \K\d+[.]\d+[.]\d' /path/file.spec)
$ echo $VERSION
3.12.0
But you never need grep and awk together.
if you only grep single file, -r makes no sense.
also based on the output of your command line, this grep should work:
grep -Po '(?<=Version: )(\d+\.){2}\d+' /path/file.spec
gives you:
3.12.0
the \K is also nice. worked for fixed/non-fixed length look-behind. (since PCRE 7.2). There is another answer about it. but I feel look-behind is easier to read, if fixed length.

Resources