Awk command to cut the url - shell

I want to cut my url https://jenkins-crumbtest2.origin-ctc-core-nonprod.com/ into https://origin-ctc-core-nonprod.com I have tried several ways to handle it
$ echo https://jenkins-crumbtest2-test.origin-ctc-core-nonprod.com/ | cut -d"/" -f3 | cut -d"/" -f5
jenkins-crumbtest2.origin-ctc-core-nonprod.com
I have 3 inputs which i want to pass to get the expected output. I want to pass any of the input to get the same output.
Input:
1. https://jenkins-crumbtest2-test.origin-ctc-core-nonprod.com/ (or)
2. https://jenkins-crumbtest2.origin-ctc-core-nonprod.com/ (or)
3. https://jenkins-crumbtest2-test-lite.origin-ctc-core-nonprod.com/
Expected Output:
https://origin-ctc-core-nonprod.com
Can someone please help me ?

Could you please try following. Written and tested with shown samples only.
awk '{gsub(/:\/\/.*test\.|:\/\/.*crumbtest2\.|:\/\/.*test-lite\./,"://")} 1' Input_file
OR non-one liner form of solution above is as follows.
awk '
{
gsub(/:\/\/.*test\.|:\/\/.*crumbtest2\.|:\/\/.*test-lite\./,"://")
}
1
' Input_file
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
{
gsub(/:\/\/.*test\.|:\/\/.*crumbtest2\.|:\/\/.*test-lite\./,"://") ##Gobally substituting everything till test OR crumbtest OR test-lite with :// in line.
}
1 ##Printing current line here.
' Input_file ##Mentioning Input_file name h

This awk skips the records that don't have fixed string origin-ctc-core-nonprod.com in them:
awk 'match($0,/origin-ctc-core-nonprod\.com/){print "https://" substr($0,RSTART,RLENGTH)}'
You can use it with: echostring| awk ..., catfile|or awk ...file .
Explined:
awk ' # using awk
match($0,/origin-ctc-core-nonprod\.com/) { # if fixed string is matched
print "https://" substr($0,RSTART,RLENGTH) # output https:// and fixed string
# exit # uncomment if you want only
}' # one line of output like in sample
Or if you don't need the https:// part, you could just use grep:
grep -om 1 "origin-ctc-core-nonprod\.com"
Then again:
$ var=$(grep -om 1 "origin-ctc-core-nonprod\.com" file) && echo https://$var

Related

Extract text between 2 similar or different strings separately in shell script

I want to extract text between each ### separately to compare with a different file. Need to extract all CVE numbers for all docker images to compare from previous report. File looks as shown below. This is a snippet and it has more than 100 such lines. Need to do this via Shell Script. Kindly help.
### Vulnerabilities found in docker image alarm-integrator:22.0.0-150
| CVE | X-ray Severity | Anchore Severity | Trivy Severity | TR |
| :--- | :------------: | :--------------: | :------------: | :--- |
|[CVE-2020-29361](#221fbde4e2e4f3dd920622768262ee64c52d1e1384da790c4ba997ce4383925e)|||Important|
|[CVE-2021-35515](#898e82a9a616cf44385ca288fc73518c0a6a20c5e0aae74ed8cf4db9e36f25ce)|||High|
### Vulnerabilities found in docker image br-agent:22.0.0-154
| CVE | X-ray Severity | Anchore Severity | Trivy Severity | TR |
| :--- | :------------: | :--------------: | :------------: | :--- |
|[CVE-2020-29361](#221fbde4e2e4f3dd920622768262ee64c52d1e1384da790c4ba997ce4383925e)|||Important|
|[CVE-2021-23214](#75eaa96ec256afa7bc6bc3445bab2e7c5a5750678b7cda792e3c690667eacd98)|||Important|
I've tried something like this grep -oP '(?<=\"##\").*?(?=\"##\")' but it doesn't work.
Expected Output:
For alarm-integrator
CVE-2020-29361
CVE-2021-35515
For br-agent
CVE-2020-29361
CVE-2021-23214
With your shown samples, please try following awk code.
awk '
/^##/ && match($0,/docker image[[:space:]]+[^:]*/){
split(substr($0,RSTART,RLENGTH),arr1)
print "for "arr1[3]
next
}
match($0,/^\|\[[^]]*/){
print substr($0,RSTART+2,RLENGTH-2)
}
' Input_file
Explanation: Adding detailed explanation for above awk code.
awk ' ##Starting awk program from here.
/^##/ && match($0,/docker image[[:space:]]+[^:]*/){ ##Checking condition if line starts from ## AND using match function to match regex docker image[[:space:]]+[^:]* to get needed value.
split(substr($0,RSTART,RLENGTH),arr1) ##Splitting matched part in above match function into arr1 array with default delimiter of space here.
print "for "arr1[3] ##Printing string for space arr1 3rd element here
next ##next will skip all further statements from here.
}
match($0,/^\|\[[^]]*/){ ##using match function to match starting |[ till first occurrence of ] here.
print substr($0,RSTART+2,RLENGTH-2) ##printing matched sub string from above regex.
}
' Input_file ##mentioning Input_file name here.
Using GNU awk (which I assume you have or can get since you're using GNU grep) for the 3rd arg to match():
$ cat tst.awk
match($0,/^###.* ([^:]+):.*/,a) { print "For", a[1] }
match($0,/\[([^]]+)/,a) { print a[1] }
!NF
$ awk -f tst.awk file
For alarm-integrator
CVE-2020-29361
CVE-2021-35515
For br-agent
CVE-2020-29361
CVE-2021-23214
with awk you can do:
awk -v FS=' |[[]|[]]' '/^[#]+/{sub(/:.*$/,"");print "For " $NF} /^\|\[/{print $2} /^$/ {print ""}' file
For alarm-integrator
CVE-2020-29361
CVE-2021-35515
For br-agent
CVE-2020-29361
CVE-2021-23214
we config the field separator FS as |[[]|[]]: space or [ character or ] character.
first condition-action is for getting For alarm-integrator and For br-agent
second condition-action for all CVE numbers
and lastly we add the blank line.
more readable:
awk -v FS=' |[[]|[]]' '
/^[#]+/{sub(/:.*$/,"");print "For " $NF}
/^\|\[/{print $2}
/^$/ {print ""}
' file
For alarm-integrator
CVE-2020-29361
CVE-2021-35515
For br-agent
CVE-2020-29361
CVE-2021-23214

Extract a property value from a text file

I have a log file which contains lines like the following one:
Internal (reserved=1728469KB, committed=1728469KB)
I'd need to extract the value contained in "committed", so 1728469
I'm trying to use awk for that
cat file.txt | awk '{print $4}'
However that produces:
committed=1728469KB)
This is still incomplete and would need still some work. Is there a simpler solution to do that instead?
Thanks
Could you please try following, using match function of awk.
awk 'match($0,/committed=[0-9]+/){print substr($0,RSTART+10,RLENGTH-10)}' Input_file
With GNU grep using \K option of it:
grep -oP '.*committed=\K[0-9]*' Input_file
Output will be 1728469 in both above solutions.
1st solution explanation:
awk ' ##Starting awk program from here.
match($0,/committed=[0-9]+/){ ##Using match function to match from committed= till digits in current line.
print substr($0,RSTART+10,RLENGTH-10) ##Printing sub string from RSTART+10 to RLENGTH-10 in current line.
}
' Input_file ##Mentioning Input_file name here.
Sed is better at simple matching tasks:
sed -n 's/.*committed=\([0-9]*\).*/\1/p' input_file
$ awk -F'[=)]' '{print $3}' file
1728469KB
You can try this:
str="Internal (reserved=1728469KB, committed=1728469KB)"
echo $str | awk '{print $3}' | cut -d "=" -f2 | rev | cut -c4- | rev

Extract specific substring in shell

I have a file which contains following line:
ro fstype=sd timeout=10 console=ttymxc1,115200 show=true
I'd like to extract and store fstype attribue "sd" in a variable.
I did the job using bash
IFS=" " read -a args <<< file
for arg in ${args[#]}; do
if [[ "$arg" =~ "fstype" ]]; then
id=$(cut -d "=" -f2 <<< "$arg")
echo $id
fi
done
and following awk command in another shell script:
awk -F " " '{print $2}' file | cut -d '=' -f2
Because 'fstype' argument position and file content can differ, how to do the same things and keep compatibility in shell script ?
Could you please try following.
awk 'match($0,/fstype=[^ ]*/){print substr($0,RSTART+7,RLENGTH-7)}' Input_file
OR more specifically to handle any string before = try following:
awk '
match($0,/fstype=[^ ]*/){
val=substr($0,RSTART,RLENGTH)
sub(/.*=/,"",val)
print val
val=""
}
' Input_file
With sed:
sed 's/.*fstype=\([^ ]*\).*/\1/' Input_file
awk code's explanation:
awk ' ##Starting awk program from here.
match($0,/fstype=[^ ]*/){ ##Using match function to match regex fstype= till first space comes in current line.
val=substr($0,RSTART,RLENGTH) ##Creating variable val which has sub-string of current line from RSTART to till RLENGTH.
sub(/.*=/,"",val) ##Substituting everything till = in value of val here.
print val ##Printing val here.
val="" ##Nullifying val here.
}
' Input_file ##mentioning Input_file name here.
Any time you have tag=value pairs in your data I find it best to start by creating an array (f[] below) that maps those tags (names) to their values:
$ awk -v tag='fstype' -F'[ =]' '{for (i=2;i<NF;i+=2) f[$i]=$(i+1); print f[tag]}' file
sd
$ awk -v tag='console' -F'[ =]' '{for (i=2;i<NF;i+=2) f[$i]=$(i+1); print f[tag]}' file
ttymxc1,115200
With the above approach you can do whatever you like with the data just by referencing it by it's name as the index in the array, e.g.:
$ awk -F'[ =]' '{
for (i=2;i<NF;i+=2) f[$i]=$(i+1)
if ( (f["show"] == "true") && (f["timeout"] < 20) ) {
print f["console"], f["fstype"]
}
}' file
ttymxc1,115200 sd
If your data has more than 1 row and there can be different fields on each row (doesn't appear to be true for your data) then add delete f as the first line of the script.
If the key and value can be matched by the regex fstype=[^ ]*, grep and -o option which extracts matched pattern can be used.
$ grep -o 'fstype=[^ ]*' file
fstype=sd
In addition, regex \K can be used with -P option (please make sure this option is only valid in GNU grep).
Patterns that are to the left of \K are not shown with -o.
Therefore, below expression can extract the value only.
$ grep -oP 'fstype=\K[^ ]*' file
sd

Copy numbers at the beginning of each line to the end of line

I have a file that produces this kind of lines . I wanna edit these lines and put them in passageiros.txt
a82411:x:1015:1006:Adriana Morais,,,:/home/a82411:/bin/bash
a60395:x:1016:1006:Afonso Pichel,,,:/home/a60395:/bin/bash
a82420:x:1017:1006:Afonso Alves,,,:/home/a82420:/bin/bash
a69225:x:1018:1006:Afonso Alves,,,:/home/a69225:/bin/bash
a82824:x:1019:1006:Afonso Carreira,,,:/home/a82824:/bin/bash
a83112:x:1020:1006:Aladje Sanha,,,:/home/a83112:/bin/bash
a82652:x:1022:1006:Alexandre Ferreira,,,:/home/a82652:/bin/bash
a83063:x:1023:1006:Alexandre Feijo,,,:/home/a83063:/bin/bash
a82540:x:1024:1006:Ana Santana,,,:/home/a82540:/bin/bash
With the following code i'm able to get something like this:
cat /etc/passwd |grep "^a[0-9]" | cut -d ":" -f1,5 | sed "s/a//" | sed "s/,//g" > passageiros.txt
sed -e "s/$/:::a/" -i passageiros.txt
82411:Adriana Morais:::a
60395:Afonso Pichel:::a
82420:Afonso Alves:::a
69225:Afonso Alves:::a
82824:Afonso Carreira:::a
83112:Aladje Sanha:::a
82652:Alexandre Ferreira:::a
83063:Alexandre Feijo:::a
82540:Ana Santana:::a
So my goal is to create something like this:
82411:Adriana Morais:::a82411#
60395:Afonso Pichel:::a60395#
82420:Afonso Alves:::a82420#
69225:Afonso Alves:::a69225#
82824:Afonso Carreira:::a82824#
83112:Aladje Sanha:::a83112#
82652:Alexandre Ferreira:::a82652#
83063:Alexandre Feijo:::a83063#
82540:Ana Santana:::a82540#
How can I do this?
Could you please try following.
awk -F'[:,]' '{val=$1;sub(/[a-z]+/,"",$1);print $1,$5,_,_,val"#"}' OFS=":" Input_file
Explanation: Adding explanation for above code too.
awk -F'[:,]' ' ##Starting awk script here and making field seprator as colon and comma here.
{ ##Starting main block here for awk.
val=$1 ##Creating a variable val whose value is first field.
sub(/[a-z]+/,"",$1) ##Using sub for substituting any kinf of alphabets small a to z in first field with NULL here.
print $1,$5,_,_,val"#" ##Printing 1st, 5th field and printing 2 NULL variables and printing variable val with #.
} ##Closing block for awk here.
' OFS=":" Input_file ##Mentioning OFS value as colon here and mentioning Input_file name here.
EDIT: Adding #Aserre's solution too here.
awk -F'[:,]' '{print substr($1, 2),$5,_,_,$1"#"}' OFS=":" Input_file
You may use the following awk:
awk 'BEGIN {FS=OFS=":"} {sub(/^a/, "", $1); gsub(/,/, "", $5); print $1, $5, _, _, "a" $1 "#"}' file > passageiros.txt
See the online demo
Details
BEGIN {FS=OFS=":"} sets the input and output field separator to :
sub(/^a/, "", $1) removes the first a from Field 1
gsub(/,/, "", $5) removes all , from Field 5
print $1, $5, _, _, "a" $1 "#" prints only the necessary fields to the output.
You can use just one sed:
grep '^a' file | cut -d: -f1,5 | sed 's/a\([^:]*\)\(.*\)/\1\2:::a\1#/;s/,,,//'

Extract the last three columns from a text file with awk

I have a .txt file like this:
ENST00000000442 64073050 64074640 64073208 64074651 ESRRA
ENST00000000233 127228399 127228552 ARF5
ENST00000003100 91763679 91763844 CYP51A1
I want to get only the last 3 columns of each line.
as you see some times there are some empty lines between 2 lines which must be ignored. here is the output that I want to make:
64073208 64074651 ESRRA
127228399 127228552 ARF5
91763679 91763844 CYP51A1
awk  '/a/ {print $1- "\t" $-2 "\t" $-3}'  file.txt.
it does not return what I want. do you know how to correct the command?
Following awk may help you in same.
awk 'NF{print $(NF-2),$(NF-1),$NF}' OFS="\t" Input_file
Output will be as follows.
64073208 64074651 ESRRA
127228399 127228552 ARF5
91763679 91763844 CYP51A1
EDIT: Adding explanation of command too now.(NOTE this following command is for only explanation purposes one should run above command only to get the results)
awk 'NF ###Checking here condition NF(where NF is a out of the box variable for awk which tells number of fields in a line of a Input_file which is being read).
###So checking here if a line is NOT NULL or having number of fields value, if yes then do following.
{
print $(NF-2),$(NF-1),$NF###Printing values of $(NF-2) which means 3rd last field from current line then $(NF-1) 2nd last field from line and $NF means last field of current line.
}
' OFS="\t" Input_file ###Setting OFS(output field separator) as TAB here and mentioning the Input_file here.
You can use sed too
sed -E '/^$/d;s/.*\t(([^\t]*[\t|$]){2})/\1/' infile
With some piping:
$ cat file | tr -s '\n' | rev | cut -f 1-3 | rev
64073208 64074651 ESRRA
127228399 127228552 ARF5
91763679 91763844 CYP51A1
First, cat the file to tr to squeeze out repeted \ns to get rid of empty lines. Then reverse the lines, cut the first three fields and reverse again. You could replace the useless cat with the first rev.

Resources