Grep only 2 portions in a line - bash

I have the following line. I can grep one part but struggling with also grepping the second portion.
Line:
html:<TR><TD>PICK_1</TD><TD>36.0000</TD><TD>1000000</TD><TD>26965</TD><TD>100000000</TD><TD>97074000</TD><TD>2926000</TD><TD>2.926%</TD><TD>97.074%</TD></TR>
I want to have the following results after grepping this line.
PICK_1 97.074%
Currently just grepping first portion via following command.
grep -Po "<TR><TD>[A-Z0-9_]+" test.txt
Appreciate any help on how I can go about doing this. Thanks.

Use awk with a custom field separator:
awk -F'[<>TDR/]+' '{ print $2, $(NF-1) }' file
This splits the line on things that look like one or more opening or closing <TD> or <TR> tags, and prints the second and second-last field.
Warning: this will break on almost every input except the one that you've shown, since awk, grep and friends are designed for processing text, not HTML.

If you always have the same number of fields delimited by "TD" tags, you can try with this (dirty) awk:
awk -F'[<TD>|</TD>]' '{print $8 " " $80}'
Or this combination of column and awk:
column -t -s "</TD>" | awk -F' ' '{print $3 " " $11}'
Or with sed instead of column:
sed -e 's/<TD>/ /g' | awk -F' ' '{print $3 " " $11}'

try provide each patter after "-e" option
grep -e PICK_1 -e "<TR><TD>[A-Z0-9_]+" test.txt

awk -F'[<>]' '{print $5,$(NF-4)}' file
PICK_1 97.074%

Related

Extract a property value from a text file

I have a log file which contains lines like the following one:
Internal (reserved=1728469KB, committed=1728469KB)
I'd need to extract the value contained in "committed", so 1728469
I'm trying to use awk for that
cat file.txt | awk '{print $4}'
However that produces:
committed=1728469KB)
This is still incomplete and would need still some work. Is there a simpler solution to do that instead?
Thanks
Could you please try following, using match function of awk.
awk 'match($0,/committed=[0-9]+/){print substr($0,RSTART+10,RLENGTH-10)}' Input_file
With GNU grep using \K option of it:
grep -oP '.*committed=\K[0-9]*' Input_file
Output will be 1728469 in both above solutions.
1st solution explanation:
awk ' ##Starting awk program from here.
match($0,/committed=[0-9]+/){ ##Using match function to match from committed= till digits in current line.
print substr($0,RSTART+10,RLENGTH-10) ##Printing sub string from RSTART+10 to RLENGTH-10 in current line.
}
' Input_file ##Mentioning Input_file name here.
Sed is better at simple matching tasks:
sed -n 's/.*committed=\([0-9]*\).*/\1/p' input_file
$ awk -F'[=)]' '{print $3}' file
1728469KB
You can try this:
str="Internal (reserved=1728469KB, committed=1728469KB)"
echo $str | awk '{print $3}' | cut -d "=" -f2 | rev | cut -c4- | rev

Shell script to match a string and print the next string on aix machine

I have a following line as input.
Parsing events:hostname='tom';Ipaddress='10.10.10.1';situation_name='sgd_abc_app_a';type='General';
Like this there are many fields in a line separated by a delimiter as semi-colon. (But starting with Parsing Events:)
I want to extract onlysgd_abc_app_a when it matches situation_name.
Thanks
Kulli
Try
sed -n 's/^.*situation_name=//p' input_file| awk -F "'" '{print $2}'
For your request, it would work no matter the position of situation_name
$ awk '/situation_name/{match($0,/situation_name=[^;]+/); print substr($0,RSTART+16,RLENGTH-17)}' file
sgd_abc_app_a
awk solution:
s="Parsing events: hostname='tom';Ipaddress='10.10.10.1';situation_name='sgd_abc_app_a';type='General';"
awk -F'[=;]' '{ gsub("\047","",$6); print $6 }' <<< $s
Or with sed:
sed -n "s/^Parsing events:.*situation_name='\([^']*\).*/\1/p" <<< $s
The output:
sgd_abc_app_a

How to print the csv file excluding first column till end using awk

I have a csv file with dynamic columns.
I've tried to use awk -F , 'NF>1' resul1.txt but it still prints all columns.
Since it has dynamic columns.
Its quite difficult to print using print $1 till end.
Try this awk command:
awk -F, '{$1=""}1' input.txt | awk -vOFS=, '{$1=$1}1' > output.txt
Make the 1st field empty
Print out entire line again
try substr function :
substr(string, start [, length ])
Return a length-character-long substring of string, starting at character number start. The first character of a string is character
number one.For example, substr("washington", 5, 3) returns "ing".*
awk -F, '{print substr($0,length($1)+1+length(FS))}' file
You can use cut:
cut -d',' -f2- yourfile.csv > output.csv
Explanation:
-d - setting delimiter to ,
-f - fields to print
2- - from 2 field to end of line
With awk:
awk -F, '{sub(/[^,]+,/,"",$0);}1' OFS=, yourfile.csv > output.csv
With sed:
sed -i.bak 's/^[^,]\+,//g' yourfile.csv
-i - in-place edit

Google Drive upload file using curl

I am scripting an awk statement and I want to insert quotations around the column of text (one at the beginning and 1 at the end of first column)
Example
before
https://otrs.com/ID=24670 2014060910001178
after
"https://otrs.com/ID=24670" 2014060910001178
so far I have
awk '{ print $2"\""$2"\""$0 }'F1 request.txt > request1.txt
but that prints a repeat of the second value and I just want the quotes to go around the first column.
Thanks for your help
Through sed,
$ echo 'https://otrs.com/ID=24670 2014060910001178' | sed 's/^\([^ ]*\)\(.*\)$/"\1"\2/g'
"https://otrs.com/ID=24670" 2014060910001178
Through awk,
$ echo 'https://otrs.com/ID=24670 2014060910001178' | awk '{gsub(/^/,"\"",$1);gsub(/$/,"\"",$1);}1'
"https://otrs.com/ID=24670" 2014060910001178
Another for awk:
awk '{ $1 = "\"" $1 "\"" }1' request.txt > request1.txt

awk - split only by first occurrence

I have a line like:
one:two:three:four:five:six seven:eight
and I want to use awk to get $1 to be one and $2 to be two:three:four:five:six seven:eight
I know I can get it by doing sed before. That is to change the first occurrence of : with sed then awk it using the new delimiter.
However replacing the delimiter with a new one would not help me since I can not guarantee that the new delimiter will not already be somewhere in the text.
I want to know if there is an option to get awk to behave this way
So something like:
awk -F: '{print $1,$2}'
will print:
one two:three:four:five:six seven:eight
I will also want to do some manipulations on $1 and $2 so I don't want just to substitute the first occurrence of :.
Without any substitutions
echo "one:two:three:four:five" | awk -F: '{ st = index($0,":");print $1 " " substr($0,st+1)}'
The index command finds the first occurance of the ":" in the whole string, so in this case the variable st would be set to 4. I then use substr function to grab all the rest of the string from starting from position st+1, if no end number supplied it'll go to the end of the string. The output being
one two:three:four:five
If you want to do further processing you could always set the string to a variable for further processing.
rem = substr($0,st+1)
Note this was tested on Solaris AWK but I can't see any reason why this shouldn't work on other flavours.
Some like this?
echo "one:two:three:four:five:six" | awk '{sub(/:/," ")}1'
one two:three:four:five:six
This replaces the first : to space.
You can then later get it into $1, $2
echo "one:two:three:four:five:six" | awk '{sub(/:/," ")}1' | awk '{print $1,$2}'
one two:three:four:five:six
Or in same awk, so even with substitution, you get $1 and $2 the way you like
echo "one:two:three:four:five:six" | awk '{sub(/:/," ");$1=$1;print $1,$2}'
one two:three:four:five:six
EDIT:
Using a different separator you can get first one as filed $1 and rest in $2 like this:
echo "one:two:three:four:five:six seven:eight" | awk -F\| '{sub(/:/,"|");$1=$1;print "$1="$1 "\n$2="$2}'
$1=one
$2=two:three:four:five:six seven:eight
Unique separator
echo "one:two:three:four:five:six seven:eight" | awk -F"#;#." '{sub(/:/,"#;#.");$1=$1;print "$1="$1 "\n$2="$2}'
$1=one
$2=two:three:four:five:six seven:eight
The closest you can get with is with GNU awk's FPAT:
$ awk '{print $1}' FPAT='(^[^:]+)|(:.*)' file
one
$ awk '{print $2}' FPAT='(^[^:]+)|(:.*)' file
:two:three:four:five:six seven:eight
But $2 will include the leading delimiter but you could use substr to fix that:
$ awk '{print substr($2,2)}' FPAT='(^[^:]+)|(:.*)' file
two:three:four:five:six seven:eight
So putting it all together:
$ awk '{print $1, substr($2,2)}' FPAT='(^[^:]+)|(:.*)' file
one two:three:four:five:six seven:eight
Storing the results of the substr back in $2 will allow further processing on $2 without the leading delimiter:
$ awk '{$2=substr($2,2); print $1,$2}' FPAT='(^[^:]+)|(:.*)' file
one two:three:four:five:six seven:eight
A solution that should work with mawk 1.3.3:
awk '{n=index($0,":");s=$0;$1=substr(s,1,n-1);$2=substr(s,n+1);print $1}' FS='\0'
one
awk '{n=index($0,":");s=$0;$1=substr(s,1,n-1);$2=substr(s,n+1);print $2}' FS='\0'
two:three:four five:six:seven
awk '{n=index($0,":");s=$0;$1=substr(s,1,n-1);$2=substr(s,n+1);print $1,$2}' FS='\0'
one two:three:four five:six:seven
Just throwing this on here as a solution I came up with where I wanted to split the first two columns on : but keep the rest of the line intact.
Comments inline.
echo "a:b:c:d::e" | \
awk '{
split($0,f,":"); # split $0 into array of fields `f`
sub(/^([^:]+:){2}/,"",$0); # remove first two "fields" from `$0`
print f[1],f[2],$0 # print first two elements of `f` and edited `$0`
}'
Returns:
a b c:d::e
In my input I didn't have to worry about the first two fields containing escaped :, if that was a requirement, this solution wouldn't work as expected.
Amended to match the original requirements:
echo "a:b:c:d::e" | \
awk '{
split($0,f,":");
sub(/^([^:]+:)/,"",$0);
print f[1],$0
}'
Returns:
a b:c:d::e

Resources