AWK - Print complete input string after comparison - bash

I have a file a.text:
hello world
my world
hello universe
I want to print the complete string if the second word is "world":
[root#sc-rdops-vm18-dhcp-57-128:/var/log] cat a | awk -F " " '{if($2=="world") print $1}'
hello
my
But the output which I want is:
[root#sc-rdops-vm18-dhcp-57-128:/var/log] cat a | awk -F " " '{if($2=="world") print <Something here>}'
hello world
my world
Any pointers on how I can do this?
Thanks in advance.

awk '{if ($2=="world") {print}}' file
Output:
hello world
my world

First off, since you are writing a single if statement, you can use the awk 'filter{commands;}' pattern, like so
awk -F " " '$2=="world" { print <Something here> }'
To print the entire line you can use print $0
awk -F " " '$2=="world"{print $0}' file
which can be written as
awk -F " " '$2=="world"{print}' file
But {print} is the default action, so it can be omitted after the filter like this:
awk -F " " '$2=="world"' file
Or even without the -F option, since the space is the default FS value
awk '$2=="world"' file

If you want / have to use awk to solve your problem:
awk '$0~/world/' file.txt
If a line (i.e., $0) matches the string "world" (i.e., ~/world/) the entire line is printed
If you only want to check the second column for world:
awk '$2 == "world"' file.txt

Related

Add length of following line to current line in bash

I have a small sample data set test1.faa
>PROKKA_00001_A1#hypothetical#protein
MTIALHLTAVLAFAALAGCGANDSDPGPGGVTVSEARALDQAAEMLEKRGRSPADENAEQAERLRREQAQARTPGQPPEQALQQDGASAPE
>PROKKA_00002_A1#Cystathionine#beta-lyase
MHRFGGMVTAILKGGLDDARRFLERCELFALAESLGGVESLIEHPAIMTHASVPREIREALGISDGLVRLSVGIEDADDLLAELETALA
>PROKKA_00003_A1#hypothetical#protein
MVPIVSAAPVFTLLLTVAVFRRERLTAGRIAAVAVVVPSVILIALGH
and I would like to add the length of the following line to the headerline, followed by next line, such as
>PROKKA_00001_A1#hypothetical#protein_92
MTIALHLTAVLAFAALAGCGANDSDPGPGGVTVSEARALDQAAEMLEKRGRSPADENAEQAERLRREQAQARTPGQPPEQALQQDGASAPE
I tried to do this with awk, but it returns the following error:
awk: >PROKKA_00001_A1#hypothetical#protein: No such file or directory
I assume it is related to the >in the beginning? But I need it in the output file.
This is the code I tried:
#!/bin/bash
cat test1.faa | while read line
do
headerline=$(awk '/>/{print $0}' $line)
echo -e "this is the headerline \n ${headerline}"
fastaline=$(awk '!/>/{print $0}' $line)
echo -e "this is the fastaline \n ${fastaline}"
fastaline_length=$(awk -v linelength=$fastaline '{print length(linelength)}')
echo -e "this is length of fastaline \n ${fastaline_length}"
echo "${headerline}_${fastaline_length}"
echo $fastaline
done
Any suggestions on how to do this?
Could you please try following(considering that your actual Input_file is same as shown sample).
awk '/^>/{value=$0;next} {print value"_"length($0) ORS $0;value=""}' Input_file
this awk command would do what you want
awk '
/^>/ {
getline next_line
print $0 "_" length(next_line)
print next_line
}
' test1.faa

Shell script to match a string and print the next string on aix machine

I have a following line as input.
Parsing events:hostname='tom';Ipaddress='10.10.10.1';situation_name='sgd_abc_app_a';type='General';
Like this there are many fields in a line separated by a delimiter as semi-colon. (But starting with Parsing Events:)
I want to extract onlysgd_abc_app_a when it matches situation_name.
Thanks
Kulli
Try
sed -n 's/^.*situation_name=//p' input_file| awk -F "'" '{print $2}'
For your request, it would work no matter the position of situation_name
$ awk '/situation_name/{match($0,/situation_name=[^;]+/); print substr($0,RSTART+16,RLENGTH-17)}' file
sgd_abc_app_a
awk solution:
s="Parsing events: hostname='tom';Ipaddress='10.10.10.1';situation_name='sgd_abc_app_a';type='General';"
awk -F'[=;]' '{ gsub("\047","",$6); print $6 }' <<< $s
Or with sed:
sed -n "s/^Parsing events:.*situation_name='\([^']*\).*/\1/p" <<< $s
The output:
sgd_abc_app_a

Can I have multiple awk actions without inserting newlines?

I'm a newbie with very small and specific needs. I'm using awk to parse something and I need to generate uninterrupted lines of text assembled from several pieces in the original text. But awk inserts a newline in the output whenever I use a semicolon.
Simplest example of what I mean:
Original text:
1 2
awk command:
{ print $1; print $2 }
The output will be:
1
2
The thing is that I need the output to be a single line, and I also need to use the semicolons, because I have to do multiple actions on the original text, not all of them print.
Also, using ORS=" " causes a whole lot of different problems, so it's not an option.
Is there any other way that I can have multiple actions in the same line without newline insertion?
Thanks!
The newlines in the output are nothing to do with you using semicolons to separate statements in your script, they are because print outputs the arguments you give it followed by the contents of ORS and the default value of ORS is newline.
You may want some version of either of these:
$ echo '1 2' | awk '{printf "%s ", $1; printf "%s ", $2; print ""}'
1 2
$
$ echo '1 2' | awk -v ORS=' ' '{print $1; print $2; print "\n"}'
1 2
$
$ echo '1 2' | awk -v ORS= '{print $1; print " "; print $2; print "\n"}'
1 2
$
but it's hard to say without knowing more about what you're trying to do.
At least scan through the book Effective Awk Programming, 4th Edition, by Arnold Robbins to get some understanding of the basics before trying to program in awk or you're going to waste a lot of your time and learn a lot of bad habits first.
You have better control of the output if you use printf, e.g.
awk '{ printf "%s %s\n",$1,$2 }'
awk '{print $1 $2}'
Is the solution in this case
TL;DR
You're getting newlines because print sends OFS to standard output after each print statement. You can format the output in a variety of other ways, but the key is generally to invoke only a single print or printf statement regardless of how many fields or values you want to print.
Use Commas
One way to do this is to use a single call to print using commas to separate arguments. This will insert OFS between the printed arguments. For example:
$ echo '1 2' | awk '{print $1, $2}'
1 2
Don't Separate Arguments
If you don't want any separation in your output, just pass all the arguments to a single print statement. For example:
$ echo '1 2' | awk '{print $1 $2}'
12
Formatted Strings
If you want more control than that, use formatted strings using printf. For example:
$ echo '1 2' | awk '{printf "%s...%s\n", $1, $2}'
1...2
$ echo "1 2" | awk '{print $1 " " $2}'
1 2

Gawk Line removal, Splitter is :

Is it possible to move certain columns from one .txt file into another .txt file?
I have a .txt that contains:
USERID:ORDER#:IP:PHONE:ADDRESS:POSTCODE
USERID:ORDER#:IP:PHONE:ADDRESS:POSTCODE
With gawk I want to extract ADDRESS & POSTCODE columns into another .txt, so for this given file the output should be:
ADDRESS1:POSTCODE1
ADDRESS2:POSTCODE2
etc.
This is a classic AWK transform. You want to use "-F :" to specify that the input is delimited by ":" and print a new ":" on output:
awk -F: '{ print $5 ":" $6 }' <input.txt >output.txt
Try that:
awk -F: '{printf "%s:%s ",$5,$6}' ex.txt
input is
USERID:ORDER#:IP:PHONE:ADDRESS1:POSTCODE1
USERID:ORDER#:IP:PHONE:ADDRESS2:POSTCODE2
output is (on one line if I understand correctly)
ADDRESS1:POSTCODE1 ADDRESS2:POSTCODE2
only default is that it ends with a trailing space and does not end with a newline.
Which can be fixed with the slightly more complex (but still readable):
awk -F: 'BEGIN {z=0;} {if (z==1) { printf " "; } ; z=1; printf "%s:%s",$5,$6} END{printf"\n"}' ex.txt
awk -F: 'NR==1 {print $5"1:"$6"1"};NR==2 {print $5"2:"$6"2"}' file
ADDRESS1:POSTCODE1
ADDRESS2:POSTCODE2

Passing variables into awk from bash

I am writing a shell script file in which I have to print certain columns of a file. So I try to use awk. The column numbers are calculated in the script. Nprop is a variable in a for loop, that changes from 1 to 8.
avg=1+3*$nprop
awk -v a=$avg '{print $a " " $a+1 " " $a+2}' $filename5 >> neig5.dat
I have tried the following also:
awk -v a=$avg '{print $a " " $(a+1) " " $(a+2) }' $filename5 >> neig5.dat
This results in printing the first three columns all the time.
avg=1+3*$nprop
This will set $avg to 1+3*4, literally, if $prop is 4 for instance. You should be evaluating that expression:
avg=$(( 1+3*$nprop ))
And use the version of the awk script with parenthesis.
This single awk script is a translation of what you want:
awk '{j=0;for(i=4;i<=25;i=3*++j+1)printf "%s %s %s ",$i,$(i+1),$(i+2);print ""}'
You don't need to parse your file 8 times in a shell loop just parse it once with awk.
Use a BEGIN{ } block to create a couple of awk variables:
avg=$((1+3*$nprop))
awk -v a=$avg 'BEGIN{ap1=a+1;ap2=a+2} {print $a " " $ap1 " " $ap2}' $filename5 >> neig5.dat
awk -v n="$nprop" 'BEGIN{x=3*n} {a=x; print $++a, $++a, $++a}' file
If you just want your seed value (nprop) to increment on every pass of the file and process the file 8 times, get rid of your external loop and just do this:
awk 'BEGIN{for (i=2;i<=8;i++) ARGV[++ARGC] = ARGV[1]} {a=3*NR/FNR; print $++a, $++a, $++a}' file
In GNU awk you can replace NR/FNR with ARGIND.

Resources