Passing variables into awk from bash - bash

I am writing a shell script file in which I have to print certain columns of a file. So I try to use awk. The column numbers are calculated in the script. Nprop is a variable in a for loop, that changes from 1 to 8.
avg=1+3*$nprop
awk -v a=$avg '{print $a " " $a+1 " " $a+2}' $filename5 >> neig5.dat
I have tried the following also:
awk -v a=$avg '{print $a " " $(a+1) " " $(a+2) }' $filename5 >> neig5.dat
This results in printing the first three columns all the time.

avg=1+3*$nprop
This will set $avg to 1+3*4, literally, if $prop is 4 for instance. You should be evaluating that expression:
avg=$(( 1+3*$nprop ))
And use the version of the awk script with parenthesis.

This single awk script is a translation of what you want:
awk '{j=0;for(i=4;i<=25;i=3*++j+1)printf "%s %s %s ",$i,$(i+1),$(i+2);print ""}'
You don't need to parse your file 8 times in a shell loop just parse it once with awk.

Use a BEGIN{ } block to create a couple of awk variables:
avg=$((1+3*$nprop))
awk -v a=$avg 'BEGIN{ap1=a+1;ap2=a+2} {print $a " " $ap1 " " $ap2}' $filename5 >> neig5.dat

awk -v n="$nprop" 'BEGIN{x=3*n} {a=x; print $++a, $++a, $++a}' file
If you just want your seed value (nprop) to increment on every pass of the file and process the file 8 times, get rid of your external loop and just do this:
awk 'BEGIN{for (i=2;i<=8;i++) ARGV[++ARGC] = ARGV[1]} {a=3*NR/FNR; print $++a, $++a, $++a}' file
In GNU awk you can replace NR/FNR with ARGIND.

Related

Can I have multiple awk actions without inserting newlines?

I'm a newbie with very small and specific needs. I'm using awk to parse something and I need to generate uninterrupted lines of text assembled from several pieces in the original text. But awk inserts a newline in the output whenever I use a semicolon.
Simplest example of what I mean:
Original text:
1 2
awk command:
{ print $1; print $2 }
The output will be:
1
2
The thing is that I need the output to be a single line, and I also need to use the semicolons, because I have to do multiple actions on the original text, not all of them print.
Also, using ORS=" " causes a whole lot of different problems, so it's not an option.
Is there any other way that I can have multiple actions in the same line without newline insertion?
Thanks!
The newlines in the output are nothing to do with you using semicolons to separate statements in your script, they are because print outputs the arguments you give it followed by the contents of ORS and the default value of ORS is newline.
You may want some version of either of these:
$ echo '1 2' | awk '{printf "%s ", $1; printf "%s ", $2; print ""}'
1 2
$
$ echo '1 2' | awk -v ORS=' ' '{print $1; print $2; print "\n"}'
1 2
$
$ echo '1 2' | awk -v ORS= '{print $1; print " "; print $2; print "\n"}'
1 2
$
but it's hard to say without knowing more about what you're trying to do.
At least scan through the book Effective Awk Programming, 4th Edition, by Arnold Robbins to get some understanding of the basics before trying to program in awk or you're going to waste a lot of your time and learn a lot of bad habits first.
You have better control of the output if you use printf, e.g.
awk '{ printf "%s %s\n",$1,$2 }'
awk '{print $1 $2}'
Is the solution in this case
TL;DR
You're getting newlines because print sends OFS to standard output after each print statement. You can format the output in a variety of other ways, but the key is generally to invoke only a single print or printf statement regardless of how many fields or values you want to print.
Use Commas
One way to do this is to use a single call to print using commas to separate arguments. This will insert OFS between the printed arguments. For example:
$ echo '1 2' | awk '{print $1, $2}'
1 2
Don't Separate Arguments
If you don't want any separation in your output, just pass all the arguments to a single print statement. For example:
$ echo '1 2' | awk '{print $1 $2}'
12
Formatted Strings
If you want more control than that, use formatted strings using printf. For example:
$ echo '1 2' | awk '{printf "%s...%s\n", $1, $2}'
1...2
$ echo "1 2" | awk '{print $1 " " $2}'
1 2

AWK - Print complete input string after comparison

I have a file a.text:
hello world
my world
hello universe
I want to print the complete string if the second word is "world":
[root#sc-rdops-vm18-dhcp-57-128:/var/log] cat a | awk -F " " '{if($2=="world") print $1}'
hello
my
But the output which I want is:
[root#sc-rdops-vm18-dhcp-57-128:/var/log] cat a | awk -F " " '{if($2=="world") print <Something here>}'
hello world
my world
Any pointers on how I can do this?
Thanks in advance.
awk '{if ($2=="world") {print}}' file
Output:
hello world
my world
First off, since you are writing a single if statement, you can use the awk 'filter{commands;}' pattern, like so
awk -F " " '$2=="world" { print <Something here> }'
To print the entire line you can use print $0
awk -F " " '$2=="world"{print $0}' file
which can be written as
awk -F " " '$2=="world"{print}' file
But {print} is the default action, so it can be omitted after the filter like this:
awk -F " " '$2=="world"' file
Or even without the -F option, since the space is the default FS value
awk '$2=="world"' file
If you want / have to use awk to solve your problem:
awk '$0~/world/' file.txt
If a line (i.e., $0) matches the string "world" (i.e., ~/world/) the entire line is printed
If you only want to check the second column for world:
awk '$2 == "world"' file.txt

Modify content inside quotation marks, BASH

Good day to all,
I was wondering how to modify the content inside quotation marks and left unmodified the outside.
Input line:
,,,"Investigacion,,, desarrollo",,,
Output line:
,,,"Investigacion, desarrollo",,,
Initial try:
sed 's/\"",,,""*/,/g'
But nothing happens, thanks in advance for any clue
The idiomatic awk way to do this is simply:
$ awk 'BEGIN{FS=OFS="\""} {sub(/,+/,",",$2)} 1' file
,,,"Investigacion, desarrollo",,,
or if you can have more than one set of quoted strings on each line:
$ cat file
,,,"Investigacion,,, desarrollo",,,"foo,,,,bar",,,
$ awk 'BEGIN{FS=OFS="\""} {for (i=2;i<=NF;i+=2) sub(/,+/,",",$i)} 1' file
,,,"Investigacion, desarrollo",,,"foo,bar",,,
This approach works because everything up to the first " is field 1, and everything from there to the second " is field 2 and so on so everything between "s is the even-numbered fields. It can only fail if you have newlines or escaped double quotes inside your fields but that'd affect every other possible solution too so you'd need to add cases like that to your sample input if you want a solution that handles it.
Using a language that has built-in CSV parsing capabilities like perl will help.
perl -MText::ParseWords -ne '
print join ",", map { $_ =~ s/,,,/,/; $_ } parse_line(",", 1, $_)
' file
,,,"Investigacion, desarrollo",,,
Text::ParseWords is a core module so you don't need to download it from CPAN. Using the parse_line method we set the delimiter and a flag to keep the quotes. Then just do simple substitution and join the line to make your CSV again.
Using egrep, sed and tr:
s=',,,"Investigacion,,, desarrollo",,,'
r=$(egrep -o '"[^"]*"|,' <<< "$s"|sed '/^"/s/,\{2,\}/,/g'|tr -d "\n")
echo "$r"
,,,"Investigacion, desarrollo",,,
Using awk:
awk '{ p = ""; while (match($0, /"[^"]*,{2,}[^"]*"/)) { t = substr($0, RSTART, RLENGTH); gsub(/,+/, ",", t); p = p substr($0, 1, RSTART - 1) t; $0 = substr($0, RSTART + RLENGTH); }; $0 = p $0 } 1'
Test:
$ echo ',,,"Investigacion,,, desarrollo",,,' | awk ...
,,,"Investigacion, desarrollo",,,
$ echo ',,,"Investigacion,,, desarrollo",,,",,, "' | awk ...
,,,"Investigacion, desarrollo",,,", "

issue with OFS in awk

I have a string containing this (field separator is the percentage sign), stored in a variable called data
201%jkhjfhn%kfhngjm%mkdfhgjdfg%mkdfhgjdfhg%mkdhfgjdhfg%kdfhgjgh%kdfgjhgfh%mkfgnhmkgfnh%k,gnhjkgfn%jkdfhngjdfng
I'm trying to print out that string replacing the percentage sign with a pipe but it seems harden than i thought:
echo ${data} | awk -F"%" 'BEGIN {OFS="|"} {print $0}'
I know I'm very close to it just not close enough.
I see that code as:
1 echo the variable value into a awk session
2 set field separator as "%"
3 set as output field separator "|"
4 print the line
Try this :
echo "$data" | awk -F"%" 'BEGIN {OFS="|"} {$1=$1; print $0}'
From awk manual
Finally, there are times when it is convenient to force awk to rebuild the entire
record, using the current value of the fields and OFS. To do this, use the seemingly
innocuous assignment:
$1 = $1 # force record to be reconstituted
print $0 # or whatever else with $0
Another lightweight way using only tr if you search an alternative for awk :
tr '%' '|' <<< "$data"
Sputnick gave you the awk solution, but you don't actually need awk at all, just use your shell:
echo ${data//%/|}

rearrange data

If I have a list of data in a text file seperated by a new line, is there a way to append something to the start, then the data, then append something else then the data again?
EG a field X would become new X = X;
Can you do this with bash or sed or just unix tools like cut?
EDIT:
I am trying to get "ITEM_SITE_ID :{$row['ITEM_SITE_ID']} " .
I am using this line awk '{ print "\""$1 " {:$row['$1']} " }'
And I get this "ITEM_SITE_ID {:$row[]}
What have I missed?
I think the problem is your single quotes are not properly escaped, which is actually impossible to do.
With sed:
sed "s/\(.*\)/\1 = \1;/"
Or in your case:
sed "s/\(.*\)/\"\1 :{\$row['\1']}\"/"
And with bash:
while read line
do
echo "\"$line :{\$row['$line']}\""
done
And actually you can do it in awk using bashes $'' strings:
awk $'{ print "\\"" $1 " :{$row[\'" $1 "\']}\\"" }'
Awk is often the perfect tool for tasks like this. For your specific example:
awk '{ print "new " $1 " = " $1 ";" }'

Resources