BASH: Cannot awk with a variable in a while loop - bash

I have a Problem when trying to awk a READ input in a while loop.
This is my code:
#!/bin/bash
read -p "Please enter the Array LUN ID (ALU) you wish to query, separated by a comma (e.g. 2036,2037,2045): " ARRAY_LUNS
LUN_NUMBER=`echo $ARRAY_LUNS | awk -F "," '{ for (i=1; i<NF; i++) printf $i"\n" ; print $NF }' | wc -w`
echo "you entered $LUN_NUMBER LUN's"
s=0
while [ $s -lt $LUN_NUMBER ];
do
s=$[$s+1]
LUN_ID=`echo $ARRAY_LUNS | awk -F, '{print $'$s'}' | awk -v n1="$s" 'NR==n1'`
echo "NR $s :"
echo "awk -v n1="$s" 'NR==n1'$LUN_ID"
done
No matter what options with awk i try, i dont get it to display more than the first entry before the comma. It looks to me, like the loop has some problems to get the variable s counted upwards. But on the other hand, the code line:
LUN_ID=`echo $ARRAY_LUNS | awk -F, '{print $'$s'}' | awk -v n1="$s" 'NR==n1'`
works just great! Any idea on how to solve this. Another solution to my READ input would be just fine as well.

#!/bin/bash
typeset -a ARRAY_LUNS
IFS=, read -a -p "Please enter the Array LUN ID (ALU) you wish to query, separated by a comma (e.g. 2036,2037,2045): " ARRAY_LUNS
LUN_NUMBER="${#ARRAY_LUNS[#]}"
echo "you entered $LUN_NUMBER LUNs"
for((s=0;s<LUN_NUMBER;s++))
do
echo "LUN id $s: ${ARRAY_LUNS[s]}"
done
Why does your awk code not work?
The problem is not the counter. I said The last awk command in the pipe i.e.
awk -v n1="$s" 'NR==n1'.
This awk code tries to print the first line when s is 1, the second line when s is 2, the third line when s is 3, and so on... But how many lines are printed by echo $ARRAY_LUNS? Just ONE... there is no second line, no third line... just ONE line and just ONE line is printed.
That line contains all LUN_IDs in ONE LINE, i.e, one LUN_ID next to another LUN_ID, like this way:
34 45 21 223
NOT this way
34
45
21
223
Those LUN_IDs are fields printable by awk using $1, $2, $3, ... and so on.
Therefore if you want you code to run fine just remove that last command in the pipe:
LUN_ID=$(echo "$ARRAY_LUNS" | awk -F, '{print $'$s'}')
Please, for any further question, firstly read this awk guide

Related

How to read two lines in shell script (In single iteration line1, line2 - Next iteration it should take line2,line3.. so on)

In my shell script, one of the variable contains set of lines. I have a requirement to get the two lines info at single iteration in which my awk needs it.
var contains:
12 abc
32 cdg
9 dfk
98 nhf
43 uyr
5 ytp
Here, In a loop I need line1, line2[i.e 12 abc \n 32 cdg] content and next iteration needs line2, line3 [32 cdg \n 9 dfk] and so on..
I tried to achieve by
while IFS= read -r line
do
count=`echo ${line} | awk -F" " '{print $1}'`
id=`echo ${line} | awk -F" " '{print $2}'`
read line
next_id=`echo ${line} | awk -F" " '{print $2}'`
echo ${count}
echo ${id}
echo ${next_id}
## Here, I have some other logic of awk..
done <<< "$var"
It's reading line1, line2 at first iteration. At second iteration it's reading line3, line4. But, I required to read line2, line3 at second iteration. Can anyone please sort out my requirement.
Thanks in advance..
Don't mix a shell script spawing 3 separate subshells for awk per-iteration when a single call to awk will do. It will be orders of magnitude faster for large input files.
You can group the messages as desired, just by saving the first line in a variable, skipping to the next record and then printing the line in the variable and the current record through the end of the file. For example, with your lines in the file named lines, you could do:
awk 'FNR==1 {line=$0; next} {print line"\n"$0"\n"; line=$0}' lines
Example Use/Output
$ awk 'FNR==1 {line=$0; next} {print line"\n"$0"\n"; line=$0}' lines
12 abc
32 cdg
32 cdg
9 dfk
9 dfk
98 nhf
98 nhf
43 uyr
43 uyr
5 ytp
(the additional line-break was simply included to show separation, the output format can be changed as desired)
You can add a counter if desired and output the count via the END rule.
The solution depends on what you want to do with the two lines.
My first thought was something like
sed '2,$ s/.*/&\n&/' <<< "${yourvar}"
But this won't help much when you must process two lines (I think | xargs -L2 won't help).
When you want them in a loop, try
while IFS= read -r line; do
if [ -n "${lastline}" ]; then
echo "Processing lines starting with ${lastline:0:2} and ${line:0:2}"
fi
lastline="${line}"
done <<< "${yourvar}"

Awk add a variable to part of a column string

Objective
add "67" to column 1 of the output file with 67 being the variable ($iv) classified on the difference between 2 dates.
File1.csv
display,dc,client,20572431,5383594
display,dc,client,20589101,4932821
display,dc,client,23030494,4795549
display,dc,client,22973424,5844194
display,dc,client,21489000,4251031
display,dc,client,23150347,3123945
display,dc,client,23194965,2503875
display,dc,client,20578983,1522448
display,dc,client,22243554,920166
display,dc,client,20572149,118865
display,dc,client,23077785,28077
display,dc,client,21811100,5439
Current Output 3_file1.csv
BOB-UK-,display,dc,client,20572431,5383594,0.05,269.18
BOB-UK-,display,dc,client,20589101,4932821,0.05,246.641
BOB-UK-,display,dc,client,23030494,4795549,0.05,239.777
BOB-UK-,display,dc,client,22973424,5844194,0.05,292.21
BOB-UK-,display,dc,client,21489000,4251031,0.05,212.552
BOB-UK-,display,dc,client,23150347,3123945,0.05,156.197
BOB-UK-,display,dc,client,23194965,2503875,0.05,125.194
BOB-UK-,display,dc,client,20578983,1522448,0.05,76.1224
BOB-UK-,display,dc,client,22243554,920166,0.05,46.0083
BOB-UK-,display,dc,client,20572149,118865,0.05,5.94325
BOB-UK-,display,dc,client,23077785,28077,0.05,1.40385
BOB-UK-,display,dc,client,21811100,5439,0.05,0.27195
TOTAL,,,,,33430004,,1671.5
Desired Output 3_file1.csv
BOB-UK-67,display,dc,client,20572431,5383594,0.05,269.18
BOB-UK-67,display,dc,client,20589101,4932821,0.05,246.641
BOB-UK-67,display,dc,client,23030494,4795549,0.05,239.777
BOB-UK-67,display,dc,client,22973424,5844194,0.05,292.21
BOB-UK-67,display,dc,client,21489000,4251031,0.05,212.552
BOB-UK-67,display,dc,client,23150347,3123945,0.05,156.197
BOB-UK-67,display,dc,client,23194965,2503875,0.05,125.194
BOB-UK-67,display,dc,client,20578983,1522448,0.05,76.1224
BOB-UK-67,display,dc,client,22243554,920166,0.05,46.0083
BOB-UK-67,display,dc,client,20572149,118865,0.05,5.94325
BOB-UK-67,display,dc,client,23077785,28077,0.05,1.40385
BOB-UK-67,display,dc,client,21811100,5439,0.05,0.27195
TOTAL,,,,,33430004,,1671.5
Current Code
#! bin/sh
set -eu
de=$(date +"%d-%m-%Y" -d "1 month ago")
ds="15-04-2014"
iv=$(awk -vdate1=$de -vdate2=$ds 'BEGIN{split(date1, A,"-");split(date2, B,"-");year_diff=A[3]-B[3];if(year_diff){months_diff=A[2] + 12 * year_diff - B[2] + 1;} else {months_diff=A[2]>B[2]?A[2]-B[2]+1:B[2]-A[2]+1};print months_diff}')
for f in $(find *.csv); do
awk -F"," -v OFS=',' '{print "BOB-UK-"$iv,$0,0.05}' $f > "1_$f.csv" ##PROBLEM LINE##
awk -F"," -v OFS=',' '{print $0,$6*$7/1000}' "1_$f.csv" > "2_$f.csv" ##calculate price
awk -F"," -v OFS=',' '{print $0}; {sum+=$6}{sum2+=$8} END {print "TOTAL,,,,," (sum)",,"(sum2)}' "2_$f.csv" > "3_$f.csv" ##calculate total
done
Issue
When I run the first awk line (Marked as "## PROBLEM LINE##") the loop doesn't change column $1 to include the "67" after "BOB-UK-". This should be done with the print "BOB-UK-"$iv but instead it doesn't do anything. I suspect this is due to the way print works in awk but I haven't been able to work out a way to treat it within this row. Does anyone know if this is possible or do I need to create a new row to achieve this?
You have to pass the variable value to awk. awk does not inherit variables from the shell and does not expand $variable variables like shell. It is another tool with it's internal language.
awk -v iv="$iv" -F"," -v OFS=',' '{print "BOB-UK-"iv,$0,0.05}' "$f"
Tested in repl with the input provided.
for f in $(find *.csv)
Is useless use of find, makes no sense, just
for f in *.csv
Also note that you are creating 1_$f.csv, 2_$f.csv and 3_$f.csv files in the current directory in your loop, so the next time you run your script there will be 4 times more .csv files to iterate through. Dunno if that's relevant.
How $iv works in awk?
The $<number> is the field number <number> from the line in awk. So for example the $1 is the first field of the line in awk. The $2 is the second field. The $0 is special and it is the whole line.
The $iv expands to $ + the value of iv. So for example:
echo a b c | awk '{iv=2; print $iv}'
will output b, as the $iv expands to $2 then $2 expands to the second field from the input - ie. b.
Uninitialized variables in awk are initialized with 0. So $iv is substituted for $0 in your awk line, so it expands for the whole line.

how to pass in a variable to awk commandline

I'm having some trouble passing bash script variables into awk command-line.
Here is pseudocode:
for FILE in $INPUT_DIR/*.txt; do
filename=`echo $FILE | sed -n 's/^.*\(chr[0-9A-Z]*\).*.vcf$/\1/p'`
OUTPUT_FILE=$OUTPUT_DIR/$filename.snps.txt
egrep -v "^#" $FILE | awk '{print $2,$4,$5}' > $OUTPUT_FILE
done
The final line where I awk the columns, I would like it to be flexible or user input. For example, the user could want columns 6,7,and 8 as well, or column 133 and 138, or column 245 through 248. So how do I custom this so I can have that 'print $2 .... $5' be a user input thing? For example the user would run this script like : bash script.sh input_dir output_dir [user inputs whatever string of columns], and then I would get those columns in the output. I tried passing it in, but I guess I'm not getting the syntax right.
With awk, you should declare the variable before use it. This is better than the escape method (awk '{print $'$var'}'):
awk -v var1="$col1" -v var2="$col2" 'BEGIN {print var1,var2 }'
Where $col1 and $col2 would be the input variables.
Maybe you can try an input variable as string with "$2,$4,$5" and print this variable to get the values (I am not sure if this works)
The following test works for me:
A="\$3" ; ls -l | awk "{ print $A }"

Can I have multiple awk actions without inserting newlines?

I'm a newbie with very small and specific needs. I'm using awk to parse something and I need to generate uninterrupted lines of text assembled from several pieces in the original text. But awk inserts a newline in the output whenever I use a semicolon.
Simplest example of what I mean:
Original text:
1 2
awk command:
{ print $1; print $2 }
The output will be:
1
2
The thing is that I need the output to be a single line, and I also need to use the semicolons, because I have to do multiple actions on the original text, not all of them print.
Also, using ORS=" " causes a whole lot of different problems, so it's not an option.
Is there any other way that I can have multiple actions in the same line without newline insertion?
Thanks!
The newlines in the output are nothing to do with you using semicolons to separate statements in your script, they are because print outputs the arguments you give it followed by the contents of ORS and the default value of ORS is newline.
You may want some version of either of these:
$ echo '1 2' | awk '{printf "%s ", $1; printf "%s ", $2; print ""}'
1 2
$
$ echo '1 2' | awk -v ORS=' ' '{print $1; print $2; print "\n"}'
1 2
$
$ echo '1 2' | awk -v ORS= '{print $1; print " "; print $2; print "\n"}'
1 2
$
but it's hard to say without knowing more about what you're trying to do.
At least scan through the book Effective Awk Programming, 4th Edition, by Arnold Robbins to get some understanding of the basics before trying to program in awk or you're going to waste a lot of your time and learn a lot of bad habits first.
You have better control of the output if you use printf, e.g.
awk '{ printf "%s %s\n",$1,$2 }'
awk '{print $1 $2}'
Is the solution in this case
TL;DR
You're getting newlines because print sends OFS to standard output after each print statement. You can format the output in a variety of other ways, but the key is generally to invoke only a single print or printf statement regardless of how many fields or values you want to print.
Use Commas
One way to do this is to use a single call to print using commas to separate arguments. This will insert OFS between the printed arguments. For example:
$ echo '1 2' | awk '{print $1, $2}'
1 2
Don't Separate Arguments
If you don't want any separation in your output, just pass all the arguments to a single print statement. For example:
$ echo '1 2' | awk '{print $1 $2}'
12
Formatted Strings
If you want more control than that, use formatted strings using printf. For example:
$ echo '1 2' | awk '{printf "%s...%s\n", $1, $2}'
1...2
$ echo "1 2" | awk '{print $1 " " $2}'
1 2

awk line break with printf

I have a simple shell script, shown below, and I want to put a line break after each line returned by it.
#!/bin/bash
vcount=`db2 connect to db_lexus > /dev/null; db2 list tablespaces | grep -i "Tablespace ID" | wc -l`
db2pd -d db_lexus -tablespaces | grep -i "Tablespace Statistics" -A $vcount | awk '{printf ($2 $7)}'
The output is:
Statistics:IdFreePgs0537610230083224460850d
and I want the output to be something like that:
Statistics:
Id FreePgs
0 5376
1 0
2 3008
3 224
4 608
5 0
Is that possible to do with shell scripting?
Your problem can be reduced to the following:
$ cat infile
11 12
21 22
$ awk '{ printf ($1 $2) }' infile
11122122
printf is for formatted printing. I'm not even sure if the behaviour of above usage is defined, but it's not how it's meant to be done. Consider:
$ awk '{ printf ("%d %d\n", $1, $2) }' infile
11 12
21 22
"%d %d\n" is an expression that describes how to format the output: "a decimal integer, a space, a decimal integer and a newline", followed by the numbers that go where the %d are. printf is very flexible, see the manual for what it can do.
In this case, we don't really need the power of printf, we can just use print:
$ awk '{ print $1, $2 }' infile
11 12
21 22
This prints the first and second field, separated by a space1 – and print does add a newline without us telling it to.
1More precisely, "separated by the value of the output field separator OFS", which defaults to a space and is printed wherever we use , between two arguments. Forgetting the comma is a popular mistake that leads to no space between the record fields.
It looks like you just want to print columns 2 and 7 of whatever is passed to AWK. Try changing your AWK command to
awk '{print $2, $7}'
This will also add a line break at the end.
I realize you are asking about how to do something in a shell script, but it would certainly be a LOT easier to get this from the database using SQL:
#!/bin/bash
export DB2DBDFT=db_lexus
db2 "select tbsp_id, tbsp_free_pages \
from table(mon_get_tablespace('',-2)) as T \
order by tbsp_id"

Resources