how can I send parameter to awk using shell script - shell

I have this file
myfile
a b c d e 1
b c s d e 1
a b d e f 2
d f g h j 2
awk 'if $6==$variable {print #0}' myfile
How can I use this code in shell script that get $variable as parameter by user in command prompt?

You can use awk's -v flag. And since awk prints by default, you can try for example:
variable=1
awk -v var=$variable '$6 == var' file.txt
Results:
a b c d e 1
b c s d e 1
EDIT:
The command is essentially the same, wrapped up in shell. You can use it in a shell script with multiple arguments like this script.sh 2 j
Contents of script.sh:
command=$(awk -v var_one=$1 -v var_two=$2 '$6 == var_one && $5 == var_two' file.txt)
echo -e "$command"
Results:
d f g h j 2

This is question 24 in the comp.unix.shell FAQ (http://cfajohnson.com/shell/cus-faq-2.html#Q24) but the most commonly used alternatives with the most common reasons to pick between the 2 are:
-v var=value '<script>' file1 file2
if you want the variable to be populated in the BEGIN section
or:
'<script>' file1 var=value file2
if you do not want the variable to be populated in the BEGIN section and/or need to change the variables value between files

Related

Conditional print based on array content in bash or awk

I have an input file with following contents:
SS SC
a 1
b 2
d 5
f 7
I have an input bash array as follow:
echo "${input[*]}"
a b c d e f
I need to create an output to:
1. print the all elements of the array in 1st column
2. In second column, I need to print 0 or 1, based on the presence of the element.
To explain this, in the input array called input, I have a,b,c,d,e,f. Now a is present in input file, so the output should be a 1, whereas c is missing in the input file, so the output should be c 0 in the output.
Eg: Expected result:
SS RESULT
a 1
b 1
c 0
d 1
e 0
f 1
Tried, to split the bash array in an attempt to iterate over it, but its printing for each line(the way awk works), its getting too difficult to handle.
awk -v par="${input[*]}" 'BEGIN{ n = split(par, a, " ")} {for(i=0;i<=n;i++){printf "%s\n", a[i]}}' input
I am able(missing header) to do this with bash for loop and some grep: But hoping awk would be shorter, as I need to put this in a yaml file so need to keep it short.
for item in ${input[#]};do
if ! grep -qE "^${item}" input ;then
echo "$item 0";
else
echo "$item 1";
fi;
done
a 1
b 1
c 0
d 1
e 0
f 1
Using awk to store the values in the first column of the file in an associative array and then see if the elements of the array exist in it:
#!/usr/bin/env bash
input=(a b c d e f)
awk 'BEGIN { print "SS", "RESULT" }
FNR == NR { vals[$1] = 1; next }
{ print $0, $0 in vals }
' input.txt <(printf "%s\n" "${input[#]}")
Or doing the the same thing in pure bash:
#!/usr/bin/env bash
input=(a b c d e f)
declare -A vals
while read -r v _; do
vals[$v]=1
done < input.txt
echo "SS RESULT"
for v in "${input[#]}"; do
if [[ -v vals[$v] ]]; then
printf "%s 1\n" "$v"
else
printf "%s 0\n" "$v"
fi
done
Following code snippet demonstrates how it can be achieved in perl
use strict;
use warnings;
use feature 'say';
my #array = qw/a b c d e f/;
my %seen;
$seen{(split)[0]}++ while <DATA>;
say 'SS RESULT';
say $_, ' ', $seen{$_} ? 1 : 0 for #array;
__DATA__
SS SC
a 1
b 2
d 5
f 7
Output
SS RESULT
a 1
b 1
c 0
d 1
e 0
f 1

linux shell diff two files to get new line

I have two files and I want to get the new line by comparing two files, I know can use 'diff newfile oldfile' to get the new lines, but the output will include "<" and diff infomation which I don't want to have.
for example, now I have an oldfile:
a
b
c
and a newfile
a
b
c
d
e
f
the result of the 'diff newfile oldfile' will be
4,6d3
< d
< e
< f
but the result i want to have is
d
e
f
So how can i get this output? I have searchd many diff options but dont have any ideas
Thank you in advance.
Similar to this question, you can use comm for this purpose.
comm -13 file1 file2
Will print only the lines of file2 that don't exist in file1.
Native diff solution:
diff --changed-group-format='%<' --unchanged-group-format='' new.txt old.txt
The output:
d
e
f
You could also use awk:
$ awk 'NR==FNR{a[$0];next} ($0 in a==0)' oldfile newfile
d
e
f
or grep if the files are not that big (mind the partial matches):
$ grep -v -f oldfile newfile
d
e
f
or join (inputfiles need to be ordered):
$ join -v 2 oldfile newfile
d
e
f

Concatenation of two columns from the same file

From a text file
file
a d
b e
c f
how are the tab delimited columns concatenated into one column
a
b
c
d
e
f
Now I use awk to output columns to two files that I then concatenated using cat. But there must be a better one line command?
for a generalized approach
$ f() { awk '{print $'$1'}' file; }; f 1; f 2
a
b
c
d
e
f
if the file is tab delimited perhaps simply with cut (the inverse operation of paste)
$ cut -f1 file.t; cut -f2 file.t
This simple awk command should do the job:
awk '{print $1; s=s $2 ORS} END{printf "%s", s}' file
a
b
c
d
e
f
You can use process substitution; that would eliminate the need to create file for each column.
$ cat file
a d
b e
c f
$ cat <(awk '{print $1}' file) <(awk '{print $2}' file)
a
b
c
d
e
f
$
OR
as per the comment you can just combine multiple commands and redirect their output to a different file like this:
$ cat file
a d
b e
c f
$ (awk '{print $1}' file; awk '{print $2}' file) > output
$ cat output
a
b
c
d
e
f
$
try: Without reading file twice or without any external calls of any other commands, only single awk to rescue. Also considering that your Input_file is same like shown sample.
awk '{VAL1=VAL1?VAL1 ORS $1:$1;VAL2=VAL2?VAL2 ORS $2:$2} END{print VAL1 ORS VAL2}' Input_file
Explanation: Simply creating a variable named VAL1 which will contain $1's value and keep on concatenating in it's own value, VAL2 will have $2's value and keep on concatenating value in it's own. In END section of awk printing the values of VAL1 and VAL2.
You can combine bash commands with ; to get a single stream:
$ awk '{print $1}' file; awk '{print $2}' file
a
b
c
d
e
f
Use process substitution if you want that to be as if it were a single file:
$ txt=$(awk '{print $1}' file; awk '{print $2}' file)
$ echo "$txt"
a
b
c
d
e
f
Or for a Bash while loop:
$ while read -r line; do echo "line: $line"; done < <(awk '{print $1}' file; awk '{print $2}' file)
line: a
line: b
line: c
line: d
line: e
line: f
If you're using notepadd++ you could replace all tab values with the newline char "\r\n"
another approach:
for i in $(seq 1 2); do
awk '{print $'$i'}' file
done
output:
a
b
c
d
e
f

How to multiply AWK output

I have a file data.csv with multiple lines that reads:
A
B
C
and I want the output of the code to be multiplied n times:
A
B
C
A
B
C
Here is an example of a line I've been trying and what it returns:
awk '{for (i=0; i<3 ;i++){ print $1}}' input.csv
A
A
A
B
B
B
C
C
C
Same with cat and other tools
$ awk -v n=3 'BEGIN{ for (i=1;i<n;i++) {ARGV[ARGC]=ARGV[1]; ARGC++} } 1' file
A
B
C
A
B
C
A
B
C
Note that the above only stores the name of the file n times, not the contents of the file and so it'd work for any file of any size as it uses negligible memory.
This would do:
for i in {1..3}; do cat data.csv; done
It won't work with pipes, though.
Thanks for the comments
You can use cat and printf
cat $(printf "%0.sfile " {1..3})
Here is a single efficient 1-liner: yes data | head -3 | xargs cat
$ cat data
A
B
C
$ yes data | head -3 | xargs cat
A
B
C
A
B
C
A
B
C
$
head -3 => here 3 indicates n number of times.
Or using an awk solution:
$ cat data
A
B
C
$ awk 'BEGIN{i=0} {a[i]=$0;i++} END {for(i=0;i<=3;i++) for(j=0;j<=NR;j++) print a[j]}' data | sed '/^$/d'
A
B
C
A
B
C
A
B
C
A
B
C
$
Try this :
seq 2 | xargs -Inone cat input.csv
probably the shortest
cat input.csv{,,}
Supposing you're writing a shell-script, why use awk?
for i in `seq 3`; do
cat data.csv
done
If you want to do this using pipes, e.g. with awk, you'll need to store the file data in memory or save it temporarily to disk. For example:
cat data.csv | \
awk '{a = a $0 "\n"} END { for (i=0; i<3 ;i++){ printf "%s",a; }}'
for (( c=1; c<=3; c++ ))
do
cat Input_file.csv
done
With sed and hold/pattern space:
In this given situation with only single letters. Respectively ABC:
If you want to print once:
cat temp | sed 's/\(.*\)/\1/;N;N;H'
output:
[anthony#aguevara ~]$ cat temp | sed 's/\(.*\)/\1/;N;N;H;G;G'
A
B
C
[anthony#aguevara ~]$
Twice(Just append a semi-colon/capital G tot he end):
cat temp | sed 's/\(.*\)/\1/;N;N;H;G'
output:
[anthony#aguevara ~]$ cat temp | sed 's/\(.*\)/\1/;N;N;H;G;G'
A
B
C
A
B
C
[anthony#aguevara ~]$
Three times(Another G):
cat temp | sed 's/\(.*\)/\1/;N;N;H;G;G'
output:
[anthony#aguevara ~]$ cat temp | sed 's/\(.*\)/\1/;N;N;H;G;G'
A
B
C
A
B
C
A
B
C
[anthony#aguevara ~]$
and so on.
File(Has no newlines in file):
[anthony#aguevara ~]$ cat temp
A
B
C
[anthony#aguevara ~]$

Search for a column by name in awk

I have a file that has many columns. Let us say "Employee_number" "Employee_name" "Salary". I want to display all entries in a column by giving all or part of the column name. For example if my input "name" I want all the employee names printed. Is it possible to do this in a simple manner using awk?
Thanks
Given a script getcol.awk as follows:
BEGIN {
colname = ARGV[1]
ARGV[1] = ""
getline
for (i = 1; i <= NF; i++) {
if ($i ~ colname) {
break;
}
}
if (i > NF) exit
}
{print $i}
... and the input file test.txt:
apple banana candy deer elephant
A B C D E
A B C D E
A B C D E
A B C D E
A B C D E
A B C D E
A B C D E
... the command:
$ awk -f getcol.awk b <test.txt
... gives the following output:
B
B
B
B
B
B
B
Note that the output text does not include the first line of the test file, which is treated as a header.
Simple one-liner will do the trick:
$ cat file
a b c
1 2 3
1 2 3
1 2 3
$ awk -v c="a" 'NR==1{for(i=1;i<=NF;i++)n=$i~c?i:n;next}n{print $n}' file
1
1
1
$ awk -v c="b" 'NR==1{for(i=1;i<=NF;i++)n=$i~c?i:n;next}n{print $n}' file
2
2
2
$ awk -v c="c" 'NR==1{for(i=1;i<=NF;i++)n=$i~c?i:n;next}n{print $n}' file
3
3
3
# no column d so no output
$ awk -v c="d" 'NR==1{for(i=1;i<=NF;i++)n=$i~c?i:n;next}n{print $n}' file
Note: as in your requirement you want name to match employee_name just be aware if you give employee you will get the last column matching employee this is easily changed however.

Resources