This is my script:
for i in *.locs
do
awk -v start=$(head -n 1 ${i}) -v end=$(tail -n 1 ${i})
BEGIN {
sum = 0;
count = 0;
range_start = -1;
range_end = -1;
}
{
irow = int($1)
ival = $2 + 0.0
if (irow >= start && end >= irow) {
if (range_start == -1) {
range_start = NR;
}
sum = sum + ival;
count++;
}
else if (irow > end) {
if (range_end == -1) {
range_end = NR - 1;
}
}
}
END {
echo "${i}"
print "start =", range_start, "end =", range_end, "mean =", sum / count
}
done
Which gives me this error:
line 15: syntax error near unexpected token `}'
line 15: `}'
But when I first use the awk to generate the variables start and end followed by -f myscript.sh file
I don't get an error:
What am I missing?
Thanks in advance
You need to either quote the entire awk script, or escape the dollar signs so that the shell does not expand them as positional parameters before awk is called. (Adding single quotes takes care of the other problem, which is that without a line continuation character, the awk command itself ends at the end of the line and the rest of the script is parsed as incorrect bash code):
awk -v start=$(head -n 1 ${i}) -v end=$(tail -n 1 ${i}) '
BEGIN {
...
'
Characters within your awk program are being interpreted by the shell. You can pass your program as a command-line argument to awk, but it must be enclosed in 'single-quotes' to prevent the shell from interpreting it.
Related
I am converting a CSV file into a table format, and I wrote an AWK script and saved it as my.awk. Here is the my script:
#AWK for test
awk -F , '
BEGIN {
aa = 0;
}
{
hdng = "fname,lname,salary,city";
l1 = length($1);
l13 = length($13);
if ((l1 > 2) && (l13 == 0)) {
fname = substr($1, 2, 1);
l1 = length($3) - 4;
lname = substr($3, l1, 4);
processor = substr($1, 2);
#printf("%s,%s,%s,%s\n", fname, lname, salary, $0);
}
if ($0 ~ ",,,,")
aa++
else if ($0 ~ ",fname")
printf("%s\n", hdng);
else if ((l1 > 2) && (l13 == 0)) {
a++;
}
else {
perf = $11;
if (perf ~/^[0-9\.\" ]+$/)
type = "num"
else
type = "char";
if (type == "num")
printf("Mr%s,%s,%s,%s,,N,N,,\n", $0,fname,lname, city);
}
}
END {
} ' < life.csv > life_out.csv*
How can I run this script on a Unix server? I tried to run this my.awk file by using this command:
awk -f my.awk life.csv
The file you give is a shell script, not an awk program. So, try sh my.awk.
If you want to use awk -f my.awk life.csv > life_out.cs, then remove awk -F , ' and the last line from the file and add FS="," in BEGIN.
If you put #!/bin/awk -f on the first line of your AWK script it is easier. Plus editors like Vim and ... will recognize the file as an AWK script and you can colorize. :)
#!/bin/awk -f
BEGIN {} # Begin section
{} # Loop section
END{} # End section
Change the file to be executable by running:
chmod ugo+x ./awk-script
and you can then call your AWK script like this:
`$ echo "something" | ./awk-script`
Put the part from BEGIN....END{} inside a file and name it like my.awk.
And then execute it like below:
awk -f my.awk life.csv >output.txt
Also I see a field separator as ,. You can add that in the begin block of the .awk file as FS=","
I have a bash script that greps and sorts information from /etc/passwd here
export FT_LINE1=13
export FT_LINE2=23
cat /etc/passwd | grep -v "#" | awk 'NR%2==1' | cut -f1 -d":" | rev | sort -r | awk -v l1="$FT_LINE1" -v l2="$FT_LINE2" 'NR>=l1 && NR<=l2' | tr '\n' ',' | sed 's/, */, /g'
The result is this list
sstq_, sorebrek_brk_, soibten_, sirtsa_, sergtsop_, sec_, scodved_, rlaxcm_, rgmecived_, revreswodniw_, revressta_,
How can i replace the last comma with a dot (.)? I want it to look like this
sstq_, sorebrek_brk_, soibten_, sirtsa_, sergtsop_, sec_, scodved_, rlaxcm_, rgmecived_, revreswodniw_, revressta_.
You can add:
| sed 's/,$/./'
(where $ means "end of line").
There are way to many pipes in your command, some of them can be removed.
As explained in the comment cat <FILE> | grep is a bad habit!!! In general, cat <FILE> | cmd should be replaced by cmd <FILE> or cmd < FILE depending on what type of arguments your command does accept.
On a few GB size file to process, you will already feel the difference.
This being said, you can do the whole processing without using a single pipe by using awk for example:
awk -v l1="$FT_LINE1" -v l2="$FT_LINE2" 'function reverse(s){p=""; for(i=length(s); i>0; i--){p=p substr(s,i,1);}return p;}BEGIN{cmp=0; FS=":"; ORS=","}!/#/{cmp++;if(cmp%2==1) a[cmp]=reverse($1);}END{asort(a);for(i=length(a);i>0;i--){if((length(a)-i+1)>=l1 && (length(a)-i)<=l2){if(i==1){ORS=".";}print a[i];}}}' /etc/passwd
Explanations:
# BEGIN rule(s)
BEGIN {
cmp = 0 #to be use to count the lines since NR can not be used directly
FS = ":" #file separator :
ORS = "," #output record separator ,
}
# Rule(s)
! /#/ { #for lines that does not contain this char
cmp++
if (cmp % 2 == 1) {
a[cmp] = reverse($1) #add to an array the reverse of the first field
}
}
# END rule(s)
END {
asort(a) #sort the array and process it in reverse order
for (i = length(a); i > 0; i--) {
# apply your range conditions
if (length(a) - i + 1 >= l1 && length(a) - i <= l2) {
if (i == 1) { #when we reach the last character to print, instead of the comma use a dot
ORS = "."
}
print a[i] #print the array element
}
}
}
# Functions, listed alphabetically
#if the reverse operation is necessary then you can use the following function that will reverse your strings.
function reverse(s)
{
p = ""
for (i = length(s); i > 0; i--) {
p = p substr(s, i, 1)
}
return p
}
If you don't need to reverse part you can just remove it from the awk script.
In the end, not a single pipe is used!!!
I have a text file where i'm trying to validate with particular column(5) if that column contains value like ACT,LFP,TST and EPO then file goes to further process else it should be exit.Here i'm if my text file contains these value in column number 5 means ACT,LFP,TST and EPO go for further process on other hand if column contains apart from that four value then script will terminate.
Code
cat test.txt \
| awk -F '~' -v ERR="/a/x/ERROR" -v NAME="/a/x/z/" -v WRKD="/a/x/b/" -v DATE="23_09_16" -v PD="234" -v FILE_NAME="FILENAME" \
'{ if ($5 != "ACT" || $5 != "LFP" || $5 != "EPO" || $5 != "TST")
system("mv "NAME" "ERR);
system("rm -f"" "WRKD);
print DATE" " PD " " "[" FILE_NAME "]" " ERROR: Panel status contains invalid value due to this file move to error folder";
print DATE" " PD " " "[" FILE_NAME "]" " INFO: Script is exited";
system("exit");
}' >>log.txt
Txt file: test.txt(Note:- File should be processed successfully)
161518~CHEM~ACT~IRPMR~ACT~UD
010282~CHEM~ACT~IRPMR~ACT~UD
162794~CHEM~ACT~IRPMR~LFP~UD
030767~CHEM~ACT~IRPMR~LFP~UD
Txt file: test1.txt(Note:- File should not be processed successfully.This file contains one invalid value)
161518~CHEM~ACT~IRPMR~**ACT1**~UD
010282~CHEM~ACT~IRPMR~ACT~UD
162794~CHEM~ACT~IRPMR~TST~UD
030767~CHEM~ACT~IRPMR~LFP~UD
awk to the rescue!
Lets assume the following input file:
010282~CHEM~ACT~IRPMR~ACT~UD
121212~CHEM~ACT~IRPMR~ZZZ~UD
162794~CHEM~ACT~IRPMR~TST~UD
020202~CHEM~ACT~IRPMR~YYY~UD
030767~CHEM~ACT~IRPMR~LFP~UD
987654~CHEM~ACT~IRPMR~EPO~UD
010101~CHEM~ACT~IRPMR~XXX~UD
123456~CHEM~ACT~IRPMR~TST~UD
1) This example illustrates how to check for invalid lines/records in the input file:
#!/bin/awk
BEGIN {
FS = "~"
s = "ACT,LFP,TST,EPO"
n = split( s, a, "," )
}
{
for( i = 1; i <= n; i++ )
if( a[i] == $5 )
next
print "Unexpected value # line " NR " [" $5 "]"
}
# eof #
Testing:
$ awk -f script.awk -- input.txt
Unexpected value # line 2 [ZZZ]
Unexpected value # line 4 [YYY]
Unexpected value # line 7 [XXX]
2) This example illustrates how to filter out (remove) invalid lines/records from the input file:
#!/bin/awk
BEGIN {
FS = "~"
s = "ACT,LFP,TST,EPO"
n = split( s, a, "," )
}
{
for( i = 1; i <= n; i++ )
{
if( a[i] == $5 )
{
print $0
next
}
}
}
# eof #
Testing:
$ awk -f script.awk -- input.txt
010282~CHEM~ACT~IRPMR~ACT~UD
162794~CHEM~ACT~IRPMR~TST~UD
030767~CHEM~ACT~IRPMR~LFP~UD
987654~CHEM~ACT~IRPMR~EPO~UD
123456~CHEM~ACT~IRPMR~TST~UD
3) This example illustrates how to display the invalid lines/records from the input file:
#!/bin/awk
BEGIN {
FS = "~"
s = "ACT,LFP,TST,EPO"
n = split( s, a, "," )
}
{
for( i = 1; i <= n; i++ )
if( a[i] == $5 )
next
print $0
}
# eof #
Testing:
$ awk -f script.awk -- input.txt
121212~CHEM~ACT~IRPMR~ZZZ~UD
020202~CHEM~ACT~IRPMR~YYY~UD
010101~CHEM~ACT~IRPMR~XXX~UD
Hope it Helps!
Without getting into the calls to system, this will show you an answer.
awk -F"~" '{ if (! ($5 == "ACT" || $5 == "LFP" || $5 == "EPO" || $5 == "TST")) print $0}' data.txt
output
161518~CHEM~ACT~IRPMR~**ACT1**~UD
This version is testing if $5 matches at least one item in the list. If it doesn't (the ! at the front of the || chain tests), then it prints the record as an error.
Of course, $5 will match only one from that list at a time, but that is all you need.
By contrast, when you say
if ($5 != "ACT" || $5 != "LFP" ...)
You're creating a logic test that can never be true. If $5 does not equal "ACT" because it is "LFP", you have already had the chained condition fail, and the remaining || will not be checked.
IHTH
So i am trying to write a bash script to check if all values in a data set are within a certain margin of the average.
so far:
#!/bin/bash
cat massbuild.csv
while IFS=, read col1 col2
do
x=$(grep "$col2" $col1.pdb | grep "HETATM" | awk '{ sum += $7; n++ } END { if (n > 0) print sum / n; }')
i=$(grep "$col2" $col1.pdb | grep "HETATM" | awk '{print $7;}')
if $(($i > $[$x + 15])); then
echo "OUTSIDE THE RANGE!"
fi
done < massbuild.csv
So far, I have broken it down by components to test, and have found the values of x and i read correctly, but it seems that adding 15 to x, or the comparison to i doesn't work.
I have read around online and i am stumped =/
Without sample input and expected output we're just guessing but MAYBE this is the right starting point for your script (untested, of course, since no in/out provided):
#!/bin/bash
awk -F, '
NR==FNR {
file = $1 ".pdb"
ARGV[ARGC] = file
file2col2s[file] = (col1to2s[file] ? file2col2s[file] FS : "") $2
next
}
FNR==1 { split(file2col2s[FILENAME],col2s) }
/HETATM/ {
for (i=1;i in col2s;i++) {
col2 = col2s[i]
if ($0 ~ col2) {
sum[FILENAME,col2] += $7
cnt[FILENAME,col2]++
}
}
}
END {
for (file in file2col2s) {
split(file2col2s[file],col2s)
for (i=1;i in col2s;i++) {
col2 = col2s[i]
print sum[file,col2]
print cnt[file,col2]
}
}
}
' massbuild.csv
Does this help?
a=4; b=0; if [ "$a" -lt "$(( $b + 5 ))" ]; then echo "a < b + 5"; else echo "a >= b + 5"; fi
Ref: http://www.tldp.org/LDP/abs/html/comparison-ops.html
how can i easily (quick and dirty) change, say 10, random lines of a file with a simple shellscript?
i though about abusing ed and generating random commands and line ranges, but i'd like to know if there was a better way
awk 'BEGIN{srand()}
{ lines[++c]=$0 }
END{
while(d<10){
RANDOM = int(1 + rand() * c)
if( !( RANDOM in r) ) {
r[RANDOM]
print "do something with " lines[RANDOM]
++d
}
}
}' file
or if you have the shuf command
shuf -n 10 $file | while read -r line
do
sed -i "s/$line/replacement/" $file
done
Playing off #Dennis' version, this will always output 10.
Doing random numbers in a separate array could create
duplicates and, consequently, fewer than 10 modifications.
file=~/testfile
c=$(wc -l < "$file")
awk -v c=$c '
BEGIN {
srand();
count = 10;
}
{
if (c*rand() < count) {
--count;
print "do something with " $0;
} else
print;
--c;
}
' "$file"
This seems to be quite a bit faster:
file=/your/input/file
c=$(wc -l < "$file")
awk -v c=$c 'BEGIN {
srand();
for (i=0;i<10;i++) lines[i] = int(1 + rand() * c);
asort(lines);
p = 1
}
{
if (NR == lines[p]) {
++p
print "do something with " $0
}
else print
}' "$file"
I