Why is my awk sub command failing? - shell

When I run
df -hl | grep '/dev/disk1' | awk '{sub(/%/, \"\");print $5}'
I'm getting the following error:
awk: syntax error at source line 1
context is
{sub(/%/, >>> \ <<< "\");}
awk: illegal statement at source line 1
I can't seem to find any documentation on awk sub.
df -hl | grep '/dev/disk1'
returns
/dev/disk1 112Gi 94Gi 18Gi 85% 24672655 4649071 84% /
As I understand, it should return the percentage of disk space used.
It should return 85 from the input
/dev/disk1 112Gi 94Gi 18Gi 85% 24699942 4621784 84% /

This will fix the command as you supplied it.
df -hl | grep '/dev/disk1' | awk '{sub( /%/, ""); print $5 }'
No need to escape the double quotes.
Of course you don't need to use grep here either.
df -hl | awk '/disk1/ { sub( /%/, "", $5); print $5}'
Notice that you can supply the target for the substitution as a third argument to sub.
The sub command is described in the gawk manual on this page.

Perhaps you can reduce it down to just df and awk with:
df --output=pcent /dev/disk1 | awk '/ /{printf("%d\n", $1)}'

Related

How do I remove the header in the df command?

I'm trying to write a bash command that will sort all volumes by the amount of data they have used and tried using
df | awk '{print $1 | "sort -r -k3 -n"}'
Output:
map
devfs
Filesystem
/dev/disk1s5
/dev/disk1s2
/dev/disk1s1
But this also shows the header called Filesystem.
How do I remove that?
For your specific case, i.e. using awk, #codeforester answer (using awk NR (Number of Records) variable) is the best.
In a more general case, in order to remove the first line of any output, you can use the tail -n +N option in order to output starting with line N:
df | tail -n +2 | other_command
This will remove the first line in df output.
Skip the first line, like this:
df | awk 'NR>1 {print $1 | "sort -r -k3 -n"}'
I normally use one of these options, if I have no reason to use awk:
df | sed 1d
The 1d option to sed says delete the first line, then print everything else.
df | tail -n+2
the -n+2 option to tail say start looking at line 2 and print everything until End-of-Input.
I suspect sed is faster than awk or tail, but I can't prove it.
EDIT
If you want to use awk, this will print every line except the first:
df | awk '{if (FNR>1) print}'
FNR is the File Record Number. It is the line number of the input. If it is greater than 1, print the input line.
Count the lines from the output of df with wc and then substract one line to output a headerless df with tail ...
LINES=$(df|wc -l)
LINES=$((${LINES}-1))
df | tail -n ${LINES}
OK - I see oneliner - Here is mine ...
DF_HEADERLESS=$(LINES=$(df|wc -l); LINES=$((${LINES}-1));df | tail -n ${LINES})
And for formated output lets printf loop over it...
printf "%s\t%s\t%s\t%s\t%s\t%s\n" ${DF_HEADERLESS} | awk '{print $1 | "sort -r -k3 -n"}'
This might help with GNU df and GNU sort:
df -P | awk 'NR>1{$1=$1; print}' | sort -r -k3 -n | awk '{print $1}'
With GNU df and GNU awk:
df -P | awk 'NR>1{array[$3]=$1} END{PROCINFO["sorted_in"]="#ind_num_desc"; for(i in array){print array[i]}}'
Documentation: 8.1.6 Using Predefined Array Scanning Orders with gawk
Removing something from a command output can be done very simply, using grep -v, so in your case:
df | grep -v "Filesystem" | ...
(You can do your awk at the ...)
When you're not sure about caps, small caps, you might add -i:
df | grep -i -v "FiLeSyStEm" | ...
(The switching caps/small caps are meant as a clarification joke :-) )

Adding custom column in the output with awk

I am reading file utilization on the server with below command.
How can I add the hostname in my output as a first column?
Thanks in advance
df -h | grep % | awk '{OFS="\t";print $6,$5}'
Output:
/apps/inf9b2b 43%
/apps/dbclients 13%
/apps/inf9 77%
This is a simple application of How do I use shell variables in an awk script?
df -h | awk -v hostname="$(hostname)" '/%/ {OFS="\t"; print hostname, $6, $5}'
Note that there's no need for an external grep -- just make your pattern match a condition of the awk statement.
You can do df -h | grep % | awk '{OFS="\t";print "hostname\t" $6,$5}'

Bash string replace on command result

I have a simple bash script which is getting the load average using uptime and awk, for example
LOAD_5M=$(uptime | awk -F'load averages:' '{ print $2}' | awk '{print $2}')
However this includes a ',' at the end of the load average
e.g.
0.51,
So I have then replaced the comma with a string replace like so:
LOAD_5M=${LOAD_5M/,/}
I'm not an awk or bash wizzkid so while this gives me the result I want, I am wondering if there is a succinct way of writing this, either by:
Using awk to get the load average without the comma, or
Stripping the comma in a single line
You can do that in same awk command:
uptime | awk -F 'load averages?: *' '{split($2, a, ",? "); print a[2]}'
1.32
The 5 min load is available in /proc/loadavg. You can simply use cut:
cut -d' ' -f2 /proc/loadavg
With awk you can issue:
awk '{print $2}' /proc/loadavg
If you are not working on Linux the file /proc/loadavg will not being present. In this case I would suggest to use sed, like this:
uptime | sed 's/.*, \(.*\),.*,.*/\1/'
uptime | awk -F'load average:' '{ print $2}' | awk -F, '{print $2}'
0.38
(My uptime output has 'load average:' singular)
The load average numbers are always the last 3 fields in the 'uptime' output so:
IFS=' ,' read -a uptime_fields <<<"$(uptime)"
LOAD_5M=${uptime_fields[#]: -2:1}

SSH call inside ruby, using %x

I am trying to make a single line ssh call from a ruby script. My script takes a hostname, and then sets out to return the hostname's machine info.
return_value = %x{ ssh #{hostname} "#{number_of_users}; #{number_of_processes};
#{number_of_processes_running}; #{number_of_processes_sleeping}; "}
Where the variables are formatted like this.
number_of_users = %Q(users | wc -w | cat | awk '{print "Number of Users: "\$1}')
number_of_processes = %Q(ps -el | awk '{print $2}' | wc -l | awk '{print "Number of Processes: "$1}')
I have tried both %q, %Q, and just plain "" and I cannot get the awk to print anything before the output. I either get this error (if I include the colon)
awk: line 1: syntax error at or near :
or if I don't include the slash in front of $1 I just get empty output for that line. Is there any solution for this? I thought it might be because I was using %q, but it even happens with just double quotes.
Use backticks to capture the output of the command and return the output as a string:
number_of_users = `users | wc -w | cat | awk '{print "Number of Users:", $1}'`
puts number_of_users
Results on my system:
48
But you can improve your pipeline:
users | awk '{ print "Number of Users:", NF }'
ps -e | awk 'END { print "Number of Processes:", NR }'
So the solution to this problem is:
%q(users | wc -w | awk '{print \"Number of Users: \"\$1}')
Where you have to use %q, not %, not %Q, and not ""
You must backslash double quotes and the dollar sign in front of any awk variables
If somebody could improve upon this answer by explaining why, that would be most appreciated
Though as Steve pointed out I could have improved my code using users | awk '{ print \"Number of Users:\", NF }'
In which case there is no need to backslash the NF.

bash awk first 1st column and 3rd column with everything after

I am working on the following bash script:
# contents of dbfake file
1 100% file 1
2 99% file name 2
3 100% file name 3
#!/bin/bash
# cat out data
cat dbfake |
# select lines containing 100%
grep 100% |
# print the first and third columns
awk '{print $1, $3}' |
# echo out id and file name and log
xargs -rI % sh -c '{ echo %; echo "%" >> "fake.log"; }'
exit 0
This script works ok, but how do I print everything in column $3 and then all columns after?
You can use cut instead of awk in this case:
cut -f1,3- -d ' '
awk '{ $2 = ""; print }' # remove col 2
If you don't mind a little whitespace:
awk '{ $2="" }1'
But UUOC and grep:
< dbfake awk '/100%/ { $2="" }1' | ...
If you'd like to trim that whitespace:
< dbfake awk '/100%/ { $2=""; sub(FS "+", FS) }1' | ...
For fun, here's another way using GNU sed:
< dbfake sed -r '/100%/s/^(\S+)\s+\S+(.*)/\1\2/' | ...
All you need is:
awk 'sub(/.*100% /,"")' dbfake | tee "fake.log"
Others responded in various ways, but I want to point that using xargs to multiplex output is rather bad idea.
Instead, why don't you:
awk '$2=="100%" { sub("100%[[:space:]]*",""); print; print >>"fake.log"}' dbfake
That's all. You don't need grep, you don't need multiple pipes, and definitely you don't need to fork shell for every line you're outputting.
You could do awk ...; print}' | tee fake.log, but there is not much point in forking tee, if awk can handle it as well.

Resources