How do I remove the header in the df command? - bash

I'm trying to write a bash command that will sort all volumes by the amount of data they have used and tried using
df | awk '{print $1 | "sort -r -k3 -n"}'
Output:
map
devfs
Filesystem
/dev/disk1s5
/dev/disk1s2
/dev/disk1s1
But this also shows the header called Filesystem.
How do I remove that?

For your specific case, i.e. using awk, #codeforester answer (using awk NR (Number of Records) variable) is the best.
In a more general case, in order to remove the first line of any output, you can use the tail -n +N option in order to output starting with line N:
df | tail -n +2 | other_command
This will remove the first line in df output.

Skip the first line, like this:
df | awk 'NR>1 {print $1 | "sort -r -k3 -n"}'

I normally use one of these options, if I have no reason to use awk:
df | sed 1d
The 1d option to sed says delete the first line, then print everything else.
df | tail -n+2
the -n+2 option to tail say start looking at line 2 and print everything until End-of-Input.
I suspect sed is faster than awk or tail, but I can't prove it.
EDIT
If you want to use awk, this will print every line except the first:
df | awk '{if (FNR>1) print}'
FNR is the File Record Number. It is the line number of the input. If it is greater than 1, print the input line.

Count the lines from the output of df with wc and then substract one line to output a headerless df with tail ...
LINES=$(df|wc -l)
LINES=$((${LINES}-1))
df | tail -n ${LINES}
OK - I see oneliner - Here is mine ...
DF_HEADERLESS=$(LINES=$(df|wc -l); LINES=$((${LINES}-1));df | tail -n ${LINES})
And for formated output lets printf loop over it...
printf "%s\t%s\t%s\t%s\t%s\t%s\n" ${DF_HEADERLESS} | awk '{print $1 | "sort -r -k3 -n"}'

This might help with GNU df and GNU sort:
df -P | awk 'NR>1{$1=$1; print}' | sort -r -k3 -n | awk '{print $1}'
With GNU df and GNU awk:
df -P | awk 'NR>1{array[$3]=$1} END{PROCINFO["sorted_in"]="#ind_num_desc"; for(i in array){print array[i]}}'
Documentation: 8.1.6 Using Predefined Array Scanning Orders with gawk

Removing something from a command output can be done very simply, using grep -v, so in your case:
df | grep -v "Filesystem" | ...
(You can do your awk at the ...)
When you're not sure about caps, small caps, you might add -i:
df | grep -i -v "FiLeSyStEm" | ...
(The switching caps/small caps are meant as a clarification joke :-) )

Related

Find unique URLs in a file

Situation
I have many URLs in a file, and I need to find out how many unique URLs exist.
I would like to run either a bash script or a command.
myfile.log
/home/myfiles/www/wp-content/als/xm-sf0ab5df9c1262f2130a9b313192deca4-f0ab5df9c1262f2130a9b313192deca4-c23c5fbca96e8d641d148bac41017635|https://public.rgfl.org/HS/PowerPoint%20Presentations/Health%20and%20Safety%20Law.ppt,18,17
/home/myfiles/www/wp-content/als/xm-s4bf050d47df5bfaf0486a50a8528cb16-4bf050d47df5bfaf0486a50a8528cb16-c23c5fbca96e8d641d148bac41017635|https://public.rgfl.org/HS/PowerPoint%20Presentations/Health%20and%20Safety%20Law.ppt,15,14
/home/myfiles/www/wp-content/als/xm-sad122bf22152ba4823a520cc2fe59f40-ad122bf22152ba4823a520cc2fe59f40-c23c5fbca96e8d641d148bac41017635|https://public.rgfl.org/HS/PowerPoint%20Presentations/Health%20and%20Safety%20Law.ppt,17,16
/home/myfiles/www/wp-content/als/xm-s3c0f031eebceb0fd5c4334ecef15292d-3c0f031eebceb0fd5c4334ecef15292d-c23c5fbca96e8d641d148bac41017635|https://public.rgfl.org/HS/PowerPoint%20Presentations/Health%20and%20Safety%20Law.ppt,12,11
/home/myfiles/www/wp-content/als/xm-sff661e8c3b4f94957926d5434d0ad549-ff661e8c3b4f94957926d5434d0ad549-c23c5fbca96e8d641d148bac41017635|https://quality.gha.org/Portals/2/documents/HEN/Meetings/nursesinstitute/062013/nursesroleineliminatingharm_moddydunning.pptx,17,16
/home/myfiles/www/wp-content/als/xm-s32c41ec2a5440ad220008b9abfe9add2-32c41ec2a5440ad220008b9abfe9add2-c23c5fbca96e8d641d148bac41017635|https://quality.gha.org/Portals/2/documents/HEN/Meetings/nursesinstitute/062013/nursesroleineliminatingharm_moddydunning.pptx,19,18
/home/myfiles/www/wp-content/als/xm-s28787ca2f4372ddb3616d3fd53c161ab-28787ca2f4372ddb3616d3fd53c161ab-c23c5fbca96e8d641d148bac41017635|https://quality.gha.org/Portals/2/documents/HEN/Meetings/nursesinstitute/062013/nursesroleineliminatingharm_moddydunning.pptx,22,21
/home/myfiles/www/wp-content/als/xm-s89a7b68158e38391da9f0de1e636c0d5-89a7b68158e38391da9f0de1e636c0d5-c23c5fbca96e8d641d148bac41017635|https://quality.gha.org/Portals/2/documents/HEN/Meetings/nursesinstitute/062013/nursesroleineliminatingharm_moddydunning.pptx,13,12
/home/myfiles/www/wp-content/als/xm-sc4b14e10f6151995f21334061ff1d139-c4b14e10f6151995f21334061ff1d139-c23c5fbca96e8d641d148bac41017635|https://royalmechanical.files.wordpress.com/2011/06/hy-wire-car-2.pptx,13,12
/home/myfiles/www/wp-content/als/xm-se589d47d163e43fa0c0d68e824e2c286-e589d47d163e43fa0c0d68e824e2c286-c23c5fbca96e8d641d148bac41017635|https://royalmechanical.files.wordpress.com/2011/06/hy-wire-car-2.pptx,19,18
/home/myfiles/www/wp-content/als/xm-s52f897a623c539d09bfb988bfb153888-52f897a623c539d09bfb988bfb153888-c23c5fbca96e8d641d148bac41017635|https://royalmechanical.files.wordpress.com/2011/06/hy-wire-car-2.pptx,14,13
/home/myfiles/www/wp-content/als/xm-sccf27a904c5b88e96a3522b2e1180fed-ccf27a904c5b88e96a3522b2e1180fed-c23c5fbca96e8d641d148bac41017635|https://royalmechanical.files.wordpress.com/2011/06/hy-wire-car-2.pptx,18,17
/home/myfiles/www/wp-content/als/xm-s6874bf9d589708764dab754e5af06ddf-6874bf9d589708764dab754e5af06ddf-c23c5fbca96e8d641d148bac41017635|https://royalmechanical.files.wordpress.com/2011/06/hy-wire-car-2.pptx,17,16
/home/myfiles/www/wp-content/als/xm-s46c55ec8387dbdedd7a83b3ad541cdc1-46c55ec8387dbdedd7a83b3ad541cdc1-c23c5fbca96e8d641d148bac41017635|https://royalmechanical.files.wordpress.com/2011/06/hy-wire-car-2.pptx,19,18
/home/myfiles/www/wp-content/als/xm-s08cfdc15f5935b947bbaa93c7193d496-08cfdc15f5935b947bbaa93c7193d496-c23c5fbca96e8d641d148bac41017635|https://royalmechanical.files.wordpress.com/2011/06/hydro-power-plant.ppt,9,8
/home/myfiles/www/wp-content/als/xm-s86e267bd359c12de262c0279cee0c941-86e267bd359c12de262c0279cee0c941-c23c5fbca96e8d641d148bac41017635|https://royalmechanical.files.wordpress.com/2011/06/hydro-power-plant.ppt,15,14
/home/myfiles/www/wp-content/als/xm-s5aa60354d134b87842918d760ec8bc30-5aa60354d134b87842918d760ec8bc30-c23c5fbca96e8d641d148bac41017635|https://royalmechanical.files.wordpress.com/2011/06/hydro-power-plant.ppt,14,13
Desired Result:
Unique Urls: 4
cut -d "|" -f 2 file | cut -d "," -f 1 | sort -u | wc -l
Output:
4
See: man cut, man sort
An awk solution would be
awk '{sub(/^[^|]*\|/,"");gsub(/,[^,]*/,"");i+=a[$0]++?0:1}END{print i}' file
4
If you happen to use GNU awk then below would also give you the same result
awk '{i+=a[gensub(/.*(http[^,]*).*/,"\\1",1)]++?0:1}END{print i}' file
4
Or even short as pointed out in this cracker comment by #cyrus
awk -F '[|,]' '{i+=!a[$2]++} END{print i}' file
4
which uses awk multiple field separator functionality with more idiomatic awk.
Note: See the [ awk manual ] for more info.
Parse with sed, and since file appears to be already sorted,
(with respect to URLs), just run uniq, and count it:
echo Unique URLs: $(sed 's/^.*|\([^,]*\),.*$/\1/' file | uniq | wc -l)
Use GNU grep to extract URLs:
echo Unique URLs: $(grep -o 'ht[^|,]*' file | uniq | wc -l)
Output (either method):
Unique URLs: 4
tr , '|' < myfile.log | sort -u -t '|' -k 2,2 | wc -l
tr , '|' < myfile.log translates all commas into pipe characters
sort -u -t '|' -k 2,2 sorts unique (-u), pipe delimited (-t '|'), in the second field only (-k 2,2)
wc -l counts the unique lines

How to count duplicates in Bash Shell

Hello guys I want to count how many duplicates there are in a column of a file and put the number next to them. I use awk and sort like this
awk -F '|' '{print $2}' FILE | sort | uniq -c
but the count (from the uniq -c) appears at the left side of the duplicates.
Is there any way to put the count on the right side instead of the left, using my code?
Thanks for your time!
Though I believe you shouls show us your Input_file so that we could create a single command or so for this requirement, since you have't shown Input_file so trying to solve it with your command itself.
awk -F '|' '{print $2}' FILE | sort | uniq -c | awk '{for(i=2;i<=NF;i++){printf("%s ",$i)};printf("%s%s",$1,RS)}'
You can just use awk to reverse the output like below:
awk -F '|' '{print $2}' FILE | sort | uniq -c | awk {'print $2" "$1'}
awk -F '|' '{print $2}' FILE | sort | uniq -c| awk '{a=$1; $1=""; gsub(/^ /,"",$0);print $0,a}'
You can use awk to calculate the amount of duplicates, so your command can be simplified as followed,
awk -F '|' '{a[$2]++}END{for(i in a) print i,a[i]}' FILE | sort
Check this command:
awk -F '|' '{c[$2]++} END{for (i in c) print i, c[i]}' FILE | sort
Use awk to do the counting is enough. If you do not want to sort by browser, remove the pipe and sort.

How to process string in using awk in shell script

I am very new to shell scripting and have to do so many tasks around it. I am trying to learn as fast a possible but some times shell scripting makes a task look very easy and at other times it just toys with me. And I am facing similar situation now.
I have a command which gives me an output like this.
File Dependents
----------------------------------------------------------------------------
<File> is a requisite of <Dependents>
Path: /usr/lib/obj
Java 1.0.0.0 analysis 0.0.0.2
runtime 1.2.0.0
client 1.2.0.0
framework 6.1.9.100
sguide 1.9.10.0
sysmgt 6.1.9.100
dsm 6.1.9.200
Path: /etc/obj
Java 1.0.0.0 analysis 1.2.0.2
runtime 2.0.0.0
client3 6.1.9.0
sysmgt 6.1.9.0
dsm2 6.1.9.0
Now I want to get the list of dependencies into an array for further processing. This is what I am able to do so far:
<command> | cut -f1 | grep '[a-z]' | grep -v File | grep -v : | awk '{ print $1}'
output is:
Java<<< I want this to be analysis
runtime
client
framework
sguide
sysmgt
dsm
Java<<< want this to be analysis
runtime
client3
sysmgt
dsm2
I have to capture these two lists in two separate arrays.
Can someone please help me in achieving this output in an elegant way. I don't want to butcher this code with my brute force method involving lot of conditions and comparisions.
awk to the rescue!
$ arr1=$(command ... | awk -v c=1 '!NF{f=0} f && s==c{print $1} /Java/{f=1; s++; if(s==c) print $(NF-1)}')
$ arr2=$(command ... | awk -v c=2 '!NF{f=0} f && s==c{print $1} /Java/{f=1; s++; if(s==c) print $(NF-1)}')
$ echo $arr1
analysis runtime client framework sguide sysmgt dsm
$ echo $arr2
analysis runtime client3 sysmgt dsm2
perhaps better if you run the command once and split the results into two arrays.
Explanation
awk -v c=1 set awk variable c to 1 (describes group instance number)
'!NF{f=0} if there are no fields (empty line) reset f
f && s==c{print $1} if f is set and counter equals to c print the first field
/Java/{f=1; s++; when pattern matched to Java, set f and increment counter and
...if(s==c) print $(NF-1)}' if counter matches c print the penultimate field.
You can fix your solution by removing the substring with Java first:
command | sed 's/Java [^ ]*//' | cut -f1 | grep '[a-z]' | grep -v File | grep -v : | awk '{ print $1}'
When you use awk, you can better use the full strength of awk. Just say you want the print the second last field of any line with a number:
command | awk '/[0-9]/ { print $(NF-1) }'
This is better than trying to use sed (do you have tabs or spaces?)
command | sed -n '/[0-9].[0-9]/ s/^.* \([^ ]*\) .*/\1/p'
A funny solution is using rev to revert your text. That way cut can find the second field.
command | grep '[0-9].[0-9]' | rev | cut -d " " -f2 | rev
For people who only read the last line, I will repeat the awk solution:
command | awk '/[0-9]/ { print $(NF-1) }'

bash awk first 1st column and 3rd column with everything after

I am working on the following bash script:
# contents of dbfake file
1 100% file 1
2 99% file name 2
3 100% file name 3
#!/bin/bash
# cat out data
cat dbfake |
# select lines containing 100%
grep 100% |
# print the first and third columns
awk '{print $1, $3}' |
# echo out id and file name and log
xargs -rI % sh -c '{ echo %; echo "%" >> "fake.log"; }'
exit 0
This script works ok, but how do I print everything in column $3 and then all columns after?
You can use cut instead of awk in this case:
cut -f1,3- -d ' '
awk '{ $2 = ""; print }' # remove col 2
If you don't mind a little whitespace:
awk '{ $2="" }1'
But UUOC and grep:
< dbfake awk '/100%/ { $2="" }1' | ...
If you'd like to trim that whitespace:
< dbfake awk '/100%/ { $2=""; sub(FS "+", FS) }1' | ...
For fun, here's another way using GNU sed:
< dbfake sed -r '/100%/s/^(\S+)\s+\S+(.*)/\1\2/' | ...
All you need is:
awk 'sub(/.*100% /,"")' dbfake | tee "fake.log"
Others responded in various ways, but I want to point that using xargs to multiplex output is rather bad idea.
Instead, why don't you:
awk '$2=="100%" { sub("100%[[:space:]]*",""); print; print >>"fake.log"}' dbfake
That's all. You don't need grep, you don't need multiple pipes, and definitely you don't need to fork shell for every line you're outputting.
You could do awk ...; print}' | tee fake.log, but there is not much point in forking tee, if awk can handle it as well.

Storing the grep output in loop

This grep commands prints the numbers (the count of groups merged)
grep "merged" sombe_conversion_PSTN.sh.sql.log | awk '{print $1}' | sed 's/ //g'
The Output is as follows:
1000000
41474
41543
83410
83153
83085
82861
82904
82715
41498
41319
I need to add the data from second to last row of output and store it in a variable and
first element in a different variable.
for example :
var_num=1000000
sum_others=663962
How do i loop and add the variables?
Do it twice. If your list of numbers is in the file output, do
$ var_num=$(cat output | head -1)
$ sum_others=$(cat output | sed '1d' | awk '{s += $1} END {print s}')

Resources