Sorting file contents numerically by field

Sorting file contents numerically by field - bash

I am trying to write a BASH script to sort the contents of a file numerically according to a specific field in the file.
The file is under /etc/group. All of the fields are colon-separated :. I have to sort the contents of /etc/group numerically based on the 3rd field.
Example field: daemon:*:1:root
What I'm trying so far:
#!/bin/bash
sort /etc/group -n | cut -f 3-3 -d ":" /etc/group
This is getting me really close, but it only prints out a sorted list of 3rd field values (since cut literally cuts out the rest of the line). I'm trying to keep the rest of the line but still have it sorted by the 3rd field contents.

You can use sort -t like this:
sort -t : -nk3 /etc/group
-t : tells sort to use field delimiter as :
-nk3 tells sort to sort data numerically on field #3

Related

Using the result of grep to sort files

I need to sort a file based on the results of grep. Example:
cat cuts.txt | grep -P '(?<=[+]).*(?=[+])'
text +124+ text
text +034+ text
text +334+ text
How do I sort lines in crescent order based on what grep found?

Could you please try following, written and tested with shown samples. Considering that you need to sort by 2nd field's increasing values. Since OP mentioned +digits+ values could be present anywhere in line hence coming with this Generic solution here.
grep -P '(?<=[+]).*(?=[+])' Input_file |
awk '
match($0,/\+[0-9]+\+/){
print substr($0,RSTART,RLENGTH),$0
}
' | sort -k1.2 | cut -d' ' -f2-
Output will be as follows.
text +034+ text
text +124+ text
text +334+ text
Logical explanation: After passing grep command's output to awk command using regex in awk command to find +digits+ values in lines and printing 1st matched values then whole line, by doing this it will be easy for sort to why because we always get to sort now on 1st field. Once we do sorting on first field then use cut to get everything from 2nd field onwards, why because 1st field is an extra added field by awk command to make sort easier and not needed in actual output.
Also we need NOT to use a separate cat command to this one, we could directly read Input_file by grep command.

How to sort by numbers that are part of a filename in bash?

I'm trying to assign a variable in bash to the file in this directory with the largest number before the '.tar.gz' and I'm drawing a complete blank on the best way to approach this:
ls /dirname | sort
daily-500-12345.tar.gz
daily-500-12345678.tar.gz
daily-500-987654321.tar.gz
weekly-200-1111111.tar.gz
monthly-100-8675309.tar.gz

sort -Vrt - -k3,3
-V Natural sort
-r Reverse, so you can use head -1 to get the first line only
-t - Use hyphen as field separator
-k3,3 Sort using only the third field
Output:
daily-500-987654321.tar.gz
daily-500-12345678.tar.gz
monthly-100-8675309.tar.gz
weekly-200-1111111.tar.gz
daily-500-12345.tar.gz

Sort and remove duplicates based on column

I have a text file:
$ cat text
542,8,1,418,1
542,9,1,418,1
301,34,1,689070,1
542,9,1,418,1
199,7,1,419,10
I'd like to sort the file based on the first column and remove duplicates using sort, but things are not going as expected.
Approach 1
$ sort -t, -u -b -k1n text
542,8,1,418,1
542,9,1,418,1
199,7,1,419,10
301,34,1,689070,1
It is not sorting based on the first column.
Approach 2
$ sort -t, -u -b -k1n,1n text
199,7,1,419,10
301,34,1,689070,1
542,8,1,418,1
It removes the 542,9,1,418,1 line but I'd like to keep one copy.
It seems that the first approach removes duplicate but not sorts correctly, whereas the second one sorts right but removes more than I want. How should I get the correct result?

The problem is that when you provide a key to sort the unique occurrences are looked for that particular field. Since the line 542,8,1,418,1 is displayed, sort sees the next two lines starting with 542 as duplicate and filters them out.
Your best bet would be to either sort all columns:
sort -t, -nk1,1 -nk2,2 -nk3,3 -nk4,4 -nk5,5 -u text
or
use awk to filter duplicate lines and pipe it to sort.
awk '!_[$0]++' text | sort -t, -nk1,1

When sorting on a key, you must provide the end of the key as well, otherwise sort uses all following keys as well.
The following should work:
sort -t, -u -k1,1n text

How to sort the output obtained with grep -c?

I use the following 'grep' command to get the count of the string alert in each of my files at the given path:
grep 'alert' -F /usr/local/snort/rules/* -c
How do I sort the resulting output in desired order- say ascending order, descending order, ordered by name, etc. An answer specific to these cases is sufficient.
You may freely suggest a command other than grep as well.

Pipe it into sort. Assuming your filenames have no colons, use the "-t" option to specify the colon as field saparator. Use -n for numerical sorting.
Example:
grep 'alert' -F /usr/local/snort/rules/* -c | sort -t: -n -k2
should split lines into fields separated by ":", use the second field for sorting, and treat this as numbers (so 21 is actually later than 3).

Sorting a CSV file from greatest to least based a number appearing in a column

I have a CSV file like this:
bear,1
fish,20
tiger,4
I need to sort it from greatest to least number, based on what is found in the second column, e.g.:
fish,20
tiger,4
bear,1
How can the file be sorted in this way?

sort -t, -k+2 -n -r filename
will do what you want.
-t, specifies the field separator to be a comma
-k+2 specifies the field to sort on (field2)
-r specifies a reverse sort
-n specifies a numeric sort

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Sorting file contents numerically by field - bash

You can use sort -t like this: sort -t : -nk3 /etc/group -t : tells sort to use field delimiter as : -nk3 tells sort to sort data numerically on field #3

Related

Using the result of grep to sort files

How to sort by numbers that are part of a filename in bash?

Sort and remove duplicates based on column

How to sort the output obtained with grep -c?

Sorting a CSV file from greatest to least based a number appearing in a column

Categories

Resources