I'm trying to assign a variable in bash to the file in this directory with the largest number before the '.tar.gz' and I'm drawing a complete blank on the best way to approach this:
ls /dirname | sort
daily-500-12345.tar.gz
daily-500-12345678.tar.gz
daily-500-987654321.tar.gz
weekly-200-1111111.tar.gz
monthly-100-8675309.tar.gz
sort -Vrt - -k3,3
-V Natural sort
-r Reverse, so you can use head -1 to get the first line only
-t - Use hyphen as field separator
-k3,3 Sort using only the third field
Output:
daily-500-987654321.tar.gz
daily-500-12345678.tar.gz
monthly-100-8675309.tar.gz
weekly-200-1111111.tar.gz
daily-500-12345.tar.gz
I am trying to write a BASH script to sort the contents of a file numerically according to a specific field in the file.
The file is under /etc/group. All of the fields are colon-separated :. I have to sort the contents of /etc/group numerically based on the 3rd field.
Example field: daemon:*:1:root
What I'm trying so far:
#!/bin/bash
sort /etc/group -n | cut -f 3-3 -d ":" /etc/group
This is getting me really close, but it only prints out a sorted list of 3rd field values (since cut literally cuts out the rest of the line). I'm trying to keep the rest of the line but still have it sorted by the 3rd field contents.
You can use sort -t like this:
sort -t : -nk3 /etc/group
-t : tells sort to use field delimiter as :
-nk3 tells sort to sort data numerically on field #3
I have a list of files in a folder.
The names are:
1-a
100-a
2-b
20-b
3-x
and I want to sort them like
1-a
2-b
3-x
20-b
100-a
The files are always a number, followed by a dash, followed by anything.
I tried a ls with a col and sort and it works, but I wanted to know if there's a simpler solution.
Forgot to mention: This is bash running on a Mac OS X.
Some ls implementations, GNU coreutils' ls is one of them, support the -v (natural sort of (version) numbers within text) option:
% ls -v
1-a 2-b 3-x 20-b 100-a
or:
% ls -v1
1-a
2-b
3-x
20-b
100-a
Use sort to define the fields.
sort -s -t- -k1,1n -k2 filenames.txt
The -t tells sort to treat - as the field separator in input items. -k1,1n instructs sort to first sort on the first field numerically; -k2 sorts using the remaining fields as the second key in cade the first fields are equal. -s keeps the sort stable (although you could omit it since the entire input string is being used in one field or another).
(Note: I'm assuming the file names do not contain newlines, so that something like ls > filenames.txt is guaranteed to produce a file with one name per line. You could also use ls | sort ... in that case.)
I have a text file:
$ cat text
542,8,1,418,1
542,9,1,418,1
301,34,1,689070,1
542,9,1,418,1
199,7,1,419,10
I'd like to sort the file based on the first column and remove duplicates using sort, but things are not going as expected.
Approach 1
$ sort -t, -u -b -k1n text
542,8,1,418,1
542,9,1,418,1
199,7,1,419,10
301,34,1,689070,1
It is not sorting based on the first column.
Approach 2
$ sort -t, -u -b -k1n,1n text
199,7,1,419,10
301,34,1,689070,1
542,8,1,418,1
It removes the 542,9,1,418,1 line but I'd like to keep one copy.
It seems that the first approach removes duplicate but not sorts correctly, whereas the second one sorts right but removes more than I want. How should I get the correct result?
The problem is that when you provide a key to sort the unique occurrences are looked for that particular field. Since the line 542,8,1,418,1 is displayed, sort sees the next two lines starting with 542 as duplicate and filters them out.
Your best bet would be to either sort all columns:
sort -t, -nk1,1 -nk2,2 -nk3,3 -nk4,4 -nk5,5 -u text
or
use awk to filter duplicate lines and pipe it to sort.
awk '!_[$0]++' text | sort -t, -nk1,1
When sorting on a key, you must provide the end of the key as well, otherwise sort uses all following keys as well.
The following should work:
sort -t, -u -k1,1n text
"sort" correctly reports these two lines are out of order:
> echo "a b\na a" | sort -c
sort: -:2: disorder: a a
How do I tell sort to compare only the first field of each line? I tried:
> echo "a b\na a" | sort -c -k1
sort: -:2: disorder: a a
but it failed, as above.
Can I make sort compare the first field of each line only, or must I
used something like sed to trim the lines before comparing them?
EDIT: I'm using "sort (GNU coreutils) 7.2". I tried using a different field separator but it didn't help:
> echo "a b\na a" | sort -k1 -c -t" "
sort: -:2: disorder: a a
although I'm pretty sure space is the default separator anyway.
The following works as expected:
echo "a b\na a" | sort -s -c -k1,1
There were two problems with your sort invocation:
The argument to -k is a key definition that specifies a start and end position. If end position is omitted, it defaults to the last field of the line, not the start field. -k1,1 specifies both, telling sort not to include the second field in the comparison.
sort is not stable by default, which means it doesn't guarantee not to disturb the order of lines that compare equal. Quoting the documentation:
Finally, as a last resort when all keys compare equal, sort compares
entire lines as if no ordering options other than --reverse (-r)
were specified. The --stable (-s) option disables this
"last-resort comparison" so that lines in which all fields compare
equal are left in their original relative order.