I have written a bash script that I need to work identically on linux and macOS that relies on the sort command. I am piping the output of git tag -l to sort, to get a list of all the version tags in the correct semantic order. GNU offers -V which makes this automagic but macOS does not support this argument, so I need to figure out how to accomplish this sort order without it.
6.3.1.1
6.3.1.10
6.3.1.11
6.3.1.2
6.3.1.3
...
needs to be sorted as
6.3.1.1
6.3.1.2
6.3.1.3
...
6.3.1.10
6.3.1.11
You can use additional features of git tag to get a list of tags matching a pattern and sorted properly for version tag ordering (typically no leading zeros):
$ git tag --sort v:refname
v0.0.0
v0.0.1
v0.0.2
v0.0.3
v0.0.4
v0.0.5
v0.0.6
v0.0.7
v0.0.8
v0.0.9
v0.0.10
v0.0.11
v0.0.12
From $ man git-tag:
--sort=<type>
Sort in a specific order. Supported type is "refname
(lexicographic order), "version:refname" or "v:refname"
(tag names are treated as versions). Prepend "-" to reverse
sort order. When this option is not given, the sort order
defaults to the value configured for the tag.sort variable
if it exists, or lexicographic order otherwise. See
git config(1).
You can download coreUtils from http://rudix.org/packages/index.html
It contains gnusort with support sort -V sintax
sed 's/\b\([0-9]\)\b/0\1/g' versions.txt | sort | sed 's/\b0\([0-9]\)/\1/g'
To explain why this works, consider the first sed command by itself. With your input as versions.txt, the first sed command adds a leading zero onto single-digit version numbers, producing:
06.03.01.01
06.03.01.02
06.03.01.03
06.03.01.10
06.03.01.11
The above can be sorted normally. After that, it is a matter of removing the added characters. In the full command, the last sed command removes the leading zeros to produce the final output:
6.3.1.1
6.3.1.2
6.3.1.3
6.3.1.10
6.3.1.11
The works as long as version numbers are 99 or less. If you have version numbers over 99 but less than 1000, the command gets only slightly more complicated:
sed 's/\b\([0-9]\)\b/00\1/g ; s/\b\([0-9][0-9]\)\b/0\1/g' versions.txt | sort | sed 's/\b0\+\([0-9]\)/\1/g'
As I don't have a Mac, the above were tested on Linux.
UPDATE: In the comments, Jonathan Leffler says that even though word boundary (\b) is in Mac regex docs, Mac sed doesn't seem to recognize it. He suggests replacing the first sed with:
sed 's/^[0-9]\./0&/; s/\.\([0-9]\)$/.0\1/; s/\.\([0-9]\)\./.0\1./g; s/\.\([0-9]\)\./.0\1./g'
So, the full command might be:
sed 's/^[0-9]\./0&/; s/\.\([0-9]\)$/.0\1/; s/\.\([0-9]\)\./.0\1./g; s/\.\([0-9]\)\./.0\1./g' versions.txt | sort | sed 's/^0// ; s/\.0/./g'
This handles version numbers up to 99.
The standard sort that comes installed on OS X can sort by fields using a separator. So you can sort the version numbers and any suffixes.
This will sort by suffix first and then by the X.Y.Z parts sort -s -t- -k 2,2n | sort -t. -s -k 1,1n -k 2,2n -k 3,3n -k 4,4n, which can also sort the -N-g format version number from the git describe --tags command
0.11.1
0.11.4
0.11.9-1-ge6b0c59
0.12.0
0.12.1
0.12.2-1-g2d0a334
0.13.0
0.13.0-1-g7711b16
0.13.0-2-g32f91bd
0.13.0-3-g83e21c5
0.14.1-alpha
0.14.1
0.14.2
The -3-g83e21c5 above is an example of a suffix that the git describe --tags command will automatically append to the latest tag to to signify the number of commits since the tag (3), and the Git SHA hash of the most recent commit (83e21c5)
To reverse the sort into descending order do this: sort -s -t- -k 2,2nr | sort -t. -s -k 1,1nr -k 2,2nr -k 3,3nr -k 4,4nr
Or you can define a shell function around it.
version_sort() {
# read stdin, sort by version number descending, and write stdout
# assumes X.Y.Z version numbers
# this will sort tags like pr-3001, pr-3002 to the END of the list
# and tags like 2.1.4 BEFORE 2.1.4-gitsha
sort -s -t- -k 2,2nr | sort -t. -s -k 1,1nr -k 2,2nr -k 3,3nr -k 4,4nr
}
or write it into a little file named version-sort, and put into some directory on your PATH. Be sure to chmod +x on the file
#!/usr/bin/env bash
sort -s -t- -k 2,2nr | sort -t. -s -k 1,1nr -k 2,2nr -k 3,3nr -k 4,4nr
brew install coreutils
If corutils are installed you should have gsort on your Mac
gsort --version
Related
I have a list of buildnumbers which I get from my buildserver, like this:
1.0.0.b1
1.0.0.b10
1.0.0.b11
1.0.0.b12
1.0.0.b13
1.0.0.b14
1.0.0.b15
1.0.0.b16
1.0.0.b17
1.0.0.b18
1.0.0.b19
1.0.0.b2
1.0.0.b20
1.0.0.b21
1.0.0.b22
1.0.0.b3
1.0.0.b4
1.0.0.b5
1.0.0.b6
1.0.0.b7
1.0.0.b8
1.0.0.b9
now I need to sort this where I expect the highes buildnumber on the bottom like this:
1.0.0.b1
1.0.0.b2
1.0.0.b3
1.0.0.b4
1.0.0.b5
1.0.0.b6
1.0.0.b7
1.0.0.b8
1.0.0.b9
1.0.0.b10
1.0.0.b11
1.0.0.b12
1.0.0.b13
1.0.0.b14
1.0.0.b15
1.0.0.b16
1.0.0.b17
1.0.0.b18
1.0.0.b19
1.0.0.b20
1.0.0.b21
1.0.0.b22
now in linux with GNU sort it is easy - just use sort -V
But this has also to work on macOS where I do not have any experience on it, but from testing I know -V does not work there.
I tried with
sort -t . -k 1,1n -k 2,2n -k 3,3n -k 4,4n
but no luck there.
I want to have it sorted by Version/buildnumber, e.g.
1.1.3.b5 is higher than 1.0.3.b66
what have I missed here? Can you please help me? Also, unfortuneatly, installing homebrew coreutils are not an option
thank you,
br Alex
I assume your real full list won't have all b versions. You'll need to split field 4 into two keys; one for the alpha part and one for the numeric part.
$: sort -t. -k1n -k2n -k3n -k4.1,4.1 -k4.2n vnums
1.0.0.a5
1.0.0.a10
1.0.0.a13
1.0.0.a19
1.0.0.b1
1.0.0.b6
1.0.0.b8
1.0.0.b9
1.0.0.b12
1.0.0.b14
1.0.0.b17
1.0.0.b20
1.0.0.b21
1.0.0.b22
1.0.0.c3
1.0.0.c7
1.0.0.c15
1.0.0.c16
1.0.0.d2
1.0.0.d4
1.0.0.d11
1.0.0.d18
1.0.3.b66
1.1.3.b5
Note the limiting of the alpha column of field 4 to a single character.
Perl one-liner
Assuming that there is no more than 6 digit (or fixed size), sprintf "%06", $& will sort over numbers left padded with 0:
perl -e 'sub v{"#_"=~s/\d+/sprintf"%06d",$&/ger}print sort{v($a)cmp v($b)} <>' inputfile
Treat the forth key as composed of subfields:
sort -t. -k1n,1 -k2n,2 -k3n,3 -k4.1,4.1 -k4.2n
I have a largish file with lines like this: (^I represents a tab, $ end-of-line)
2^IElaeocarpus williamsianus^I48$
4^I$
6^I$
8^I$
10^I$
12^I$
14^IElaeocarpus hookerianus^I73$
16^IElaeocarpus kirtonii^I111$
20^I$
22^ITetratheca juncea^I66$
42^IMalagasy giant rat^I401$
and I want to sort the lines so that those with the highest number in the 3rd field (i.e. after the 2nd tab) come first, i.e.
42^IMalagasy giant rat^I401$
16^IElaeocarpus kirtonii^I111$
14^IElaeocarpus hookerianus^I73$
22^ITetratheca juncea^I66$
2^IElaeocarpus williamsianus^I48$
4^I$
6^I$
8^I$
10^I$
12^I$
20^I$
(I don't care about the order of the lines with no field 3). So I assumed something like the following would work
sort -r -t $'\t' -k 3,3n myfile
but it doesn't (GNU sort, OS X 10.9). I feel I'm being stupid. What's the correct incantation?
You need to add a modifier to your -k parameter, not the command line parameter.
So something in these lines should do the trick:
sort -t $'\t' -k 3,3nr myfile
It seems that it isn't n what you want, but g.
sort -t $'\t' test.txt -k 3.2gr
The dot specifies in the key at which character to start comparing.
As favoretti pointed out, what you want to reverse is by that column, so you apply the modifier there.
I've been trying to sort two files and get the output.
say for file 1:
102310863||7097881||6845123||271640||06007709532577||||
102310875||7092992||6840818||023740||10034500635650||||
and file 2:
102310863||7097881||6845193||271640||06007709532577||||
102310875||7092992||6840808||023740||10034500635650||||
The desired output is:
102310863||7097881||6845123||271640||06007709532577||||
102310863||7097881||6845193||271640||06007709532577||||
102310875||7092992||6840818||023740||10034500635650||||
102310875||7092992||6840808||023740||10034500635650||||
I've been trying to use the sort command
sort -t \| -n -k1,1 t1.txt t2.txt
but it is giving me the output
102310863||7097881||6845123||271640||06007709532577||||
102310863||7097881||6845193||271640||06007709532577||||
102310875||7092992||6840808||023740||10034500635650||||
102310875||7092992||6840818||023740||10034500635650||||
which is not what I want because original file order is not preserved.
Is there any other way of doing it to get the desired output?
Using the -s flag performs a stable sort.
sort -s -t \| -k1,1 t1.txt t2.txt
From man sort:
-s, --stable
stabilize sort by disabling last-resort comparison
I have a list of files in a folder.
The names are:
1-a
100-a
2-b
20-b
3-x
and I want to sort them like
1-a
2-b
3-x
20-b
100-a
The files are always a number, followed by a dash, followed by anything.
I tried a ls with a col and sort and it works, but I wanted to know if there's a simpler solution.
Forgot to mention: This is bash running on a Mac OS X.
Some ls implementations, GNU coreutils' ls is one of them, support the -v (natural sort of (version) numbers within text) option:
% ls -v
1-a 2-b 3-x 20-b 100-a
or:
% ls -v1
1-a
2-b
3-x
20-b
100-a
Use sort to define the fields.
sort -s -t- -k1,1n -k2 filenames.txt
The -t tells sort to treat - as the field separator in input items. -k1,1n instructs sort to first sort on the first field numerically; -k2 sorts using the remaining fields as the second key in cade the first fields are equal. -s keeps the sort stable (although you could omit it since the entire input string is being used in one field or another).
(Note: I'm assuming the file names do not contain newlines, so that something like ls > filenames.txt is guaranteed to produce a file with one name per line. You could also use ls | sort ... in that case.)
I have a file with floats with exponents and I want to sort them. AFAIK 'sort -g' is what I need. But it seems like it sorts floats throwing away all the exponents. So the output looks like this (which is not what I wanted):
$ cat file.txt | sort -g
8.387280091e-05
8.391373668e-05
8.461754562e-07
8.547354437e-05
8.831553093e-06
8.936111118e-05
8.959458896e-07
This brings me to two questions:
Why 'sort -g' doesn't work as I expect it to work?
How cat I sort my file with using bash commands?
The problem is that in some countries local settings can mess this up by using , as the decimal separator instead of . on a system level. Check by typing locale in terminal. There should be an entry
LC_NUMERIC=en_US.UTF-8
If the value is anything else, change it to the above by editing the locale file
sudo gedit /etc/default/locale
That's it. You can also temporarily use this value by doing
LC_ALL=C sort -g file.dat
LC_ALL=C is shorter to write in terminal, but putting it in the locale file might not be preferable as it could alter some other system-wide behavior such as maybe time format.
Here's a neat trick:
$ sort -te -k2,2n -k1,1n test.txt
8.461754562e-07
8.959458896e-07
8.831553093e-06
8.387280091e-05
8.391373668e-05
8.547354437e-05
8.936111118e-05
The -te divides your number into two fields by the e that separates out the mantissa from the exponent. the -k2,2 says to sort by exponent first, then the -k1,1 says to sort by your mantissa next.
Works with all versions of the sort command.
Your method is absolutely correct
cat file.txt | sort -g
If the above code is not working , then try this
sed 's/\./0000000000000/g' file.txt | sort -g | sed 's/0000000000000/\./g'
Convert '.' to '0000000000000' , sort and again subsitute with '.'. I chose '0000000000000' to replace so as to avoid mismatching of the number with the inputs.
You can manipulate the number by your own.