Shell script to sort debian version numbers (line_5.4.3-2) [duplicate] - bash

This question already has answers here:
How can I sort file names by version numbers?
(7 answers)
Closed 3 years ago.
I have a text file with entries representing build tags with version and build number in the same format as debian packages like this:
nimbox-apexer_1.0.0-12
nimbox-apexer_1.1.0-2
nimbox-apexer_1.1.0-1
nimbox-apexer_1.0.0-13
Using a shell script I need to sort the above list by 'version-build' and get the last line, which in the above example is nimbox-apexer_1.1.0-2.

Get the latest build with:
cat file.txt | sort -V | tail -n1
Now, to catch it into a variable:
BUILD=$(cat file.txt | sort -V | tail -n1)

sort -n -t "_" -k2.3 file | tail -1

cat file.txt | cut -d_ -f 2 | sed "s/-/./g" | sort -n -t . -k 1,2n -k 2,2n -k 3,3n -k 4,3n
The 2n,3n are the number of characters considered relevant in that field. Increase them if you use really big version numbers...

With GNU sort:
sort --version-sort file | tail -n -1
GNU tail doesn't like tail -1.

I haven't been able to find a simple way to do this. I've been looking at code to sort ip address, which is similar to my problem, and trying to change my situation to that one. This what I have come up with. Please tell me there is a simpler better way !!!
sed 's/^[^0-9]*\([0-9]*\)\.\([0-9]*\)\.\([0-9]*\)-\([0-9]*\)/\1.\2.\3.\4 &/' list.txt | \
sort -t . -n -k 1,1 -k 2,2 -k 3,3 -k 4,4 | \
sed 's/^[^ ]* \(.*\)/\1/' | \
tail -n 1
So starting with this data:
nimbox-apexer_11.9.0-2
nimbox-apexer_1.10.0-9
nimbox-apexer_1.9.0-1
nimbox-apexer_1.0.0-12
nimbox-apexer_1.1.0-2
nimbox-apexer_1.1.0-1
nimbox-apexer_1.0.0-13
The first sed converts my problem into a sorting IPs problem keeping the original line to reverse the change at the end:
11.9.0.2 nimbox-apexer_11.9.0-2
1.10.0.9 nimbox-apexer_1.10.0-9
1.9.0.1 nimbox-apexer_1.9.0-1
1.0.0.12 nimbox-apexer_1.0.0-12
1.1.0.2 nimbox-apexer_1.1.0-2
1.1.0.1 nimbox-apexer_1.1.0-1
1.0.0.13 nimbox-apexer_1.0.0-13
The sort orders the line using the first four numbers which in my case represent mayor.minor.release.build
1.0.0.12 nimbox-apexer_1.0.0-12
1.0.0.13 nimbox-apexer_1.0.0-13
1.1.0.1 nimbox-apexer_1.1.0-1
1.1.0.2 nimbox-apexer_1.1.0-2
1.9.0.1 nimbox-apexer_1.9.0-1
1.10.0.9 nimbox-apexer_1.10.0-9
11.9.0.2 nimbox-apexer_11.9.0-2
The last sed eliminates the data used to sort
nimbox-apexer_1.0.0-12
nimbox-apexer_1.0.0-13
nimbox-apexer_1.1.0-1
nimbox-apexer_1.1.0-2
nimbox-apexer_1.9.0-1
nimbox-apexer_1.10.0-9
nimbox-apexer_11.9.0-2
Finally tail gets the last line which is the one I need.

Related

How to find most frequent string in file

I have a question about bash script, lets say there is file witch contains lines, each line will have path to a file and a date, the problem is how to find most frequent path.
Thanks in advance.
Here's a suggestion
$ cut -d' ' -f1 file.txt | sort | uniq -c | sort -rn | head -n1
# \_____________________/ \__/ \_____/ \______/ \_______/
# select the file column sort print sort on print top
# files counts count result
Example use:
$ cat file.txt
/home/admin/fileA jan:17:13:46:27:2015
/home/admin/fileB jan:17:13:46:27:2015
/home/admin/fileC jan:17:13:46:27:2015
/home/admin/fileA jan:17:13:46:27:2015
/home/admin/fileA jan:17:13:46:27:2015
$ cut -d' ' -f1 file.txt | sort | uniq -c | sort -rn | head -n1
3 /home/admin/fileA
You can strip out 3 from the final result by another cut.
Reverse the lines, cut the begginning (the date), reverse them again, then sort and count unique lines:
cat file.txt | rev | cut -b 22- | rev | sort | uniq -c
If you're absolutely sure you won't have whitespace in your paths, you can avoid rev altogether:
cat file.txt | cut -d " " -f 1 | sort | uniq -c
If the output is too long to inspect visually, aioobe's suggestion of following this with sort -rn | head -n1 will serve you well
It's worth noticing, as aioobe mentioned, that many unix commands optionally take a file argument. By using it, you can avoid the extra cat command in the beginning, by supplying its argument to the next command:
cat file.txt | rev | ... vs rev file.txt | ...
While I personally find the first option both easier to remember and understand, the second is preferred by many (most?) people, as it saves up system resources (specifically, the memory and references used by an additional process) and can have better performance in some specific use cases. Wikipedia's cat article discusses this in detail.

Sort text file using bash sort

I'm trying to sort the following file by date with earliest to latest:
$NAME DIA
# Date,Open,High,Low,Close,Volume,Adj Close
01-10-2014,169.91,169.98,167.42,167.68,11019000,167.68
29-04-2014,164.62,165.27,164.49,165.00,4581400,163.40
17-10-2013,152.11,153.59,152.05,153.48,9916600,150.26
06-09-2013,149.70,149.97,147.77,149.09,9001900,145.68
02-11-2012,132.56,132.61,130.47,130.67,5141300,125.01
01-11-2012,131.02,132.44,130.97,131.98,3807400,126.27
sort -t- -k3 -k2 -k1 DIA.txt gets the year right but scrambles the month and day.
any help would be greatly appreciated.
This seems to produce correct output
sort -s -t- -k3,3 -k2,2 -k1,1
output:
$ sort -s -t- -k3,3 -k2,2 -k1,1 dia.txt
# Date,Open,High,Low,Close,Volume,Adj Close
01-11-2012,131.02,132.44,130.97,131.98,3807400,126.27
02-11-2012,132.56,132.61,130.47,130.67,5141300,125.01
06-09-2013,149.70,149.97,147.77,149.09,9001900,145.68
17-10-2013,152.11,153.59,152.05,153.48,9916600,150.26
29-04-2014,164.62,165.27,164.49,165.00,4581400,163.40
01-10-2014,169.91,169.98,167.42,167.68,11019000,167.68
I would try changing the date format first.
sed -r "s/(..)-(..)-(....)/\\3-\\2-\\1/" DIA.txt | sort
You can also change it back after sorting the lines.
sed -r "s/(..)-(..)-(....)/\\3-\\2-\\1/" DIA.txt | sort | sed -r "s/(....)-(..)-(..)/\\3-\\2-\\1/"
sort's -k flag only allows you to specify two columns that give the range of keys to use in the sort. Here you want to involve a third column before that. There is a special syntax to use an additional column to resolve ties (here between rows when sorting with column 3 and 2):
sort -t'-' -k3,2.1 d

How to filter pipeline data according to column?

I wrote the following pipeline:
for i in `ls c*.txt | sort -V`; do echo $i; grep -v '#' ${i%???}_c_new.txt | grep -v 'seq-name' | cut -f 6 | grep -o '[0-9]*' | awk '{s+=$1} END {print s}'; done
Now, I want to take 6th column (cut -f 6 and later code) of only those lines, which match certain grep in 13th column.
These:
cut -f 13 | grep -o '^A$'
So that I look at 13th column and if grep matches, then I take this line and make rest of the code - counting sum of numbers in 6th column.
Please, how can I do such a thing? Thanks.
Make a grep command that will take uncut lines and filter by 13th field, like
grep -E '(\S+\s+){12}A\s'
and then pipe it to cut -f 6 and so on.

How can I sort file names by version numbers?

In the directory "data" are these files:
command-1.9a-setup
command-2.0a-setup
command-2.0c-setup
command-2.0-setup
I would like to sort the files to get this result:
command-1.9a-setup
command-2.0-setup
command-2.0a-setup
command-2.0c-setup
I tried this
find /data/ -name 'command-*-setup' | sort --version-sort --field-separator=- -k2
but the output was
command-1.9a-setup
command-2.0a-setup
command-2.0c-setup
command-2.0-setup
The only way I found that gave me my desired output was
tree -v /data
How could I get with sort the output in the wanted order?
Edit: It turns out that Benoit was sort of on the right track and Roland tipped the balance
You simply need to tell sort to consider only field 2 (add ",2"):
find ... | sort --version-sort --field-separator=- --key=2,2
Original Answer: ignore
If none of your filenames contain spaces between the hyphens, you can try this:
find ... | sed 's/.*-\([^-]*\)-.*/\1 \0/;s/[^0-9] /.&/' | sort --version-sort --field-separator=- --key=2 | sed 's/[^ ]* //'
The first sed command makes the lines look like this (I added "10" to show that the sort is numeric):
1.9.a command-1.9a-setup
2.0.c command-2.0c-setup
2.0.a command-2.0a-setup
2.0 command-2.0-setup
10 command-10-setup
The extra dot makes the letter suffixed version number sort after the version number without the suffix. The second sed command removes the prefixed version number from each line.
There are lots of ways this can fail.
If you specify to sort that you only want to consider the second field (-k2) don't complain that it does not consider the third one.
In your case, run sort --version-sort without any other argument, maybe this will suit better.
Looks like this works:
find /data/ -name 'command-*-setup' | sort -t - -V -k 2,2
not with sort but it works:
tree -ivL 1 /data/ | perl -nlE 'say if /\Acommand-[0-9][0-9a-z.]*-setup\z/'
-v: sort the output by version
-i: makes tree not print the indentation lines
-L level: max display depth of the directory tree
Another way to do this is to pad your numbers.
This example pads all numbers to 8 digits.
Then, it does a plain alphanumeric sort.
Then, it removes the pad.
$ pad() { perl -pe 's/(\d+)/0000000\1/g' | perl -pe 's/0*(\d{8})/\1/g'; }
$ unpad() { perl -pe 's/0*([1-9]\d*|0)/\1/g'; }
$ cat files | pad | sort | unpad
command-1.9a-setup
command-2.0-setup
command-2.0a-setup
command-2.0c-setup
command-10.1-setup
To get some insight into how this works, let's look at the padded sorted result:
$ cat files | pad | sort
command-00000001.00000009a-setup
command-00000002.00000000-setup
command-00000002.00000000a-setup
command-00000002.00000000c-setup
command-00000010.00000001-setup
You'll see that with all the numbers nicely padded to 8 digits, the alphanumeric sort puts the filenames into their desired order.
Old post, but... ls -l --sort=version may be of assistance (although for OP's example the sort is the same as done by ls -l in a RHEL 7.2):
command-1.9a-setup
command-2.0a-setup
command-2.0c-setup
command-2.0-setup
YMMV i guess.
$ cat files
command-1.9a-setup
command-2.0c-setup
command-10.1-setup
command-2.0a-setup
command-2.0-setup
$ cat files | sort -t- -k2,2 -n
command-1.9a-setup
command-2.0-setup
command-2.0a-setup
command-2.0c-setup
command-10.1-setup
$ tac files | sort -t- -k2,2 -n
command-1.9a-setup
command-2.0-setup
command-2.0a-setup
command-2.0c-setup
command-10.1-setup
I have files in a folder and need to get those name in sort order, based on the number. E.g. -
abc_dr-1.txt
hg_io-5.txt
kls_er_we-3.txt
sd-4.txt
sl_rt_we_yh-2.txt
I need to sort them based on number.
So I used this to sort.
ls -1 | sort -t '-' -nk2
It gave me files in sort order based on number.

How to reverse lines of a text file?

I'm writing a small shell script that needs to reverse the lines of a text file. Is there a standard filter command to do this sort of thing?
My specific application is that I'm getting a list of Git commit identifiers, and I want to process them in reverse order:
git log --pretty=oneline work...master | grep -v DEBUG: | cut -d' ' -f1 | reverse
The best I've come up with is to implement reverse like this:
... | cat -b | sort -rn | cut -f2-
This uses cat to number every line, then sort to sort them in descending numeric order (which ends up reversing the whole file), then cut to remove the unneeded line number.
The above works for my application, but may fail in the general case because cat -b only numbers nonblank lines.
Is there a better, more general way to do this?
In GNU coreutils, there's tac(1)
There is a command for your purpose:
tail -r file.txt
Prints the lines of file.txt in reverse order!
The -r flag is non-standard, may not work on all systems, works e.g. on macOS.
Beware: Amount of lines limited. Works mostly, but when working with huge files be careful and check.
Answer is not 42 but tac.
Edit: Slower but more memory consuming using sed
sed 'x;1!H;$!d;x'
and even longer
perl -e'print reverse<>'
Similar to the sed example above, using perl - maybe more memorable (depending on how your brain is wired):
perl -e 'print reverse <>'
cat -b only numbers nonblank lines"
If that's the only issue you want to avoid, then why not use "cat -n" to number all the lines?
: "#(#)$Id: reverse.sh,v 1.2 1997/06/02 21:45:00 johnl Exp $"
#
# Reverse the order of the lines in each file
awk ' { printf("%d:%s\n", NR, $0);}' $* |
sort -t: +0nr -1 |
sed 's/^[0-9][0-9]*://'
Works like a charm for me...
In this case, just use --reverse:
$ git log --reverse --pretty=oneline work...master | grep -v DEBUG: | cut -d' ' -f1
rev <name of your text file.txt>
You can even do this:
echo <whatever you want to type>|rev
awk '{a[i++]=$0}END{for(;i-->0;)print a[i]}'
More faster than sed and compatible for embed devices like openwrt.

Resources