Sort by an ID column and date column (MM/DD/YYYY) - bash

I'm trying to sort a .txt file by both an ID column and a date column, but the date sort part is not working as I need it to.
Data:
|855986|03/01/1980|100|
|855986|06/01/1979|120|
|868566|01/01/1999|560|
|855986|05/01/2015|856|
|868566|09/01/2000|560|
What I need output to look like:
|855986|06/01/1979|120|
|855986|03/01/1980|100|
|855986|05/01/2015|856|
|868566|01/01/1999|560|
|868566|09/01/2000|560|
Here's my current code, which sorts the ID and month correctly, but seems to ignore the year portion of the date:
sort -t '|' -k 1 -b -k 2.7,2.10 -k 2.1,2.2 file.txt

You are pretty close. However date field is actually field #3 as | is first character in every line.
You can use:
sort -b -t '|' -k 3.7,3.10 -k 3.4,3.5 -k 3.1,3.2 file
|855986|06/01/1979|120|
|855986|03/01/1980|100|
|868566|01/01/1999|560|
|868566|09/01/2000|560|
|855986|05/01/2015|856|

Related

How can i sort by multiple field in terminal?

I'm looking for way to sort my list(CIDR notation of ip addresses) with linux terminal. My input list seems like this:
1.0.0.0/24
1.0.4.0/22
1.0.16.0/24
1.0.64.0/18
1.0.128.0/17
1.1.1.0/24
1.1.8.0/24
1.1.20.0/24
1.1.64.0/19
1.1.103.0/24
1.1.104.0/21
1.1.112.0/20
1.1.128.0/17
1.2.4.0/24
1.2.11.0/24
1.2.128.0/17
1.3.33.0/24
1.3.34.0/24
1.3.101.0/24
1.4.128.0/17
1.5.0.0/16
1.6.0.0/17
1.6.128.0/18
1.8.18.0/24
1.9.0.0/16
1.10.72.0/23
1.10.128.0/17
1.11.0.0/16
1.16.0.0/18
1.18.116.0/22
I use sort command of terminal but can't sort that like what i want (First i want to sort by prefix then by IP address ). After using sort command:
$ sort -t '/' -k 2,2n -k1,1n input > output
Output after sort command:
180.0.0.0/10
183.0.0.0/10
183.192.0.0/10
196.64.0.0/10
208.192.0.0/10
219.0.0.0/10
220.0.0.0/10
221.0.0.0/10
221.192.0.0/10
222.0.0.0/10
223.64.0.0/10
1.128.0.0/11
1.224.0.0/11
2.0.0.0/11
2.96.0.0/11
8.224.0.0/11
13.64.0.0/11
14.32.0.0/11
14.64.0.0/11
20.0.0.0/11
23.192.0.0/11 <---
23.32.0.0/11 <---
27.160.0.0/11
27.192.0.0/11
27.64.0.0/11
31.224.0.0/11
35.160.0.0/11
35.224.0.0/11
36.192.0.0/11
37.160.0.0/11
39.32.0.0/11
39.64.0.0/11
The problem is that is sort just by first octet. Any help or better way to solve it?
i solve it like this. First change slash(/) to dot(.) then sort my list and reverse swapping from the last dot(.) to slash(/) to have output like my origin file.
Swap / to . with sed command:
sed -e 's/\//./g' input > output
Sort list by multiple field with sort command:
sort -t '.' -k 5,5n -k 1,1n -k 2,2n -k 3,3n -k 4,4n -k1,1n output > output_sorted
and at last change last dot(.) to slash:
sed 's/\./\//4' output_sorted > origin_sorted_file
With sort from GNU coreutils version 6.10 and newer(*) you can just use something like
sort -t '.' -k 1,1n -k 2,2n -k 3,3n -k 4,4n input > output
The sort command will just ignore trailing characters in a field not being part of a number.
(*) note: 6.10 (from 2008) was just the oldest version of sort / coreutils I had access to.

Command line: retrieving specific column from CSV file

I have a CSV file called articles.csv with headers as follows:
article_id, article_title, article_shares, article_date.
The first row of data in the article is found as $ articles.csv | sed "1 d" and this returns: "895", "Trump, Clinton, America. Who will win, who will lose?", "100", "01/05/2016".
I want to return the fourth column of data (the date of the article) so I use the following code:
$ articles.csv | sed "1 d" | cut -d , -f 4.
However I don't get the date, I get America. Who will win. How do I get the output of the fourth column, regardless of the fact that some columns have commas in them?
A quick and dirty solution:
... | awk -F'",' '{print $4}'
A slow but clean solution:
... | ruby -ne $'require "csv"; print CSV.parse($_)[0][3]'
Note: CSV format should not have spaces between fields, so change your record to:
"895","Trump, Clinton, America. Who will win, who will lose?","100","01/05/2016"

Sort lines by group and column

I have a csv file separated by semicolons. Which contains lines as shown below.. And I need to sort it by the first and third column, respecting the groups of lines defined by the value of the first column.
booke;book;2
booke;booke;1
booke;bookede;6
booke;bookedes;8
booke;booker;4
booke;bookes;7
booke;booket;3
booking;booking;1
booking;bookingen;2
booking;bookingens;3
booking;bookinger;7
booking;bookingerne;5
booking;bookingernes;6
booking;bookingers;8
booking;bookings;4
Expected output:
booke;booke;1
booke;book;2
booke;booket;3
booke;booker;4
booke;bookede;6
booke;bookes;7
booke;bookedes;8
booking;booking;1
booking;bookingen;2
booking;bookingens;3
booking;bookings;4
booking;bookingerne;5
booking;bookingernes;6
booking;bookinger;7
booking;bookingers;8
I tried it with sort -t; -k3,3n -k1,1 but it's sorted by the third entire column.
What about using two sorts in a pipeline fashion:
sort -t ';' -k 3,3n | sort -t ';' -k 1,1 -s
The -s in the second parameter is necessary in order to enable stable sort. Otherwise it could destroy the previous (third column) sorting.
EDIT: however as #BenjaminW. points out in his comment, you can use multiple -k flags, you only specified them the wrong way. By performing a sort:
sort -t ';' -k 1,1 -k 3,3n
It takes the first column als primary sorting column and the third as secondary.

Unix sort command with blank value

I am using below sort command to sort 2 fields in desc order and there is a chance that the second field can be blank in some cases.
sort -k 1.1,1.2n -brn -k 1.5,1.6 -o
sample data:
112321
112422
112526
1124
112623
output must be as below
1124
112526
112623
112422
112321
Can you please help me out with a solution, thanks!!!!
Do it as two separate commands and concatenate the results.
{ grep -v '^.....' input | sort -k 1.1,1.2n;
grep '^.....' input | sort -k 1.1,1.2n -brn -k 1.5,1.6; } > output

bash sort on column but do not sort same columns

My file contains:
9827259163,0,D<br>
9827961481,0,D<br>
9827202228,0,A<br>
9827529897,5,D<br>
9827529897,0#1#5#8,A<br>
9827700249,0#1,A<br>
9827700249,1#2,D<br>
9883219029,0,A<br>
9861065312,0,A<br>
I want it to sort on the basis of first column, if the records in first column are same, then do not sort those records further.
$ sort -t, -k1,1 test
9827202228,0,A
9827259163,0,D
9827529897,0#1#5#8,A
9827529897,5,D
9827700249,0#1,A
9827700249,1#2,D
9827961481,0,D
9861065312,0,A
9883219029,0,A
but what I expect is:
9827202228,0,A
9827259163,0,D
9827529897,5,D
9827529897,0#1#5#8,A
9827700249,0#1,A
9827700249,1#2,D
9827961481,0,D
9861065312,0,A
9883219029,0,A
because there are two records for 9827529897 and 9827700249, therefore it should not be sorted further.
Please suggest the command in bash shell
add option -s
sort -st, -k1,1 test
output:
9827202228,0,A
9827259163,0,D
9827529897,5,D
9827529897,0#1#5#8,A
9827700249,0#1,A
9827700249,1#2,D
9827961481,0,D
9861065312,0,A
9883219029,0,A

Resources