bash sort on column but do not sort same columns - bash

My file contains:
9827259163,0,D<br>
9827961481,0,D<br>
9827202228,0,A<br>
9827529897,5,D<br>
9827529897,0#1#5#8,A<br>
9827700249,0#1,A<br>
9827700249,1#2,D<br>
9883219029,0,A<br>
9861065312,0,A<br>
I want it to sort on the basis of first column, if the records in first column are same, then do not sort those records further.
$ sort -t, -k1,1 test
9827202228,0,A
9827259163,0,D
9827529897,0#1#5#8,A
9827529897,5,D
9827700249,0#1,A
9827700249,1#2,D
9827961481,0,D
9861065312,0,A
9883219029,0,A
but what I expect is:
9827202228,0,A
9827259163,0,D
9827529897,5,D
9827529897,0#1#5#8,A
9827700249,0#1,A
9827700249,1#2,D
9827961481,0,D
9861065312,0,A
9883219029,0,A
because there are two records for 9827529897 and 9827700249, therefore it should not be sorted further.
Please suggest the command in bash shell

add option -s
sort -st, -k1,1 test
output:
9827202228,0,A
9827259163,0,D
9827529897,5,D
9827529897,0#1#5#8,A
9827700249,0#1,A
9827700249,1#2,D
9827961481,0,D
9861065312,0,A
9883219029,0,A

Related

Sort by multiple conditions ascending and descending in bash

I have a following issue. I have a file containg name,surname,age,mood. I need to sort this file by age (descending). If age is the same that sort it my surname (ascending).
I use this: cat $FILE |sort -r -n -t"," -k3,3 -k2,2 > "$HOME"/people.txt But -r sorts both descending. How can I sort by surname ascending, please?
By default sort will perform the sort in ascending order, the -r flag will perform the sort in descending order; the -r flag can be applied to individual -k directives when you need to use a mix of ascending and descending, eg:
$ cat raw.dat
1,2,4,5
1,2,7,5
1,2,9,5
1,2,3,5
1,3,7,5
1,1,7,5
Sort by column #3 (descending) and then column #2 (ascending):
$ sort -t"," -k3nr -k2n raw.dat
1,2,9,5
1,1,7,5
1,2,7,5
1,3,7,5
1,2,4,5
1,2,3,5
NOTES:
thanks to Ted Lyngmo for adding the n flag to properly handle numerics
if data could contain a mix of characters and numerics the n may need to be replaced depending on desired sort method (eg, V)
key takeaway is that quite a few of the sort flags can be applied at the -key level

Sorting multiple columns in ascending order

Source:
10,10,7.17,1.077383,0.00428382
10,12,7.45,1.177068,0.00390197
10,4,6.86,1.184806,0.00489828
10,6,6.98,1.106846,0.00463645
10,8,7.09,1.106254,0.00451672
12,10,6.71,1.224453,0.00506310
12,12,6.96,1.141856,0.00446641
12,4,6.41,1.510563,0.00590838
12,6,6.51,1.187841,0.00548915
12,8,6.62,1.217152,0.00532222
Desired result
10,4,6.86,1.184806,0.00489828
10,6,6.98,1.106846,0.00463645
10,8,7.09,1.106254,0.00451672
10,10,7.17,1.077383,0.00428382
10,12,7.45,1.177068,0.00390197
12,4,6.41,1.510563,0.00590838
12,6,6.51,1.187841,0.00548915
12,8,6.62,1.217152,0.00532222
12,10,6.71,1.224453,0.00506310
12,12,6.96,1.141856,0.00446641
How do i sort the csv for the first two column such i get the desired result in ascending order.
10,4
10,6
10,8
10,12
sort -k1,2 -n -t, didn't work as expected
10,4,6.86,1.184806,0.00489828
10,6,6.98,1.106846,0.00463645
10,8,7.09,1.106254,0.00451672
12,4,6.41,1.510563,0.00590838
12,6,6.51,1.187841,0.00548915
12,8,6.62,1.217152,0.00532222
You can see that 10,10,7.17,1.077383,0.00428382 is missing
sort -k1,1 -k2,2 -n -t, worked fine
More info : https://unix.stackexchange.com/questions/78925/how-to-sort-by-multiple-columns
To answer your question you should use this:
sort -t, -k1,1n -k2,2n yourFile.csv
The problem with your command is that -n does no apply to the fields you try to sort on; -k1,2n would do that but it sill does not solves your problem because it will consider both fields together (e.g. 10,10, 10,12) and will not work probably because of the you locale.
If you try
LC_ALL=C sort -t, -k1,2n yourFile.csv
you will get something like:
10,10,7.17,1.077383,0.00428382
10,12,7.45,1.177068,0.00390197
10,4,6.86,1.184806,0.00489828
10,6,6.98,1.106846,0.00463645
10,8,7.09,1.106254,0.00451672
12,10,6.71,1.224453,0.00506310
12,12,6.96,1.141856,0.00446641
12,4,6.41,1.510563,0.00590838
12,6,6.51,1.187841,0.00548915
12,8,6.62,1.217152,0.00532222
(ordered by first two fields 'concatenated').

Sort lines by group and column

I have a csv file separated by semicolons. Which contains lines as shown below.. And I need to sort it by the first and third column, respecting the groups of lines defined by the value of the first column.
booke;book;2
booke;booke;1
booke;bookede;6
booke;bookedes;8
booke;booker;4
booke;bookes;7
booke;booket;3
booking;booking;1
booking;bookingen;2
booking;bookingens;3
booking;bookinger;7
booking;bookingerne;5
booking;bookingernes;6
booking;bookingers;8
booking;bookings;4
Expected output:
booke;booke;1
booke;book;2
booke;booket;3
booke;booker;4
booke;bookede;6
booke;bookes;7
booke;bookedes;8
booking;booking;1
booking;bookingen;2
booking;bookingens;3
booking;bookings;4
booking;bookingerne;5
booking;bookingernes;6
booking;bookinger;7
booking;bookingers;8
I tried it with sort -t; -k3,3n -k1,1 but it's sorted by the third entire column.
What about using two sorts in a pipeline fashion:
sort -t ';' -k 3,3n | sort -t ';' -k 1,1 -s
The -s in the second parameter is necessary in order to enable stable sort. Otherwise it could destroy the previous (third column) sorting.
EDIT: however as #BenjaminW. points out in his comment, you can use multiple -k flags, you only specified them the wrong way. By performing a sort:
sort -t ';' -k 1,1 -k 3,3n
It takes the first column als primary sorting column and the third as secondary.

awk for sort lines in file

I have a file which needs to sort on basis of column and the column is fixed length based column i.e. from character 5 to 10.
example file:
0120456789bcdc hsdsjjlofk
01204567-9 __abc __hsdsjjjiejks
01224-6777 abcddd hsdsjjjpsdpf
012645670- abccccd hsdsjjjopp
I tried awk -v FIELDWIDTHS="4 10" '{print|"$2 sort -n"}' file but it does not give proper output.
You can use sort for this
$ sort -k1.5,1.10 file
01224-6777 abcddd hsdsjjjpsdpf
01204567-9 __abc __hsdsjjjiejks
012645670- abccccd hsdsjjjopp
0120456789bcdc hsdsjjlofk

How to alphabetize all of the rows in a CSV file, column by column?

I have a three-column CSV file, like this:
chips#food#f
pizza#food#f
tiger#animal#a
fish#animal#a
marshmallow#food#f
New Years#festivals#f
I need to alphabetize the rows, first by column 3, then column 2, then column 1. The output would be:
fish#animal#a
tiger#animal#a
New Years#festivals#f
chips#food#f
marshmallow#food#f
pizza#food#f
How can I sort the data in this manner?
Some columns contain UTF-8 data.
You can try sort command:
$ sort -t# -k3,3 -k2,2 -k1,1 input.csv
fish#animal#a
tiger#animal#a
New Years#festivals#f
chips#food#f
marshmallow#food#f
pizza#food#f

Resources