File lines sorting according to index - bash

I have a file with the structure below. I want to order its lines by the second column defined by double points on bash, but I do not even know how to start. Would you be nice to help me order it?
input.txt
a:3:foo
b:1:bar
c:2:goo
output.txt
b:1:bar
c:2:goo
a:3:foo

With sort
sort -nk2 -t: input.txt
Redirect the output to the new file.
sort -nk2 -t: input.txt > output.txt
See man sort

Related

Bash: sort rows within a file by timestamp

I am new to bash scripting and I have written a script to match regex and output lines to print to a file.
However, each line contains multiple columns, one of which is the timestamp column, which appears in the form YYYYMMDDHHMMSSTTT (to millisecond) as shown below.
20180301050630663,ABC,,,,,,,,,,
20180301050630664,ABC,,,,,,,,,,
20180301050630665,ABC,,,,,,,,,,
20180301050630666,ABC,,,,,,,,,,
20180301050630667,ABC,,,,,,,,,,
20180301050630668,ABC,,,,,,,,,,
20180301050630663,ABC,,,,,,,,,,
20180301050630665,ABC,,,,,,,,,,
20180301050630661,ABC,,,,,,,,,,
20180301050630662,ABC,,,,,,,,,,
My code is written as follow:
awk -F "," -v OFS=","'{if($2=="ABC"){print}}' < $i>> "$filename"
How can I modify my code such that it can sort the rows by timestamp (YYYYMMDDHHMMSSTTT) in ascending order before printing to file?
You can use a very simple sort command, e.g.
sort yourfile
If you want to insure sort only looks at the datestamp, you can tell sort to only use the first command separated field as your sorting criteria, e.g.
sort -t, -k1 yourfile
Example Use/Output
With your data save in a file named log, you could do:
$ sort -t, -k1 log
20180301050630661,ABC,,,,,,,,,,
20180301050630662,ABC,,,,,,,,,,
20180301050630663,ABC,,,,,,,,,,
20180301050630663,ABC,,,,,,,,,,
20180301050630664,ABC,,,,,,,,,,
20180301050630665,ABC,,,,,,,,,,
20180301050630665,ABC,,,,,,,,,,
20180301050630666,ABC,,,,,,,,,,
20180301050630667,ABC,,,,,,,,,,
20180301050630668,ABC,,,,,,,,,,
Let me know if you have any problems.
Just add a pipeline.
awk -F "," '$2=="ABC"' < "$i" |
sort -n >> "$filename"
In the general case, to sort on column 234. try sort -t, -k234,234n
Notice alse the quoting around "$i", like you already have around "$filename", and the simplifications of the Awk script.
If you are using gawk you can do:
$ awk -F "," -v OFS="," '$2=="ABC"{a[$1]=$0} # Filter lines that have "ABC"
END{ # set the sort method
PROCINFO["sorted_in"] = "#ind_num_asc"
for (e in a) print a[e] # traverse the array of lines
}' file
An alternative is to use sed and sort:
sed -n '/^[0-9]*,ABC,/p' file | sort -t, -k1 -n
Keep in mind that both of these methods are unrelated to the shell used. Bash is just executing the tools (sed, awk, sort, etc) that are otherwise part of the OS.
Bash itself could do the sort in pure Bash but it would be long and slow.

bash sort a list starting at the end of each line

I have a file containing file paths and filenames that I want to sort starting at the end of the string.
My file contains a list such as below:
/Volumes/Location/Workers/Andrew/2015-08-12_Andrew_PC/DOCS/3177109.doc
/Volumes/Location/Workers/Andrew/2015-09-17_Andrew_PC/DOCS/2130419.doc
/Volumes/Location/Workers/Bill/2016-03-17_Bill_PC/DOCS/1998816.doc
/Volumes/Location/Workers/Charlie/2016-07-06_Charlie_PC/DOCS/4744123.doc
I want to sort this list such that the filenames will be sequential, this will help find duplicates based on filename regardless of path.
The list should appear like this:
/Volumes/Location/Workers/Bill/2016-03-17_Bill_PC/DOCS/1998816.doc
/Volumes/Location/Workers/Andrew/2015-09-17_Andrew_PC/DOCS/2130419.doc
/Volumes/Location/Workers/Andrew/2015-08-12_Andrew_PC/DOCS/3177109.doc
/Volumes/Location/Workers/Charlie/2015-07-06_Charlie_PC/DOCS/4744128.doc
Here's a way to do this:
sed -e 's|^.*/\(.*\)$|\1\t\0|' list.txt | sort | cut -f 2-
This uses sed to insert a copy of the filename to the beginning of each line so that we can sort the list with sort. Then we remove the stuff that we added in the first step.
This should work:
sort -t/ -k7 input_file
This will sort based on dynamic last field which is separated by /.
First it will append last field to the start of the line and then sort. First field which is appended earlier is removed by second awk.
awk -F'/' '{ $0= $NF " " $0;print $0 |"sort -k1"}' fil |awk '{print $2}'
/Volumes/Location/Workers/Bill/2016-03-17_Bill_PC/DOCS/1998816.doc
/Volumes/Location/Workers/Andrew/2015-09-17_Andrew_PC/DOCS/2130419.doc
/Volumes/Location/Workers/Andrew/2015-08-12_Andrew_PC/DOCS/3177109.doc
/Volumes/Location/Workers/Charlie/2016-07-06_Charlie_PC/DOCS/4744123.doc

Sort text file with cat and sort concatenation

I got a txt file with some content looking like
stuff,stuff,2012-12-12
morestuff,morestuff,2012-09-09
evenmorestuff,yeah,2012-08-02
and I want to use cat and sort to get them reverse ordered by the date as an output on my command-line by concatenation.
not sure why you think you need to cat a file into sort, but here are 2 options
cat yourFile | sort -t, -k3r
sort -t, -k3r yourFile
To test this I did
echo "stuff,stuff,2012-12-12
morestuff,morestuff,2012-09-09
evenmorestuff,yeah,2012-08-02" \
| sort -t, -k3r
output
stuff,stuff,2012-12-12
morestuff,morestuff,2012-09-09
evenmorestuff,yeah,2012-08-02
And finally, you can overwrite your existing file using the -o option like
sort -t, -o yourFile -k3r yourFile
Thanks to #karakfa for reminding me your your requirement for reverse order sort. This is accomplished by adding an r to the key specification, hence -k3r.
IHTH

Sorting through shell, awk, bash?

I am trying to learn bash/shell *nix commands /scripting.
So rather than writing a python program, I thought of trying it out using bash/awk etc but am having a hard time.
I have a huge text (its actually csv )file
id_1, id_2, some attributes.
I want to sort this file based on id2?
how do i do this?
Thanks
Use the --key option for sort.
For example, the following sorts input.csv on the second field (using comma as a field separator) and writes the output to output.csv.
sort --key=2,2 -t',' input.csv > output.csv
p.s. Don't forget to use the -n option if you're doing a numerical sort.
For more info, see the man page for sort.
You can use -k option of sort(1)
-k, --key=POS1[,POS2]
start a key at POS1, end it at POS2 (origin 1)
sort -t, -k2 filename.csv
I don't have a shell to verify, but basically you need to specify the separator and the sort key
checkout the command cut:
cat file.cvs | cut -d";" -f 2 | sort
I assumed your csv is semi-colon separated, but you can change it.
Save into a different name:
cat file.cvs | cut -d";" -f 2 | sort > newfile.txt

Sorting with unix tools and multiple columns

I am looking for the easiest way to solve this problem. I have a huge data set that i cannot load into excel of this type of format
This is a sentence|10
This is another sentence|5
This is the last sentence|20
What I want to do is sort this from least to greatest based on the number.
cat MyDataSet.txt | tr "|" "\t" | ???
Not sure what the best way is to do this, I was thinking about using awk to switch the columns and the do a sort, but I was having trouble doing it.
Help me out please
sort -t\| -k +2n dataset.txt
Should do it. field separator and alternate key selection
You usually don't need cat to send the file to a filter. That said, you can use the sort filter.
sort -t "|" -k 2 -n MyDataSet.txt
This sorts the MyDataSet.txt file using the | character as field separator and sorting numerically according to the second field (the number).
have you tried sort -n
$ sort -n inputFile
This is another sentence|5
This is a sentence|10
This is the last sentence|20
you could switch the columns with awk too
$ awk -F"|" '{print $2"|"$1}' inputFile
10|This is a sentence
5|This is another sentence
20|This is the last sentence
combining awk and sort:
$ awk -F"|" '{print $2"|"$1}' inputFile | sort -n
5|This is another sentence
10|This is a sentence
20|This is the last sentence
per comments
if you have numbers in the sentence
$ sort -n -t"|" -k2 inputFile
This is another sentence|5
This is a sentence|10
This is the last sentence|20
this is a sentence with a number in it 2|22
and of course you could redirect it to a new file:
$ awk -F"|" '{print $2"|"$1}' inputFile | sort -n > outFile
Try this sort command:
sort -n -t '|' -k2 file.txt
Sort by number, change the separator and grab the second group using sort.
sort -n -t'|' -k2 dataset.txt

Resources