How to sort the file based on last column in unix using sort command? - sorting

a 1
b 2 4
c 3
d 4 5 7
e 4 6
f 5
how can we print the output like below using sort in which the last column is sorted -
a 1
c 3
b 2 4
f 5
e 4 6
d 4 5 7
We can achieve the result using awk -
$awk '{print $NF,$0}' file.txt | sort -n | cut -f2- -d' '
a 1
c 3
b 2 4
f 5
e 4 6
d 4 5 7

Could you please try following and let me know if this helps you.
rev Input_file | sort -nk1.1 | rev

Related

Finding a pattern, then executing a line change only after the pattern

I have a file of the like:
H 1 2 3 4
H 1 2 3 4
C 1 2 3 4
$END
$EFRAG
COORD=CART
FRAGNAME=H2ODFT
O 1 2 3 4
H 1 2 3 4
H 1 2 3 4
FRAGNAME=H2ODFT
O 1 2 3 4
H 1 2 3 4
H 1 2 3 4
I want to remove the column "1" from the lines only after the $EFRAG line. and add a label to the O H H as well. My expected output is:
H 1 2 3 4
H 1 2 3 4
C 1 2 3 4
$END
$EFRAG
COORD=CART
FRAGNAME=H2ODFT
Oa 2 3 4
Hb 2 3 4
Hc 2 3 4
FRAGNAME=H2ODFT
Oa 2 3 4
Hb 2 3 4
Hc 2 3 4
I'm new to coding in bash, and I'm not quite sure where to start.
I was thinking of piping a grep command to a sed command, but I'm not sure how that syntax would look. Am also trying to learn awk, but that syntax is even more confusing to me. Currently trying to read a book on it's capabilities.
Any thoughts or ideas would be greatly appreciated!
L
Use the following awk processing:
awk '$0~/\$EFRAG/ {
start = 1; # marker denoting the needed block
split("a b c", suf); # auxiliary array of suffixes
}
start {
if (/^FRAGNAME/) idx = 1; # encountering subblock
if (/^[OH]/) { # if starting with O or H char
$2 = "";
$1 = $1 suf[idx++];
}
}1' test.txt
H 1 2 3 4
H 1 2 3 4
C 1 2 3 4
$END
$EFRAG
COORD=CART
FRAGNAME=H2ODFT
Oa 2 3 4
Hb 2 3 4
Hc 2 3 4
FRAGNAME=H2ODFT
Oa 2 3 4
Hb 2 3 4
Hc 2 3 4
If ed is available/acceptable.
The script.ed (name it to your own hearts content) something like:
/\$EFRAG$/;$g/^O /s/^\([^ ]*\) [^ ]* \(.*\)$/\1a \2/\
/^H /s/^\([^ ]*\) [^ ]* \(.*\)$/\1b \2/\
/^H /s/^\([^ ]*\) [^ ]* \(.*\)$/\1c \2/
,p
Q
Now run
ed -s file.txt < script.ed
Change Q to w if in-place editing is required.
Remove the ,p to silence the output.
This might work for you (GNU sed):
sed -E '1,/\$EFRAG/b;/^O/{N;N;s/^(O) \S+(.*\nH) \S+(.*\nH) \S+/\1a\2b\3c/}' file
Do not process lines from the start of the file until after encountering one containing $EFRAG.
If a line begins O, append the next two lines and then using pattern matching and back references, format those lines accordingly.

How to align rows from two different files by similitude? [duplicate]

This question already has answers here:
Inner join on two text files
(5 answers)
Closed 3 years ago.
I need help to align two files by similitude of the values from the column 2 (file 1) and column 1 (file 2).
file 1:
1 d 3
2 e 4
5 o 1
file 2:
e 6
o 5
d 8
I want to get
1 d 3 d 8
2 e 4 e 6
5 o 1 o 5
Try using the join command:
join -o "1.1,1.2,1.3,2.1,2.2" -1 2 <(cat file1 | sort) <(cat file2 | sort)
output:
1 d 3 d 8
2 e 4 e 6
5 o 1 o 5
Your files will need to be sorted for this to work. They weren't, so I had to sort them for you.
If both files have exactly the same keys (and number of lines), you can use paste:
paste -d\ <(sort -k2 file1) <(sort file2)

unix command: how to get top n records

I want to get top n records using unix command:
e.g.
input:
1 a
2 b
3 c
4 d
5 e
output(get top 3):
5 e
4 d
3 c
Current I am doing:
cat myfile.txt | sort -k1nr | head -3 > my_output.txt
It works fine but when the file gets large, it becomes very slow.
It is slow because it sorts the file completely, while what I need is just the top 3 records.
Is there any command I can use to get the top 3 records?
perl -ane '
BEGIN {#top = ([-1]) x 3}
if ($F[0] > $top[0][0]) {
#top = sort {$a->[0] <=> $b->[0]} #top[1,2], [$F[0], $_];
}
END {print for reverse map {$_->[1]} #top}
' << END_DATA
1 a
2 b
3 c
4 d
5 e
END_DATA
5 e
4 d
3 c
Have you tried changing the order of your command?
Like this.
sort -k1nr myfile.txt | head -3 > my_output.txt

Add to the end of a predetermined line using sed in bash

I have a file in the format:
C 1 1 2
H 2 2 1
C 3 1 2
C 3 3 2
H 2 3 1
I need to add " f" to the end of specific lines, for example the third line, so the output would be:
C 1 1 2
H 2 2 1
C 3 1 2 f
C 3 3 2
H 2 3 1
From Googling, it seems that I need to use sed, but I couldn't find any examples on how to do specifically what I want.
Thanks in advance.
You are looking for this article on sed. Specifically, the section on restricting to a line number. An example:
sed '3 s/$/f/' < yourFile
awk 'NR==3{$0=$0" f"}1' your_file

Split specific column(s)

I have this kind of recrods:
1 2 12345
2 4 98231
...
I need to split the third column into sub-columns to get this (separated by single-space for example):
1 2 1 2 3 4 5
2 4 9 8 2 3 1
Can anybody offer me a nice solution in sed, awk, ... etc ? Thanks!
EDIT: the size of the original third column may vary record by record.
Awk
% echo '1 2 12345
2 4 98231
...' | awk '{
gsub(/./, "& ", $3)
print
}
'
1 2 1 2 3 4 5
2 4 9 8 2 3 1
...
[Tested with GNU Awk 3.1.7]
This takes every character (/./) in the third column ($3) and replaces (gsub()) it with itself followed by a space ("& ") before printing the entire line.
Sed solution:
sed -e 's/\([0-9]\)/\1 /g' -e 's/ \+/ /g'
The first sed expression replaces every digit with the same digit followed by a space. The second expression replaces every block of spaces with a single space, thus handling the double spaces introduced by the previous expression. With non-GNU seds you may need to use two sed invocations (one for each -e).
Using awk substr and printf:
[srikanth#myhost ~]$ cat records.log
1 2 12345 6 7
2 4 98231 8 0
[srikanth#myhost ~]$ awk '{ len=length($3); for(i=1; i<=NF; i++) { if(i==3) { for(j = 1; j <= len; j++){ printf substr($3,j,1) " "; } } else { printf $i " "; } } printf("\n"); }' records.log
1 2 1 2 3 4 5 6 7
2 4 9 8 2 3 1 8 0
You can use this for more than three column records as well.
Using perl:
perl -pe 's/([0-9])(?! )/\1 /g' INPUT_FILE
Test:
[jaypal:~/Temp] cat tmp
1 2 12345
2 4 98231
[jaypal:~/Temp] perl -pe 's/([0-9])(?! )/\1 /g' tmp
1 2 1 2 3 4 5
2 4 9 8 2 3 1
Using gnu sed:
sed 's/\d/& /3g' INPUT_FILE
Test:
[jaypal:~/Temp] sed 's/[0-9]/& /3g' tmp
1 2 1 2 3 4 5
2 4 9 8 2 3 1
Using gnu awk:
gawk '{print $1,$2,gensub(/./,"& ","G", $NF)}' INPUT_FILE
Test:
[jaypal:~/Temp] gawk '{print $1,$2,gensub(/./,"& ","G", $NF)}' tmp
1 2 1 2 3 4 5
2 4 9 8 2 3 1
If you don't care about spaces, this is a succinct version:
sed 's/[0-9]/& /g'
but if you need to remove spaces, we just chain another regexp:
sed 's/[0-9]/& /g;s/ */ /g'
Note this is compatible with the original sed, thus will run on any UNIX-like.
$ awk -F '' '$1=$1' data.txt | tr -s ' '
1 2 1 2 3 4 5
2 4 9 8 2 3 1
This might work for you:
echo -e "1 2 12345\n2 4 98231" | sed 's/\B\s*/ /g'
1 2 1 2 3 4 5
2 4 9 8 2 3 1
Most probably GNU sed only.

Resources