Sort in shell scripting unique in multiple columns - sorting

I have a scenario as below.
In a file say file1.txt, I have
A 1
A 2
A 3
B 5
B 2
C 9
C 10
I would like to sort and get results like below.
A 3
B 5
C 10
I tried
sort fike1.txt -k1,1 -kn2
But didn't work.

Related

Sorting/ordering values from smallest to biggest in an array

I have a formula like this : =ArrayFormula(sort(INDEX($B$1:$B$10,MATCH(E1,$A$1:$A$10,0))))
in columns A:B:
a 1
b 2
c 3
d 4
e 5
f 6
g 7
h 8
i 9
j 10
and
the data to convert in E:H
a c f e
f a c b
b a c d
I get the following results using the above formula
in columns L:O:
1 3 6 5
6 1 3 2
2 1 3 4
My desired output is like this:
1 3 5 6
1 2 3 6
1 2 3 4
I'd like to arrange the numbers from smallest to biggest in value. I can do this with additional helper cells. but if possible i'd like to get the same result without any additional cells. can i get a little help please? thanks.
To sort by row, use SORT BYROW. But unfortunately, nested array results aren't supported in BYROW. So, we need to JOIN and SPLIT the resulting array.
=ARRAYFORMULA(SPLIT(BYROW(your_formula,LAMBDA(row,JOIN("🌆",SORT(TRANSPOSE(row))))),"🌆"))
Here's another way using Makearray with Index to get the current row and Small to get the smallest, next smallest etc. within the row:
=ArrayFormula(makearray(3,4,lambda(r,c,small(index(vlookup(E1:H3,A1:B10,2,false),r,0),c))))
Or you could change the order (might be a little faster) as you don't need to vlookup the entire array, just the current row:
=ArrayFormula(makearray(3,4,lambda(r,c,small(vlookup(index(E1:H3,r,0),A1:B10,2,false),c))))
It's interesting (to me at any rate) that you can interrogate the row and column number of the current cell using Map or Scan, so this is also possible:
=ArrayFormula(map(E1:H3,lambda(cell,small(vlookup(index(E1:H3,row(cell),0),A1:B10,2,false),column(cell)-column(E:E)+1))))
Thanks to #JvdV for this insight (which may be obvious to some but wasn't to me) shown here in Excel.
try:
=INDEX(TRIM(SPLIT(FLATTEN(QUERY(QUERY(QUERY(SPLIT(FLATTEN(E1:H3&"×​"&ROW(E1:H3)), "​"),
"select max(Col1) group by Col1 pivot Col2"), "offset 1", 0),,9^9)), "×")))
or if you want numbers:
=INDEX(IFNA(VLOOKUP(TRIM(SPLIT(FLATTEN(QUERY(QUERY(QUERY(SPLIT(FLATTEN(E1:H3&"×​"&ROW(E1:H3)), "​"),
"select max(Col1) group by Col1 pivot Col2"), "offset 1", 0),,9^9)), "×")), A:B, 2, 0)))

Increase the numbers in apl

I have the following data:
a b c d
5 9 6 0
3 1 3 2
Characters in the first row, numbers in the second row.
How do I get the character corresponding to the highest number in the second row, and how do I increase the corresponding number in the second row? (For example, here, column b has the highest number, 9, so increase that number by 10%.)
I use Dyalog version 17.1.
With:
⎕←data←3 4⍴'a' 'b' 'c' 'd' 5 9 6 0 3 1 3 2
a b c d
5 9 6 0
3 1 3 2
You can extract the second row with:
2⌷data
5 9 6 0
Now grade it descending, that is, find the indices that would sort it from highest to lowest:
⍒2⌷data
2 3 1 4
The first number is the column we're looking for:
⊃⍒2⌷data
2
Now we can use this to extract the character from the first row:
data[⊂1,⊃⍒2⌷data]
b
But we only need the column index, not the actual character. The full index of the number we want to increase is:
2,⊃⍒2⌷data
2 2
Extracting the data to see that we got the right index:
data[⊂2,⊃⍒2⌷data]
9
Now we can either create a new array with the target value increased by 10%:
1.1×#(⊂2,⊃⍒2⌷data)⊢data
a b c d
5 9.9 6 0
3 1 3 2
Or change it in-place:
data[⊂2,⊃⍒2⌷data]×←1.1
data
a b c d
5 9.9 6 0
3 1 3 2
Try it online!

Filter on google sheets minimum across columns

Hi I am trying to do get some data displayed using FILTER function in google sheets.
What i want is the minimum value across 3 columns on 1 row.
Is this possible?
For example:
A 1 6 10
B 3 5 9
C 4 4 8
D 5 3 7
A 2 1 6
Filter on A should give:
A 1
A 1
Filter on B should give:
B 3
I would really like to use filter function but =filter({A:A,min(B:D)},A:A="A") doesn't work.
Maybe, if your three (labelled) columns are A, B and C:
=filter(A2:C2,A2:C2=min(A2:C2))
but in that case filter would be overkill.

awk split one column into multiple columns

How can split one column data into multiple columns based on column values using awk?
Example file and desired output is below. My bash version is 3.2.52(1).
$ cat examplefile
A
1
B
2
B
3
C
10
C
11
C
13
A
4
B
5
B
6
B
7
C
14
Desired output:
$ cat outputfile
A B C
1 2 10
null B C
null 3 11
null null C
null null 13
A B C
4 5 14
null B null
null 6 null
null B null
null 7 null
Or forget about null values How can I obtain two columns as in the outputfile2?
cat examplefile2
A
1
B
2
B
3
cat outputfile2
A B
1 2
B
3
You can get it:
awk 'BEGIN{l=1;ll="";} {if (l) {ll=$0;l=0;} else {if (length(a[ll])>0) {a[ll]=a[ll]","ll","$0;} else {a[ll]=ll","$0;}l=1;}} END{for (k in a){print a[k];}}' examplefile
It works for any number of classes (A,B,C...).
The output is:
A,1,A,4
B,2,B,3,B,5,B,6,B,7
C,10,C,11,C,13,C,14
If you want it as columns, just have a quick look to the following post:
An efficient way to transpose a file in Bash

Natural Join of different tables

Could you please explain to me how to do a NATURAL JOIN on these two relations (one having 5 and the other one 3 rows?
1st relation
A C
3 3
6 4
2 3
3 5
7 1
2nd relation
B C D
5 1 6
1 5 8
4 3 9
In your question you have two separate relations, which have one attribute (i.e. column) in common: C.
A natural join will combine all tuples in both relations with that attribute in common. You will end up with the results:
A B C D
7 5 1 6
3 4 3 9
2 4 3 9
3 1 5 8
This can be performed in SQL by using the code #Matthew posted.
Something like:
SELECT * FROM 1stRelation NATURAL JOIN 2ndReleation
It will do the same thing and an inner join using the explicit column names. I.e.:
SELECT * from 1stRelation as x INNER JOIN 2ndRelation as z ON x.C=z.C
Personally - I prefer not to use them except in the possible case where I am not aware of the table structure in advance but know they should be able to be joined.
Basicly you do a CROSS JOIN, i. e. you combine every row from the 1st relation with every row of the 2nd relation. Then you have two C columns. Now you eliminate every row where the two C are not equal and merge them as only one column C.

Resources