Multiple sorting conditions in DolphinDB - sorting

Suppose I have a table as follows:
id=`A`B`A`B`B`B`A
item= 10 1 1 3 5 10 6
t=table(id,item)
id item
-- ----
A 10
B 1
A 1
B 3
B 5
B 10
A 6
For example, I want to sort the table with two conditions: first, by the most commonly occurring item in column item, then by the highest number in column item.
How can I sort like this:
id item
--- ----
A 10
B 10
A 1
B 1
A 6
B 5
B 3
Is there any way to go about this? Thanks!

Try this:
t1=table(id,item);
update t1 set count=count(item) context by item;
select * from t1 order by count desc, item desc;

Related

Sorting/ordering values from smallest to biggest in an array

I have a formula like this : =ArrayFormula(sort(INDEX($B$1:$B$10,MATCH(E1,$A$1:$A$10,0))))
in columns A:B:
a 1
b 2
c 3
d 4
e 5
f 6
g 7
h 8
i 9
j 10
and
the data to convert in E:H
a c f e
f a c b
b a c d
I get the following results using the above formula
in columns L:O:
1 3 6 5
6 1 3 2
2 1 3 4
My desired output is like this:
1 3 5 6
1 2 3 6
1 2 3 4
I'd like to arrange the numbers from smallest to biggest in value. I can do this with additional helper cells. but if possible i'd like to get the same result without any additional cells. can i get a little help please? thanks.
To sort by row, use SORT BYROW. But unfortunately, nested array results aren't supported in BYROW. So, we need to JOIN and SPLIT the resulting array.
=ARRAYFORMULA(SPLIT(BYROW(your_formula,LAMBDA(row,JOIN("🌆",SORT(TRANSPOSE(row))))),"🌆"))
Here's another way using Makearray with Index to get the current row and Small to get the smallest, next smallest etc. within the row:
=ArrayFormula(makearray(3,4,lambda(r,c,small(index(vlookup(E1:H3,A1:B10,2,false),r,0),c))))
Or you could change the order (might be a little faster) as you don't need to vlookup the entire array, just the current row:
=ArrayFormula(makearray(3,4,lambda(r,c,small(vlookup(index(E1:H3,r,0),A1:B10,2,false),c))))
It's interesting (to me at any rate) that you can interrogate the row and column number of the current cell using Map or Scan, so this is also possible:
=ArrayFormula(map(E1:H3,lambda(cell,small(vlookup(index(E1:H3,row(cell),0),A1:B10,2,false),column(cell)-column(E:E)+1))))
Thanks to #JvdV for this insight (which may be obvious to some but wasn't to me) shown here in Excel.
try:
=INDEX(TRIM(SPLIT(FLATTEN(QUERY(QUERY(QUERY(SPLIT(FLATTEN(E1:H3&"×​"&ROW(E1:H3)), "​"),
"select max(Col1) group by Col1 pivot Col2"), "offset 1", 0),,9^9)), "×")))
or if you want numbers:
=INDEX(IFNA(VLOOKUP(TRIM(SPLIT(FLATTEN(QUERY(QUERY(QUERY(SPLIT(FLATTEN(E1:H3&"×​"&ROW(E1:H3)), "​"),
"select max(Col1) group by Col1 pivot Col2"), "offset 1", 0),,9^9)), "×")), A:B, 2, 0)))

Filter Google Sheets' pivot table by comparison with column

I want to filter a pivot table in the following set up:
My Table:
Key Value1 Value2
1 23 a
2 33 b
3 1 c
4 5 d
My pivot table (simplified):
Key SUM of Value1 COUNTA of Value2
1 23 1
2 33 1
3 1 1
4 5 1
Grand Total 62 4
I now want to filter the pivot table by the values in this list:
Keys
1
2
4
So the resulting pivot table should look like this:
Key SUM of Value1 COUNTA of Value2
1 23 1
2 33 1
4 5 1
Grand Total 61 3
I thought this should be possible by using a custom formula in the pivot filter but it seems I have no way of using the current cell in the pivot e.g. to make a lookup.
I created a simple example of this setup here: https://docs.google.com/spreadsheets/d/1GlQDYtW8v8ri5L68RhryTZxwTikV_NXZQlccSI6_7pU/edit?usp=sharing
paste this formula in Filters!B1:
=ARRAYFORMULA(IFERROR(VLOOKUP(A1:A, Table!A1:C, {2,3}, 0), ))
and create a resulting pivot table from there:
demo spreadsheet

SAS grouping algorithm

I have the following mock up table
#n a b group
1 1 1 1
2 1 2 1
3 2 2 1
4 2 3 1
5 3 4 2
6 3 5 2
7 4 5 2
I am using SAS for this problem. In column group, the rows that are interconnected through a and b are grouped. I will try to explain why these rows are in the same group
row 1 to 2 are in group 2 since they both have a = 1
row 3 is in group 2 since b = 2 in row 2 and 3 and row 2 is in group 1
row 3 and 4 are in group 1 since a = 2 in both rows and row 3 is in group 1
The overall logic is that if a row x contains the same value of a or b as row y, row x also belongs to the same group as y is a part of.
Following the same logic, row 5,6 and 7 are in group 2.
Is there any way to make an algorithm to find these groups?
Case I:
Grouping defined as to be item linkage within contiguous rows.
Use the LAG function to examine both variables prior values. Increase the group value if both have changed. For example
group + ( a ne lag(a) and b ne lag(b) );
Case II:
Grouping determined from pair item slot value linkages over all data.
From grouping pairs by either key
General statement of problem:
-----------------------------
Given: P = p{i} = (p{i,1},p{i,2}), a set of pairs (key1, key2).
Find: The distinct groups, G = g{x}, of P,
such that each pair p in a group g has this property:
key1 matches key1 of any other pair in g.
-or-
key2 matches key2 of any other pair in g.
Demonstrates
… an iterative way using hashes.
Two hashes maintain the groupId assigned to each key value.
Two additional hashes are used to maintain group mapping paths.
When the data can be passed without causing a mapping, then the groups have
been fully determined.
A final pass is done, at which point the groupIds are assigned to each
pair and the data is output to a table.

CONNECT BY for two tables with two JOINS

I have 3 tables:
two with hierarchical structures
(like "dimensions" of recursive type of hierarchy);
one with summing data (like "facts" with X column).
They are here:
DIM1 (ID1, PARENT2, NAME1)
DIM2 (ID2, PARENT2, NAME2)
FACTS (ID1, ID2, X)
Example of DIM1 table:
-- 1 0 DIM1
---- 2 1 DIM1-A
------ 3 2 DIM1-A-A
-------- 4 3 DIM1-A-A-A
-------- 5 3 DIM1-A-A-B
------ 6 2 DIM1-A-B
-------- 7 6 DIM1-A-B-A
-------- 8 6 DIM1-A-B-B
------ 9 2 DIM1-A-C
---- 10 1 DIM1-B
------ 11 10 DIM1-B-C
------ 12 10 DIM1-B-D
---- 13 1 DIM1-C
Example of DIM2 table:
-- 1 0 DIM2
---- 2 1 DIM2-A
------ 3 2 DIM2-A-A
-------- 4 3 DIM2-A-A-A
-------- 5 3 DIM2-A-A-B
-------- 6 3 DIM2-A-B-C
------ 7 2 DIM2-A-B
---- 8 1 DIM2-B
---- 9 1 DIM2-C
Example of FACTS table:
1 1 100
1 2 30
1 3 500
-- ................
13 9 200
And I would like to create the only SELECT where I will specify the parent for DIM1 (for example ID1=2 for DIM1-A) and parent for DIM2 (for example ID2=2 for DIM2-A) and SELECT will generate a report like this:
Name_of_1 Name_of_2 Sum_of_X
--------- --------- ----------
DIM1-A-A DIM2-A-A (some sum)
DIM1-A-A DIM2-A-B (some sum)
DIM1-A-B DIM2-A-A (some sum)
DIM1-A-B DIM2-A-B (some sum)
DIM1-A-C DIM2-A-A (some sum)
DIM1-A-C DIM2-A-B (some sum)
I would like to use CONNECT BY phrase, START WITH phrase, SUM phrase, GROUP BY phrase, and OUTER or INNER (?) JOIN. I need no other extensions of Oracle 10.2.
In other words: only with "classic" SQL and
only Oracle extensions for hierarchy queries.
Is it possible?
I tried some experiments with question in
Mixing together Connect by, inner join and sum with Oracle
(where is a very nice solution but only for one
dimension table ("Tasks"), but I need to JOIN two dimension tables to one facts table), but I was not successful.
"Some sum" is not very descriptive, so I don't see why do you need CONNECT BY at all.
SELECT dim1.name, dim2.name, x
FROM (
SELECT id1, id2, SUM(x) AS x
FROM facts
GROUP BY
id1, id2
) f
JOIN dim1
ON dim1.id = f.id1
JOIN dim2
ON dim2.id = f.id2
I think what you're trying to do is get the sum of the value in the facts table for all of the children of the specified rows grouped by the topmost children. This would mean that in your example above, the results for the first row would be the sum any intersections of (DIM1-A-A, DIM1-A-A-A, DIM1-A-A-B) and (DIM2-A-A, DIM2-A-A-A, DIM2-A-A-B, DIM3-A-A-C) found in the FACTS table. With that assumption, I have come to the following solution:
SELECT root_name1, root_name2, SUM(X)
FROM ( SELECT CONNECT_BY_ROOT(name1) AS root_name,
id1
FROM dim1
CONNECT BY parent1 = PRIOR id1
START WITH parent1 = 2) d1
CROSS JOIN
( SELECT CONNECT_BY_ROOT(name2) AS root_name,
id2
FROM dim2
CONNECT BY parent2 = PRIOR id2
START WITH parent2 = 2) d2
LEFT OUTER JOIN
facts
ON d1.id1 = facts.id1
AND d2.id2 = facts.id2
GROUP BY root_name1, root_name2
(This also assumes that the columns of FACTS are named ID1, ID2, and X.)

Fix numbering of many item groups in table

I have a table with these columns
create table demo (
ID integer not null,
INDEX integer not null,
DATA varchar2(10)
)
Example data:
ID INDEX DATA
1 1 A
1 3 B
2 1 C
2 1 D
2 5 E
I want:
ID INDEX DATA
1 1 A
1 2 B -- 3 -> 2
2 1 C or D -- 1 -> 1 or 2
2 2 D or C -- 1 -> 2 or 1
2 3 E -- 5 -> 3 (closing the gap)
Is there a way to renumber the INDEX per ID using a single SQL update? Note that the order of items should be preserved (i.e. the item "E" should still be after the items "C" and "D" but the order of "C" and "D" doesn't really matter.
The goal is to be able to create a primary key over ID and INDEX.
If that matters, I'm on Oracle 10g.
MERGE
INTO demo d
USING (
SELECT rowid AS rid, ROW_NUMBER() OVER (PARTITION BY id ORDER BY index) AS rn
FROM demo
) d2
ON (d.rowid = d2.rid)
WHEN MATCHED THEN
UPDATE
SET index = rn

Resources