I was wondering what the conversion would be for this algebra πA,D (R ⋈ σB=8 (S)) - relational-algebra

The exercise question asks:
Consider a relation R(A, B, C) and S(A, B, D) containing the
following tuples:
A B C A B D
------- -------
6 8 7 5 8 7
6 6 7 6 6 7
7 8 6 6 8 6
What to be produced from the expression
πA,D(R⋈σB=8 (S))
And gives the answer as:
A D
----
6 6
Why is this?
I understand that pi is the projection so it will only output tables A and D. First thing I don't understand is why is it not AAD in the new table as there are 2 A's and Second thing is I don't understand what the selection criteria means.

Selection B=8 from S will give
A B D
5 8 7
6 8 6
Join with R will give
A B C D
6 8 7 6
since A=6 and B=8 in R table(1st row) and resulting S table(2nd row)
From the projection you will see the answer

Let's work from the inside out. First, consider σB=8 (S). This is a selection. We use S as our source but we only allow through tuples that match the B=8 condition. So lets label this new relation T(A, B, D):
A B D
-------
5 8 7
6 8 6
The 6,6,7 tuple didn't get selected since its B value doesn't equal 8.
Now lets consider R⋈T. This is a natural join between my above T tuple and R. Natural join is based on all columns with the same name being used to join the relations. We don't get two As or Bs in this result because a) they're always equal, and b) tuple elements are distinguished by name so you cannot have multiple elements with the same name. So, we produce U(A,B,C,D):
A B C D
----------
6 8 7 6
(Because only tuple (6,8,7) from R and (6,8,6) from T have matching A and B values).
Finally, we project to only retain A and D from U. Hopefully I don't have to explain that.

Related

Sorting/ordering values from smallest to biggest in an array

I have a formula like this : =ArrayFormula(sort(INDEX($B$1:$B$10,MATCH(E1,$A$1:$A$10,0))))
in columns A:B:
a 1
b 2
c 3
d 4
e 5
f 6
g 7
h 8
i 9
j 10
and
the data to convert in E:H
a c f e
f a c b
b a c d
I get the following results using the above formula
in columns L:O:
1 3 6 5
6 1 3 2
2 1 3 4
My desired output is like this:
1 3 5 6
1 2 3 6
1 2 3 4
I'd like to arrange the numbers from smallest to biggest in value. I can do this with additional helper cells. but if possible i'd like to get the same result without any additional cells. can i get a little help please? thanks.
To sort by row, use SORT BYROW. But unfortunately, nested array results aren't supported in BYROW. So, we need to JOIN and SPLIT the resulting array.
=ARRAYFORMULA(SPLIT(BYROW(your_formula,LAMBDA(row,JOIN("🌆",SORT(TRANSPOSE(row))))),"🌆"))
Here's another way using Makearray with Index to get the current row and Small to get the smallest, next smallest etc. within the row:
=ArrayFormula(makearray(3,4,lambda(r,c,small(index(vlookup(E1:H3,A1:B10,2,false),r,0),c))))
Or you could change the order (might be a little faster) as you don't need to vlookup the entire array, just the current row:
=ArrayFormula(makearray(3,4,lambda(r,c,small(vlookup(index(E1:H3,r,0),A1:B10,2,false),c))))
It's interesting (to me at any rate) that you can interrogate the row and column number of the current cell using Map or Scan, so this is also possible:
=ArrayFormula(map(E1:H3,lambda(cell,small(vlookup(index(E1:H3,row(cell),0),A1:B10,2,false),column(cell)-column(E:E)+1))))
Thanks to #JvdV for this insight (which may be obvious to some but wasn't to me) shown here in Excel.
try:
=INDEX(TRIM(SPLIT(FLATTEN(QUERY(QUERY(QUERY(SPLIT(FLATTEN(E1:H3&"×​"&ROW(E1:H3)), "​"),
"select max(Col1) group by Col1 pivot Col2"), "offset 1", 0),,9^9)), "×")))
or if you want numbers:
=INDEX(IFNA(VLOOKUP(TRIM(SPLIT(FLATTEN(QUERY(QUERY(QUERY(SPLIT(FLATTEN(E1:H3&"×​"&ROW(E1:H3)), "​"),
"select max(Col1) group by Col1 pivot Col2"), "offset 1", 0),,9^9)), "×")), A:B, 2, 0)))

Grouping connected pairs of values

I have a list containing unique pairs of values x and y; for example:
x y
-- --
1 A
2 A
3 A
4 B
5 A
5 C
6 D
7 D
8 C
8 E
9 B
9 F
10 C
10 G
I want to divide this list of pairs as follows:
Group 1
1 A
2 A
3 A
5 A
5 C
8 C
10 C
8 E
10 G
Group 2
4 B
9 B
9 F
Group 3
6 D
7 D
Group 1 contains
all pairs where y = 'A' (1-A, 2-A, 3-A, 5-A)
any additional pairs where x = any of the x's above (5-C)
any additional pairs where y = any of the y's above (8-C, 10-C)
any additional pairs where x = any of the x's above (8-E, 10-G)
The pairs in Group 2 can't be reached in such a manner from any pairs in Group 1, nor can the pairs in Group 3 be reached from either Group 1 or Group 2.
As suggested in Group 1, the chain of connections can be arbitrarily long.
I'm exploring solutions using Perl, but any sort of algorithm, including pseudocode, would be fine. For simplicity, assume that all of the data can fit in data structures in memory.
[UPDATE] Because I need to apply this approach to 5.3 billion pairs, scaleability is important to me.
Pick a starting point. Find all points reachable from that, removing from the master list. Repeat for all added points, until no more can be reached. Move to the next group, starting with another remaining point. Continue until you have no more remaining points.
pool = [(1 A), (2 A), (3 A), (4 B), ... (10 G)]
group_list = []
group = []
pos = 0
while pool is not empty
group = [ pool[0] ] # start with next available point
pos = -1
while pos+1 < size(group) // while there are new points in the group
pos += 1
group_point = group[pos] // grab next available point
for point in pool // find all remaining points reachable
if point and group_point have a coordinate in common
remove point from pool
add point to group
// we've reached closure with that starting point
add group to group_list
return group_list
You can think of the letters and numbers as nodes of a graph, and the pairs as edges. Divide this graph into connected components in linear time.
The connected component with 'A' forms group 1. The other connected components form the other groups.

Filter on google sheets minimum across columns

Hi I am trying to do get some data displayed using FILTER function in google sheets.
What i want is the minimum value across 3 columns on 1 row.
Is this possible?
For example:
A 1 6 10
B 3 5 9
C 4 4 8
D 5 3 7
A 2 1 6
Filter on A should give:
A 1
A 1
Filter on B should give:
B 3
I would really like to use filter function but =filter({A:A,min(B:D)},A:A="A") doesn't work.
Maybe, if your three (labelled) columns are A, B and C:
=filter(A2:C2,A2:C2=min(A2:C2))
but in that case filter would be overkill.

How to compute a natural join??? 5

Table R (A, C) contains the following entries:
A C
3 3
6 4
2 3
3 5
7 1
Table S (B, C, D) following
B C D
5 1 6
1 5 8
4 3 9
Calculate the natural join of R and S. Which of the lines would be the result? Each resulting string has the following schema (A, B, C, D).
Please help!!!
Got the answer by looking at this. So your answer should be: {(3,4,3,9),(2,4,3,9),(3,1,5,8),(7,5,1,6)}
A B C D
3 4 3 9
2 4 3 9
3 1 5 8
7 5 1 6

Natural Join of different tables

Could you please explain to me how to do a NATURAL JOIN on these two relations (one having 5 and the other one 3 rows?
1st relation
A C
3 3
6 4
2 3
3 5
7 1
2nd relation
B C D
5 1 6
1 5 8
4 3 9
In your question you have two separate relations, which have one attribute (i.e. column) in common: C.
A natural join will combine all tuples in both relations with that attribute in common. You will end up with the results:
A B C D
7 5 1 6
3 4 3 9
2 4 3 9
3 1 5 8
This can be performed in SQL by using the code #Matthew posted.
Something like:
SELECT * FROM 1stRelation NATURAL JOIN 2ndReleation
It will do the same thing and an inner join using the explicit column names. I.e.:
SELECT * from 1stRelation as x INNER JOIN 2ndRelation as z ON x.C=z.C
Personally - I prefer not to use them except in the possible case where I am not aware of the table structure in advance but know they should be able to be joined.
Basicly you do a CROSS JOIN, i. e. you combine every row from the 1st relation with every row of the 2nd relation. Then you have two C columns. Now you eliminate every row where the two C are not equal and merge them as only one column C.

Resources