Relational Algebra Cross Join (Cross Product) and Natural Join - relational-algebra

When do I use operators Cross Join (Cross Product) and Natural Join in a relational algebra statement?

Some versions of the relational algebra have relation headings that are sets of (unordered, uniquely named) attributes. Then (relational "Cartesian") PRODUCT aka CROSS JOIN (aka, wrongly, CROSS PRODUCT) is defined only when the input relations share no attribute names but otherwise acts like NATURAL JOIN. So its role is to confirm that you expect that there are no shared attribute names.
(Some versions of the relational algebra have relation headings that are not sets; attributes can be ordered and/or multiple attributes can have the same name. Usually PRODUCT outputs an attribute for every input attribute. If there's a NATURAL JOIN then its result will be like first doing PRODUCT, then RESTRICTing on equality of pairs of same-named attributes, then PROJECTing out one attribute of each pair. So PRODUCT works for any two inputs, and NATURAL JOIN might be undefined when an input has duplicate attribute names, but they will give the same result when there are no shared attribute names.)
As to why you would compose any particular relational algebra query:
Every table/relation has a statement parameterized by
columns/attributes. (Its "characteristic predicate".) The rows/tuples
that make the statement true go in the table/relation. First find the
statements for the given tables/relations:
// customer [Cust-Name] has £[Balance] in account [Acc-No] at branch [Branch]
Deposit (Branch, Acc-No, Cust-Name, Balance)
// customer [Cust-Name] loan [Loan-No] balance is £[Balance] at branch [Branch]
Loan(Branch, Loan-No, Cust-Name, Balance)
Now put these given statements together to get a statement that only
the rows we want satisfy. Use AND, OR, AND NOT, AND condition. Keep or
drop names. Use a new name if you need one.
customer [Cust-Name] loan [Loan-No] balance is £[Loan-Balance] at branch [Branch]
AND customer [Cust-Name] has £[Balance] in account [Acc-No] at branch [Branch]
Now to get the algebra replace:
every statement by its table/relation
every AND of table/relation statements by ⋈ (natural join)
every OR of table/relation statements (which must have the same columns/attributes) by ∪ (union)
every AND NOT of statements (which must have the same columns/attributes) by \ (difference)
every AND condition by σ condition
every Keeping names to keep by π names to keep (projection) (and Dropping by π names to keep)
every column/attribute renaming in a given statement by ρ (rename).
∩ (intersection) and x (product) are special cases of ⋈ (∩ for both
sides the same columns/attributes and x for no shared
columns/attributes).
(ρ Loan-Balance/Balance Loan) ⋈ Deposit

Related

How to subtract values form two columns in RelaX (online relational algebra calculator)

Is there any way to subtract values from two different columns using RelaX (an relational algebra online calculator)? I have tried using projection, group by, as well as a few examples I saw here on SO. I am trying to subtract the average wage from value the of the wage of employees.
The RelaX projection operator takes a list of expressions giving the column values of each row returned. Those expressions can be just column names but they don't have to be. (As with an SQL select clause.)
From the help link:
projection
Expressions can be used to create more complex statements using one or more columns of a single row.
pi c.id, lower(username)->user, concat(firstname, concat(' ', lastname))->fullname (
ρ c ( Customer )
)
Value expressions
With most operators you can use a value-expression which connects one or more columns of a single row to calculate a new value. This is possible for:
the projection creating a new column (make sure to give the column a name)
the selection any expression evaluating to boolean can be used
for the joins any expression evaluating to boolean can be used; note that the rownum() expression always represents the index of the lefthand relation
PS RelaX is properly a query language, not an algebra. Its "value expressions" are not evaluated to a value before the call. That begs the question of how we would implement a language using an algebra.
From Is multiplication allowed in relational algebra?:
Some so-called "algebras" are really languages because the expressions don't only represent the results of operators being called on values. Although it is possible for an algebra to have operand values that represent expressions and/or relation values that contain names for themselves.
The projection that takes attribute expressions begs the question of its implementation given an algebra with projection only on a relation value and attribute names. This is important in an academic setting because a question may be wanting you to actually figure out how to do that, or because the difficulty of a question is dependent on the operators available. So find out what algebra you are supposed to use.
We can introduce an operator on attribute values when we only have basic relation operators taking attribute names and relation values. Each such operator can be associated with a relation value that has an attribute for each operand and an attribute for the result. The relation holds the tuples where the result value is equal to the the result of the operator called on the operand values. (The result is functionally dependent on the operands.)
From Relational Algebra rule for column transformation:
Suppose you supply the division operator on values of a column in the form of a constant base relation called DIVIDE holding tuples where dividend/divisor=quotient. I'll use the simplest algebra, with headings that are sets of attribute names. Assume we have input relation R with column c & average A. We want the relation like R but with each column c value set to its original value divided by A.
/* rows where
EXISTS dividend [R(dividend) & DIVIDE(dividend, A, c)]
*/
PROJECT c (
RENAME c\dividend (R)
NATURAL JOIN
RENAME quotient\c (
PROJECT dividend, quotient (SELECT divisor=A (DIVIDE))))
From Relational algebra - recode column values:
To introduce specific values into a relational algebra expression you have to have a way to write table literals. Usually the necessary operators are not made explicit, but on the other hand algebra exercises frequently use some kind of notation for example values.

Difference between natural join and simple join on common attribute in algebra

I have a confusion.
Suppose there two relation with common attribite A.
Now is
(R natural join S)=(R join S where join condition A=A)?
Natural join returns a common column A
Do simple join return two columns with same name AA or 1 common column A due to relational algebra which is defined in set theory ??
There's an example of a Natural Join here. As #Renzo says, there are many variants. And SQL is different again. So I'll keep to what wikipedia shows.
Most important: the join condition applies to all attributes in common between the two arguments. So you need to say "two relations with A being their only common attribute". The only common attribute is DeptName in that wikipedia example. There can be many common attributes, in general.
Yes, joining means forming tuples in the result by pairing tuples from the argument that have same values in the corresponding common attributes. So you have same value with same attribute name. It would be pointless repeating both attributes in the result, because you'd be repeating the values. The example shows there's a single attribute DeptName in the result.
Beware that different dialects of Relational Algebra use different symbols and notations. So the bare bowtie (⋈) for Natural Join can be suffixed with a boolean condition, making a theta-join (θ-join) or equi-join -- see that example. The boolean condition is between differently-named attributes, and might use any comparison operator. So both attribute names and their values appear in the result.
Set theory operations apply because each tuple is a set of name-value pairs. The result tuples are the union of a tuple from each argument -- providing that union is a valid tuple. That is, providing same-named n-v pairs have same value.

Algorithm to group people based on their preference

I need an algorithm to group people in tables based on their preference. Each person vote sorting the tables from the favorite to the worse.
For example if there are 4 tables in total one person vote is like:
Alice{ table1 => 2, table2 => 4, table3=>1, table4=>3}
which means she would like to be put on table3 and really dislikes table2
The conditions are:
Everyone must be in a group
All groups must have the same number of people (tollerance of 1)
Maximize the global 'happiness'
Trying to sort this out I defined happiness as points, each person will have happiness 10 if they will be put on their favorite table, 6 on their second choice, 4 on the third and 1 on the last.
happiness[10, 6, 4, 1]
The global happiness is the sum of each person's happiness.
One way to solve this is to use integer linear programming.
There are many solvers out there for ILP, for example SCIP (http://scip.zib.de/).
You would have binary variable for each assigment, i.e.
= 1 if person i was assigned to table j (and 0 is it was not assigned).
Your goal is to maximize total happiness, i.e. sum of weights multiplied by
Now you have write some conditions to ensure, that:
each person is assigned to exactly one table, i.e. sum of for each i is equal to one.
all tables have similar number of persons (you can determine possible ranges for number of persons beforehand), i.e. some of for each j is in defined range.

Algorithm for conditioned migration

Literal description of the problem:
Given a number of schools, each school holds a number of teachers according to its needs. At the end of the scholar year, some teachers ask for changing their positions (schools where they're currently teaching) according to an ordered list (i.e: change me to school1, if not possible, school2, if not possible school3, etc..) Remembering that each school must have EXACTLY the number of teachers it needs (not more, not less).
Each teacher has an importance number that is unique, so that if two or more teachers ask for the same school at the same time, the one having the higher importance number will get the desired school.
if we are unable to migrate one or more teachers following his list, then we keep him at his initial school answering: "sorry, your demand is affordable for this year"
How can we afford this "migration" ?
(p.s: by "afford" I mean change the position of each teacher to the best (best according to his list) desired school.
I/ Algebraic modeling of the problem
Given E={e1..en}, n>=0, a set of positive integers (e for entity)
Given L={l1..lm}, m>=0, a set of positive integers (l for location)
Given P:E --> L , a function. (P for position)
Given C:L --> IN*, a function. (C for capacity)
Given U:L --> IN, a function. (U for used) defined by: U(l)=card({e/P(e)=l})
Given A:L --> IN, a function. (A for available) defined by: C(l)=A(l)+U(l) for any l in L.
Let D:E --> L^k, where 0 < k <= m, D(e)=(l1,l2,..li) a function (D for destinations)
(That is, each entity has an ordered non-empty list of locations (destinations) willing to move to).
Let I:E --> IR+, a bijection (I for Importance). (That is, each entity has a unique importance number I(e))
II/Rules of Migration:
The asked task, is to find out the new P' function (Positioning) that affords the following:
1- P'(e) belongs to {l1,l2,..,li} where (l1,l2,..,li)=D(e)
2- If we P'(e)=ls and P'(e)=lt are two possible solutions, where D(e) = (l1,...,ls,...,lt,...,li), then we must keep the solution that matches the destinations' order (i.e ls in this case) and exclude the other one)
3-If A(l) = 1 and P'(e1)=l and P'(e2)=l are two possible solutions, where I(e1)>I(e2) then we must keep the solution that matches the importance's order (i.e in this case P'(e1)=l) and exclude the other one.
4- If none of the desired destinations is possible, then P'(e)=P(e)
This can be formulated as bipartite matching (or, for efficiency, integer max flow to avoid duplicating the identical positions). Make a graph with a node for each teacher and a node for each position. Put edges between teachers and their current assignments, as well as everything that ranks above their current assignment. Find a maximum matching; if it is not perfect, then the problem is unsolvable without moving a teacher against their preference list.
Otherwise, for each teacher in descending order of importance, determine the best assignment that is feasible and commit to it. There is a linear-time algorithm that, given a bipartite graph with a perfect matching, determines whether there is another matching containing a particular edge (orient the matching edges, the non-matching edge the other way, and look for an augmenting path).

Representation of parent-child relationship in java

I have a set of tables that are related (parent child relationships).
I need a solution where in I can quickly find if two tables are related.
Also If they are related I need to find out if the relationship is a parent-child relationship or child-parent relationship.
My solution:
Store the relationship details in the form of a matrix.
Say there are three tables T1, T2 and T3. T1 has two children T2 and T3.
Then I can represent the relationship as
{{0,1,1},
{-1,0,0},
{-1,0,0}}
The first row and first column represent T1.
The second row and second column represent T2.
The third row and third column represent T3.
To find the relationship between T1 and T2 you go to the first row and second column. The value is 1. This shows that T1 is the parent and T2 is the child.
A -1 would indicate that the first table is the child and the second table is the parent.
A 0 would indicate that the two tables are not related.
Is there a better solution to this problem?

Resources