Is this natural join operation used correctly? (Relational Algebra) - relational-algebra

I have the following task given from the professor:R-E Modell
Assume the companies may be located in several cities. Find all companies located in every city in which “Small Bank Corporation” is
located.
Now the professor's solution is the following:
s ← Π city (σ company_name=’Small Bank Corporation’ (company))
temp1 ← Π comp_id, company_name (company)
temp2 ← Π comp_id, company_name ((temp1 × s) − company)
result ← Π company_name (temp1 − temp2)
I for myself found a completely different solutions with a natural join operation which seems much simpler:
What I tried to do was using the natural joint operation which whe defined as following that a relation r and s are joined on their common attributes. So I tried to get all city names by using a projection on a selection of all companies with the company_name "Small Bank Cooperation". After that I joined the table with the city names with the company table, so that I get all company entrys which have the city names in it.
company ⋈ Π city (σ company_name=”Small Bank Cooperation” (company)))
My question now is if my solution is also valid, since it seems a little bit to trivial?

Yours isn't the same.
My answer here says how to query relationally. It uses a version of the relational algebra where headings are sets of attribute names. My answer here summarizes it:
Every query expression has an associated (characteristic)
predicate--statement template parameterized by attributes. The tuples
that make the predicate into a true proposition--statement--are in
the relation.
We are given the predicates for expressions that are relation names.
Let query expression E have predicate e. Then:
R ⨝ S has predicate r and s
R ∪ S has predicate r or s
R - S has predicate r and not s
σ p (R) has predicate r and p
π A (R) has predicate exists non-A attributes of R [r]
When we want the tuples satisfying a certain predicate we find a way
to express that predicate in terms of relation operator
transformations of given relation predicates. The corresponding query
returns/calculates the tuples.
Your solution
company ⋈ Π city (σ company_name=”Small Bank Corporation” (company)))
is rows where
company company_id named company_name is in city
AND FOR SOME company_id & company_name [
company company_id named company_name is in city
AND company_name=”Small Bank Corporation”]
ie
company company_id named company_name is in city
AND FOR SOME company_id [
company company_id named ”Small Bank Corporation” is in city]
ie
company company_id named company_name is in city
AND some company named ”Small Bank Corporation” is in city
You are returning rows that have more columns than just company_name. But your companies are not the requested companies.
Projecting your rows on company_name gives rows where
some company named company_name is in some city
AND some company named ”Small Bank Corporation” is in that city
After that I joined the table with the city names with the company
table, so that I get all company entrys which have the city names in
it.
That isn't clear about what you get. However the companies in your rows are those in at least one of the SBC cities. The request was for those in all of the SBC cities:
companies located in every city in which “Small Bank Corporation” is located
The links I gave tell you how to compose queries but also convert between query result specifications & relational algebra expressions returning a result.
When you see a query for rows matching "every" or "all" of some other rows you can expect that that part of your query involves relational-division or some related idiom. The exact algebra depends on what is intended by the--frequently poorly/ambiguously expressed--requirements. Eg whether "companies located in every city in which" is supposed to be no companies (division) or all companies (related idiom) when there are no such cities. (The normal mathematical interpretation of your assignment is the latter.) Eg whether they want companies in exactly all such cities or at least all such cities.
(It helps to avoid "all" & "every" after "find" & "return", where it is redundant anyway.)
Database Relational Algebra: How to find actors who have played in ALL movies produced by “Universal Studios”?
How to understand u=r÷s, the division operator, in relational algebra?
How to find all pizzerias that serve every pizza eaten by people over 30?

Related

Find the names of students who are not enrolled in any course - Students, Faculty, Courses, Offerings, Enrolled

Given the database below, project the names of the students who are not enrolled in a course using relational algebra.
Students(snum, sname, major, standing, age, gpa)
Faculty(fid, fname, deptid)
Courses(cnum, cname, course_level, credits)
Offerings(onum, cnum, day, starttime, endtime, room, max_occupancy, fid)
Enrolled(snum, onum)
I can get the snum of all students not enrolled in a course with:
π snum Students - π snum Enrolled
But how do I project the sname of the student with the snums that I find?
Every base table holds the rows that make a true proposition (statement) from some (characteristic) predicate (statement template parameterized by columns). The designer gives the predicates. The users keep the tables updated.
-- rows where student [snum] is named [sname] and has major [major] and ...
Students
-- rows where student [snum] is enrolled in offering [onum]
Enrolled
Every query result holds the rows that make a true proposition from some predicate. The predicate of a relation expression is combined from the predicates of its argument expressions depending on its predicate nonterminal. The DBMS evaluates the result.
/* rows where
student [snum] is named [sname] and has major [major] and ...
AND student [snum] is enrolled in offering [onum]
*/
Student ⨝ Enrolled
AND gives NATURAL JOIN, ANDcondition gives RESTRICTcondition, EXISTScolumns gives PROJECTother columns. OR & AND NOT with the same columns on both sides give OR & MINUS. Etc.
/* rows where
THERE EXISTS sname, major, standing, age & gpa SUCH THAT
student [snum] is named [sname] and has major [major] and ...
*/
π snum Students
/* rows where
THERE EXISTS onum SUCH THAT
student [snum] is enrolled in offering [onum]
*/
π snum Enrolled
/* rows where
( THERE EXISTS sname, major, standing, age & gpa SUCH THAT
student [snum] is named [sname] and has major [major] and ...
AND NOT
THERE EXISTS onum SUCH THAT
student [snum] is enrolled in offering [onum]
)
AND student [snum] is named [sname] and has major [major] and ...
*/
(π snum Students - π snum Enrolled) ⨝ Students
You can project out any columns that you don't want from that.
(Notice that we don't need to know constraints to query.)
Relational algebra for banking scenario
Forming a relational algebra query from an English description
Is there any rule of thumb to construct SQL query from a human-readable description?

Forming a relational algebra query from an English description

I am preparing for an upcoming test in my school.
While I was going through some example questions, I got stuck with one particular question.
Passenger {p_id, p_name, p_nation} with key {p_id}
Flight {f_no, f_date, f_orig, f_dest} with key {f_no, f_date}
Trip {p_id, f_no, f_date, class} with key {p_id, f_no,f date}
and foreign keys [p_id] ⊆ Passenger[p_id] and [f_no, f_date] ⊆ Flight[f_no, f_date]
The question asks:
Consider classes that passengers have occupied on flights from Narita.
Write in relational algebra: What are the ids of passengers who have
flown from Narita in each of these classes at least once?
What I did so far is:
-- rename class to class' in Trip and join with Trip
Q1 = Trip JOIN RENAME class\class' (Trip)
-- select those Q1 tuples where class = class'
Q2 = RESTRICT class = class' (Q2)
-- Project for those who traveled in different classes more than once
Q3 = PROJECT p_id (Q1 - Q2)
Q3 will show me (if I've done it correctly) all the ids of passengers who traveled more than once in different classes.
Can someone help me to get further from this point?
This is as far as I got.
The Q3 you calculate actually holds passengers who traveled in more than one class on the same flight number on the same day. Moreover, according to the constraints there aren't any such passengers. Here's why:
According to your code Q1 is
/* (tuples where)
p_id took f_no on f_date in class
AND p_id took f_no on f_date in class'
*/
Trip JOIN RENAME class\class' Trip
For Q1, passenger p_id took f_no on d_no in class and (for that flight number and date) in class'. (Note that under common sense, with a person only able to fly a trip in one class at a time, if class <> class' then they must have flown multiple trips with the same flight number on the same date, in different classes.)
Q1 - Q2 is just SELECT class <> class' Q1. So Q3 holds ids of passengers who traveled with different classes with the same flight number on the same date. But those people aren't relevant to a sensible interpretation of your overall query "passengers who have flown from Narita in each of these classes at least once".
But anyway since {f_no, f_date} is a CK (candidate key) of Flight, there's only one flight for a given flight number and date, so no passengers can have flown the same flight number & date more than once. So Q3 is empty anyway.
Forming a relational algebra query from an English description
Always characterize a relation--the value of a given one or of a query (sub)expression--via a statement template--predicate--parameterized by attributes. The relation holds the tuples that make it into a statement--proposition--that is true of the situation.
You must have been given the predicate for each base relation. Eg:
-- (tuples where) p_id took f_no on f_date in class
Trip
Then you need to express your query (sub)expression predicates in terms of the base predicates so that the (sub)expression relations can be calculated in terms of the base relations:
Consider classes that passengers have occupied on flights from Narita.
/* (tuples where)
FOR SOME p_id, f_no, f_date, f_orig & f_dest,
p_id took f_no on f_date in class
AND f_no flew on f_date from f_orig to f_dest
AND f_orig = 'Narita'
*/
PROJECT class SELECT f_dest = 'Narita' (Trip JOIN Flight)
The predicate of r JOIN s is predicate-of-r AND predicate-of-s. The predicate of SELECT c r is predicate-of-r AND c. Every relation operator has such a predicate transform. The predicate of PROJECT some-attributes-of-r r is FOR SOME other-attributes-of-r predicate-of-r. The predicate of RENAME a\a' r is predicate-of-r with (appropriate occurrences of) a replaced by a'.
To query, find some predicate equivalent to your desired predicate, then replace its parts by corresponding relation expressions. See this.
Constraints & querying
We must know the predicates in order to query. The constraints (including FDs, CKs, PKs and FKs) are truths in every situation/state that can arise, expressed in terms of the predicates. We only need to know constraints when querying if the query's predicate can only be phrased in terms of base predicates because those constraints hold. Eg given Trip & Flight but no constraints we can't query for "classes that passengers have occupied on flights from Narita", ie the classes in tuples where:
p_id took f_no on f_date in class from f_orig to f_dest
The closest we can get is (Trip JOIN Flight):
p_id took f_no on f_date in class
AND f_no flew on f_date from f_orig to f_dest
but that doesn't necessarily tell us what class(es) were used on what flights. But if {f_no, f_date} is unique in Flight, which is implied by {f_no, f_date} being a CK of Flight, then the two predicates mean the same thing (ie have the same truth value for every tuple & situation).
On the other hand, since we can express that query given the CK constraint, we don't also need to be told that {f_no, f_date} is a FK from Trip to Flight. The FK says that if some passenger took f_no on f_date in some class then f_no flew on f_date from some origin to some destination and that {f_no, f_date} is a CK of Flight. So a Passenger {f_no, f_date} is a Flight {f_no, f_date}. But whether or not that first conjunct of the FK also holds, or any other constraint also holds, the query returns the tuples satisfying its predicate.

Need a query that will satisfy two conditions from two tables

table a and table b, table a has two field, field 1 and 2, and table b has two fields, field 3 and 4.
where
tablea.field1 >= 4 and tableb.field3 = 'male'
is something like the above query possible, Ive tried something like this in my database although there are not errors and i get results, it checks whether both are true separately.
im going to try to be abit clear, and cant give out the query outright as much as i would like to (University reasons). so ill explain, table 1 has several columns of information one of which is number of kids, table two has more information on said kids, like gender.
so im having trouble creating a query where first it checks that a parent has 2 kids but two male kids, thus creating a relationship between parent table and kids table.
CREATE TABLE parent
(pID NUMBER,
numberkids INTEGER)
CREATE TABLE kids
(kID NUMBER,
father NUMBER,
mother NUMBER,
gender VARCHAR(7))
select
p.pid
from
kids k
inner join parent pm on pm.pid = k.mother
inner join parent pf on pf.pid = k.father,
parent p
where
p.numberkids >= 2 and k.gender = 'male'
/
this query checks that the parent has 2 kids or more and the kids gender is male, but i need it to check whether the parent has 2 kids and OF those kids is there 2 or more male kids (or in short to check whether the parent has 2 or more male kids).
sorry for the long winded explanation i modified the tables and the query from the one im actually going to use (so some mistakes might be there, but the original query work, just not how i want explained above). any help would be greatly appreciated.
The best thing to do would be to take the numberKids column out of the parent table ... you'll find it very difficult to maintain.
Anyway, something like this might do the trick:
SELECT p.pID
FROM parent p INNER JOIN kids k
ON p.pID IN (k.father, k.mother)
WHERE k.gender = 'male'
GROUP BY p.pID
HAVING COUNT(*) >= 2;

Relational Database - Finding instructors who taught the most courses in 2009

We have the following schema:
instructor(ID, name, dept name, salary)
teaches(ID, course id, sec id, semester, year)
Find instructors who taught the most courses in 2009. Can someone please help me? I'm confused how to write this out in relational algebra.
This must be homework ;-) So I'll give you some hints...
Since I haven't done tuple relational calculus since college (http://en.wikipedia.org/wiki/Relational_algebra), here is an approximation in sql,
select instructor.ID, instructor.name, count(teaches.ID)
from instructor
join teaches on teaches.ID = instructor.ID
and count(teaches.ID) >= ...
group by ...
Leaving you to fill in the group by and >= values.
Think about how you calculate how many courses each teacher teaches,
select teaches.ID, count(*)
from teaches
group by teaches.ID
This might help: MySQL count maximum number of rows

Can someone explain me how the cartesian product works in relational algebra

here it says
Selection and cross product
Cross product is the costliest operator to evaluate. If the input relations have N and M rows, the result will contain NM rows. Therefore it is very important to do our best to decrease the size of both operands before applying the cross product operator.
suppose that we have 2 relations
first relation is called Student and has 3 attributes, thus
student
|a |b |c |
------------
|__|___|___|
|__|___|___|
|__|___|___|
second relation is university and again with 3 attributes
university
|e |f |g |
------------
|__|___|___|
|__|___|___|
|__|___|___|
we have 3 rows for each relation, so after applying the cross product operation we will get a relation which has 3*3 = 9 rows
now, I don't understand, why 9 and not 3?
won't the final relation be
final relation
|a |b |c |d |e |f |g |
--------------------------
|__|___|___|__|____|__|__|
|__|___|___|__|____|__|__|
|__|___|___|__|____|__|__|
doesn't this have 3 rows again?
Thanks
If the rows in Student are row1, row2 and row3, and the rows in University are row4, row5 and row6, then the cartesian product will contain
row1row4, row1row5, row1row6, row2row4, row2row5, row2row6, row3row4, row3row5, row3row6
Each possible combination of rows. That's how it is defined. Nothing more to it.
Except for your remark "Therefore it is very important to do our best to decrease the size of both operands before applying the cross product operator.". It is important to realise that there do exist optimizers which are able to "rewrite" certain algebra operations. It is certainly not the case that the onus is always on the query writer to determine the "most appropriate way of combining restrictions with other operations". In fact, "moving restrictions to the inside as far as possible" is one of the things industrial optimizers are actually very good at.
Just imagine that you have two tables one with the students and one with the universities, when you do a Cartesian query against a relational database you will get a row for every student which in turn is joined to every university.
Select *
From students,
universities;
OR
SELECT * FROM students CROSS JOIN universities
I know this has little to do with algebra but since your on stackoverflow :D
There is no common attribute to link between student and university so each row in student is matched to each row in university, 3 * 3 = 9
|a|e|
|a|f|
|a|g|
|b|e|
|b|f|
|b|g|
|c|e|
|c|f|
|c|g|
Therefore 9

Resources