Does natural join return a tuple that is not common to both relations but present in only one of the relations? - relational-algebra

Student
sID
sName
014
Robert
023
Daniel
015
Tomiwa
016
Israel
Apply
sID
cNAME
014
Crawford
009
Harvard
018
Cambridge
029
Oxford
The assignment is, query for IDs and names of students who didn’t apply anywhere.
To find the students that did not apply anywhere we create a relation rName = (Difference) where Difference is [sID from relation (Student)] - [sID from relation (Apply)].
Difference
sID
023
015
016
The relation (Difference) returns only one column which is sID of students who didn’t apply anywhere.
Using a natural join to combine relation (Student) and relation (Difference) does the resulting relation include the tuple of an s.ID present in relation (Student) but absent in relation (Difference)?
Does query relation (Student Naturaljoin Difference) return
sID
sName
023
Daniel
015
Tomiwa
016
Israel
or a new relation still including the tuple of a student that applied?
sID
sName
014
Crawford

Related

Is this natural join operation used correctly? (Relational Algebra)

I have the following task given from the professor:R-E Modell
Assume the companies may be located in several cities. Find all companies located in every city in which “Small Bank Corporation” is
located.
Now the professor's solution is the following:
s ← Π city (σ company_name=’Small Bank Corporation’ (company))
temp1 ← Π comp_id, company_name (company)
temp2 ← Π comp_id, company_name ((temp1 × s) − company)
result ← Π company_name (temp1 − temp2)
I for myself found a completely different solutions with a natural join operation which seems much simpler:
What I tried to do was using the natural joint operation which whe defined as following that a relation r and s are joined on their common attributes. So I tried to get all city names by using a projection on a selection of all companies with the company_name "Small Bank Cooperation". After that I joined the table with the city names with the company table, so that I get all company entrys which have the city names in it.
company ⋈ Π city (σ company_name=”Small Bank Cooperation” (company)))
My question now is if my solution is also valid, since it seems a little bit to trivial?
Yours isn't the same.
My answer here says how to query relationally. It uses a version of the relational algebra where headings are sets of attribute names. My answer here summarizes it:
Every query expression has an associated (characteristic)
predicate--statement template parameterized by attributes. The tuples
that make the predicate into a true proposition--statement--are in
the relation.
We are given the predicates for expressions that are relation names.
Let query expression E have predicate e. Then:
R ⨝ S has predicate r and s
R ∪ S has predicate r or s
R - S has predicate r and not s
σ p (R) has predicate r and p
π A (R) has predicate exists non-A attributes of R [r]
When we want the tuples satisfying a certain predicate we find a way
to express that predicate in terms of relation operator
transformations of given relation predicates. The corresponding query
returns/calculates the tuples.
Your solution
company ⋈ Π city (σ company_name=”Small Bank Corporation” (company)))
is rows where
company company_id named company_name is in city
AND FOR SOME company_id & company_name [
company company_id named company_name is in city
AND company_name=”Small Bank Corporation”]
ie
company company_id named company_name is in city
AND FOR SOME company_id [
company company_id named ”Small Bank Corporation” is in city]
ie
company company_id named company_name is in city
AND some company named ”Small Bank Corporation” is in city
You are returning rows that have more columns than just company_name. But your companies are not the requested companies.
Projecting your rows on company_name gives rows where
some company named company_name is in some city
AND some company named ”Small Bank Corporation” is in that city
After that I joined the table with the city names with the company
table, so that I get all company entrys which have the city names in
it.
That isn't clear about what you get. However the companies in your rows are those in at least one of the SBC cities. The request was for those in all of the SBC cities:
companies located in every city in which “Small Bank Corporation” is located
The links I gave tell you how to compose queries but also convert between query result specifications & relational algebra expressions returning a result.
When you see a query for rows matching "every" or "all" of some other rows you can expect that that part of your query involves relational-division or some related idiom. The exact algebra depends on what is intended by the--frequently poorly/ambiguously expressed--requirements. Eg whether "companies located in every city in which" is supposed to be no companies (division) or all companies (related idiom) when there are no such cities. (The normal mathematical interpretation of your assignment is the latter.) Eg whether they want companies in exactly all such cities or at least all such cities.
(It helps to avoid "all" & "every" after "find" & "return", where it is redundant anyway.)
Database Relational Algebra: How to find actors who have played in ALL movies produced by “Universal Studios”?
How to understand u=r÷s, the division operator, in relational algebra?
How to find all pizzerias that serve every pizza eaten by people over 30?

Find the names of students who are not enrolled in any course - Students, Faculty, Courses, Offerings, Enrolled

Given the database below, project the names of the students who are not enrolled in a course using relational algebra.
Students(snum, sname, major, standing, age, gpa)
Faculty(fid, fname, deptid)
Courses(cnum, cname, course_level, credits)
Offerings(onum, cnum, day, starttime, endtime, room, max_occupancy, fid)
Enrolled(snum, onum)
I can get the snum of all students not enrolled in a course with:
π snum Students - π snum Enrolled
But how do I project the sname of the student with the snums that I find?
Every base table holds the rows that make a true proposition (statement) from some (characteristic) predicate (statement template parameterized by columns). The designer gives the predicates. The users keep the tables updated.
-- rows where student [snum] is named [sname] and has major [major] and ...
Students
-- rows where student [snum] is enrolled in offering [onum]
Enrolled
Every query result holds the rows that make a true proposition from some predicate. The predicate of a relation expression is combined from the predicates of its argument expressions depending on its predicate nonterminal. The DBMS evaluates the result.
/* rows where
student [snum] is named [sname] and has major [major] and ...
AND student [snum] is enrolled in offering [onum]
*/
Student ⨝ Enrolled
AND gives NATURAL JOIN, ANDcondition gives RESTRICTcondition, EXISTScolumns gives PROJECTother columns. OR & AND NOT with the same columns on both sides give OR & MINUS. Etc.
/* rows where
THERE EXISTS sname, major, standing, age & gpa SUCH THAT
student [snum] is named [sname] and has major [major] and ...
*/
π snum Students
/* rows where
THERE EXISTS onum SUCH THAT
student [snum] is enrolled in offering [onum]
*/
π snum Enrolled
/* rows where
( THERE EXISTS sname, major, standing, age & gpa SUCH THAT
student [snum] is named [sname] and has major [major] and ...
AND NOT
THERE EXISTS onum SUCH THAT
student [snum] is enrolled in offering [onum]
)
AND student [snum] is named [sname] and has major [major] and ...
*/
(π snum Students - π snum Enrolled) ⨝ Students
You can project out any columns that you don't want from that.
(Notice that we don't need to know constraints to query.)
Relational algebra for banking scenario
Forming a relational algebra query from an English description
Is there any rule of thumb to construct SQL query from a human-readable description?

Oracle display value replacement of flattened, delimited foreign key values

I working on a data export for a painfully denormalized COTS product and am hung up over how to plug display values in my selection for columns that contain a delimited string of foreign keys.
Assume the following sets of data for example.
DEPARTMENTS table:
Key Value
---------------------------------
1 Finance
2 Human Resources
3 Public Affairs
4 Information Technology
PERSONNEL table:
PK FName LName Departments
-------------------------------------------------
111 Marty Graw 1|~*~|3|~*~|
222 Rick Shaw 2|~*~|4|~*~|
333 Jean Poole 4|~*~|2|~*~|3|~*~|1|~*~|
Desired output from select:
FName LName Departments
-----------------------------------------------------------------------------------
Marty Graw Finance, Public Affairs
Rick Shaw Human Resources, Information Technology
Jean Poole Information Technology, Human Resources, Public Affairs, Finance
I've found examples of how to deal with delimited strings but nothing that really seems to fit this particular scenario. Ideally I'd like to figure out how I could do it without having to create functions etc. as my permissions are pretty limited.
This will not preserve the original order of the IDs, but if that's not important then this will work:
select DISTINCT
p.fname
,p.name
,LISTAGG(d.value, ', ')
WITHIN GROUP (ORDER BY d.value)
OVER (PARTITION BY p.pk)
AS departments_list
from personnel p
left join departments d
on INSTR('|~*~|'||p.departments||'|~*~|'
,'|~*~|'||d.key||'|~*~|') > 0;
SQL Fiddle: http://sqlfiddle.com/#!4/d292e/3/0
EDIT
If you really need them listed in the same order as the IDs, you can use this variant:
select DISTINCT
p.fname
,p.lname
,LISTAGG(d.value, ', ')
WITHIN GROUP (
ORDER BY INSTR('|~*~|'||p.departments||'|~*~|'
,'|~*~|'||d.key||'|~*~|'))
OVER (PARTITION BY p.pk) AS departments_list
from personnel p
left join departments d
on INSTR('|~*~|'||p.departments||'|~*~|'
,'|~*~|'||d.key||'|~*~|') > 0;
http://sqlfiddle.com/#!4/d292e/4

Where two or more values match condition?

I have been asked this question;
You list county names and the surnames of the representatives if the representatives in the counties have the same surname.
and I have the following tables;
***REPRESENTATIVE***
REPI SURNAME FIRSTNAME COUNTY CONS
---- ---------- ---------- ---------- ----
R100 Gorege Larry kent CON1
R101 shneebly john kent CON2
R102 shneebly steve kent CON3
I cant seem to figure out the correct way to ask Orical to display a surname that exists more then twice and the surnames are in the same country.
I know how to ask WHERE something = something, but that's doesn't ask what I want to know.
It sounds like you want to use the HAVING clause after doing a GROUP BY
SELECT surname, county, count(*)
FROM you_table
GROUP BY surname, county
HAVING count(*) > 1;
If you really mean "more than twice" as you wrote, none of the data you'd want HAVING count(*) > 2 but then none of your sample data would be returned.
In words, this SQL statement says
Group the data into buckets by surname and county. Each distinct combination of surname and county is a separate bucket.
Count the number of rows in each bucket
Return those buckets where there are at least two rows

Can someone explain me how the cartesian product works in relational algebra

here it says
Selection and cross product
Cross product is the costliest operator to evaluate. If the input relations have N and M rows, the result will contain NM rows. Therefore it is very important to do our best to decrease the size of both operands before applying the cross product operator.
suppose that we have 2 relations
first relation is called Student and has 3 attributes, thus
student
|a |b |c |
------------
|__|___|___|
|__|___|___|
|__|___|___|
second relation is university and again with 3 attributes
university
|e |f |g |
------------
|__|___|___|
|__|___|___|
|__|___|___|
we have 3 rows for each relation, so after applying the cross product operation we will get a relation which has 3*3 = 9 rows
now, I don't understand, why 9 and not 3?
won't the final relation be
final relation
|a |b |c |d |e |f |g |
--------------------------
|__|___|___|__|____|__|__|
|__|___|___|__|____|__|__|
|__|___|___|__|____|__|__|
doesn't this have 3 rows again?
Thanks
If the rows in Student are row1, row2 and row3, and the rows in University are row4, row5 and row6, then the cartesian product will contain
row1row4, row1row5, row1row6, row2row4, row2row5, row2row6, row3row4, row3row5, row3row6
Each possible combination of rows. That's how it is defined. Nothing more to it.
Except for your remark "Therefore it is very important to do our best to decrease the size of both operands before applying the cross product operator.". It is important to realise that there do exist optimizers which are able to "rewrite" certain algebra operations. It is certainly not the case that the onus is always on the query writer to determine the "most appropriate way of combining restrictions with other operations". In fact, "moving restrictions to the inside as far as possible" is one of the things industrial optimizers are actually very good at.
Just imagine that you have two tables one with the students and one with the universities, when you do a Cartesian query against a relational database you will get a row for every student which in turn is joined to every university.
Select *
From students,
universities;
OR
SELECT * FROM students CROSS JOIN universities
I know this has little to do with algebra but since your on stackoverflow :D
There is no common attribute to link between student and university so each row in student is matched to each row in university, 3 * 3 = 9
|a|e|
|a|f|
|a|g|
|b|e|
|b|f|
|b|g|
|c|e|
|c|f|
|c|g|
Therefore 9

Resources