Is having union in σ the same as having two queries? - relational-algebra

I have this question:
What are the names of Employees in Boston or Chicago?
With these relations:
employees(id, name) and workIn(id, city)
Where the id in both relations refer to the same thing (the id of the employee)
The query I wrote was:
Π name (σ city="Boston" U city="Chicago"(employees ⋈ workIn))
The solution given to the question was:
Π name (σ city="Boston"(employees ⋈ workIn)) U
Π name (σ city="Chicago"(employees ⋈ workIn))
Would the two queries return the same result? Or is my query just wrong?
If my query is wrong, what would the difference be in values returned?

Your query is wrong since you are using the Union operator (U) between two logical conditions city="Boston" U city="Chicago" (which does not make sense, since the Union is a set operator, not a logical operator).
The logical operator to use in a condition is the “or” (written ∨), which makes a compound condition true when either of the two components are true (or both are true, but this is not possible here).
So a correct expression is:
Π name (σ city="Boston" ∨ city="Chicago"(employees ⋈ workIn))
and this is equivalent to the expression with the Union:
Π name (σ city="Boston"(employees ⋈ workIn)) U
Π name (σ city="Chicago"(employees ⋈ workIn))

Related

Prolog - Print truth table for first order table

I've some difficults to print a truth table for first order logic.
The tasks is:
Write a Prolog predicate table (F, U) that receives one as input
well-formed formula in propositional logic F e
is satisfied by the rows of the truth table of formula F
which have U value. Formula F is expressed by
the function symbols not / 1, and / 2 and or / 2,
the constants t (for true) and f (for false),
any round brackets and use free variables as symbols
of proposition.
Example
?- table(and(P, or(Q, P)), U)
P=t, Q=f, U=t;
P=t, Q=t, U=t;
P=f, Q=f, U=f;
P=f, Q=t, U=f;

Translate to RA: bi-implication/equivalence

(No this isn't one of those translate SQL to RA questions ;-) I have a formula in First-Order Logic that I want to express in RA. That ought to be easy: Codd's 1972 approach in the Relational Completeness paper is to show each FOL operator can be equivalently expressed in RA.
Given relation SP:
Heading {S# CHAR, P# CHAR, QTY INT}
Key {S#, P#}
Characteristic predicate SP(s, p, q) = 'Supplier s supplies Part p in quantity q.'
Express: 'Supplier 'S1' and Supplier 'S2' supply exactly the same set of Parts (disregarding quantities).'
Formula:
∀p. (∃q1. SP('S1', p, q1) ) ⇔ (∃q2. SP('S2', p, q2) )
Note in case of S1 supplying no parts at all, this formula is true just in case S2 also supplies no parts.
This is a Yes/No question (the formula has no free variables); so I'd expect the RA expression must result in a relation with no attributes, returning an empty relation if the two Suppliers do not supply the same set of Parts (formula evaluates to False); otherwise the non-empty relation with no attributes (formula evaluates to True).
To explain a bit further: usually queries return a list of something -- such as the list of Parts supplied by S1, disregarding quantities: SP WHERE (S# = 'S1') {P#} (or in Greek π{P#}(σS# = 'S1'(SP))). For a Yes/No question, we're interested only in whether the query returns something vs nothing, e.g. does Supplier S1 supply Part P456?: SP WHERE (S# = 'S1' AND P# ='P456') {} (π{}(σS# = 'S1'(σP# = 'P456'(SP)))).
You'll notice I'm using a variant of RA: Tutorial D from Date & Darwen. This is easier to read and typeset than Codd's original RA (I've also included the Greek characters and subscripts form). I'll limit myself to Tutorial D operators that correspond to Codd's RA.
I can do the inverse of the query I want: 'Are there any Parts Supplied by S1 but not by S2, or vice versa?'
Firstly a couple of shorthands for common subexpressions
WITH S1P := SP WHERE (S# = 'S1'){P#},
S2P := SP WHERE (S# = 'S2'){P#} :
( S1P MINUS S2P )
UNION
( S2P MINUS S1P );
For those who prefer Greek:
S1P := π{P#}(σS# = 'S1'(SP))
S2P := π{P#}(σS# = 'S2'(SP))
(S1P \ S2P) ∪ (S2P \ S1P)
This'll return an empty result just in case the two Suppliers supply exactly the same set of Parts. So all that remains to do is project that result on to no attributes, and flip empty result to non-empty and vice versa. But Codd's RA doesn't have a way to express that flip, AFAICT.
Applying Codd's 1972 method to the formula, the outermost operation is a forall quantifier, so convert that to a negation of an existential:
¬∃p. ¬( (∃q1. SP('S1', p, q1) ) ⇔ (∃q2. SP('S2', p, q2) ) )
But now the outermost operation is negation. Codd's method only allows negation to appear nested inside conjunction.
I'm stuck. Any ideas?
There is no RA expression that answers the question, if we limit to RA operators and semantics per Codd's 1972 specification.
Even if we add the operators commonly included in RA these days, we can't answer the question as posed. For example, the operators covered in wikipedia such as Rename aka ρ, Extend (for calculated columns), Grouping/Aggregation, Outer Joins.
From the discussion, arguably, the desired result (a degree-zero relation) is not countenanced by Codd. I say "arguably" because Codd never rigorously defines 'relation'. There's Codd 1970 footnote 1 "R is a subset of the Cartesian product S1 x S2 x ... x Sn."; but no lower bound given for n. Clearly it's intended to include the degenerate 'product' for n is 1, then why not allow zero?
For example SQL does not support degree-zero tables. SQL does support pseudo-extending a would-be degree-zero table with a dummy column:
SELECT 'Yes' AS Dummy FROM SP WHERE...
Even allowing that, I claim the question as posed can't be answered in SQL. (Consider the case where SP is empty: then the two Suppliers do supply the same set of Products, viz. the empty set; but the FROM SP ... can only return an empty relation.)
Various non-standard operators or primitives have been suggested (see Comments on q and on other answers). AFAICT there is no authoritative reference that 'blesses' any particular approach. For example, the Alice Book seems not to consider relations of degree zero.
To briefly survey the possible operators/primitives. (Any one of these is expressively equivalent to any other, in the sense that if we have one we can define the others in terms of it -- except for the last.)
Those returning true/false:
Relational comparison: subset ⊆, which can be used to define equality of relations ==. (These require the operands to be 'Union Compatible'.)
IS_EMPTY( ) (which appears in Tutorial D).
The difficulty with returning true/false is that there are no such primitives in RA. (RA operators are usually described as "closed over relations".) Alternatively these operators could return a degree-zero relation; but then why not go to that direct?
Those returning a degree-zero relation:
A complement operation, valid only applied to a degree-zero relation. (This is the "flip" operation discussed in the q.)
Make Dee a primitive -- that is, the non-empty degree-zero relation. Then Dum =df Dee MINUS Dee; and in general complement of r (which must be degree-zero) =df Dee MINUS r
Provide primitive(s) to express a relation literal/constant value, just as most programming languages support expressing numeric or String literals, or more complex data structures. Then Dum/Dee are just two amongst the many relation constants.

Relational Algebra: Natural Join having the same result as Cartesian product

I am trying to understand what will be the result of performing a natural join
between two relations R and S, where they have no common attributes.
By following the below definition, I thought the answer might be an empty set:
Natural Join definition.
My line of thought was because the condition in the 'Select' symbol is not met, the projection of all of the attributes won't take place.
When I asked my lecturer about this, he said that the output will be the same as doing a cartezian product between R and S.
I can't seem to understand why, would appreciate any help )
Natural join combines a cross product and a selection into one
operation. It performs a selection forcing equality on those
attributes that appear in both relation schemes. Duplicates are
removed as in all relation operations.
There are two special cases:
• If the two relations have no attributes in common, then their
natural join is simply their cross product.
• If the two relations have more than one attribute in common,
then the natural join selects only the rows where all pairs of
matching attributes match.
Notation: r s
Let r and s be relation instances on schema R and S
respectively.
The result is a relation on schema R ∪ S which is
obtained by considering each pair of tuples tr from r and ts from s.
If tr and ts have the same value on each of the attributes in R ∩ S, a
tuple t is added to the result, where
– t has the same value as tr on r
– t has the same value as ts on s
Example:
R = (A, B, C, D)
S = (E, B, D)
Result schema = (A, B, C, D, E)
r s is defined as:
πr.A, r.B, r.C, r.D, s.E (σr.B = s.B r.D = s.D (r x s))
The definition of the natural join you linked is:
It can be broken as:
1.First take the cartezian product.
2.Then select only those row so that attributes of the same name have the same value
3.Now apply projection so that all attributes have distinct names.
If the two tables have no attributes with same name, we will jump to step 3 and therefore the result will indeed be cartezian product.

How to find the intersection of two NFA

In DFA we can do the intersection of two automata by doing the cross product of the states of the two automata and accepting those states that are accepting in both the initial automata.
Union is performed similarly. How ever although i can do union in NFA easily using epsilon transition how do i do their intersection?
You can use the cross-product construction on NFAs just as you would DFAs. The only changes are how you'd handle ε-transitions. Specifically, for each state (qi, rj) in the cross-product automaton, you add an ε-transition from that state to each pair of states (qk, rj) where there's an ε-transition in the first machine from qi to qk and to each pair of states (qi, rk) where there's an ε-transition in the second machine from rj to rk.
Alternatively, you can always convert the NFAs into DFAs and then compute the cross product of those DFAs.
Hope this helps!
We can also use De Morgan's Laws: A intersection B = (A' U B')'
Taking the union of the compliments of the two NFA's is comparatively simpler, especially if you are used to the epsilon method of union.
There is a huge mistake in templatetypedef's answer.
The product automaton of L1 and L2 which are NFAs :
New states Q = product of the states of L1 and L2.
Now the transition function:
a is a symbol in the union of both automatons' alphabets
delta( (q1,q2) , a) = delta_L1(q1 , a) X delta_L2(q2 , a)
which means you should multiply the set that is the result of delta_L1(q1 , a) with the set that results from delta_L2(q1 , a).
The problem in the templatetypedef's answer is that the product result (qk ,rk) is not mentioned.
Probably a late answer, but since I had the similar problem today I felt like sharing it. Realise the meaning of intersection first. Here, it means that given the string e, e should be accepted by both automata.
Consider the folowing automata:
m1 accepting the language {w | w contains '11' as a substring}
m2 accepting the language {w | w contains '00' as a substring}
Intuitively, m = m1 ∩ m2 is the automaton accepting the strings containing both '11' and '00' as substrings. The idea is to simulate both automata simultaneously.
Let's now formally define the intersection.
m = (Q, Σ, Δ, q0, F)
Let's start by defining the states for m; this is, as mentioned above the Cartesian product of the states in m1 and m2. So, if we have a1, a2 as labels for the states in m1, and b1, b2 the states in m2, Q will consist of following states: a1b1, a2b1, a1b2, a2b2. The idea behind this product construction is to keep track of where we are in both m1 and m2.
Σ most likely remains the same, however in some cases they differ and we just take the union of alphabets in m1 and m2.
q0 is now the state in Q containing both the start state of m1 and the start state of m2. (a1b1, to give an example.)
F contains state s IF and only IF both states mentioned in s are accept states of m1, m2 respectively.
Last but not least, Δ; we define delta again in terms of the Cartesian product, as follows: Δ(a1b1, E) = Δ(m1)(a1, E) x Δ(m2)(b1, E), as also mentioned in one of the answers above (if I am not mistaken). The intuitive idea behind this construction for Δ is just to tear a1b1 apart and consider the states a1 and b1 in their original automaton. Now we 'iterate' each possible edge, let's pick E for example, and see where it brings us in the original automaton. After that, we glue these results together using the Cartesian product. If (a1, E) is present in m1 but not Δ(b1, E) in m2, then the edge will not exist in m; otherwise we'll have some kind of a union construction.
An alternative to constructing the product automaton is allowing more complicated acceptance criteria. Ordinarily, an NFA accepts an input string when it has reached any one of a set of accepting final states. That can be extended to boolean combinations of states. Specifically, you construct the automaton for the intersection like you do for the union, but consider the resulting automaton to accept an input string only when it is in (what corresponds to) accepting final states in both automata.

DPLL algorithm definition

I am having some problems understanding the DPLL algorithm and I was wondering if anyone could explain it to me because I think my understanding is incorrect.
The way I understand it is, I take some set of literals and if some every clause is true the model is true but if some clause is false then the model is false.
I recursively check the model by looking for a unit clause, if there is one I set the value for that unit clause to make it true, then update the model. Removing all clauses that are now true and remove all literals which are now false.
When there are no unit clauses left, I chose any other literal and assign values for that literal which make it true and make it false, then again remove all clauses which are now true and all literals which are now false.
DPLL requires a problem to be stated in disjunctive normal form, that is, as a set of clauses, each of which must be satisfied.
Each clause is a set of literals {l1, l2, ..., ln}, representing the disjunction of those literals (i.e., at least one literal must be true for the clause to be satisfied).
Each literal l asserts that some variable is true (x) or that it is false (~x).
If any literal is true in a clause, then the clause is satisfied.
If all literals in a clause are false, then the clause is unsatisfiable and hence the problem is unsatisfiable.
A solution is an assignment of true/false values to the variables such that every clause is satisfied. The DPLL algorithm is an optimised search for such a solution.
DPLL is essentially a depth first search that alternates between three tactics. At any stage in the search there is a partial assignment (i.e., an assignment of values to some subset of the variables) and a set of undecided clauses (i.e., those clauses that have not yet been satisfied).
(1) The first tactic is Pure Literal Elimination: if an unassigned variable x only appears in its positive form in the set of undecided clauses (i.e., the literal ~x doesn't appear anywhere) then we can just add x = true to our assignment and satisfy all the clauses containing the literal x (similarly if x only appears in its negative form, ~x, we can just add x = false to our assignment).
(2) The second tactic is Unit Propagation: if all but one of the literals in an undecided clause are false, then the remaining one must be true. If the remaining literal is x, we add x = true to our assignment; if the remaining literal is ~x, we add x = false to our assignment. This assignment can lead to further opportunities for unit propagation.
(3) The third tactic is to simply choose an unassigned variable x and branch the search: one side trying x = true, the other trying x = false.
If at any point we end up with an unsatisfiable clause then we have reached a dead end and have to backtrack.
There are all sorts of clever further optimisations, but this is the core of almost all SAT solvers.
Hope this helps.
The Davis–Putnam–Logemann–Loveland (DPLL) algorithm is a, backtracking-based search algorithm for deciding the satisfiability of propositional logic formulae in conjunctive normal form also known as satisfiability problem or SAT.
Any boolean formula can be expressed in conjunctive normal form (CNF) which means a conjunction of clauses i.e. ( … ) ^ ( … ) ^ ( … )
where a clause is a disjunction of boolean variables i.e. ( A v B v C’ v D)
an example of boolean formula expressed in CNF is
(A v B v C) ^ (C’ v D) ^ (D’ v A)
and solving the SAT problem means finding the combination of values for the variables in the formula that satisfy it like A=1, B=0, C=0, D=0
This is a NP-Complete problem. Actually it is the first problem which has been proven to be NP-Complete by Stepehn Cook and Leonid Levin
A particular type of SAT problem is the 3-SAT which is a SAT in which all clauses have three variables.
The DPLL algorithm is way to solve SAT problem (which practically depends on the hardness of the input) that recursively creates a tree of potential solution
Suppose you want to solve a 3-SAT problem like this
(A v B v C) ^ (C’ v D v B) ^ (B v A’ v C) ^ (C’ v A’ v B’)
if we enumerate the variables like A=1 B=2 C=3 D=4 and se negative numbers for negated variables like A’ = -1 then the same formula can be written in Python like this
[[1,2,3],[-3,4,2],[2,-1,3],[-3,-1,-2]]
now imagine to create a tree in which each node consists of a partial solution. In our example we also depicted a vector of the clauses satisfied by the solution
the root node is [-1,-1,-1,-1] which means no values have been yet assigned to the variables neither 0 nor 1
at each iteration:
we take the first unsatisfied clause then
if there are no more unassigned variables we can use to satisfy that clause then there can’t be valid solutions in this branch of the search tree and the algorithm shall return None
otherwise we take the first unassigned variable and set it such it satisfies the clause and start recursively from step 1. If the inner invocation of the algorithm returns None we flip the value of the variable so that it does not satisfy the clause and set the next unassigned variable in order to satisfy the clause. If all the three variables have been tried or there are no more unassigned variable for that clause it means there are no valid solutions in this branch and the algorithm shall return None
See the following example:
from the root node we choose the first variable (A) of the first clause (A v B v C) and set it such it satisfies the clause then A=1 (second node of the search tree)
the continue with the second clause and we pick the first unassigned variable (C) and set it such it satisfies the clause which means C=0 (third node on the left)
we do the same thing for the fourth clause (B v A’ v C) and set B to 1
we try to do the same thing for the last clause we realize we no longer have unassigned variables and the clause is always false. We then have to backtrack to the previous position in the search tree. We change the value we assigned to B and set B to 0. Then we look for another unassigned value that can satisfy the third clause but there are not. Then we have to backtrack again to the second node
Once there we have to flip the assignment of the first variable (C) so that it won’t satisfy the clause and set the next unassigned variable (D) in order to satisfy it (i.e. C=1 and D=1). This also satisfies the third clause which contains C.
The last clause to satisfy (C’ v A’ v B’) has one unassigned variable B which can be then set to 0 in order to satisfy the clause.
In this link http://lowcoupling.com/post/72424308422/a-simple-3-sat-solver-using-dpll you can find also the python code implementing it

Resources