Finding tuples if it only exists in all occurrences of a constraint - relational-algebra

Database (all entries are integers):
ID | BUDGET
1 | 20
8 | 20
10 | 20
5 | 4
9 | 4
10 | 4
1 | 11
9 | 11
Suppose my constraint is having a budget of >= 10.
I would want to return ID of 1 only in this case. How do I go about it?
I've tried taking the cross product of itself after selecting budget >= 10 and returning if id1 = id2 and budget1 <> budget2 but that does not work in the case where there's only 1 budget that is >= 10. (EG below)
ID | BUDGET
1 | 20
8 | 20
10 | 20
1 | 4
5 | 4
9 | 4
10 | 4
9 | 4
If I were to do what I did for the first example, nothing will be returned as budget1 <> budget2 will result in an empty table.
EDIT1: I can only use relational algebra to solve the problem. So SQL's exist, where and count keywords cant be used.
Edit2: Only project, select, rename, set difference, set union, left join, right join, full inner join, natural joins, set intersection and cross product allowed

The question is not completely clear to me. If you want to return all the ID for which there is a budget greater than 10, and no budget less than 10, the expression is simply the following:
π(ID)(σ(BUDGET>=10)(R)) - π(ID)(σ(BUDGET<10)(R))
If, an the other hand, you want all the ID which have all the budgets present in the relation and greater then 10, then we must use the ÷ operator:
R ÷ π(BUDGET)(σ(BUDGET>=10)(R))
From your comment, the second case is the correct one. Let’s see how to compute the division from its definition (applied to two generic relations R(A) and S(B)):
R ÷ S = πA-B(R) - πA-B((πA-B(R) x S) - R)
where R is the original relation, and
S = π(BUDGET)(σ(BUDGET>=10)(R)),
that is:
BUDGET
------
20
11
Starting from the inner expression:
πA-B(R) is equal to πID(R) =
ID
--
1
5
8
9
10
then πA-B(R) x S) is:
ID BUDGET
---------
1 20
1 11
5 20
5 11
8 20
8 11
9 20
9 11
10 20
10 11
then ((πA-B(R) x S) - R) is:
ID BUDGET
---------
5 20
5 11
8 11
9 20
10 20
then πA-B((πA-B(R) x S) - R) is:
ID
__
5
8
9
10
and, finally, subtracting this relation from πA-B(R) we obtain the result:
ID
--
1

Related

How to find max level in each path in a Oracle hierarchical SQL?

I would like to know the way to find the max level number in Oracle hierarchical SQL within the given path.
For example : If connect by clause starts with root 1 having below relation .
parent_id node_id votes
NULL 1 -
1 2 10
2 3 12
3 4 11
1 20 5
20 30 20
20 40 4
40 50 22
Here first 3 records belongs to one path with max levle 3.
Next 2 records belong to another path with max level 2.
Last two record belongs to another path with max level 3.
I need output with these max level within the given distinct path and minimum votes:
parent_id node_id LEVEL MAX_LEVL MIN_VOTE
1 2 1 3 10
2 3 2 3 10
3 4 3 3 10
1 20 1 2 5
20 30 2 2 5
1 20 1 3 4
20 40 2 3 4
40 50 3 3 4
1
|
--------------
| |
2 20
| |
3 --------------
| | |
4 30 40
|
50
Thanks,
Guru

Take out elements from a vector that meets certain condition

I have two vectors, A = [1,3,5] and B = [1,2,3,4,5,6,7,8,9,10]. I want to get C=[2,4,6,7,8,9,10] by extracting some elements from B that A doesn't have.
I don't want to use loops, because this is a simplified problem from a real data simulation. In the real case A and B are huge, but A is included in B.
Here are two methods,
C=setdiff(B,A)
but if values are repeated in B they will only come up once in C, or
C=B(~ismember(B,A))
which will preserve repeated values in B.
One approach with unique, sort and diff -
C = [A B];
[~,~,idC] = unique(C);
[sidC,id_idC] = sort(idC);
start_id = id_idC(diff([0 sidC])==1);
out = C(start_id(start_id>numel(A)))
Sample runs -
Case #1 (Sample from question):
A =
1 3 5
B =
1 2 3 4 5 6 7 8 9 10
out =
2 4 6 7 8 9 10
Case #2 (Bit more generic case):
A =
11 15 14
B =
19 14 6 8 9 11 15
out =
6 8 9 19

Binary heap insertion, don't understand for loop

In Weiss 'Data Structures and Algorithms In Java", he explains the insert algorithm for binary heaps thusly
public void insert( AnyType x )
{
if( currentSize == array.length -1)
enlargeArray( array.length * 2 + 1);
// Percolate up
int hole = ++currentSize;
for(array[0] = x; x.compareTo( array[ hole / 2 ]) < 0; hole /=2 )
array[ hole ] = array[ hole / 2 ];
array[ hole ] = x;
}
I get the principle of moving a hole up the tree, but I don't understand how he's accomplishing it with this syntax in the for loop... What does the initializer array[0] = x; mean? It seems he's overwriting the root value? It seems like a very contrived piece of code. What's he doing ere?
First off, I got a response from Mark Weiss and his email basically said the code was correct (full response at the bottom of this answer).
He also said this:
Consequently, the minimum item is in array index 1 as shown in findMin. To do an insertion, you follow the path from the bottom to the root.
Index 1? Hmmm... I then had to go back and re-read larger portions of the chapter and when I saw figure 6.3 it clicked.
The array is 0-based, but the elements that are considered part of the heap is stored from index 1 and onwards. Illustration 6.3 looks like this:
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| | A | B | C | D | E | F | G | H | I | J | | | |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0 1 2 3 4 5 6 7 8 9 10 11 12 13
The placing of the value at element 0 is a sentinel value to make the loop terminate.
Thus, with the above tree, let's see how the insert function works. H below marks the hole.
First we place x into the 0th element (outside the heap), and places the hole at the next available element in the array.
H
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| x | A | B | C | D | E | F | G | H | I | J | | | |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Then we bubble up (percolate) the hole, moving the values up from "half the index" until we find the right spot to place the x.
If we look at figure 6.5 and 6.6, let's place the actual values into the array:
H/2 H
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
| 14 | 13 | 21 | 16 | 24 | 31 | 19 | 68 | 65 | 26 | 32 | | | |
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Notice that we placed 14, the value to insert, into index 0, but this is outside the heap, our sentinel value to ensure the loop terminates.
Then we compare the value x with the value at hole / 2, which now is 11/2 = 5. x is less than 31, so we move the value up and move the hole:
H/2 H <---------------------------
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
| 14 | 13 | 21 | 16 | 24 | 31 | 19 | 68 | 65 | 26 | 32 | 31 | | |
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
0 1 2 3 4 5 6 7 8 9 10 11 12 13
| ^
+--------- move 31 -----------+
We compare again, 14 is again less than 21 (5 / 2 = 2), so once more:
H/2 H <------------
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
| 14 | 13 | 21 | 16 | 24 | 21 | 19 | 68 | 65 | 26 | 32 | 31 | | |
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
0 1 2 3 4 5 6 7 8 9 10 11 12 13
| ^
+-- move 21 ---+
Now, however, 14 is not less than 13 (hole / 2 --> 2 / 1 = 1), so we've found the right spot for x:
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
| 14 | 13 | 14 | 16 | 24 | 21 | 19 | 68 | 65 | 26 | 32 | 31 | | |
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
0 1 2 3 4 5 6 7 8 9 10 11 12 13
^
x
As you can see, if you look at illustrations 6.6 and 6.7, this matches the expected behavior.
So while the code isn't wrong, you got one little snag that is perhaps outside of scope of the book.
If the type of x being inserted is a reference type, you will in the current heap have 2 references to the same object just inserted. If you then immediately delete the object from the heap, it looks (but look where looking like got us in the first place...) like the 0th element will still retain the reference, prohibiting the garbage collector from doing its job.
To make sure there's no hidden agenda here, here is the complete answer from Mark:
Hi Lasse,
The code is correct.
The binary heap is a complete binary tree in which on any path from a
bottom to the root, values never increase. Consequently the minimum
item is at the root. The array representation places the root at
index 1, and for any node at index i, the parent is at i/2 (rounded
down) (the left child is at 2i and the right child at 2i+1, but that
is not needed here).
Consequently, the minimum item is in array index 1 as shown in
findMin. To do an insertion, you follow the path from the bottom to
the root.
In the for loop:
hole /= 2 expresses the idea of moving the hole to the parent.
x.compareTo( array[ hole / 2 ]) < 0 expresses the idea that we stay in
the loop as long as x is smaller than the parent.
The problem is that if x is a new minimum, you never get out of the
loop safely (technically you crash trying to compare x and array[0]).
You could put in an extra test to handle the corner case.
Alternatively, the code gets around that by putting x in array[0] at
the start, and since the "parent" of node i is i/2, the "parent" of
the root which is in index 1 can be found in index 0. This guarantees
the loop terminates if x is the new minimum (and then places x, which
is the new minimum in the root at index 1).
A longer explanation is in the book... but the basic concept here is
that of using a sentinel (or dummy) value to avoid extra code for
boundary cases.
Regards,
Mark Weiss
The array initialiser looks wrong. If it were array[hole] = x;, then the whole thing makes perfect sense.
It first puts the value in the lowest rank of the tree (the entry after the current size), then it looks in the entry `above it' by looking at (int) hole/2.
It keeps moving it up until the comparator tells it to stop. I think that this is a slight misuse of the syntax of a for loop, since it feels like its really a while(x.compare(hole/2) < 0) type loop.

Oracle Sql query to count time span with certain criteria

Oracle Sql query , I was trying to count the grand total for time difference that is greater than 2, but when I tried this it just counted all the rows from the query instead of just the rows that have the criteria I was looking for. Anybody have an idea of what I am missing or a better approach . Thanks
This is my query
select DC.CUST_FIRST_NAME,DC.CUST_LAST_NAME,oi.customer_id,oi.order_timestamp,oi.order_timestamp - LAG(oi.order_timestamp) OVER (ORDER BY oi.order_timestamp) AS "Difference(In Days)" ,
(select Count('Elapsed Order Difference')
from demo_orders oi,
demo_customers dc
where OI.CUSTOMER_ID = DC.CUSTOMER_ID
group by 'Elapsed Order Difference'
having count('Elapsed Order Difference') > 3
)Total
from demo_orders oi,
demo_customers dc
where OI.CUSTOMER_ID = DC.CUSTOMER_ID
Results
CUST_FIRST_NAME CUST_LAST_NAME CUSTOMER_ID ORDER_TIMESTAMP Difference(In Days) TOTAL
Eugene Bradley 7 8/14/2013 5:59:11 PM 10
William Hartsfield 2 8/28/2013 5:59:11 PM 14 10
Edward "Butch" OHare 4 9/8/2013 5:59:11 PM 11 10
Edward Logan 3 9/10/2013 5:59:11 PM 2 10
Edward Logan 3 9/20/2013 5:59:11 PM 10 10
Albert Lambert 6 9/25/2013 5:59:11 PM 5 10
Fiorello LaGuardia 5 9/30/2013 5:59:11 PM 5 10
William Hartsfield 2 10/8/2013 5:59:11 PM 8 10
John Dulles 1 10/14/2013 5:59:11 PM 6 10
Eugene Bradley 7 10/17/2013 5:59:11 PM 3 10
This is untested, but I think it might give you what you're after.
with raw_data as (
select
dc.cust_first_name, dc.cust_last_name,
oi.customer_id, oi.order_timestamp,
oi.order_timestamp - LAG(oi.order_timestamp) OVER
(ORDER BY oi.order_timestamp) AS "Difference(In Days)",
case
when oi.order_timestamp - LAG(oi.order_timestamp)
over (ORDER BY oi.order_timestamp) > 2 then 1
else 0
end as gt2
from
demo_orders oi,
demo_customers dc
where
oi.customer_id = dc.customer_id
)
select
cust_first_name, cust_last_name,
customer_id, order_timestamp,
"Difference(In Days)",
sum (gt2) over (partition by 1) as total
from raw_data
When you do Count('Elapsed Order Difference') above, you are counting every row, no matter what. You could have put count ('frog') or count (*) and have gotten the same result. The having count > 3 was already satisfied since the count of all rows was 10.
In general, I'd try to avoid using a scalar for a field in a query as you have in your example. I'm not saying it's never a good idea, but I would argue that there is usually a better way to do it. With 10 rows, you'll hardly notice a performance difference, but as your datasets grow, this can create issues.
Expected output:
fn ln id order date dif total
E B 7 8/14/2014 8
W H 2 8/28/2014 14 8
E O 4 9/8/2014 11 8
E L 3 9/10/2014 2 8
E L 3 9/20/2014 10 8
A L 6 9/25/2014 5 8
F L 5 9/30/2014 5 8
W H 2 10/8/2014 8 8
J D 1 10/14/2014 6 8
E B 7 10/17/2014 3 8

Oracle numbering update

I am trying to do some SQL working with Oracle
I have a table that contains text data, and an order list of numbers of how that text should appear, 1-27.
So:
Bonjour | 1
mon nom | 2
Jean P. | 3
Hello J | 4
Je suis | 5
is John | 6
Now I have to reorder the list numbers:
5 to number 2, and move 2 to 3, 3 to 4, 4 to 5, BUT NOT 5 to 6 or 6 to anything. Remember, this is a list to number 27.
So I'll have:
Bonjour | 1
mon nom | 2
Jean P. | 3
Hello J | 4
My name | 5
is John | 6
Does anyone know of a good way to go about doing this?
Something like:
update foobar
set sort_nr = case
when sort_nr = 5 then 2
when sort_nr = 2 then 3
when sort_nr = 3 then 4
when sort_nr = 4 then 5
else sort_nr
end
where sort_nr in (2,3,4,5);
The else part in the case is not stricly necessary. But in case you forget the where clause it prevent accidental wrong updates.
Here is an SQLFiddle example: http://sqlfiddle.com/#!4/52d50/1

Resources