ORACLE SQL one to many relationship? - oracle

Let's see if I can try to explain what I'm trying to do.
I have a table where each row has a begin date (begin_date) and end date (end_date). this table has a key (person_key)
I have another table where that person_key could have multiple entries each with a begin_date and end_date and an associated value.
example:
Table 1
key begin_date end_date
123 1/1/2016 1/31/2016
123 2/1/2016 2/29/2016
123 3/1/2016 3/31/2016
Table 2
key begin_date end_date value
123 1/15/2016 2/16/2016 X
123 2/17/2016 12/31/2099 Y
What I want to be able to do in SQL is write a query that will produce the following results:
Table 3
key begin_date end_date value
123 1/1/2016 1/31/2016 X
123 2/1/2016 2/16/2016 X
123 2/17/2016 2/29/2016 Y
123 3/1/2016 3/31/2016 Y
This may be way too involved as a simple solution but just looking for some guidance!

Related

Oracle -- Datatype of column which can store value "13:45"

We need to store a value "13:45" in the column "Start_Time" of an Oracle table.
Value can be read as 45 minutes past 13:00 hours
Which datatype to be used while creating the table? Also, once queried, we would like to see only the value "13:45".
I would make it easier:
create table t_time_only (
time_col varchar2(5),
time_as_interval INTERVAL DAY TO SECOND invisible
generated always as (to_dsinterval('0 '||time_col||':0')),
constraint check_time
check ( VALIDATE_CONVERSION(time_col as date,'hh24:mi')=1 )
);
Check constraint allows you to validate input strings:
SQL> insert into t_time_only values('25:00');
insert into t_time_only values('25:00')
*
ERROR at line 1:
ORA-02290: check constraint (CHECK_TIME) violated
And invisible virtual generated column allows you to make simple arithmetic operations:
SQL> insert into t_time_only values('15:30');
1 row created.
SQL> select trunc(sysdate) + time_as_interval as res from t_time_only;
RES
-------------------
2020-09-21 15:30:00
Your best option is to store the data in a DATE type column. If you are going to be any comparisons against the times (querying, sorting, etc.), you will want to make sure that all of the times are using the same day. It doesn't matter which day as long as they are all the same.
CREATE TABLE test_time
(
time_col DATE
);
INSERT INTO test_time
VALUES (TO_DATE ('13:45', 'HH24:MI'));
INSERT INTO test_time
VALUES (TO_DATE ('8:45', 'HH24:MI'));
Test Query
SELECT time_col,
TO_CHAR (time_col, 'HH24:MI') AS just_time,
24 * (time_col - LAG (time_col) OVER (ORDER BY time_col)) AS difference_in_hours
FROM test_time
ORDER BY time_col;
Test Results
TIME_COL JUST_TIME DIFFERENCE_IN_HOURS
____________ ____________ ______________________
01-SEP-20 08:45
01-SEP-20 13:45 5
Table Definition using INTERVAL
create table tab
(tm INTERVAL DAY(1) to SECOND(0));
Input value as literal
insert into tab (tm) values (INTERVAL '13:25' HOUR TO MINUTE );
Input value dynamically
insert into tab (tm) values ( (NUMTODSINTERVAL(13, 'hour') + NUMTODSINTERVAL(26, 'minute')) );
Output
you may either EXTRACT the hour and minute
EXTRACT(HOUR FROM tm) int_hour,
EXTRACT(MINUTE FROM tm) int_minute
or use formatted output with a trick by adding some fixed DATE
to_char(DATE'2000-01-01'+tm,'hh24:mi') int_format
which gives
13:25
13:26
Please see this answer for other formating options HH24:MI
The used INTERVAL definition may store seconds as well - if this is not acceptable, add CHECK CONSTRAINT e.g. as follows (adjust as requiered)
tm INTERVAL DAY(1) to SECOND(0)
constraint "wrong interval" check (tm <= INTERVAL '23:59' HOUR TO MINUTE and EXTRACT(SECOND FROM tm) = 0 )
This rejects the following as invalid input
insert into tab (tm) values (INTERVAL '13:25:30' HOUR TO SECOND );
-- ORA-02290: check constraint (X.wrong interval) violated

Finding a conditional min date on tableau

I have a database with lots of records that have a date
I want to show the minimum date where there is no different booking type after that date
With these fields
Date, Booking type, Id
ID | Date | Booking type
123 01/04 A
123 01/05 B
123 01/06 A
123 01/07 A
So I would only want to show record on date 01/06 for ID 123 booking type 'A' as there has not been a different booking type after that
At the moment I can only get date 01/04 for booking type A id 123
First you calculate how many distinct Booking Type there is for each ID and Date
{FIXED [ID], [Date]: COUNTD([Booking Type])}
Once done, you can do a filter where SUM(....) > 1

Creating an oracle table with auto increment feature

How to create an oracle table with an auto increment column such that whenever exsisiting value is getting inserted it should increment the counter otherwise it should insert a new count
For instance if I have a column with phone number and status
There should be an another column named counter on which auto increment feature will be present
Whenever exsiting phonenumber is inserted again then counter must be increment and if a new value is inserted then counter should add a new initial value for that number
Depending on how you want to insert the data. If you are going to be inserting many rows at the same time then try a MERGEstatement.
Join with the phone number, if found increment the counter column value else set the counter to 1.
If you are going to be inserting one row at a time then this is best done in the code that performs an insert.
EDIT: I did not think this through. Now that I am, I think it is unnecessary to use a counter column.
If you are going to insert phone numbers multiple times anyway, why don't you simply count each phone number? It doesn't have to be stored.
You can't create a table like that.
You can, however, add your own logic into the place where you INSERT new rows - eg, it's not in the table itself. You can also go the route of a TRIGGER.
Additionally, you may wish to simply have your ID be a unique GUID that gets generated and create this duplicate counter whenever it is necessary, using ROW_NUMBER() OVER like EMP_ID in this example from the oracle website:
SELECT department_id, last_name, employee_id, ROW_NUMBER()
OVER (PARTITION BY department_id ORDER BY employee_id) AS emp_id
FROM employees;
DEPARTMENT_ID LAST_NAME EMPLOYEE_ID EMP_ID
------------- ------------------------- ----------- ----------
10 Whalen 200 1
20 Hartstein 201 1
20 Fay 202 2
30 Raphaely 114 1
30 Khoo 115 2
30 Baida 116 3
30 Tobias 117 4
30 Himuro 118 5
30 Colmenares 119 6
For Auto Increment You can create a sequence as below.
CREATE SEQUENCE name_of_sequence
START WITH 1
INCREMENT BY 1;
For the second part of your query you can define a trigger that automatically populates the primary key value using above sequence

Parent key not found (oracle)

I want to insert 400000 random lines into a table with an oracle procedure.
I got an 'violated - parent key not found' Error.
create or replace
PROCEDURE TESTDATA AS
X INTEGER;
BEGIN
FOR X IN 1..400000 LOOP
INSERT INTO SALES
SELECT CAST(DBMS_RANDOM.VALUE(1, 10) as INTEGER) as "CUSTOMER_ID",
CAST(DBMS_RANDOM.VALUE(1, 2) as INTEGER) as "PRODUCT_ID",
TO_DATE(TRUNC(DBMS_RANDOM.VALUE(TO_CHAR(TO_DATE('1-jan-2000'),'J'),TO_CHAR(TO_DATE('21-dec-2012'),'J'))),'J') as "SALES_DATE",
CAST(DBMS_RANDOM.VALUE(1, 100) as INTEGER) as "PIECES"
FROM DUAL;
END LOOP;
END TESTDATA;
-- -----------------------------------------------------
-- Table SALES
-- -----------------------------------------------------
CREATE TABLE SALES (
CUSTOMER_ID INTEGER,
PRODUCT_ID INTEGER,
SALES_DATE DATE,
PIECES INTEGER,
PRIMARY KEY (CUSTOMER_ID, PRODUCT_ID, SALES_DATE),
FOREIGN KEY (CUSTOMER_ID) REFERENCES CUSTOMER (CUSTOMER_ID),
FOREIGN KEY (PRODUCT_ID) REFERENCES PRODUCT (PRODUCT_ID)
);
I hope that anyone could help me.
Roman
Your create table statement includes foreign keys on the CUSTOMER and PRODUCT tables. Your insert statement uses random values to populate CUSTOMER_ID and PRODUCT_ID. It is highly unlikely that random values will match existing keys in the referenced tables, so it's unsurprising that you get foreign key violations.
As to how you fix it, well that depends on what you actually want to achieve. As you're populating your table with random numbers you clearly don't care about the data, so you might as well drop the foreign keys. Alternatively you can use primary key values from the referenced tables in the insert statement.
"How can I use the primary key values from the referenced tables?"
You have twenty permutations of PRODUCT and CUSTOMER. So you will need to thrash through them 20000 times to generate 400000 records.
"In the customer table I have inserted 10 rows (1-10) and in the product table 2 rows (1-2)"
Here's a version of your procedure which loops through Products and Customers to generate random combinations of them. There's a third outer loop which allows you to produce as many sets of twenty combos as your heart desires.
create or replace procedure testdata
( p_iters in pls_integer := 1)
as
type sales_nt is table of sales%rowtype;
recs sales_nt;
rec sales%rowtype;
tot_ins simple_integer := 0;
begin
recs := new sales_nt();
dbms_random.seed(val=>to_number(to_char(sysdate,'sssss')));
<< iterz >>
for idx in 1..p_iters loop
<< prodz >>
for c_rec in ( select customer_id from customer
order by dbms_random.value )
loop
<< custz >>
for p_rec in ( select product_id from product
order by dbms_random.value )
loop
rec.customer_id := c_rec.customer_id;
rec.product_id := p_rec.product_id;
rec.sales_date := date '2000-01-01' + dbms_random.value(1, 6000);
rec.pieces := dbms_random.value(1, 100);
recs.extend();
recs(recs.count()) := rec;
end loop custz;
end loop prodz;
forall idx in 1..recs.count()
insert into sales
values recs(idx);
tot_ins := tot_ins + sql%rowcount;
recs.delete();
end loop iterz;
dbms_output.put_line('test records generated = '||to_char(tot_ins));
end testdata;
/
Proof of the pudding:
SQL> set serveroutput on
SQL> exec testdata(2)
test records generated = 40
PL/SQL procedure successfully completed.
SQL> select * from sales;
CUSTOMER_ID PRODUCT_ID SALES_DAT PIECES
----------- ---------- --------- ----------
9 2 21-FEB-02 42
9 1 10-AUG-05 63
7 1 23-FEB-12 54
7 2 21-NOV-12 80
1 2 06-NOV-15 56
1 1 08-DEC-09 47
4 2 08-JUN-10 58
4 1 19-FEB-09 43
2 2 04-SEP-02 64
2 1 09-SEP-15 69
6 2 20-FEB-08 60
...
2 1 11-JAN-16 19
3 2 03-FEB-10 58
3 1 25-JUL-10 66
9 2 26-FEB-16 70
9 1 15-MAY-14 90
6 2 03-APR-05 60
6 1 21-MAY-15 19
40 rows selected.
SQL>
note
As #mathguy points out, the OP's code constrains the range of random values to fit the range of primary keys in the referenced tables. However, I will leave this answer in place because its general approach is both safer (guaranteed to always match a referenced primary key) and less brittle (can cope with inserting or deleting PRODUCT or CUSTOMER records.

Why is my date dimension table useless? (Confusion over PostgreSQL storage...)

I have looked over this about 4 times and am still perplexed with these results.
Take a look at the following (which I originally posted here)
Date dimension table --
-- Some output omitted
DROP TABLE IF EXISTS dim_calendar CASCADE;
CREATE TABLE dim_calendar (
id SMALLSERIAL PRIMARY KEY,
day_id DATE NOT NULL,
year SMALLINT NOT NULL, -- 2000 to 2024
month SMALLINT NOT NULL, -- 1 to 12
day SMALLINT NOT NULL, -- 1 to 31
quarter SMALLINT NOT NULL, -- 1 to 4
day_of_week SMALLINT NOT NULL, -- 0 () to 6 ()
day_of_year SMALLINT NOT NULL, -- 1 to 366
week_of_year SMALLINT NOT NULL, -- 1 to 53
CONSTRAINT con_month CHECK (month >= 1 AND month <= 31),
CONSTRAINT con_day_of_year CHECK (day_of_year >= 1 AND day_of_year <= 366), -- 366 allows for leap years
CONSTRAINT con_week_of_year CHECK (week_of_year >= 1 AND week_of_year <= 53),
UNIQUE(day_id)
);
INSERT INTO dim_calendar (day_id, year, month, day, quarter, day_of_week, day_of_year, week_of_year) (
SELECT ts,
EXTRACT(YEAR FROM ts),
EXTRACT(MONTH FROM ts),
EXTRACT(DAY FROM ts),
EXTRACT(QUARTER FROM ts),
EXTRACT(DOW FROM ts),
EXTRACT(DOY FROM ts),
EXTRACT(WEEK FROM ts)
FROM generate_series('2000-01-01'::timestamp, '2024-01-01', '1day'::interval) AS t(ts)
);
/* ==> [ INSERT 0 8767 ] */
Tables for testing --
DROP TABLE IF EXISTS just_dates CASCADE;
DROP TABLE IF EXISTS just_date_ids CASCADE;
CREATE TABLE just_dates AS
SELECT a_date AS some_date
FROM some_table;
/* ==> [ SELECT 769411 ] */
CREATE TABLE just_date_ids AS
SELECT d.id
FROM just_dates jd
INNER JOIN dim_calendar d
ON d.day_id = jd.some_date;
/* ==> [ SELECT 769411 ] */
ALTER TABLE just_date_ids ADD CONSTRAINT jdfk FOREIGN KEY (id) REFERENCES dim_calendar (id);
Confusion --
pocket=# SELECT pg_size_pretty(pg_relation_size('dim_calendar'));
pg_size_pretty
----------------
448 kB
(1 row)
pocket=# SELECT pg_size_pretty(pg_relation_size('just_dates'));
pg_size_pretty
----------------
27 MB
(1 row)
pocket=# SELECT pg_size_pretty(pg_relation_size('just_date_ids'));
pg_size_pretty
----------------
27 MB
(1 row)
Why is a table consisting of a bunch of smallints the same size as a table consisting of a bunch of dates? And I should mention that before, when dim_calendar.id was a normal SERIAL, it gave the same 27MB result.
Also, and more importantly -- WHY does a table with 769411 records with a single smallint field have a size of 27MB, which is > 32bytes/record???
P.S. Yes, I will have billions (or at a minimum hundreds of millions) of records, and am trying to add performance and space optimizations wherever possible.
EDIT
This might have something to do with it, so throwing it out there --
pocket=# select count(id) from just_date_ids group by id;
count
--------
409752
359659
(2 rows)
In tables with one or two columns, the biggest part of the size is always the Tuple Header.
Have a look here http://www.postgresql.org/docs/current/interactive/storage-page-layout.html, it explains how the data is stored. I'm quoting the part of the above page that is most relevant with your question
All table rows are structured in the same way. There is a fixed-size header (occupying 23 bytes on most machines), followed by an optional null bitmap, an optional object ID field, and the user data.
This mostly explains the question
WHY does a table with 769411 records with a single smallint field have a size of 27MB, which is > 32bytes/record???
The other part of your question has to do with the byte alignment of postgres data. Smallints are aligned in 2-byte offsets, but ints (and dates of course... date is an int4 after all) are aligned in 4 bytes offsets. So the order in which the table columns are devlared plays a significant role.
Having a table with smallint, date, smallint needs 12 bytes for user data (not counting the overhead), while declaring smallint, smallint, date only will need 8 bytes. See a great (and surprisingly not accepted) answer here Calculating and saving space in PostgreSQL

Resources