Oracle - Unique constraint over nullable fields

Oracle - Unique constraint over nullable fields - oracle

I would like to consult / gather some ideas, from you guys, on possible solutions for unique constraint definition over nullable columns in oracle.
Let's have a table of customers
PK(ID), first_name, last_name are pretty obvious
EXT_CODE is Unique, visible in application, used to synchronization with 3rd party systems, means, that it's external ID first time delivered by other system, then remains unchanged whole lifetime
Example: update clients set first_name = 'ABC' where ext_code = 'ABC'
+---+-----------+-----------+-----------+
|ID |FIRST_NAME |LAST_NAME |EXT_CODE |
+---+-----------+-----------+-----------+
|1 |Peter |Pletan |ABC |
|2 |John |Dollar |DEF |
|3 |Mia |Zin |GHI |
|4 |Jasper |Blau |NULL |
|5 |George |Khan |NULL |
-----------------------------------------
Until now, everything is ok, I have EXT_CODE unique per this table, so there is always only one row returned, when update from external system is requested. When there is client with ext_code = null, it cannot be maintained from external system, because where something = null, never returns anything. There can be only one client with same EXT_CODE, but any number of those without this EXT_CODE (column is nullable)
Now comes the difficult part.
I decided, that in this table, data for more (independent) customers could be stored. For this reason, I added new column called CUSTOMER_CODE.
This code splits the table virtually into separate spaces, while every customer can see only her data.
For this purpose, oracle vpd (virtual private database) has been introduced.
Every customer uses her own oracle user
On logon, customer code is loaded
Predicate WHERE CUSTOMER_CODE = 'my_code' (loaded in step2) is appended to every query
Modified table might look following
+---------------+---+-----------+-----------+-----------+
|CUSTOMER_CODE |ID |FIRST_NAME |LAST_NAME |EXT_CODE |
+---------------+---+-----------+-----------+-----------+
|C1 |1 |Peter |Pletan |ABC |
|C1 |2 |John |Dollar |DEF |
|C1 |3 |Mia |Zin |GHI |
|C1 |4 |Jasper |Blau |NULL |
|C1 |5 |George |Khan |NULL |
|C2 |6 |Paul |Walker |1 |
|C2 |7 |Simon |Sleeper |2 |
|C2 |8 |Lian |Driver |3 |
|C2 |9 |Cor |Pilot |NULL |
|C2 |10 |Martin |Oldman |NULL |
---------------------------------------------------------
That is considered general overview. When customer C1 logs in, she sees only rows 1-5, while C2 6-10.
Here come the issues
Due to UNIQUE constraint on EXT_CODE, customer C1 and C2 cannot have same ext_code - following two rows already breaks the constraint
+---------------+---+-----------+-----------+-----------+
|CUSTOMER_CODE |ID |FIRST_NAME |LAST_NAME |EXT_CODE |
+---------------+---+-----------+-----------+-----------+
|C1 |1 |Peter |Pletan |ABC |
|C2 |2 |John |Dollar |ABC |
That is easily fixable, by instead of UNIQUE(ext_code), i make UNIQUE(CUSTOMER_CODE, EXT_CODE), what causes another issue - I can no more have 2 rows with ext_code empty, because C1, NULL and C1, NULL is the same from oracle's point of view. Example for these rows, ID=4,5. I could have these prior to customer introduction.
What are my possibilities now ?
1. Functional based index (drop unique constraint) - index, which would set both values null if any is null, so it doesn't get indexed at all => might be a solution, but indexes are not deferable in opposite to unique constraints
Trigger - which checks the data and throw exception (only if both values are not null)
Make ext_code not null - place regular unique constraint over combination (ext_code, customer_code) => not viable option
Other ideas - I would like to hear from you.

You haven't said which version of Oracle you're using, but from 11g you can use a virtual column with a unique constraint:
alter table customer add (unq_col varchar2(24) -- or necessary size
generated always as (case when ext_code is null then null
else customer_code||'~'||ext_code end));
alter table customer add (constraint unq_col_con unique (unq_col));
The generated column can be built any way you consider safe - with a delimiter if you can identify a character that can never be in one of the columns, or padding, or whatever is suitable.
Then trying to duplicate a code within a customer fails:
update customer set ext_code = 'ABC' where ext_code = 'DEF'
Error report -
SQL Error: ORA-00001: unique constraint (SCHEMA.UNQ_COL_CON) violated
00001. 00000 - "unique constraint (%s.%s) violated"
But with a different customer is OK:
update customer set ext_code = 'ABC' where ext_code = '1';
1 row updated.

Related

How should I index a FULLNAME field in Oracle when I need to query by first and last name?

I have a rather large table (34 GB, 77M rows) which contains payment information. The table is partitioned by payment date because users usually care about small ranges of dates so the partition pruning really helps queries to return quickly.
The problem is that I have a user who wants to find out all payments that have ever been made to certain people.
Names are stored in columns NAME1 and NAME2, which are both VARCHAR2(40 Byte) and hold free-form full name data. For example, John Q Public could appear in either column as:
John Q Public
John Public
Public, John Q
or even embedded in the middle of the field, like "Estate of John Public"
Right now, the way the query is set up is to look for
NAME1||NAME2 LIKE '%JOHN%PUBLIC%' OR NAME1||NAME2 LIKE '%PUBLIC%JOHN%' and as you can imagine, the performance sucks.
Is this a job for Oracle Text? How else could I better index the atomic bits of the columns so that the user can search by first/last name?
Database Version: Oracle 12c (12.1.0.2.0)

Create a multi-column index on both names and modify your query to use an INDEX FAST FULL SCAN operation.
Traversing a b-tree index is a great way to quickly find a small amount of data. Unfortunately the leading wildcards ruin that access path for your query. However, Oracle has multiple ways of reading data from an index. The INDEX FAST FULL SCAN operation simply reads all of the index blocks in no particular order, as if the index was a skinny table. Since the average row length of your table is 442 bytes, and the two columns use at most 80 bytes, reading all the names in the index may be much faster than scanning the entire table.
But the index alone probably isn't enough. You need to change the concatenation into multiple OR expressions.
Sample schema:
--Create payment table and index on name columns.
create table payment
(
id number,
paydate date,
other_data varchar2(400),
name1 varchar2(40),
name2 varchar2(40)
);
create index payment_idx on payment(name1, name2);
--Insert 100K sample rows.
insert into payment
select level, sysdate + level, lpad('A', 400, 'A'), level, level
from dual
connect by level <= 100000;
--Insert two rows with relevant values.
insert into payment values(0, sysdate, 'other data', 'B JOHN B PUBLIC B', 'asdf');
insert into payment values(0, sysdate, 'other data', 'asdf', 'C JOHN C PUBLIC C');
commit;
--Gather stats to help optimizer pick the right plan.
begin
dbms_stats.gather_table_stats(user, 'payment');
end;
/
Original expression uses a full table scan:
explain plan for
select name1, name2
from payment
where NAME1||NAME2 LIKE '%JOHN%PUBLIC%' OR NAME1||NAME2 LIKE '%PUBLIC%JOHN%';
select * from table(dbms_xplan.display);
Plan hash value: 684176532
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9750 | 4056K| 1714 (1)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| PAYMENT | 9750 | 4056K| 1714 (1)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("NAME1"||"NAME2" LIKE '%JOHN%PUBLIC%' OR "NAME1"||"NAME2"
LIKE '%PUBLIC%JOHN%')
New expression uses a faster INDEX FAST FULL SCAN operation:
explain plan for
select name1, name2
from payment
where
NAME1 LIKE '%JOHN%PUBLIC%' OR
NAME1 LIKE '%PUBLIC%JOHN%' OR
NAME2 LIKE '%JOHN%PUBLIC%' OR
NAME2 LIKE '%PUBLIC%JOHN%';
select * from table(dbms_xplan.display);
Plan hash value: 1655289165
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 18550 | 217K| 152 (3)| 00:00:01 |
|* 1 | INDEX FAST FULL SCAN| PAYMENT_IDX | 18550 | 217K| 152 (3)| 00:00:01 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("NAME1" LIKE '%JOHN%PUBLIC%' AND "NAME1" IS NOT NULL AND
"NAME1" IS NOT NULL OR "NAME1" LIKE '%PUBLIC%JOHN%' AND "NAME1" IS NOT NULL
AND "NAME1" IS NOT NULL OR "NAME2" LIKE '%JOHN%PUBLIC%' AND "NAME2" IS NOT
NULL AND "NAME2" IS NOT NULL OR "NAME2" LIKE '%PUBLIC%JOHN%' AND "NAME2" IS
NOT NULL AND "NAME2" IS NOT NULL)
This solution should definitely be faster than a full table scan. How much faster depends on the average name size and the name being searched. And depending on the query you may want to add additional columns to keep all the relevant data in the index.
Oracle Text is also a good option, but that feature feels a little "weird" in my opinion. If you're not already using text indexes you might want to stick with normal indexes to simplify administrative tasks.

Fetch all names into multi row block

I have tables that looks like this:
tbl1
+---------+
|c_no |
+---------+
|1 |
+---------+
tbl2
+----------+---------+
|tbl1_c_no |s_name |
+----------+---------+
|1 |A |
|1 |D |
+----------+---------+
My form:
◘ The 1st block's base table usage is tbl1.
◘ C_NO field is auto generated using sequence. (required).
◘ S_GR is just an unbound item. (not required).
◘ The 2nd block's base table usage is tbl2 and is multiple row.
◘ S_NAME. (required)
◘ 1st block is like the parent of 2nd block.
◘ 1st and 2nd block is linked using c_no and tbl1_c_no
For example if I wanted to add some data, it's like this:
Then press F10 for saving:
tbl1 will be:
+---------+
|c_no |
+---------+
|1 |
|2 |
+---------+
tbl2 will be:
+----------+---------+
|tbl1_c_no |s_name |
+----------+---------+
|1 |A |
|1 |D |
|2 |B |
|2 |C |
|2 |E |
+----------+---------+
And my problem is that I wanted to fetch s_names from my 3rd table into 2nd block.
tbl3
+----------+---------+
|s_gr |s_name |
+----------+---------+
|80 |F |
|85 |G |
|84 |H |
|84 |I |
|80 |J |
+----------+---------+
Like this:
then after leaving S_GR field, it will fetch S_NAME from tbl3 that S_GR = 80 into the 2nd block

You can create two blocks :
for the 1st one, to have a block with no base table, create manually just by touching Data Blocks node
with mouse's cursor and then toggling the create icon (a green plus
sign ) and type a name blk_no. And add a field s_no on the canvas.
for the 2nd one use Data Block Wizard and choose Table or View type
for the type of the block. There select the table(tbl1)'s both columns
(s_no and name) as Database Items.
And then, the forms must invoke
Layout Wizard automatically as default, where choose only name column
as displayed and leave s_no hidden as to be . Name the block as blk_names. This is a base-table block, and Data Source Name of the block blk_names is the table tbl1.
By the way, set Number of Records Displayed property to 10 as an example, and convert the name of the field name to snames as in your question.
Set block's WHERE Clause (in Database node) as s_no = :blk_no.s_no at the Property Palette. After
all, create a KEY-NEXT-ITEM trigger on s_no field with the inline
code :
go_block('blk_names');
execute_query;
At the runtime you can enter an integer value( let's give 1 as an example ) for s_no and populate the names field by pressing enter key ( the records with A and D will appear )
A button might be added with WHEN-BUTTON-PRESSED trigger having the code :
go_block('blk_names');
delete tbl2;
first_record;
while :blk_names.s_no is not null
loop
insert into tbl2 values(:snames);
next_record;
end loop;
commit;
to populate and re-populate the table tbl2( in this case tbl2 is populated with the records A and D ).
P.S. To suppress the message
FRM-40352: Last Record of Query retrieved
add an ON-MESSAGE trigger at the forms level wtih the code :
if message_code = 40352 then
null;
end if;

T-SQL: Sort by column with not every row contains data

I have sample data as follows:
taskname |skillname |user |Partition
--------------------------------------
taskAAAA |skill1111 |user3 |1
|skill2222 | |1
taskBBBB |skill1111 |user2 |2
taskCCCC |skill3333 |user1 |3
taskDDDD |skill1111 |user4 |4
|skill2222 | |4
If there are two skills belongs to a task, taskname and user will not repeat itself in taskname column and user column.
I manage to put partition to the same taskname. But I need to sort by user in ascending order and the records will follow its partition. The result in this case will be as follow:
taskname |skillname |user |Partition
--------------------------------------
taskCCCC |skill3333 |user1 |3
taskBBBB |skill1111 |user2 |2
taskAAAA |skill1111 |user3 |1
|skill2222 | |1
taskDDDD |skill1111 |user4 |4
|skill2222 | |4
Anyone can help me?

ANSI SQL supports NULLS LAST:
order by user nulls last
Not all databases support this construct. It is easily replace by a two-key search:
order by (case when user is not null then 1 else 2 end), -- "NULLS LAST"
user

First option is using an Order By keyword and NULLS LAST
select * from table order by user NULLS LAST
If your SQL doesn't support NULLS you can use the value IS NULL expression
select * from table order by user IS NULL, user
If the user's field is null then the expression IS NULL returns 1 else 0. So the rows with not null value(0) will be first, and the rows with null value(1) will be last when a ascending sort.
Next rows will be sorted by value of user's field.

Oracle Insert Into Child & Parent Tables

I have a table - let's call it MASTER - with a lot of rows in it. Now, I had to created another table called 'MASTER_DETAILS', which will be populated with data from another system. Suh data will be accessed via DB Link.
MASTER has a FK to MASTER_DETAIL (1 -> 1 Relationship).
I created a SQL to populate the MASTER_DETAILS table:
INSERT INTO MASTER_DETAILS(ID, DETAIL1, DETAILS2, BLAH)
WITH QUERY_FROM_EXTERNAL_SYSTEM AS (
SELECT IDENTIFIER,
FIELD1,
FIELD2,
FIELD3
FROM TABLE#DB_LINK
--- DOZENS OF INNERS AND OUTER JOINS HERE
) SELECT MASTER_DETAILS_SEQ.NEXTVAL,
QES.FIELD1,
QES.FIELD2,
QES.FIELD3
FROM MASTER M
INNER JOIN QUERY_FROM_EXTERNAL_SYSTEM QES ON QES.IDENTIFIER = M.ID
--- DOZENS OF JOINS HERE
Approach above works fine to insert all the values into the MASTER_DETAILS.
Problem is:
In the approach above, I cannot insert the value of MASTER_DETAILS_SEQ.CURRVAL into the MASTER table. So I create all the entries into the DETAILS table but I don't link them to the MASTER table.
Does anyone see a way out to this problem using only a INSERT statement? I wish I could avoid creating a complex script with LOOPS and everything to handle this problem.
Ideally I want to do something like this:
INSERT INTO MASTER_DETAILS(ID, DETAIL1, DETAILS2, BLAH) AND MASTER(MASTER_DETAILS_ID)
WITH QUERY_FROM_EXTERNAL_SYSTEM AS (
SELECT IDENTIFIER,
FIELD1,
FIELD2,
FIELD3
FROM TABLE#DB_LINK
--- DOZENS OF INNERS AND OUTER JOINS HERE
) SELECT MASTER_DETAILS_SEQ.NEXTVAL,
QES.FIELD1,
QES.FIELD2,
QES.FIELD3
FROM MASTER M
INNER JOIN QUERY_FROM_EXTERNAL_SYSTEM QES ON QES.IDENTIFIER = M.ID
--- DOZENS OF JOINS HERE,
SELECT MASTER_DETAILS_SEQ.CURRVAL FROM DUAL;
I know such approach does not work on Oracle - but I am showing this SQL to demonstrate what I want to do.
Thanks.

If there is really a 1-to-1 relationship between the two tables, then they could arguably be a single table. Presumably you have a reason to want to keep them separate. Perhaps the master is a vendor-supplied table you shouldn't touch and the detail is extra data; but then you're changing the master anyway by adding the foreign key field. Or perhaps the detail will be reloaded periodically and you don't want to update the master table; but then you have to update the foreign key field anyway. I'll assume you're required to have a separate table, for whatever reason.
If you put a foreign key on the master table that refers to the primary key on the detail table, you're are restricted to it only ever being a 1-to-1 relationship. If that really is the case then conceptually it shouldn't matter which way the relationship is built - which table has the primary key and which has the foreign key. And if it isn't then your model will break when your detail table (or the remote query) comes back with two rows related to the same master - even if you're sure that won't happen today, will it always be true? The pluralisation of the name master_details suggests that might be expected. Maybe. Having the relationship the other way would prevent that being an issue.
I'm guessing you decided to put the relationship that way round so you can join the tables using the detail's key:
select m.column, md.column
from master m
join master_details md on md.id = m.detail_id
... because you expect that to be the quickest way, since md.id will be indexed (implicitly, as a primary key). But you could achieve the same effect by adding the master ID to the details table as a foreign key:
select m.column, md.column
from master m
join master_details md on md.master_id = m.id
It is good practice to index foreign keys anyway, and as long as you have an index on master_details.master_id then the performance should be the same (more or less, other factors may come in to play but I'd expect this to generally be the case). This would also allow multiple detail records in the future, without needing to modify the schema.
So as a simple example, let's say you have a master table created and populated with some dummy data:
create table master(id number, data varchar2(10),
constraint pk_master primary key (id));
create sequence seq_master start with 42;
insert into master (id, data)
values (seq_master.nextval, 'Foo ' || seq_master.nextval);
insert into master (id, data)
values (seq_master.nextval, 'Foo ' || seq_master.nextval);
insert into master (id, data)
values (seq_master.nextval, 'Foo ' || seq_master.nextval);
select * from master;
ID DATA
---------- ----------
42 Foo 42
43 Foo 43
44 Foo 44
The changes you've proposed might look like this:
create table detail (id number, other_data varchar2(10),
constraint pk_detail primary key(id));
create sequence seq_detail;
alter table master add (detail_id number,
constraint fk_master_detail foreign key (detail_id)
references detail (id));
insert into detail (id, other_data)
select seq_detail.nextval, 'Foo ' || seq_detail.nextval
from master m
-- joins etc
;
... plus the update of the master's foreign key, which is what you're struggling with, so let's do that manually for now:
update master set detail_id = 1 where id = 42;
update master set detail_id = 2 where id = 43;
update master set detail_id = 3 where id = 44;
And then you'd query as:
select m.data, d.other_data
from master m
join detail d on d.id = m.detail_id
where m.id = 42;
DATA OTHER_DATA
---------- ----------
Foo 42 Bar 1
Plan hash value: 2192253142
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 22 | 2 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 1 | 22 | 2 (0)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| MASTER | 1 | 13 | 1 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | PK_MASTER | 1 | | 0 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| DETAIL | 3 | 27 | 1 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | PK_DETAIL | 1 | | 0 (0)| 00:00:01 |
------------------------------------------------------------------------------------------
If you swap the relationship around the changes become:
create table detail (id number, master_id, other_data varchar2(10),
constraint pk_detail primary key(id),
constraint fk_detail_master foreign key (master_id)
references master (id));
create index ix_detail_master_id on detail (master_id);
create sequence seq_detail;
insert into detail (id, master_id, other_data)
select seq_detail.nextval, m.id, 'Bar ' || seq_detail.nextval
from master m
-- joins etc.
;
No update of the master table is needed, and the query becomes:
select m.data, d.other_data
from master m
join detail d on d.master_id = m.id
where m.id = 42;
DATA OTHER_DATA
---------- ----------
Foo 42 Bar 1
Plan hash value: 4273661231
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 19 | 2 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 1 | 19 | 2 (0)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| MASTER | 1 | 10 | 1 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | PK_MASTER | 1 | | 0 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| DETAIL | 1 | 9 | 1 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | IX_DETAIL_MASTER_ID | 1 | | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------
The only real difference in the plan is that you now have a range scan instead of a unique scan; if you're really sure it's 1-to-1 you could make the index unique but there's not much benefit.
SQL Fiddle of this approach.

Will this type of pagination scale?

I need to paginate on a set of models that can/will become large. The results have to be sorted so that the latest entries are the ones that appear on the first page (and then, we can go all the way to the start using 'next' links).
The query to retrieve the first page is the following, 4 is the number of entries I need per page:
SELECT "relationships".* FROM "relationships" WHERE ("relationships".followed_id = 1) ORDER BY created_at DESC LIMIT 4 OFFSET 0;
Since this needs to be sorted and since the number of entries is likely to become large, am I going to run into serious performance issues?
What are my options to make it faster?
My understanding is that an index on 'followed_id' will simply help the where clause. My concern is on the 'order by'

Create an index that contains these two fields in this order (followed_id, created_at)
Now, how large is the large we are talking about here? If it will be of the order of millions.. How about something like the one that follows..
Create an index on keys followed_id, created_at, id (This might change depending upon the fields in select, where and order by clause. I have tailor-made this to your question)
SELECT relationships.*
FROM relationships
JOIN (SELECT id
FROM relationships
WHERE followed_id = 1
ORDER BY created_at
LIMIT 10 OFFSET 10) itable
ON relationships.id = itable.id
ORDER BY relationships.created_at
An explain would yield this:
+----+-------------+---------------+------+---------------+-------------+---------+------+------+-----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+------+---------------+-------------+---------+------+------+-----------------------------------------------------+
| 1 | PRIMARY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Impossible WHERE noticed after reading const tables |
| 2 | DERIVED | relationships | ref | sample_rel2 | sample_rel2 | 5 | | 1 | Using where; Using index |
+----+-------------+---------------+------+---------------+-------------+---------+------+------+-----------------------------------------------------+
If you examine carefully, the sub-query containing the order, limit and offset clauses will operate on the index directly instead of the table and finally join with the table to fetch the 10 records.
It makes a difference when at one point your query makes a call like limit 10 offset 10000. It will retrieve all the 10000 records from the table and fetch the first 10. This trick should restrict the traversal to just the index.
An important note: I tested this in MySQL. Other database might have subtle differences in behavior, but the concept holds good no matter what.

you can index these fields. but it depends:
you can assume (mostly) that the created_at is already ordered. So that might by unnecessary. But that more depends on you app.
anyway you should index followed_id (unless its the primary key)

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Oracle - Unique constraint over nullable fields - oracle

Related

How should I index a FULLNAME field in Oracle when I need to query by first and last name?

Fetch all names into multi row block

T-SQL: Sort by column with not every row contains data

Oracle Insert Into Child & Parent Tables

Will this type of pagination scale?

Categories

Resources