How should I index a date column when some rows has null values?
We have to select rows between a date range and rows with null dates.
We use Oracle 9.2 and higher.
Options I found
Using a bitmap index on the date column
Using an index on date column and an index on a state field which value is 1 when the date is null
Using an index on date column and an other granted not null column
My thoughts to the options are:
to 1: the table have to many different values to use an bitmap index
to 2: I have to add an field only for this purpose and to change the query when I want to retrieve the null date rows
to 3: locks tricky to add an field to an index which is not really needed
What is the best practice for this case?
Thanks in advance
Some infos I have read:
Oracle Date Index
When does Oracle index null column values?
Edit
Our table has 300,000 records. 1,000 to 10,000 records are inserted and delete every day. 280,000 records have a null delivered_at date. It is a kind of picking buffer.
Our structure (translated to english) is:
create table orders
(
orderid VARCHAR2(6) not null,
customerid VARCHAR2(6) not null,
compartment VARCHAR2(8),
externalstorage NUMBER(1) default 0 not null,
created_at DATE not null,
last_update DATE not null,
latest_delivery DATE not null,
delivered_at DATE,
delivery_group VARCHAR2(9),
fast_order NUMBER(1) default 0 not null,
order_type NUMBER(1) default 0 not null,
produkt_group VARCHAR2(30)
)
In addition to Tony's excellent advice, there is also an option to index your column in such a way that you don't need to adjust your queries. The trick is to add a constant value to just your index.
A demonstration:
Create a table with 10,000 rows out of which only 6 contain a NULL value for the a_date column.
SQL> create table mytable (id,a_date,filler)
2 as
3 select level
4 , case when level < 9995 then date '1999-12-31' + level end
5 , lpad('*',1000,'*')
6 from dual
7 connect by level <= 10000
8 /
Table created.
First I'll show that if you just create an index on the a_date column, the index is not used when you use the predicate "where a_date is null":
SQL> create index i1 on mytable (a_date)
2 /
Index created.
SQL> exec dbms_stats.gather_table_stats(user,'mytable',cascade=>true)
PL/SQL procedure successfully completed.
SQL> set autotrace on
SQL> select id
2 , a_date
3 from mytable
4 where a_date is null
5 /
ID A_DATE
---------- -------------------
9995
9996
9997
9998
9999
10000
6 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=72 Card=6 Bytes=72)
1 0 TABLE ACCESS (FULL) OF 'MYTABLE' (Cost=72 Card=6 Bytes=72)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
720 consistent gets
0 physical reads
0 redo size
285 bytes sent via SQL*Net to client
234 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
6 rows processed
720 consistent gets and a full table scan.
Now change the index to include the constant 1, and repeat the test:
SQL> set autotrace off
SQL> drop index i1
2 /
Index dropped.
SQL> create index i1 on mytable (a_date,1)
2 /
Index created.
SQL> exec dbms_stats.gather_table_stats(user,'mytable',cascade=>true)
PL/SQL procedure successfully completed.
SQL> set autotrace on
SQL> select id
2 , a_date
3 from mytable
4 where a_date is null
5 /
ID A_DATE
---------- -------------------
9995
9996
9997
9998
9999
10000
6 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=6 Bytes=72)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'MYTABLE' (Cost=2 Card=6 Bytes=72)
2 1 INDEX (RANGE SCAN) OF 'I1' (NON-UNIQUE) (Cost=2 Card=6)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
6 consistent gets
0 physical reads
0 redo size
285 bytes sent via SQL*Net to client
234 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
6 rows processed
6 consistent gets and an index range scan.
Regards,
Rob.
"Our table has 300,000 records....
280,000 records have a null
delivered_at date. "
In other words almost the entire table satisfies a query which searches on where DELIVERED_AT is null. An index is completely inappropriate for that search. A full table scan is much the best approach.
If you have an Enterprise Edition license and you have the CPUs to spare, using a parallel query would reduce the elapsed time.
Do you mean that your queries will be like this?
select ...
from mytable
where (datecol between :from and :to
or datecol is null);
It would only be worth indexing the nulls if they were relatively few in the table - otherwise a full table scan may be the most efficient way to find them. Assuming it is worth indexing them you could create a function-based index like this:
create index mytable_fbi on mytable (case when datecol is null then 1 end);
Then change your query to:
select ...
from mytable
where (datecol between :from and :to
or case when datecol is null then 1 end = 1);
You could wrap the case in a function to make it slicker:
create or replace function isnull (p_date date) return varchar2
DETERMINISTIC
is
begin
return case when p_date is null then 'Y' end;
end;
/
create index mytable_fbi on mytable (isnull(datecol));
select ...
from mytable
where (datecol between :from and :to
or isnull(datecol) = 'Y');
I made sure the function returns NULL when the date is not null so that only the null dates are stored in the index. Also I had to declare the function as DETERMINISTIC. (I changed it to return 'Y' instead of 1 merely because to me the name "isnull" suggests it should; feel free to ignore my preference!)
Avoid the table lookup and create the index like this :
create index i1 on mytable (a_date,id) ;
Related
I'm trying to insert information in a partition table, but I don't know what I'm doing wrong!
Show me this error: ORA-14400: inserted partition key does not map to any partition"
The table dba_tab_partitions shows this informations below:
1 PDIA_98_20091023 0
2 PDIA_98_20091022 0
3 PDIA_98_20091021 0
4 PDIA_98_20091020 0
5 PDIA_98_20091019 0
Please help me rs
select partition_name,column_name,high_value,partition_position
from ALL_TAB_PARTITIONS a , ALL_PART_KEY_COLUMNS b
where table_name='YOUR_TABLE' and a.table_name = b.name;
This query lists the column name used as key and the allowed values. make sure, you insert the allowed values(high_value). Else, if default partition is defined, it would go there.
EDIT:
I presume, your TABLE DDL would be like this.
CREATE TABLE HE0_DT_INF_INTERFAZ_MES
(
COD_PAIS NUMBER,
FEC_DATA NUMBER,
INTERFAZ VARCHAR2(100)
)
partition BY RANGE(COD_PAIS, FEC_DATA)
(
PARTITION PDIA_98_20091023 VALUES LESS THAN (98,20091024)
);
Which means I had created a partition with multiple columns which holds value less than the composite range (98,20091024);
That is first COD_PAIS <= 98 and Also FEC_DATA < 20091024
Combinations And Result:
98, 20091024 FAIL
98, 20091023 PASS
99, ******** FAIL
97, ******** PASS
< 98, ******** PASS
So the below INSERT fails with ORA-14400; because (98,20091024) in INSERT is EQUAL to the one in DDL but NOT less than it.
SQL> INSERT INTO HE0_DT_INF_INTERFAZ_MES(COD_PAIS, FEC_DATA, INTERFAZ)
VALUES(98, 20091024, 'CTA'); 2
INSERT INTO HE0_DT_INF_INTERFAZ_MES(COD_PAIS, FEC_DATA, INTERFAZ)
*
ERROR at line 1:
ORA-14400: inserted partition key does not map to any partition
But, we I attempt (97,20091024), it goes through
SQL> INSERT INTO HE0_DT_INF_INTERFAZ_MES(COD_PAIS, FEC_DATA, INTERFAZ)
2 VALUES(97, 20091024, 'CTA');
1 row created.
For this issue need to add the partition for date column values, If last partition 20201231245959, then inserting the 20210110245959 values, this issue will occurs.
For that need to add the 2021 partition into that table
ALTER TABLE TABLE_NAME ADD PARTITION PARTITION_NAME VALUES LESS THAN (TO_DATE('2021-12-31 24:59:59', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')) NOCOMPRESS
First som basics.
Java 6
OJDBC6
Oracle 10.2.0.4 (also the same result in 11g version)
I am experiencing that a sql statement is behaving differently when executed from Java with the OJDBC6 client and using the tool SQL Gate that probably uses a native/OCI driver. For som reason the optimizer chooses to use hash join for the executed statement in Java but not for the other.
Here is the table:
CREATE TABLE DPOWNERA.XXX_CHIP (
xxxCH_ID NUMBER(22) NOT NULL,
xxxCHP_ID NUMBER(22) NOT NULL,
xxxSP_ID NUMBER(22) NULL,
xxxCU_ID NUMBER(22) NULL,
xxxFT_ID NUMBER(22) NULL,
UEMTE_ID NUMBER(38) NULL,
xxxCH_CHIPID VARCHAR2(30) NOT NULL
)
The index:
ALTER TABLE DPOWNERA.XXX_CHIP ADD
(
CONSTRAINT IX_AK1_XXX_CHIPV2
UNIQUE ( XXXCH_CHIPID )
USING INDEX
TABLESPACE DP_DATA01
PCTFREE 10
INITRANS 2
MAXTRANS 255
STORAGE (
INITIAL 128 K
NEXT 128 K
MINEXTENTS 1
MAXEXTENTS UNLIMITED
)
);
Here is the SQL i used:
SELECT *
FROM (SELECT m2.*,
rownum rnum
FROM (SELECT m_chip.xxxch_id,
m_chip.xxxch_chipid
FROM xxx_chip m_chip
ORDER BY m_chip.xxxch_chipid) m2
WHERE rownum < 101)
WHERE rnum >= 1;
And finally excerpts from the explain plan:
SQL Tool Query:
OPERATION OBJECT_NAME COST CARDINALITY CPU_COST
---------------- ------------------- ----- ----------- ----------
SELECT STATEMENT NULL 2 10 11740
VIEW NULL 2 10 11740
COUNT NULL NULL NULL NULL
VIEW NULL 2 10 11740
NESTED LOOPS NULL 2 10 11740
TABLE ACCESS XXX_CHIP 1 1000000 3319
INDEX IX_AK1_XXX_CHIPV2 1 10 2336
TABLE ACCESS XXX_CUSTOMER 1 1 842
INDEX IX_PK_XXX_CUSTOMER 1 1 105
QQL Java Query OJDBC Thin client:
**OPERATION OBJECT_NAME COST CARDINALITY CPU_COST**
SELECT STATEMENT NULL 15100 100 1538329415
VIEW NULL 15100 100 1538329415
COUNT NULL NULL NULL NULL
VIEW NULL 15100 1000000 1538329415
SORT NULL 15100 1000000 1538329415
HASH JOIN NULL 1639 1000000 424719850
VIEW index$_join$_004 3 3 2268646
HASH JOIN NULL NULL NULL NULL
INDEX IX_AK1_XXX_CUSTOMER 1 3 965
INDEX IX_PK_XXX_CUSTOMER 1 3 965
TABLE ACCESS xxx_CHIP 1614 1000000 320184788
So, i am lost to why the hash join is chosen by the optimizer?
My guess is that the varchar2 is treated differently.
I found an answer and it was simpler than i thought. It all has to do with the VARCHAR2 datatype of the index column. My database was set to language and country "en", "US" but locally
i have another language and region. Therfore the optimizer rightly discarded the index since it wasn't configured with the same language and country as the client.
So what i did to test it was to start my eclipse with some extra -D parameters entered in my eclipse.ini file.
-Duser.language=en
-Duser.country=US
-Duser.region=US
Then in the data source explorer in Eclipse i had created a connection and ran my statement and it worked like a charm.
So lesson learned is to always see to that the client and database are compatible language wise. Probably we will change so we use UTF-8 in the database so it is the same for every installation. Otherwise you will have to configure it for every installation depending on country and language.
Hope this will help someone. If the answer was unclear please post a comment.
In my vb application I want an autogenerated id of alphanumeric characters, like prd100. How can I increment it using Oracle as backend?
Any particular reason it needs to be alphanumeric? If it can just be a number, you can use an Oracle sequence.
But if you want just a random string, you could use the dbms_random function.
select dbms_random.string('U', 20) str from dual;
So you could probably combine these 2 ideas (in the code below, the sequence is called oid_seq):
SELECT dbms_random.string('U', 20) || '_' || to_char(oid_seq.nextval) FROM dual
There are two parts to your question. The first is how to create an alphanumeric key. The second is how to get the generated value.
So the first step is to determine the source of the alpha and the numeric components. In the following example I use the USER function and an Oracle sequence, but you will have your own rules. I put the code to assemble the key in a trigger which is called whenever a row is inserted.
SQL> create table t1 (pk_col varchar2(10) not null, create_date date)
2 /
Table created.
SQL> create or replace trigger t1_bir before insert on t1 for each row
2 declare
3 n pls_integer;
4 begin
5 select my_seq.nextval
6 into n
7 from dual;
8 :new.pk_col := user||trim(to_char(n));
9 end;
10 /
Trigger created.
SQL>
The second step requires using the RETURNING INTO clause to retrieve the generated key. I am using SQL*PLus for this example. I confess to having no idea how to wire this syntax into VB. Sorry.
SQL> var new_pk varchar2(10)
SQL> insert into t1 (create_date)
2 values (sysdate)
3 returning pk_col into :new_pk
4 /
1 row created.
SQL> print new_pk
NEW_PK
--------------------------------
APC61
SQL>
Finally, a word of warning.
Alphanumeric keys are a suspicious construct. They reek of "smart keys" which are, in fact, dumb. A smart key is a value which contains multiple parts. At somepoint you will find yourself wanting to retrieving all rows where the key starts with 'PRD', which means using SUBSTR() or LIKE. Even worse someday the definition of the smart key will change and you will have to cascade a complicated update to your table and its referencing foreign keys. A better ides is to use a surrogate key (number) and have the alphanumeric "key" defined as separate columns with a UNIQUE composite constraint to enforce the business rule.
SQL> create table t1 (id number not null
2 , alpha_bit varchar2(3) not null
3 , numeric_bit number not null
4 , create_date date
5 , constraint t1_pk primary key (id)
6 , constraint t1_uk unique (alpha_bit, numeric_bit)
7 )
8 /
Table created.
SQL>
I'm using Oracle on database server, from an XP client, using VB6 and ADO. In one transaction, I'm inserting one record into a parent table, which has a trigger and sequence to create a unique recordid, then that recordid is used for the relationship to a child table for a variable number of inserts to the child table. For performance, this is being sent in one execute command from my client app. For instance (simplified example):
declare Recordid int;
begin
insert into ParentTable (_field list_) Values (_data list_);
Select ParentTableSequence.currVal into Recordid from dual;
insert into ChildTable (RecordID, _field list_) Values (Recordid, _data list_);
insert into ChildTable (RecordID, _field list_) Values (Recordid, _data list_);
... multiple, variable number of additional ChildTable inserts
commit;
end;
This is working fine. My question is: I also need to return to the client the Recordid that was created for the inserts. On SQL Server, I can add something like a select to Scope_Identity() after the commit to return a recordset to the client with the unique id.
But how can I do something similar for Oracle (doesn't have to be a recordset, I just need that long integer value)? I've tried a number of things based on results from searching the 'net, but have failed in finding a solution.
These two lines can be compressed into a single statement:
-- insert into ParentTable (field list) Values (data list);
-- Select ParentTableSequence.currVal into Recordid from dual;
insert into ParentTable (field list) Values (data list)
returning ParentTable.ID into Recordid;
If you want to pass the ID back to the calling program you will need to define your program as a stored procedure or function, returning Recordid as an OUT parameter or a RETURN value respectively.
Edit
MarkL commented:
This is more of an Oracle PL/SQL
question than anything else, I
believe.
I confess that I no nothing about ADO, so I don't know whether the following example will work in your case. It involves building some infrastructure which allows us to pass an array of values into a procedure. The following example creates a new department, promotes an existing employee to manage it and assigns two new hires.
SQL> create or replace type new_emp_t as object
2 (ename varchar2(10)
3 , sal number (7,2)
4 , job varchar2(10));
5 /
Type created.
SQL>
SQL> create or replace type new_emp_nt as table of new_emp_t;
2 /
Type created.
SQL>
SQL> create or replace procedure pop_new_dept
2 (p_dname in dept.dname%type
3 , p_loc in dept.loc%type
4 , p_mgr in emp.empno%type
5 , p_staff in new_emp_nt
6 , p_deptno out dept.deptno%type)
7 is
8 l_deptno dept.deptno%type;
9 begin
10 insert into dept
11 (dname, loc)
12 values
13 (p_dname, p_loc)
14 returning deptno into l_deptno;
15 update emp
16 set deptno = l_deptno
17 , job = 'MANAGER'
18 , mgr = 7839
19 where empno = p_mgr;
20 forall i in p_staff.first()..p_staff.last()
21 insert into emp
22 (ename
23 , sal
24 , job
25 , hiredate
26 , mgr
27 , deptno)
28 values
29 (p_staff(i).ename
30 , p_staff(i).sal
31 , p_staff(i).job
32 , sysdate
33 , p_mgr
34 , l_deptno);
35 p_deptno := l_deptno;
36 end pop_new_dept;
37 /
Procedure created.
SQL>
SQL> set serveroutput on
SQL>
SQL> declare
2 dept_staff new_emp_nt;
3 new_dept dept.deptno%type;
4 begin
5 dept_staff := new_emp_nt(new_emp_t('MARKL', 4200, 'DEVELOPER')
6 , new_emp_t('APC', 2300, 'DEVELOPER'));
7 pop_new_dept('IT', 'BRNO', 7844, dept_staff, new_dept);
8 dbms_output.put_line('New DEPTNO = '||new_dept);
9 end;
10 /
New DEPTNO = 70
PL/SQL procedure successfully completed.
SQL>
The primary keys for both DEPT and EMP are assigned through triggers. The FORALL syntax is a very efficient way of inserting records (it also works for UPDATE and DELETE). This could be written as a FUNCTION to return the new DEPTNO instead, but it is generally considered better practice to use a PROCEDURE when inserting, updating or deleting.
That would be my preferred approach but I admit it's not to everybody's taste.
Edit 2
With regards to performance, bulk operations using FORALL will definitely perform better than a handful of individual inserts. In SQL, set operations are always preferable to record-by-record. However, if we are dealing with only a handful of records each time it can be hard to notice the difference.
Building a PL/SQL collection (what you think of as a temporary table in SQL Server) can be expensive in terms of memory. This is especially true if there are many users running the code, because it comes out of the session level allocation of memory, not the Shared Global Area. When we're dealing with a large number of records it is better to populate an array in chunks, perhaps using the BULK COLLECT syntax with a LIMIT clause.
The Oracle online documentation set is pretty good. The PL/SQL Developer's Guide has a whole chapter on Collections. Find out more.
It seems vsize() and length() return the same results. Does anyone know of a practical example of when to use vsize instead of length?
select vsize(object_name), length(object_name) from user_objects
Result:
/468ba408_LDAPHelper 20 20
/de807749_LDAPHelper 20 20
A4201_A4201_UK 14 14
A4201_PGM_FK_I 14 14
A4201_PHC_FK_I 14 14
Well, Length() takes a character argument (CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB, or NCLOB) whereas VSize() takes just about any data type, so if you pass Length() a noncharacter data type there has to be an implicit conversion.
Length is also sensitive to to character sets.
drop table daa_test;
create table daa_test as select sysdate dt from dual;
alter session set nls_date_format = 'YYYY-MM-DD';
select vsize(dt) from daa_test;
select length(dt) from daa_test;
alter session set nls_date_format = 'YYYY-MM-DD HH24:mi:ss';
select vsize(dt) from daa_test;
select length(dt) from daa_test;
... giving ...
drop table daa_test succeeded.
create table succeeded.
alter session set succeeded.
VSIZE(DT)
----------------------
7
1 rows selected
LENGTH(DT)
----------------------
10
1 rows selected
alter session set succeeded.
VSIZE(DT)
----------------------
7
1 rows selected
LENGTH(DT)
----------------------
19
1 rows selected
VSize is really of use IMHO in understanding internal storage requirements of data.
see: http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1897591221788