Vertica Primary Key Strange behavior [duplicate] - vertica

This question already has an answer here:
Unique Contrains in Vertica DB
(1 answer)
Closed 7 years ago.
I wrote this simple piece of code in vertica 7.1.2
select reenable_duplicate_key_error();
create table Person(id int PRIMARY KEY, firstname varchar(20));
insert into Person select 1, 'test1' union all select 1, 'test2' union all select 1, 'test3';
Now if I do a
select * from Person;
i see
id | firstname
----+-----------
1 | test1
1 | test2
1 | test3
(3 rows)
so it seems there is no effect of marking a key as primary key

This is expected and documented behavior. Vertica does not enforce uniqueness on load (imagine trying to ingest 500GB and having to rollback due to a PK violation). You can use analyze_constraints before committing or upgrade to 7.2 where you can enable enforcement of PK. It is still important to use keys for referential integrity.
See my blog post on other ways to enforce uniqueness on load.
Update: As of 7.2, Vertica can automatically enforce primary and unique key constraints. See the documentation for more information.

Related

Oracle - Preventing duplicates based on two columns [duplicate]

This question already has answers here:
How to give a unique constraint to a combination of columns in Oracle?
(3 answers)
Closed 6 years ago.
Say that I have a table an Oracle 11g database that was defined like this
CREATE TABLE LAKES.DEPARTMENTAL_READINGS
(
ID NUMBER NOT NULL,
DEPT_ID INTEGER NOT NULL,
READING_DATE DATE NOT NULL,
VALUE NUMBER(22,1)
):
And the data in the table looks like this:
ID (PK) DEPT_ID CREATION_DATE VALUE
-------------------------------------------------------------
1 101 10/12/2016 3.0
2 102 10/12/2016 2.5
3 103 10/12/2016 3.3
4 101 10/13/2016 3.4
5 102 10/13/2016 2.7
6 103 10/13/2016 4.0
As you can see, I have one entry for each date for each department ID. There should no more than one. We have merge statements handling our scripts for data imports so most of this is being prevented when data is pulled in. However, as there's no telling who may continue to write scripts for this application and we want to be as stringent as possible. Is there a way to set constraints to prevent duplicate data from being entered for each dept_id/creation_date combination?
You can create composite primary key on those 2 columns together. This will deny insert by throwing an error.

Get the last inserted row ID in trafodion

I want to get the row ID or record ID for last inserted record in the table in Trafodion.
Example:
1 | John <br/>
2 | Michael
When executing an INSERT statement, I want to return the created ID, means 3.
Could anyone tell me how to do that using trafodion or is it not possible ?
Are you using a sequence generator to generate unique ids for this table? Something like this:
create table idcol (a largeint generated always as identity not null,
b int,
primary key(a desc));
Either way, with or without sequence generator, you could get the highest key with this statement:
select max(a) from idcol;
The problem is that this statement could be very inefficient. Trafodion has a built-in optimization to read the min of a key column, but it doesn't use the same optimization for the max value, because HBase didn't have a reverse scan until recently. We should make use of the reverse scan, please feel free to file a JIRA. To make this more efficient with the current code, I added a DESC to the primary key declaration. With a descending key, getting the max key will be very fast:
explain select max(a) from idcol;
However, having the data grow from higher to lower values might cause issues in HBase, I'm not sure whether this is a problem or not.
Here is yet another solution: Use the Trafodion feature that allows you to select the inserted data, showing you the inserted values right away:
select * from (insert into idcol(b) values (11),(12),(13)) t(a,b);
A B
-------------------- -----------
1 11
2 12
3 13
--- 3 row(s) selected.

DB2 duplicate key error when inserting, BUT working after select count(*)

I have a - for me unknown - issue and I don't know what's the logic/cause behind it. When I try to insert a record in a table I get a DB2 error saying:
[SQL0803] Duplicate key value specified: A unique index or unique constraint *N in *N
exists over one or more columns of table TABLEXXX in SCHEMAYYY. The operation cannot
be performed because one or more values would have produced a duplicate key in
the unique index or constraint.
Which is a quite clear message to me. But actually there would be no duplicate key if I inserted my new record seeing what records are already in there. When I do a SELECT COUNT(*) from SCHEMAYYY.TABLEXXX and then try to insert the record it works flawlessly.
How can it be that when performing the SELECT COUNT(*) I can suddenly insert the records? Is there some sort of index associated with it which might give issues because it is out of sync? I didn't design the data model, so I don't have deep knowledge of the system yet.
The original DB2 SQL is:
-- Generate SQL
-- Version: V6R1M0 080215
-- Generated on: 19/12/12 10:28:39
-- Relational Database: S656C89D
-- Standards Option: DB2 for i
CREATE TABLE TZVDB.PRODUCTCOSTS (
ID INTEGER GENERATED BY DEFAULT AS IDENTITY (
START WITH 1 INCREMENT BY 1
MINVALUE 1 MAXVALUE 2147483647
NO CYCLE NO ORDER
CACHE 20 )
,
PRODUCT_ID INTEGER DEFAULT NULL ,
STARTPRICE DECIMAL(7, 2) DEFAULT NULL ,
FROMDATE TIMESTAMP DEFAULT NULL ,
TILLDATE TIMESTAMP DEFAULT NULL ,
CONSTRAINT TZVDB.PRODUCTCOSTS_PK PRIMARY KEY( ID ) ) ;
ALTER TABLE TZVDB.PRODUCTCOSTS
ADD CONSTRAINT TZVDB.PRODCSTS_PRDCT_FK
FOREIGN KEY( PRODUCT_ID )
REFERENCES TZVDB.PRODUCT ( ID )
ON DELETE RESTRICT
ON UPDATE NO ACTION;
I'd like to see the statements...but since this question is a year old...I won't old my breath.
I'm thinking the problem may be the
GENERATED BY DEFAULT
And instead of passing NULL for the identity column, you're accidentally passing zero or some other duplicate value the first time around.
Either always pass NULL, pass a non-duplicate value or switch to GENERATED ALWAYS
Look at preceding messages in the joblog for specifics as to what caused this. I don't understand how the INSERT can suddenly work after the COUNT(*). Please let us know what you find.
Since it shows *N (ie n/a) as the name of the index or constraing, this suggests to me that is is not a standard DB2 object, and therefore may be a "logical file" [LF] defined with DDS rather than SQL, with a key structure different than what you were doing your COUNT(*) on.
Your shop may have better tools do view keys on dependent files, but the method below will work anywhere.
If your table might not be the actual "physical file", check this using Display File Description, DSPFD TZVDB.PRODUCTCOSTS, in a 5250 ("green screen") session.
Use the Display Database Relations command, DSPDBR TZVDB.PRODUCTCOSTS, to find what files are defined over your table. You can then DSPFD on each of these files to see the definition of the index key. Also check there that each of these indexes is maintained *IMMED, rather than *REBUILD or *DELAY. (A wild longshot guess as to a remotely possible cause of your strange anomaly.)
You will find the DB2 for i message finder here in the IBM i 7.1 Information Center or other releases
Is it a paging issue? we seem to get -0803 on inserts occasionally when a row is being held for update and it locks a page that probably contains the index that is needed for the insert? This is only a guess but it appears to me that is what is happening.
I know it is an old topic, but this is what Google shown me on the first place.
I had the same issue yesterday, causing me a lot of headache. I did the same as above, checked the table definitions, keys, existing items...
Then I found out the problem was with my INSERT statement. It was trying to insert to identical records at once, but as the constraint prevented the commit, I could not find anything in the database.
Advice: review your INSERT statement carefully! :)

Declarative integrity constraint between rows without pivot

I have a situation like the following join table:
A_ID B_ID
1 27
1 314
1 5
I need to put a constraint on the table that will prevent a duplicate group from being entered. In other words:
A_ID B_ID
2 27
2 314
2 5
should fail, but
A_ID B_ID
3 27
3 314
should succeed, because it's a distinct group.
The 2 ways I've thought of are:
Pivot the table in a materialize view based upon the order and put a unique key on the pivot fields. I don't like this because in Oracle I have to limit the number of rows in a group because of both the pivoting rules, and the 32-column index limitation (thought of a way around this second problem, but still).
Create some unique hash value on the combination of the B_IDs and make that unique. Maybe I'm not enough of a mathematician, but I can't think of a way to do this that doesn't limit the number of values that I can use for B_ID.
I feel like there's something obvious I'm missing here, like I could just add some sort of an ordering column and set a different unique key, but I've done quite a bit of reading and haven't come up with anything. It might also be that the data model I inherited is flawed, but I can't think of anything that would give me similar flexibility.
Firstly a regular constraint can't work.
If the set with A_ID of 1 exists, and then session 1 inserts a record with A_ID 2 and B_ID of 27, session 2 inserts (2,314) and session 3 inserts (2,5), then none of those would see a conflict to cause a constraint violation. Triggers won't work either. Equally, if a set existed of (6,99), then it would be difficult for another session to create a new set of (6,99,300).
The MV with 'refresh on commit' could work, preventing the last session from successfully committing. I'd look more at the hashing option, summing up the hashed B_ID's for each A_ID
select table_name, sum(ora_hash(column_id)), count(*)
from user_tab_columns
group by table_name
While hash collisions are possible, they are very unlikely.
If you are on 11g check out LISTAGG too.
select table_name, listagg(column_id||':') within group (order by column_id)
from user_tab_columns
group by table_name

Why is oracle spewing bad table metadata?

I'm using DBVisualizer to extract DDL from an Oracle 10.2 DB. I'm getting odd instances of repeated columns in constraints, or repeated constraints in the generated DDL. At first I chalked it up to a bug in DBVisualizer, but I tried using Apache DDLUtils against the DB and it started throwing errors which investigation revealed to be caused by the same problem. The table metadata being returned by Oracle appears to have multiple entries for some FK constraints.
I can find no reference to this sort of thing from my google searches and I was wondering if anyone else had seen the same thing. Is this a bug in the Oracle driver, or does the metadata contain extra information which is being dropped when my tools access it, resulting in confusion on the part of the tools...
Here is an example (truncated) DDL output from
CREATE TABLE ARTIST
(
ID INTEGER NOT NULL,
FIRST_NAME VARCHAR2( 128 ),
LAST_NAME VARCHAR2( 128 ),
CONSTRAINT ARTIST_ID_PK PRIMARY KEY( ID ),
CONSTRAINT ARTIST_CONTENT_ID_FK FOREIGN KEY( ID, ID, ID ) REFERENCES CMS_CONTENT( CONTENT_ID, CONTENT_ID, CONTENT_ID )
-- note the multiple instances of ID and CONTENT_ID in the above line
-- rest assured there is nothing bizarre about the foreign table CMS_CONTENT
)
I'm attempting to find a Java example which can show the behaviour, and will update the question when I have a concrete example.
You can try the built-in Oracle DBMS_METADATA.GET_DDL('TABLE','ARTIST') and see if that resolves the issue (ie whether it is a bug in the tools or the DB).
You can look at the data_dictionary tables too. In this case, ALL_CONSTRAINTS and ALL_CONS_COLUMNS.
select ac.owner, ac.constraint_name, ac.table_name, ac.r_owner, ac.r_constraint_name,
acc.column_name, acc.position
from all_constraints ac join all_cons_columns acc on
(ac.owner = acc.owner and ac.constraint_name = acc.constraint_name)
where ac.table_name = 'ARTIST'
and ac.constraint_type = 'R'
I'd suspect that it is a bug in the tools, and they've missed a join on the owning schema and you are picking up the same table/constraint but in another user's schema.
As far as I can see, dbvis (6.5.7) uses own code when you use the 'DDL' tab and it uses dbms_metadata when using the tab 'DDL with Storage'.
Does this make a difference for you ?
Ronald

Resources