Log Postgresql sequence value that is missing in main table - spring

Gaps on id column occurs due to the usage of sequences on PostgreSQL as the default value of the table’s primary key. When the transaction rollback the sequence value will not rollback thus the id column will be not sequential. I have tried to eliminate the gaps but it seems to impact the performance. What I am thinking right now is to log those missing IDs along with error message and request payload. As a result, I am struggling getting the ID that was generated by sequence when DB transaction rollback.
Please suggest approach to get the missing sequence ID from DB when function is rollbacked so we can log it.

Related

replicat failing on insert with no data found

Failing to understand in which situation will an insert fail with no data found error. Any insights please.
Oracle GoldenGate Delivery for Oracle process started, group REPA discard file opened: 2020-08-21 18:32:07.326069
Current time: 2020-08-21 18:32:08
Discarded record from action ABEND on error 1403
No data found
Aborting transaction on /zfssa/gg_02/ogg/dirdat/REPA/EX beginning at seqno 473 rba 425209949
error at seqno 473 rba 425214669
Problem replicating SRC.TABLE to TGT.TABLE.
Record not found
Mapping problem with insert record (target format) SCN:3329198919.29.23.78560...
An error by NO DATA FOUND usually points to a inconsistency problem. The REPLICAT is basically an application doing data manipulation using SQL statements. If it attempts to perform a DML and the database rejects it, normally is because of inconsistency with the attempted DML and the records related to it.
For example, attempting to delete a row which does not exist will fail with a database error. Aside from an Oracle GoldenGate bug, this usually points to target database inconsistencies. In other words, the target database is in a state that the customer did not expect it to be in.
Determine why your target database is not in the state that you expect it to be. Numerous reasons can cause this and below are some them. This list is by no means exhaustive.
Possible Causes
The use of parameters to ignore previous errors such as HANDLECOLLISIONS, REPERROR with IGNORE or DISCARD options.
Primary keys or key columns that are different between source and target database. They might be the same columns but the type and/or size are different.
The target database is manipulated by an application program.
The target replicat is MAPped by filters and selection, or the extract DML operations have filters. This will be based on your business needs and may or may not be intentional.
Non-primary key table updates. If all the columns are used for replication there are cases whereby more than one update can occur making subsequent DML operations fail.
Non-primary key table updates where KEYCOLS are used. These keys may not be unique. To test uniqueness of selected keys, run the query on the source database based on these KEYCOLS and sort them.
The language and characterset in use (NLS, double-byte or multibyte charset) is different and may cause unexpected conversion issues automatically done by the database. Use SETENV parameter to change the language and set this before the USERID parameter.
Your source database and target database are of a different type, for example Oracle to MSSQL and the conversion done on the primary keys or key columns are not what you expected.
There are other specific configurations, patches, features, default database behaviors and so on. Search the My Oracle Support (MOS) Knowledge base for the database error number, example: "ORA-01403" under the Oracle GoldenGate core product. Review these knowledge solutions to see if they are related to your issue.
In rare situations, this may be an Oracle GoldenGate bug in that a particular DML was not captured or the values were incorrectly interpreted. Please submit an SR if you think this is the case. You will need to provide in addition to the target replicat reports and materials stated above, all extract reports on the source machine and GoldenGate trails.
Duplicate Mapping in replicat parameter with ALLOWDUPTARGETMAP parameter
Incorrect use of Extract parameter THREADOPTIONS PROCESSTHREADS. This can cause it to miss data.
Possible Solutions
What do you need to investigate a replicat database issue?
For a start you will need the replicat parameter file, report file and discard file.
Report file: Contains all warnings, errors, tables that are already mapped, columns mapped or unmapped and all run time environment settings.
Discard file: Display in detail the issue with mapping this table that generates the database error, the columns, its values, position of the record in the GoldenGate trail.
Parameter file: Usually the parameters are within the report file but this will be useful if the report file has been rolled over (REPORTROLLOVER parameter).
What are the next steps?
Query your target database based on the above data. Depending on the database error, the report file and/or discard file may contain the exact SQL statement used. Nevertheless one should be able to construct the appropriate query. This is to confirm that the replicat has indeed reported the correct database errors.
For example if the Oracle DB error was ORA-01403 which means no data found, your query should be selecting the row with the primary keys or keys as specified. Your query should return the same results as the replicat.
Fixing the replicat.
The first thing to consider is whether you can ignore this error for now and resolve the situation later on. If your business allows you to do so then you may either exclude this table altogether (TABLEEXCLUDE) or simply skip this error (REPERROR , DISCARD). If you skip the error, then start the replicat with REPERROR but run it for a short while (stop replicat) and remove this REPERROR. Then restart the replicat.
Fixing the database.
No matter what the reason is, you will need to resync the table that caused this issue.
Configuration Issue
If you have Duplicate mapping in replicat parameters in the replicat parameter file and ALLOWDUPTARGETMAP parameter is used, DML will be applied twice. This leads to ORA-1403 error on delete operation and ORA-001 error for Insert operation. Remove the duplicate mapping to fix the issue.
Summary, there are a lot of possibilities for a no data found error in a replicat process in GoldenGate.
This could also depend on the type of replicat you are using. Many of them perform queries inside the database prior to actually performing a DML statement. The error you have can often result from a lack of permissions in the target database, or if you are using something like a database vault, or other technology which manipulates how DML is performed.

Oracle deadlock output when caused by foreign keys

We have a multi-threaded batch job ending up in deadlock. I am getting conflicting answers from our dba's as to what actually causes the deadlock.
Caused by: java.sql.SQLException: ORA-00060: deadlock detected while waiting for resource
The error output references the sql for inserting into table A. Every row going into table A should be unique. Table A has foreign keys on two other tables, both of which are indexed and primary keys made up of two columns. Many rows in Table A can point to the same FK in the parent tables. Our code handles FK errors by trying to insert into parent tables and then trying into Table A again.
The sql in the trace log refers to the Table A insert sql (does not show param binding values). Does this mean definitively that there are two identical sql statements trying to be inserted into Table A in which case our prior logic is not thread-safe somewhere? Or could it really be that there are two inserts both referencing an unsatisfied FK? And the deadlock occurs from our error handling in trying to insert into the parent table. If so, would the sql in the trace not then reference the parent table sql?
Or perversely, does the original insert attempt put a lock on the row and then after handling the error, does the second attempt of the insert cause the deadlock? Any further debugging assistance?
There's not much info to work with, but my guess would be that two threads are trying to insert the same rows into one of the 'two other tables' at the same time. Idea on debugging below
Use a trigger on table A and on the other two tables ( 3 triggers in total) that write the inserted data to logging tables in an autonomous transaction that commits. This way you can see the uncommitted inserts that were rolled back due to the deadlock (the rows that exists in the logging tables but not in the actual tables are the ones that were rolled back).
HTH, KR

Codeigniter ORA-00001: unique constraint

This error: Message: oci_execute (): ORA-00001: unique constraint (SCHEMA_NAME.NAME CONSTRAINT) violated
I wonder if I do not how to handle the error more simply, more generic.
Because otherwise I will have to work on each function of the models to check the data before adding, for no duplications and not give the error noted above.
Does anyone know a simple way to solve this?
Thanks.
There is a hint that can be specified that will allow the statement to succeed without inserting the duplicated data. It can be useful for replication or bulk data loading where the job may attempt to insert the same data multiple times. I wouldn't recommend it as part of a user application.
"The IGNORE_ROW_ON_DUPKEY_INDEX hint applies only to single-table INSERT operations. It is not supported for UPDATE, DELETE, MERGE, or multitable insert operations. IGNORE_ROW_ON_DUPKEY_INDEX causes the statement to ignore a unique key violation for a specified set of columns or for a specified index. When a unique key violation is encountered, a row-level rollback occurs and execution resumes with the next input row."

Operations on certain tables won't finish

We have a table TRANSMISSIONS(ID, NAME) which behaves funny in the following ways:
The statement to add a foreign key in another table referencing TRANSMISSIONS.ID won't finish
The statement to add a column to TRANSMISSIONS won't finish
The statement to disable/drop a unique constraint won't finish
The statement to disable/drop a trigger won't finish
TRANSMISSION's primary key is ID, there is also a unique constraint on NAME - therefore there are indexes on ID and NAME. We also have a trigger which creates values for column ID using a sequence, so that INSERT statements do not need to provide a value for ID.
Besides TRANSMISSIONS, there are two more tables behaving like this. For other tables, the above-mentioned statements work fine.
The database is used in an application with Hibernate and due to an incorrect JPA configuration we produced high values for ID during a time. Note that we use the trigger only for "manual" INSERT statements and that Hibernate produces ID values itself, also using the sequence.
The first thought was that the problems were due to the high IDs but we have the problems also with tables that never had such high IDs.
Anyways we suspected that the indexes might be fragmented somehow and called ALTER INDEX TRANSMISSIONS_PK SHRINK SPACE COMPACT, which ran through but showed no effect.
Also, we wanted to call ALTER TABLE TRANSMISSIONS SHRINK SPACE COMPACT which didn't work because we needed to call first ALTER TABLE TRANSMISSIONS ENABLE ROW MOVEMENT which never finished.
We have another instance of the database which does not behave in such a funny way. So we think it might be that in the course of running the application the database got somehow into an inconsistent state.
Does someone have any suggestions what might have gone out of control/into an inconsitent state?
More hints:
There are no locks present on any objects in the database (according to information in v$lock and v$locked_object)
We tried all these statements in SQL Developer and also using SQLPlus (command-line tool).

Identical Oracle db setups: exception on just one of them

edit: Look to the end of this question for what caused the error and how I found out.
I have a very strange exception thrown on me from Hibernate when I run an app that does batch inserts of data into an oracle database. The error comes from the Oracle database, ORA-00001, which
" means that an attempt has been made to
insert a record with a duplicate
(unique) key. This error will also be
generated if an existing record is
updated to generate a duplicate
(unique) key."
The error is weird because I have created the same table (exactly same definition) on another machine where I do NOT get the same error if I use that through my app. AND all the data get inserted into the database, so nothing is really rejected.
There has to be something different between the two setups, but the only thing I can see that is different is the banner output that I get when issuing
select * from v$version where banner like 'Oracle%';
The database that gives me trouble:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Prod
The one that works:
Oracle Database 10g Release 10.2.0.3.0 - 64bit Production
Table definitions, input, and the app I wrote is the same for both. The table involved is basically a four column table with a composite id (serviceid, date, value1, value2) - nothing fancy.
Any ideas on what can be wrong? I have started out clean several times, dropping both tables to start on equal grounds, but I still get the error from the database.
Some more of the output:
Caused by: java.sql.BatchUpdateException: ORA-00001: unique constraint (STATISTICS.PRIMARY_KEY_CONSTRAINT) violated
at oracle.jdbc.driver.DatabaseError.throwBatchUpdateException(DatabaseError.java:367)
at oracle.jdbc.driver.OraclePreparedStatement.executeBatch(OraclePreparedStatement.java:8728)
at org.hibernate.jdbc.BatchingBatcher.doExecuteBatch(BatchingBatcher.java:70)
How I found out what caused the problem
Thanks to APC and ik_zelf I was able to pinpoint the root cause of this error. It turns out the Quartz scheduler was wrongly configured for the production database (where the error turned up).
For the job running against the non-failing oracle server I had <cronTriggerExpression>0/5 * * * * ?</cronTriggerExpression> which ran the batch job every five seconds. I figured that once a minute was sufficent for the other oracle server, and set the quartz scheduler up with * */1 * * * ?. This turns out to be wrong, and instead of running every minute, this ran every second!
Each job took approximately 1.5-2 seconds, and thus two or more jobs were running concurrently, thus causing simultaneous inserts on the server. So instead of inserting 529 elements, I was getting anywhere from 1000 to 2000 inserts. Changing the crontrigger expression to the same as the other one, running every five seconds, fixed the problem.
To find out what was wrong I had to set true in hibernate.cfg.xml and disable the primary key constraint on the table.
-- To catch exceptions
-- to find the offending rows run the following query
-- SELECT * FROM uptime_statistics, EXCEPTIONS WHERE MY_TABLE.rowid = EXCEPTIONS.row_id;
create table exceptions(row_id rowid,
owner varchar2(30),
table_name varchar2(30),
constraint varchar2(30));
-- This table was set up
CREATE TABLE MY_TABLE
(
LOGDATE DATE NOT NULL,
SERVICEID VARCHAR2(255 CHAR) NOT NULL,
PROP_A NUMBER(10,0),
PROP_B NUMBER(10,0),
CONSTRAINT PK_CONSTRAINT PRIMARY KEY (LOGDATE, SERVICEID)
);
-- Removed the constraint to see what was inserted twice or more
alter table my_table
disable constraint PK_CONSTRAINT;
-- Enable this later on to find rows that offend the constraints
alter table my_table
enable constraint PK_CONSTRAINT
exceptions into exceptions;
You have a unique compound constraint. ORA-00001 means that you have two or more rows which have duplicate values in ServiceID, Date, Value1 and/or Value2. You say the input is the same for both databases. So either:
you are imagining that your program is hurling ORA-00001
you are mistaken that the input is the same in both runs.
The more likely explanation is the second one: one or more of your key columns is populated by an external source or default value (e.g. code table for ServiceId or SYSDATE for the date column). In your failing database this automatic population is failing to provide a unique value. There can be any number of reasons why this might be so, depending on what mechanism(s) you're using. Remember that in a unique compound key NULL entries count. That is, you can have any number of records (NULL,NULL.NULL,NULL) but only one for (42,NULL,NULL,NULL).
It is hard for us to guess what the actual problem might be, and almost as hard for you (although you do have the advantage of being the code's author, which ought to grant you some insight). What you need is some trace statements. My preferred solution would be to use Bulk DML Exception Handling but then I am a PL/SQL fan. Hibernate allows you to hook in some logging to your programs: I suggest you switch it on. Code is a heck of a lot easier to debug when it has decent instrumentation.
As a last resort, disable the constraint before running the batch insert. Afterwards re-enable it like this:
alter table t42
enable constraint t42_uk
exceptions into my_exceptions
/
This will fail if you have duplicate rows but crucially the MY_EXCEPTIONS table will list all the rows which clash. That at least will give you some clue as to the source of the duplication. If you don't already have an exceptions table you will have to run a script: $ORACLE_HOME/rdbms/admin/utlexcptn.sql ( you may need a DBA to gain access to this directory).
tl;dr
insight requires information: instrument your code.
The one that has problems is a EE and the other looks like a SE database. I expect that the first is on quicker hardware. If that is the case, and your date column is filled using SYSDATE, it could very well be that the time resolution is not enough; that you get duplicate date values. If the other columns of your data are also not unique, you get ORA-00001.
It's a long shot but at first sight I would look into this direction.
Can you use an exception table to identify the data? See Reporting Constraint Exceptions
My guess would be the service id. Whatever service_id hibernate is using for the 'fresh' insert has already been used.
Possibly the table is empty in one database but populated in another
I'm betting though that the service_id is sequence generated and the sequence number is out of sync with the data content. So you have the same 1000 rows in the table but doing
SELECT service_id_seq.nextval FROM DUAL
in one database gives a lower number than the other. I see this a lot where the sequence has been created (eg out of source control) and data has been imported into the table from another database.

Resources