Why does Spring data jdbc run a select statement when inserting? - spring

I have a deadlock that baffles me. It happens in my spring application upon inserting into a Table without an active transaction. I use the save method on the SimpleJdbcRepository. This question is not about solving the Deadlock but about why it happens.
The best I can guess so far, it for some reason 'save' starts a transaction, runs at least one select statement and somehow acquires locks and runs into a deadlock. But that would actually require at least 2 select statements. It shouldn't do any upon inserting. Also I cannot reproduce it locally, the jdbc logs only show the insert statement.
Does someone have any clues why Spring data jdbc runs these extra select statements?
Bonus question, how do I get rid of them?
Here is the stacktrace as part of the deadlock exception. Best I can tell the save function triggers the select that triggers the deadlock:
org.postgresql.util.PSQLException: ERROR: deadlock detected
Detail: Process 315008 waits for ShareLock on transaction 11699360; blocked by process 315268.
Process 315268 waits for ShareLock on transaction 11699361; blocked by process 315008.
Hint: See server log for query details.
Where: while locking tuple (5,18) in relation "journey"
SQL statement "SELECT 1 FROM ONLY "my_schema"."user" x WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x"
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2676)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2366)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:356)
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:496)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:413)
at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:190)
at org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:152)
at com.zaxxer.hikari.pool.ProxyPreparedStatement.executeUpdate(ProxyPreparedStatement.java:61)
at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeUpdate(HikariProxyPreparedStatement.java)
at org.springframework.jdbc.core.JdbcTemplate.lambda$update$3(JdbcTemplate.java:992)
at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:651)
at org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:991)
at org.springframework.jdbc.core.namedparam.NamedParameterJdbcTemplate.update(NamedParameterJdbcTemplate.java:356)
at org.springframework.jdbc.core.namedparam.NamedParameterJdbcTemplate.update(NamedParameterJdbcTemplate.java:340)
at org.springframework.data.jdbc.core.convert.IdGeneratingInsertStrategy.execute(IdGeneratingInsertStrategy.java:67)
at org.springframework.data.jdbc.core.convert.DefaultDataAccessStrategy.insert(DefaultDataAccessStrategy.java:117)
at org.springframework.data.jdbc.core.JdbcAggregateChangeExecutionContext.executeInsertRoot(JdbcAggregateChangeExecutionContext.java:77)
at org.springframework.data.jdbc.core.AggregateChangeExecutor.execute(AggregateChangeExecutor.java:66)
at org.springframework.data.jdbc.core.AggregateChangeExecutor.lambda$execute$0(AggregateChangeExecutor.java:52)
at java.base/java.util.ArrayList.forEach(Unknown Source)
at org.springframework.data.relational.core.conversion.DefaultAggregateChange.forEachAction(DefaultAggregateChange.java:127)
at org.springframework.data.jdbc.core.AggregateChangeExecutor.execute(AggregateChangeExecutor.java:52)
at org.springframework.data.jdbc.core.JdbcAggregateTemplate.store(JdbcAggregateTemplate.java:360)
at org.springframework.data.jdbc.core.JdbcAggregateTemplate.save(JdbcAggregateTemplate.java:161)
at org.springframework.data.jdbc.repository.support.SimpleJdbcRepository.save(SimpleJdbcRepository.java:78)

Not a complete answer but too much for a comment:
if you have a TransactionManager the save method will run in a transaction.
Spring Data JDBC isn't issuing a select statement. You can tell from at org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:991) that it actually executes an update/insert.
So the query you are seeing is probably created by Postgres itself.

Related

Oracle Lock , How do they differ?

What is the difference between the below two erros.as far as i understood, they happen in case of a Lock. But do you know the differene in scenarios where one might occure.
ORA-04021: timeout occurred while waiting to lock object
and
ORA-00054: resource busy and acquire with NOWAIT specified
Example for ORA-04021 might be this: there's a package in your schema. It contains a procedure which does some job that takes 15 minutes to finish. Someone runs that procedure. Meanwhile, you'd want to fix something in that package, so you edit its code and want to compile it. Well, you can't - it is being used so you'll have to wait until it is released. Oracle tells you that timeout occurred while you're waiting to lock the package and compile it.
Example for ORA-00054: there's a table. You update some values in it, but didn't commit (nor rollback) as you have to do something else as well. In another session, another user wants to alter one of table's columns (for example, enlarge its size). ALTER will then raise 0RA-00054 which says that table is busy (you're updating it in another session, right?) so you'll have to wait until transaction commits (or rollbacks).

Select * from table#dblink in PL/SQL Developer

When I run query select * from table#dblink in PL/SQL Developer,transaction commit/rollback icons are activated, but then if I use Fetch last page these icons are disabled. Why is this happening?
Querying over a db_link flips the 'we have a transaction' switch in the data dictionary
In most tools, you'll get a prompt for COMMIT or an indicator of an open transaction whenever you query against a DB_LINK.
That's because you're doing 'something' that's not clear to us in a different database. Your 'SELECT' could have side effects which require a COMMIT/ROLLBACK, or as Tom would say
'If you are distributed, you would want to commit to finish off anything that was implicitly started on the remote site.'
I think that PL/SQL is trying to remove useless transactions to help avoid session errors. It seems that whenever you press the "Fetch last page" button, PL/SQL Developer runs commit write batch if the statement contains a database link, if there are no transactions currently open in the session, and if the statement does not include FOR UPDATE.
Those are a lot of weird conditions, but they seem to ensure that the program won't commit when it shouldn't. I assume PL/SQL Developer is using commit write batch to use less resources than a normal commit. That guess is based on the number returned by this query increasing when I hit the button. (There's another statistic for user commits, and that number does not increase.)
select value
from v$mystat
join v$statname on v$mystat.statistic# = v$statname.statistic#
where lower(display_name) = 'commit batch performed';
This behavior is a little odd, but it could help prevent some errors in the session. For example, if you later try to run alter session enable parallel dml the session would throw the error ORA-12841: Cannot alter the session parallel DML state within a transaction. By committing the (worthless) transaction, you avoid some of those errors.

PostgreSQL vs. Oracle default transaction management

In PostgreSQL, if you encounter an error in transaction (for example when your insert statement violates unique constraint), the whole transaction is aborted, you cannot commit it and no rows are inserted:
database=# begin;
BEGIN
database=# insert into table (id, something) values ('1','whatever');
INSERT 0 1
database=# insert into table (id, something) values ('1','whatever');
ERROR: duplicate key value violates unique constraint "table_id_key"
Key (id)=(1) already exists.
database=# insert into table (id, something) values ('2','whatever');
ERROR: current transaction is aborted, commands ignored until end of transaction block
database=# rollback;
database=# select * from table;
id | something |
-----+------------+
(0 rows)
You can change that by setting ON_ERROR_ROLLBACK to "on" or "interactive", after that you can do multiple inserts ignoring errors, commit and have only successfully inserted rows in table after transaction end.
database=# \set ON_ERROR_ROLLBACK interactive
In Oracle, this is the default transaction management behaviour, which surprises me. Isn't this completely counterintuitive and dangerous?
When I start a transaction I want to be sure that all the statements were successfull. What if my multiple inserts comprise some kind of an object or data structure? I end up completely unaware of the data state in my database and should be checking it after the commit.
If one of the inserts fails I want to be sure that other inserts will be rollbacked or not even evaluated after the first error, which is exactly how it's done in PostgreSQL.
Why does Oracle have such way of transaction management as a default, and why is it considered good practice?
For example, some random guy here in comments
This is a very neat feature.
I don't understand this, though: "Normally, any error you make will
throw an exception and cause your current transaction to be marked as
aborted. This is sane and expected behavior..."
No, it's really not. Oracle doesn't work this way, nor does MySQL. I
have no experience with MSSQL or DB2 but I'll bet a dollar each they
don't work this way either. There no intuitive reason why a syntax
error, or any other error for that matter, should abort a transaction.
I can only assume there's either some limitation deep in the Postgres
guts that requires this behavior, or that it conforms to some obscure
part of the SQL standard that everyone else sensibly ignores. There's
certainly no API / UX reason why it should work this way.
We really shouldn't be too proud of any workarounds we've developed
for this pathological behavior. It's like IT Stockholm Syndrome.
Does not it violate even the definition of the transaction?
Transactions provide an "all-or-nothing" proposition, stating that
each work-unit performed in a database must either complete in its
entirety or have no effect whatsoever.
I agree with you. I think it's a mistake not to abort the whole tx. But people are used to that, so they think it's reasonable and correct. Like people who use MySQL think that the DBMS should accept 0000-00-00 as a date, or people using Oracle expect that '' IS NULL.
The idea that there's a clear distinction between a syntax error and something else is flawed.
If I write
BEGIN;
CREATE TABLE new_customers (...);
INSET INTO new_customers (...)
SELECT ... FROM customers;
DROP TABLE customers;
COMMIT;
I don't care that it's a typo resulting in a syntax error that caused me to lose my data. I care that the transaction didn't successfully execute all its statements but still committed.
It'd be technically feasible to allow soft rollback in PostgreSQL before any rows are actually written by a statement - probably before we even enter the executor. So failures in the parse and parameter binding phases could allow the tx not to be aborted. We have a statement memory context we could use to clean up.
However, once the statement starts changing rows, it's doing so on disk with the same transaction ID as the prior statements in the tx. So you can't roll it back without rolling back the whole tx. To allow statement rollback Pg needs to assign a new subtransaction ID. That costs resources. You can do it explicitly with SAVEPOINTs when you want to, and internally that's what psql is doing. In theory we could allow the server to do this implicitly for each statement to implement statement rollback, just at a performance cost. But I doubt any patch implementing this would get committed, at least not without a LOT of argument, because most of the PostgreSQL team are (IMO reasonably) not fond of "whoops, that broke but we'll continue anyway" transaction semantics.

What is the reason for ORA-00054 error?

From Oracle's documentation:-
ORA-00054 resource busy and acquire with NOWAIT specified
Cause: Resource interested is busy.
Action: Retry if necessary.
In our code we issue a SELECT FOR UPDATE NOWAIT command to lock the row we are about to update.
Right now the logic is if it returns SQL error 54 then it is assumed that another user is trying to update that same record. Is this logic valid?
From Oracle's documentation it looks more like if the DB is overwhelmed then this might also cause this error to be thrown.
What are the possible reasons for this error, when we are only using the above SQL command?
The SELECT ... FOR UPDATE attempts to acquire an RS (Row Share) lock on the table and an X (eXclusive) lock on the row. If another session has an exclusive lock on the table (eg creating an index) or an exclusive lock on the row (update, delete, or select for update) then the query will wait for the other transaction to release the lock (commit or rollback generally) unless you have specified NOWAIT.
So one possibility is to not specify NOWAIT.
I don't recognise the situation where the database might throw this error due to being "overwhelmed".

Finding all statements involved in a deadlock from an Oracle trace file?

As I understand it, the typical case of a deadlock involving row-locking requires four SQL statements. Two in one transaction to update row A and row B, and then a further two in a separate transaction to update the same rows, and require the same locks, but in the reverse order.
Transaction 1 gets the lock on row A before transaction 2 can request it, transaction 2 gets the lock on row B before transaction 1 can get it, and neither can get the remaining required locks. One or either transaction has to be rolled back, so the other can complete.
When I review an Oracle trace file after a deadlock, it only seems to highlight two queries. These seem to be the last one out of each transaction.
How can I identify the other statements involved in each transaction, or is this missing in an Oracle trace file?
I can include relevant bits of the specific trace file if required.
You're correct, in a typical row-level deadlock, you'll have session 1 execute sql_a that will lock row 1. Then session 2 will execute sql_b that will lock row 2. Then session 1 will execute sql_c to attempt to lock row 2, but session 2 has not committed, and so session 1 starts waiting. Finally, session 2 comes along, and it issues sql_d, attempting to lock row 1, but, since session 1 holds that lock, it starts waiting. Three seconds later, the deadlock is detected, and one of the sessions will catch ORA-00060 and the trace file is written.
In this scenario, the trace file will contain sql_c and sql_d, but not sql_a or sql_b.
The problem is that information just really isn't available anywhere. Consider that you execute a DML, it starts a transaction if one doesn't exist, generates a bunch of undo and redo, and the change is made. But, once that happens, the session is no longer associated with that SQL statement. There's really no clean way to go back and find that information.
sql_c and sql_d, on the other hand, are the statements that were associated with those sessions when the deadlock occurred, so, clearly, Oracle can identify them, and include that in the trace file.
So, you're correct, the information about sql_a and sql_b is not in the trace, and it's really not readily available.
Hope that helps.

Resources