How JTA/JTS handle transaction time out issue? - spring

Below is my understand that JTA/ JTS handle transaction time out issue. But I cannot find my document or material to back my understand. Is my understand right? Do u know any material is refer to this issue?
Application Server iterates through all the transactions to check timeout. If a transaction timeout occurs, application server marks roll back for the transaction, and log down the detail. But Application Server neither throws exception nor interrupts the transaction this moment. When the transaction thread continue to attempt to access another transactional resource (like JDBC/ JMS), the transactional resource which implements JTA interface will check roll back flag first before go further. Then at this moment, RollbackException is thrown.
==========
Test Case 1:
Set transaction timeout to 10 secs
I. Transaction begin
II. Sleep 20 secs
III. System out "Sleep end"
Result: Timeout occur at 10th secs, and system out log down the timeout detail, but not throw exception. "Sleep end" will be printed.
==========
Test Case 2:
Set transaction timeout to 10 secs
I. Transaction begin
II. Sleep 20 secs
III. Access db 1st time
IV. Access db 2nd time
V. System out "Sleep end"
Result: Timeout occur at 10th secs, and system out logs down the timeout detail, but not throw exception. Exception throws while access db 1st time. "Sleep end" will not be printed.
==========
Test Case 3:
Set transaction timeout to 10 secs
I. Transaction begin
II. Access db and db deadlock
Result: Timeout occur at 10th secs, and system out logs down the timeout detail. No exception throws, the transaction thread is stuck. So transaction timeout control cannot handle db timeout issue. I am so confused about this..
In my understanding, above behavior should be the same while using spring transaction management(JTA) and EJB. Am I right?
Thanks for ur help!

Tested, and proved that my understand should be correct.
Summarize the result as below:
• Transaction timeout control only affects transactional activities (Ex: access DB/ send JMS message).
• Application server do not interrupt current transaction thread immediately while timeout occurs, instead, application server only log down the detail. Timeout exception will throw while transaction commit or attempt to access next transactional activities.
• DB deadlock issue cannot be handled by transaction timeout control. But DB2 have deadlock prevent mechanism to release the deadlock and roll back transaction for some cases.

Related

JMeter : Check JDBC connection open or close after each thread execution

I am trying to check the JDBC connection status after each thread execution whether it is close or open?.
In my thread group there are three things
JDBC connection configuration
JDBC request (select * from employee)
JSR223 PostProcessor
Script :
def connection = org.apache.jmeter.protocol.jdbc.config.DataSourceElement.getConnection('ConnectionString')
log.info('*************Connection closed: '+ connection.isClosed())
Above script is logging the connection status after each thread execution when loop count is 1. Problem here as soon as I change the loop count to >= 2. it started throwing the error below error
Problem in JSR223 script, JSR223 PostProcessor
javax.script.ScriptException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object
And when I remove the Post processor and increase the loop count it is working fine.
Logs :
2023-02-19 16:15:33,599 INFO o.a.j.t.JMeterThread: Thread started: DB Thread Group 2-1
2023-02-19 16:15:38,054 DEBUG o.a.j.p.j.AbstractJDBCTestElement: executing jdbc:SELECT * FROM EMPLOYEE
2023-02-19 16:15:38,623 INFO o.a.j.e.J.JSR223 PostProcessor: *************Connection closed: false
2023-02-19 16:15:58,637 ERROR o.a.j.e.JSR223PostProcessor: Problem in JSR223 script, JSR223 PostProcessor
javax.script.ScriptException: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object, borrowMaxWaitDuration=PT10S
at org.codehaus.groovy.jsr223.GroovyScriptEngineImpl.eval(GroovyScriptEngineImpl.java:320) ~[groovy-jsr223-3.0.11.jar:3.0.11]
at org.codehaus.groovy.jsr223.GroovyCompiledScript.eval(GroovyCompiledScript.java:71) ~
In the JDBC configuration are you using a connection pool to request connections?
What your test shows is that the JSR223 script is closing the connection, which is probably a good thing from a coding perspective, but the next iteration of the loop tries to execute a request with a closed Connection and blammo. If you switch from raw connections to a connection pool when the JSR 223 closes the connection it'll be returned to the pool and remain open for the next iteration of the loop. You'll have to switch to using DataSource API typically for this, but it's a minor tweak to the script.
I can think of 2 possible reasons:
Either your database is down/overloaded/not reachable via JDBC
Or your connection pool settings need to be tweaked, i.e. max number of connections and/or wait time need to be increased:
In general I don't think your approach is correct, as per JavaDoc:
This method generally cannot be called to determine whether a connection to a database is valid or invalid. A typical client can determine that a connection is invalid by catching any exceptions that might be thrown when an operation is attempted.
So you might want to increase debug logging verbosity for JMeter, your JDBC driver and Java SQL namespace instead

Running out of SQL connections with Quarkus and hibernate-reactive-panache

I've got a Quarkus app which uses hibernate-reactive-panache to run some queries and than process the result and return JSON via a Rest Call.
For each Rest call 5 DB queries are done, the last one will load about 20k rows:
public Uni<GraphProcessor> loadData(GraphProcessor graphProcessor){
return myEntityRepository.findByDateLeaving(graphProcessor.getSearchDate())
.select().where(graphProcessor::filter)
.onItem().invoke(graphProcessor::onNextRow).collect().asList()
.onItem().invoke(g -> log.info("loadData - end"))
.replaceWith(graphProcessor);
}
//In myEntityRepository
public Multi<MyEntity> findByDateLeaving(LocalDate searchDate){
LocalDateTime startDate = searchDate.atStartOfDay();
return MyEntity.find("#MyEntity.findByDate",
Parameters.with("startDate", startDate)
.map()).stream();
}
This all works fine for the first 4 times but on the 5th call I get
11:12:48:070 ERROR [org.hibernate.reactive.util.impl.CompletionStages:121] (147) HR000057: Failed to execute statement [$1select <ONE OF THE QUERIES HERE>]: $2could not load an entity: [com.mycode.SomeEntity#1]: java.util.concurrent.CompletionException: io.vertx.core.impl.NoStackTraceThrowable: Timeout
at <16 internal lines>
io.vertx.sqlclient.impl.pool.SqlConnectionPool$1PoolRequest.lambda$null$0(SqlConnectionPool.java:202) <4 internal lines>
at io.vertx.sqlclient.impl.pool.SqlConnectionPool$1PoolRequest.lambda$onEnqueue$1(SqlConnectionPool.java:199) <15 internal lines>
Caused by: io.vertx.core.impl.NoStackTraceThrowable: Timeout
I've checked https://quarkus.io/guides/reactive-sql-clients#pooled-connection-idle-timeout and configured
quarkus.datasource.reactive.idle-timeout=1000
That itself did not make a difference.
I than added
quarkus.datasource.reactive.max-size=10
I was able to run 10 Rest calls before getting the timeout again. On a pool setting of max-size=20 I was able to run it 20 times. So it does look like each Rest call will use up a SQL connection and not release it again.
Is there something that needs to be done to manually release the connection or is this simply a bug?
The problem was with using #Blocking on a reactive Rest method.
See https://github.com/quarkusio/quarkus/issues/25138 and https://quarkus.io/blog/resteasy-reactive-smart-dispatch/ for more information.
So if you have a rest method that returns e.g. Uni or Multi, DO NOT use #Blocking on the call. I had to initially add it as I received an Exception telling me that the thread cannot block. This was due to some CPU intensive calculations. Adding #Blocking made that exception go away (in dev-mode but another problem popped up in native mode) but caused this SQL pool issue.
The real solution was to use emitOn to change the thread for the cpu intensive method:
.emitOn(Infrastructure.getDefaultWorkerPool())
.onItem().transform(processor::cpuIntensiveMethod)

KafkaConsumer poll() behavior understanding

Trying to understand (new to kafka)how the poll event loop in kafka works.
Use Case : 25 records on the topic, max poll size is set to 5.
max.poll.interval.ms = 5000 //5 seconds by default max.poll.records = 5
Sequence of tasks
Poll the records from the topic.
Process the records in a for loop.
Some processing login where the logic would either pass or fail.
If logic passes (with offset) will be added to a map.
Then it will be committed using commitSync call.
If fails then the loop will break and whatever was success before this would be committed.The problem starts after this.
The next poll would just keep moving in batches of 5 even after error, is it expected?
What we basically expect is that the loop breaks and the offsets till success process message logic should get committed, then the next poll should continue from the failed message.
Example, 1st batch of poll 5 messages polled and 1,2 offsets successful and committed then 3rd failed.So the poll call keep moving to next batch like 5-10,10-15 if there are any errors in between we expect it to stop at that point and poll should start from 3 in first case or if it fails in 2nd batch at 8 then the next poll should start from 8th offset not from next max poll batch settings which would be like 5 in this case.IF IT MATTERS USING SPRING BOOT PROJECT and enable autocommit is false.
I have tried finding this in documentation but no help.
tried tweaking this but no help max.poll.interval.ms
EDIT: Not accepted answer because there is no direct solution for a customer consumer.Keeping this for informational purpose
max.poll.interval.ms is milliseconds, not seconds so it should be 5000.
Once the records have been returned by the poll (and offsets not committed), they won't be returned again unless you restart the consumer or perform seek() operations on the consumer to reset the offset to the unprocessed ones.
The Spring for Apache Kafka project provides a SeekToCurrentErrorHandler to perform this task for you.
If you are using the consumer yourself (which it sounds like), you must do the seeks.
You can manually seek to the beginning offset of the poll for all the assigned partitions on failure. I am not sure using spring consumer.
Sample code for seeking offset to beginning for normal consumer.
In the code below I am getting the records list per partition and then getting the offset of the first record to seek to.
def seekBack(records: ConsumerRecords[String, String]) = {
records.partitions().map(partition => {
val partitionedRecords = records.records(partition)
val offset = partitionedRecords.get(0).offset()
consumer.seek(partition, offset)
})
}
One problem doing this in production is bad since you don't want seekback all the time only in cases where you have a transient error otherwise you will end up retrying infinitely.

Jco Adapter pooling performance deadlock?

We're running an enterprise scale SAP application with front-end springboot clients connecting via Jco adapter 3.0 on Oracle VM using the connection pool (size 100). We're experiencing unsystematic long-running requests > 10s that are not visible in the SAP application server log, i.e. the bottleneck does not appear to be on SAP side.
Looking at the trace files (level 4) for an example request we can see that the time seems lost when the adapter thread tries to get the client from the pool (other threads continue execution, removed the irrelevant threads for clarity):
[20:05:50:259]: [JCoAPI] JCoContext.isStateful(P-foo-CPIC0) in session ID Client-53-1 returns false
[20:05:50:259]: [JCoAPI] JCoContext.begin(P-foo-CPIC0) in session ID Client-53-1
[20:05:50:259]: [JCoAPI] Started context for session Client-53-1
[20:05:50:259]: [JCoAPI] JCoContext.begin() for destination PFOO_200 (P-foo-CPIC0) on context with id Client-53-1; current state counter is 1
[20:05:50:259]: [JCoAPI] destination PFOO_200 destinationID=P-foo-CPIC0 executes Z_foo sessionID=Client-53-1, threadID=0x35
[20:05:50:259]: [JCoAPI] Context.getConnection on destination PFOO_200 (state: destination = STATEFUL, default = STATELESS)
[20:05:50:259]: [JCoAPI] PoolingFactory.getClient() on pool P-foo-CPIC0
--> time lost here
[20:06:20:840]: [JCoAPI] PoolingFactory.getClient() returns handle [3/84977415]
[20:06:20:840]: [JCoAPI] Context.getConnection on destination PFOO_200 nothing found in the context - got client from ConnectionManager [3/84977415]
[20:06:20:840]: [JCoAPI] JCoClient before execute(Z_foo) on handle [3/84977415]
[20:06:20:840]: [JCoRFC] Executing function Z_foo on handle [3/84977415]
[20:06:20:866]: [JCoAPI] JCoClient after execute(Z_foo) on handle [3/84977415] returns after 26 ms
[20:06:20:866]: [JCoAPI] Context.releaseConnection on destination PFOO_200 [3/84977415]
[20:06:20:867]: [JCoAPI] JCoContext.end(P-foo-CPIC0) in session ID Client-53-1
[20:06:20:867]: [JCoAPI] PoolingFactory.releaseClient() handle [3/84977415] into pool P-foo-CPIC0 [pool size: 3, peak limit: 100, waiting threads: 0, currently used: 1]
[20:06:20:879]: [JCoAPI] Finished context for session Client-53-1
[20:06:20:879]: [JCoAPI] JCoContext.end() for destination PFOO_200 (P-foo-CPIC0) on context with id Client-53-1; current state counter is 0
For a typical request the step is handled in milliseconds.
Are there any known limitations or configurations regarding pool handling for the Jco adapter, either on adapter or on SAP side?
Update we've on Jco adapter 3.0.16 and will double-check 3.0.17 now. DNS seems unlikely since we're monitoring dig/nslookup and they're running without delays.
Which JCo patch level do you use?
Did you try to update to the latest JCo patch level 3.0.17 first?
In your time gap the RFC connection will be opened and the RFC logon will be done, if the pool is empty at that time. Did you have a closer look with a higher trace level, or did you have a look into the RFC trace?
This can be anything from not having a free dialog work process at ABAP side, to SAP system database issues (required for the RFC logon authentication checks), slow response times from the SAP message server (if using load balanced logons), SNC handshake issues (if using SNC) or general network issues with the DNS (try using the IP address instead of a hostname).
Another point worth checking: you say your connection pool has size 100. Is it possible, that your program has more than 100 threads? Then it may happen from time to time, that all connections are currently busy in other threads and the current thread has to wait until a function call in another thread completes and a connection is returned to the pool.
(How long a thread waits on an empty pool can be customized via the "pool wait time" parameter.)

How to set timeout on a SQL statement with Hibernate

What I have tried...
Using Criteria:
criteria.setTimeout(1);
=> Zero effect
Using HQL:
return getSession().createQuery("from User").setTimeout(1).list();
=> Zero effect
Even tried it on transaction level:
#Transactional(timeout = 1, readOnly = true, propagation = Propagation.REQUIRES_NEW)
Some effect; after the query is executed an exception is thrown but the transaction is not aborted after 1 second.
This is running on a Jetty server with Atomikos TM.
How can I specify a timeout on a specific query so that the call is aborted after that timeout (which will not be 1 second, this is just to test quicker)?
UPDATE
When running the same test but bypassing Atomikos/Hibernate/Spring (so using plain JDBC) the timeout does work (approximately at least; it will not be exactly 1 second, on H2 it came really close, on Oracle this was sometimes over 2 seconds). The result was as expected: SQLException: ORA-01013: user requested cancel of current operation.
UPDATE
When debugging hibernate I see that the timeout is set on the PreparedStatement but it seems to have no effect.

Resources