how to preserve database connection while thread is blocked - oracle

The database i'm integrated is configured as if a connection is idle(not being used for a while), then connection is dropped. Since im using spring batch in persistent configuration, there is always an active database connection on running threads.
One of my spring batch job is dependent to data from external web service which takes long time to execute. Thats why i already lose the database connection when i get the result.
I tried to use taskscheduler to register a heartbeat query(select 1 from dual) before the web request occurs, which executes the queryevery 5 minutes to keep the connection alive but even if the query executes periodically, i guess it executes the query on a seperate connecyion since it runs on another thread.
Does anyone have an alternative suggestion to keep the connection alive while on locked thread?
I use JPA's EntityManager for the haertbeat query

If you use Spring then you also use HikariCP. The recent JDBC standard defines method isValid() so you do not have to call SQL to check whether Connection is alive.
More over there is one more mechanism you can use. It is called TCP keepalive.
If you insert stanza ENABLE=BROKEN into your JDBC url Oracle JDBC drivers will enable TCP Keepalive feature on a TCP connection
jdbc:oracle:thin:#(DESCRIPTION=(ENABLE=BROKEN)(ADDRESS=(PROTOCOL=tcp)(PORT=1521)(HOST=myhost))(CONNECT_DATA=(SERVICE_NAME=orcl)))
Then it will be Linux kernel who will be sending keepalive probes over TCP connection even if your thread is blocked.
Beware: The delay for 1st probe and frequency is determined by Linux kernel parameters.
# cat /proc/sys/net/ipv4/tcp_keepalive_time
7200
# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
75
# cat /proc/sys/net/ipv4/tcp_keepalive_probes
9
By default the 1st keep alive probe (TCP window carying 0 bytes) is sent after 2 hours.
While Cisco/Juniper usually cut off TCP connection after one hour.

Related

Golang ssh client timeout not working as expected

I am writing a Golang ssh/sftp client which connects to a sftp server with a slowness in connecting and writing files, using golang.org/x/crypto/ssh package. I need to set Connection timeout and SO timeout (as we do in Java JSCH library).
First to achieve Connection timeout I was using ssh.ClientConfig.Timeout, but only worked for nanosecond and microsecond values, not for milliseconds and above, where I needed to set 5 seconds. According to the API doc also I assume ssh.ClientConfig.Timeout is used only for TCP socket connection creation and ssh handshake is not included there.
So then I tried net.Conn.SetDeadline() and it was for end-to-end connection creation + writing file + connection closing. Since this is also not fine, tried net.Conn.SetWriteDeadline() which looks like SO timeout (applied in TCP packet level) but timeout error is not appeared just after the duration elasped, instead comes out after the server's late reply or subsequent write operation starts.
So can someone please show the correct way of setting Connection timeout and SO timeout in Golang ssh package or tell whether this is supported or not?

client-mode="true" and retryInterval on the inbound adapter with Client Connection factory

In spring Documentation --> 32.6 TCP Adapters it is mentioned that we use clientMode = "true" then the inbound adapter is responsible for the connection with external server.
I have created a flow in which the TCP Adapter with client connection factory makes connection with external server the code for the flow is :
IntegrationFlow flow = IntegrationFlows.from(Tcp.inboundAdapter(Tcp.nioClient(hostConnection.getIpAddress(),Integer.parseInt(hostConnection.getPort()))
.serializer(customSerializer)
.deserializer(customSerializer)
.id(hostConnection.getConnectionNumber())).clientMode(true).retryInterval(1000).errorChannel("testChannel").id(hostConnection.getConnectionNumber()+"adapter"))
.enrichHeaders(f->f.header("CustomerCode",hostConnection.getConnectionNumber()))
.channel(directChannel())
.handle(Jms.outboundAdapter(ConnectionFactory())
.destination(hostConnection.getConnectionNumber()))
.get();
theFlow = this.flowContext.registration(flow).id(hostConnection.getConnectionNumber()+"outflow").register();
I have created multiple flow by iterating over the list of connections and
iterate the above code in for loop and register them in flowcontext with unique ID.
My clients are created successfully with no issue and then establish there connection as supported by topology.
Issue :
I have counted the number of client connection created successfully so I have counted that 7 client connection (7 Integration flow) made successfully and they initiate connection from themselves.
when I create 8th client connection (8th flow created and registered successfully) but the .clientMode(true) is not working means the client don't initiate connection itself after first failure means it try for the first time to make connection if connected successfully then no issue but in case of failure it don't retry again.
Also my other created clients i.e 7 clients connection which are created successfully they also stopped initiating connection from itself when they got disconnected.
Note: There is no issue with flow only the TCP Adapters they stop initiating the connection
The flow is created and registered successfully as there is no issue it is because when I run a control bus command #adapter_id.retryConnection() it got connected with the server.
I don't understand that what is the issue with my flow that i couldn't initiate a connection after a particular count i.e seven or is there limitation in creating number of clients.
One possibility is the taskScheduler's thread pool is exhausted - that shouldn't happen with the above configuration, but it depends on what else is in the application. Take a thread dump (e.g. jstack) to see what the taskScheduler threads are doing.
See the documentation for information about how to configure the threads in the scheduler. However, if it solves it, you should really figure out what task(s) are using scheduler threads for long tasks.
Also turn on DEBUG logging to see if it provides any clues.

HornetQ client-failure-check-period

Suppose that after 30s (default client-failure-check-period) the client did not receive any packets from the server as a result of net connection problems.
Will the client now be disconnect from session/connection?
Suppose now I add this configration :
<retry-interval>1000</retry-interval>
<retry-interval-multiplier>1.5</retry-interval-multiplier>
<max-retry-interval>60000</max-retry-interval>
<reconnect-attempts>1000</reconnect-attempts>
What will happen now?
Will the client still get disconnected from session/connection but only after trying to reconnect 1000 times (until net is available again)? Or will it ignore the need to do disconnect?
Regarding your first question, and according to HornetQ documentation, that can be found under 17.2. Detecting failure from the client side:
As long as the client is receiving data from the server it will consider the connection to be still alive.
If the client does not receive any packets for client-failure-check-period milliseconds then it will consider the connection failed and will either initiate failover, or call any FailureListener instances (or ExceptionListener instances if you are using JMS) depending on how it has been configured.
Therefore the client will assume that the connection was in fact lost and start its failure processes.
For your second question, also according to the HornetQ documentation, that can be found under 34.3. Configuring reconnection/reattachment attributes:
reconnect-attempts. This optional parameter determines the total number of reconnect attempts to make before giving up and shutting down. A value of -1 signifies an unlimited number of attempts. The default value is 0.
So, yes, the connection will be dropped after 1000 attempts.

Connection pool opens more connections then maximum pool size

Hey I'm using Glassfish open source v4 and I'm having a weird problem.
I have defined a JDBC connection pool to Oracle 11g in the admin console and I've set :
Pool Settings
Initial and Minimum Pool Size: 500
Maximum Pool Size: 1000
Pool Resize Quantity: : 750
And I've created a specific user for this connection pool. Yet sometimes when I inspect opened connections in the database I see that there are more then 1000 (maximum I've seen was 1440)
When this happens any query attempts fail, sometimes with OutOfMemory exception, some show http thread interuptions and some don't show any logs at all, just takes a long time.
What I am wondering is how is it possible the Glassfish opens more connections then I've defined it to?
1t try to compare output from netstat on appl. server and db server side. You may have some "dangling" connections. Also try to find some documentation about DCD (Dead connection detection) in Oracle.
Few years ago I saw situations where Java application server thought that the connection is dead because it is not responding for few minutes. So this connection was put onto some dead connection list and a new connection was created.
There also can be some network issues - for example there is a FW between appl and db server.
When TCP connection is not active for one hour then it's cut over on one side but DB sever does not know about that.
The usual way how to investigate that is
compare output of both netstat(s) (appl./db)
identify dangling TCP connections
translate TCP connection onto Unix process id(PID) of Oracle session process
translate PID onto Oracle session (SID and SERIAL#)
kill the session on Oracle level (alter system kill session ...)

How can I set the timeout on OCILogon2?

When the Oracle 10 databases are up and running fine, OCILogon2() will connect immediately. When the databases are turned off or inaccessible due to network issues - it will fail immediately.
However when our DBAs go into emergency maintenance and block incomming connections, it can take 5 to 10 minutes to timeout.
This is problematic for me since I've found that OCILogin2 isn't thread safe and we can only use it serially - and I connect to quite a few Oracle DBs. 3 blocked servers X 5-10 minutes = 15 to 30 minutes of lockup time
Does anyone know how to set the OCILogon2 connection timeout?
Thanks.
I'm currenty playing with OCI and it seems to me that it's impossible.
The only way I can think of is to use non-blocking mode. You'll need OCIServerAttach() and OCISessionBegin() instead of OCILogon() in this case. But when I tried this, OCISessionBegin() constantly returns OCI_ERROR with the following error code:
ORA-03123 operation would block
Cause: The attempted operation cannot complete now.
Action: Retry the operation later.
It looks strange and I don't yet know how to deal with it.
Possible workaround is to run your logon in another process, which you can kill after timeout...
We think we found the right file setting - but it's one of those problems where we have to wait until something rare and horrible occurs before we can verify it :-/
[sqlnet.ora]
SQLNET.OUTBOUND_CONNECT_TIMEOUT=60
From the Oracle docs..
http://download.oracle.com/docs/cd/B28359_01/network.111/b28317/sqlnet.htm#BIIFGFHI
5.2.35 SQLNET.OUTBOUND_ CONNECT _TIMEOUT
Purpose
Use the SQLNET.OUTBOUND_ CONNECT _TIMEOUT parameter to specify the time, in seconds, for a client to establish an Oracle Net connection to the database instance.
If an Oracle Net connection is not established in the time specified, the connect attempt is terminated. The client receives an ORA-12170: TNS:Connect timeout occurred error.
The outbound connect timeout interval is a superset of the TCP connect timeout interval, which specifies a limit on the time taken to establish a TCP connection. Additionally, the outbound connect timeout interval includes the time taken to be connected to an Oracle instance providing the requested service.
Without this parameter, a client connection request to the database server may block for the default TCP connect timeout duration (approximately 8 minutes on Linux) when the database server host system is unreachable.
The outbound connect timeout interval is only applicable for TCP, TCP with SSL, and IPC transport connections.
Default
None
Example
SQLNET.OUTBOUND_ CONNECT _TIMEOUT=10

Resources