Pooling Not Reusing INACTIVE Sessions - oracle

This is a general question about flow -
Lately we started getting warnings by .Net of Connection timeout or Connection must be open for this operation.
We are working with Oracle DB and we set a job running every 5 seconds, counting how many connections are there (both ACTIVE and INACTIVE) by the w3wp (we are querying gv$session).
The max pool size for each WS (we have 2) is 300, meaning 600 connections in total.
We noticed that indeed we are reaching the 600 sessions before the crash, however, there are many INACTIVE sessions out of those 600 sessions.
I would except that those sessions would be reused, since they are INACTIVE at the moment.
In addition, the prev_sql_id being running by most of these INACTIVE sessions is: SELECT PARAMETER, VALUE FROM SYS.NLS_DATABASE_PARAMETERS WHERE PARAMETER IN ('NLS_CHARACTERSET', 'NLS_NCHAR_CHARACTERSET').
Is it a normal behavior?
Further more, after recycling, the connection count is of course small (around 30), but 1 minute later it jumps into 200. Again, the majority are INACTIVE sessions.
What is the best way to understand what are these sessions and troubleshoot it?
Thanks!

Related

How to set idle timeout in quarkus when we limit the maximum number of http connections

https://quarkus.io/guides/http-reference#http-limits-configuration
quarkus.http.limits.max-connections
The maximum number of connections that are allowed at any one time. If this is set it is recommended to set a short idle timeout.
what does this exactly mean and what is the property to set the idle timeout?
It means that Quarkus will limit the number of open HTTP connections to whatever you have set.
The reason we recommend to also set a low idle timeout via quarkus.http.idle-timeout (depends on the application, but you probably want something in the low seconds), is that if you have idle connections sitting around and have the number of maximum available connections, you could run out of connections very quickly.
P.S. All Quarkus configuration options can be found here.

H2 database: Does a 60 second write delay have adverse effects on db health?

We're currently using H2 version 199 in embedded mode with default nio file protocol and MVStore storage system. The write_delay parameter is set to 60 seconds.
We run a batch insert/update/delete of about 30.000 statements within 2 seconds (in one transaction) followed by another batch of a couple of hundred statements only 30 seconds later (in a second transaction). The next attempt to open a db connection (only 2 minutes later) shows that the DB is corrupt:
File corrupted while reading record: null. Possible solution: use the recovery tool [90030-199]
Since the transactions occur within a minute, we wonder whether the write_delay of 60 seconds might be contributing to the issue.
Changing write_delay to 60s (from a default value of 0.5s) will definitely increase your risk of lost transactions, and I do not see a good reason for doing it. Should not cause a db corruption, though. More likely some thread interruptions do that, since you are running in the same JVM a web server and who knows what else. Using async file store might help in that area, and yes it is stable enough (how much worse it can go for your app, than a database corruption, anyway).

DBCP2 BasicDataSource Idle connections not getting cleared

i see that Idle connections are not getting cleared. I am not sure what the reason is?
initialSize-10
maxtotal-20
maxidle-10
minidle-0
minEvictableIdleTimeMillis-30min
numTestsPerEvictionRun-60min
numTestsPerEvictionRun-20
testOnBorrow-true
testWhileIdle-true
validationQuery-select 1 from dual
From various sources provided the following is my understanding
maxtotal- maxactive connections to Datasource which is 20 in above case
maxidle- number of idle connections that can remain in pool. these are removed by the sweeper. In the above case, a connection is idle if it remains idle for 30min. If the sweeper runs every 60min which checks 20 idle connections to and clears idle connections. Idle connections exceeding this would be closed immediately.
Is the above understanding correct?
I am using BasicDataSourceMXBean to print the stats
{"NumActive":"0","NumIdle":"10","isClosed":"false","maxTotal":"20","MaxIdle":"10","MinIdle":"0"}
The idle connections are never getting cleared even though there is no traffic. Is there anything wrong in the above config?
Also what is minIdle and when should we set it to non zero value?
Recently upgraded hibernate version from 3.6.0.Final to hibernate 4.3.11.Final and spring to 4.2.9 from older spring version.
Earlier the idle connections were getting cleared. But since the upgrade the idle connections are not getting cleared.
Everywhere I have looked, it seems that property should be testWhileIdle rather than testOnIdle. The setting is by default false, so your idle threads aren't being tested for validity, thus aren't being evicted.
The minIdle basically tells the connection pool how many idle threads are permissible. Its my understanding from the documentation that when minIdle is 0, there should be no idle connections.
Typically minIdle defaults to the same value as initialSize.
i see that Idle connections are not getting cleared. I am not sure
what the reason is?
https://commons.apache.org/proper/commons-dbcp/configuration.html
The testWhileIdle vs. testOnIdle issue that others have pointed out should resolve your question as to why the idle connections remain open. You are correct in assuming that your initialSize=10 connections will be cleared by the eviction sweeper at the 60 minute mark to bring you down to minIdle=0. Why you would want to have a minIdle=0 is a different question? The whole point of connection pooling is really to pre-authenticate, test, and establish your connections so they can sit in your pool "Idle" and available to "Borrow" by incoming requests. This improves performance by reducing execution time to only the SQL session.
Also what is minIdle and when should we set it to non zero value?
These idle connections will pre-establish and maintain wait for your future SQL requests. The minIdle sizing depends on your application, but the default from DBCP2 is 8 and probably not a bad place to start. The idea is to keep enough on hand to keep up with the average demand on the pool. You would set a maxIdle to deal with those peak times when you have bursts of traffic. The testWhileIdle=true configuration you have applied will run the validationQuery when the sweeper comes around, but will only test 3 connections per run by default. You can configure numTestsPerEvictionRun to a higher number if you want more to be tested. These "tests" ensure your connections are still in a good state so that you don't grab a "bad" idle connection from the pool during execution.
I suspect that you may be more concerned with "hung" connections rather than "idle" connections. If this is the case, you will want to review the "abandoned" configurations that are designed to destroy "active" connections that have been running longer than X amount of time. removeAbandonedOnMaintenance=true along with removeAbandonedTimeout={numberOfSecondsBeforeEligibleForRemoval}.

Bigquery Streaming inserts, persistent or new http connection on every insert?

I am using google-api-ruby-client for Streaming Data Into BigQuery. so whenever there is a request. it is pushed into Redis as a queue & then a new Sidekiq worker tries to insert into bigquery. I think its involves opening a new HTTPS connection to bigquery every insert.
the way, I have it setup is:
Events post every 1 second or when the batch size reaches 1MB (one megabyte), whichever occurs first. This is per worker, so the Biquery API may receive tens of HTTP posts per second over multiple HTTPS connections.
This is done using the provided API client by Google.
Now the Question -- For Streaming inserts, what is the better approach:-
persistent HTTPS connection. if yes, then should it be a global connection that's shared across all requests? or something else?
Opening new connection. like we are doing now using google-api-ruby-client
I think it's pretty much to early to speak about these optimizations. Also other context is missing like if you exhausted the kernel's TCP connections or not. Or how many connections are in TIME_WAIT state and so on.
Until the worker pool doesn't reach 1000 connections per second on the same machine, you should stick with the default mode the library offers
Otherwise this would need lots of other context and deep level of understanding how this works in order to optimize something here.
On the other hand you can batch more rows into the same streaming insert requests, the limits are:
Maximum row size: 1 MB
HTTP request size limit: 10 MB
Maximum rows per second: 100,000 rows per second, per table.
Maximum rows per request: 500
Maximum bytes per second: 100 MB per second, per table
Read my other recommendations
Google BigQuery: Slow streaming inserts performance
I will try to give also context to better understand the complex situation when ports are exhausted:
Let's say on a machine you have a pool of 30,000 ports and 500 new connections per second (typical):
1 second goes by you now have 29500
10 seconds go by you now have 25000
30 seconds go by you now have 15000
at 59 seconds you get to 500,
at 60 you get back 500 and stay at using 29500 and that keeps rolling at
29500. Everyone is happy.
Now say that you're seeing an average of 550 connections a second.
Suddenly there aren't any available ports to use.
So, your first option is to bump up the range of allowed local ports;
easy enough, but even if you open it up as much as you can and go from
1025 to 65535, that's still only 64000 ports; with your 60 second
TCP_TIMEWAIT_LEN, you can sustain an average of 1000 connections a
second. Still no persistent connections are in use.
This port exhaust is better discussed here: http://www.gossamer-threads.com/lists/nanog/users/158655

Window Workflow SendReceive Activity timeout issue

I have hosted a state machine Worklow as a WCF service..And the workflow is called in an ASP.NET code. I used netTcpContextBinding for workflow hosting. Problem is that if a SendRecieve activity within the workflow is taking a lot of time (say 1 minute) to execute, then it will show transaction aborted error and will terminate.. i have already set the binding values for send, recieve, open, close timeouts to maximum values in both web.config and the app.config..
How can i overcome this issue?
A TransactionScope has a default timeout of 60 seconds so if whatever you are doing in there takes longer it will time out and abort. You can increase the timeout on the TransactionScope but quite frankly the 60 seconds is already quite long. In most cases you are better of at doing any long running work to collect data before the transaction and keep your transaction time as short as possible.

Resources