We are getting the below error while executing the search requests. We are using ElasticSearch 7.9.2v
java.lang.RuntimeException: Request cannot be executed; I/O reactor status: STOPPED
I am only using sync method to execute the requests as shown below:
client.search(searchRequest,RequestOptions.DEFAULT)
When the first time the issue occurs looks like it stops the RestHighLevelClient and the subsequent calls are getting failed with the same error. we need to restart our app to initialize the client again. From last two days, we are running into this issue very frequently.
Note: we are not closing the client for every call. We initiate the client during application startup and close it only when the app is getting shut down.
Restarting the java process (in my case is SpringBoot java process) fixes the problem.
Had the same issue. It was caused by OutOfMemoryError that was caused by other changes at my service.
So increasing heap-space limit helped.
Related
I am getting the stuck thread error often while trying to send the JMS message to another manager server within the domain in our production environment.
Initially, we felt that it might be due to Load on the server but issue occurred randomly Even at the time of less load and system processing well some high volume time
We are not able to find the reason for the same.
Error Information:
weblogic.jms.client.JMSConnectionFactory.createQueueConnection(JMSConnectionFact
ory.java:199)
I'm trying to handle Couchbase bootstrap failure gracefully and not fail the application startup. The idea is to use "Couchbase as a service", so that if I can't connect to it, I should still be able to return a degraded response. I've been able to somewhat achieve this by using the Couchbase async API; RxJava FTW.
Problem is, when the server is down, the Couchbase Java client goes crazy and keeps trying to connect to the server; from what I see, the class that does this is ConfigEndpoint and there's no limit to how many times it tries before giving up. This is flooding the logs with java.net.ConnectException: Connection refused errors. What I'd like, is for it to try a few times, and then stop.
Got any ideas that can help?
Edit:
Here's a sample app.
Steps to reproduce the problem:
svn export https://github.com/asarkar/spring/trunk/beer-demo.
From the beer-demo directory, run ./gradlew bootRun. Wait for the application to start up.
From another console, run curl -H "Accept: application/json" "http://localhost:8080/beers". The client request is going to timeout due to the failure to connect to Couchbase, but Couchbase client is going to flood the console continuously.
The reason we choose to have the client continue connecting is that Couchbase is typically deployed in high-availability clustered situations. Most people who run our SDK want it to keep trying to work. We do it pretty intelligently, I think, in that we do an exponential backoff and have tuneables so it's reasonable out of the box and can be adjusted to your environment.
As to what you're trying to do, one of the tuneables is related to retry. With adjustment of the timeout value and the retry, you can have the client referenceable by the application and simply fast fail if it can't service the request.
The other option is that we do have a way to let your application know what node would handle the request (or null if the bootstrap hasn't been done) and you can use this to implement circuit breaker like functionality. For a future release, we're looking to add circuit breakers directly to the SDK.
All of that said, these are not the normal path as the intent is that your Couchbase Cluster is up, running and accessible most of the time. Failures trigger failovers through auto-failover, which brings things back to availability. By design, Couchbase trades off some availability for consistency of data being accessed, with replica reads from exception handlers and other intentionally stale reads for you to buy into if you need them.
Hope that helps and glad to get any feedback on what you think we should do differently.
Solved this issue myself. The client I designed handles the following use cases:
The client startup must be resilient of CB failure/availability.
The client must not fail the request, but return a degraded response instead, if CB is not available.
The client must reconnect should a CB failover happens.
I've created a blog post here. I understand it's preferable to copy-paste rather than linking to an external URL, but the content is too big for an SO answer.
Start a separate thread and keep calling ping on it every 10 or 20 seconds, one CB is down ping will start failing, have a check like "if ping fails 5-6 times continuous then close all the CB connections/resources"
I have a problem with starting my Windows Service. It's configured to start automatically and it usually does start. On sometimes it doesn't, especially on Windows 8.
The windows log contains following error:
The XYZ service failed to start due to the following error: The
service did not respond to the start or control request in a timely
fashion. A timeout was reached (30000 milliseconds) while waiting for
the XYZ service to connect.
This is a .NET 2.0 service.
The standard cause of the problem is OnStart method that perform long synchronous operation. This is not an issue this time. In fact, I've placed a file logger in the beginning of the OnStart method and it looks like it's not invoked at all.
It turned out that the problem was cause by two issues:
the executable file (exe) was signed digitally;
there were Internet connection problems and accruing IP took a long time;
The two combined caused the service start process to timeout due to too long certificate validation.
I had to use this on native c win32 services, and searched if .NET has something similar. Sorry if i'm wrong.
In your OnStart, use the RequestAdditionalTime method to inform the service control manager that the operation will require more time to complete. Documentation here
I started getting 500 Internal Server Error(ISE) on Heroku, tried enabling debug level logs. The error is not consistent, it occurs for some request, for others it goes thru properly. When there is an ISE there is not even a single log line(even with debug on) in web dynos... I am able to see that the 500 response is given for the request.
From the client side am seeing the following
This exception has been logged with id 6gimbegj7
The above line tells that it should a response from Play! when it is running in production mode.
Attached New Relic monitoring plugin where it says the Exception occurred on NettyDispatcher (Netty IO server), not further info.
Any idea what could be the issue?
We have identified the problem here,
In play! we have a Global.java on the root of the packages, which has the methods like
onRequest(), onRequestRoute()
If there is any error at this place, and if we do not handle the exceptions, then Netty directly throws an 500 response to the request. This can happen from any part of the application. We did catch the exceptions in other part of the application but missed this one.
Mimicking the heoku env with the help of Foreman tool helped us to narrow down on the cause. As such setting the playframework log level to DEBUG did not throw any logs/stacktrace, Not sure why.
I am running a listener program on a JMS queue hosted in Sun Java System Application Server 9.1_02 (build b08-p03)
After receiving a message I will extract and log some details in to a log file.
I observed that when running listener for really long hours, it stops receiving messages. I had to manually stop the program and start it again. Then it receives some 200-300 message and then stops again. I had to restart it to retrieve another set of messages.
Why is this weird behavior? Can someone throw light on this?
Thank you
Chaitanya
I found the issue myself. I am running the listener as Load Runner Java Vuser script. One negative thing about LR is that it does not show any exceptions. The script appears to be running but it actually does nothing. I found this when I ported the script to Eclipse and found that the script is throwing exceptions. This is essentially because I am not checking for a certain condition.
Happy that I demystified at last!
Thanks.