Managing ElasticSearch resources in Tomcat - elasticsearch

I have a web app running in Tomcat that needs to connect to ES using the High Level Java API. I am usure about the best practices for managing the ES resources (client, transport) in that context.
In the past, I would create a brand new client for every request and close it (as well as its transport) when I was done with the request.
But now I read that it's best to use a single client in my app and across all threads (the client is apparently thread-safe).
I can see two issues with that approach.
Issue 1: client timing out
If the single client hasn't been used in a while, it may have timed out. So before I use the client, I need a way to check if the client is still alive. But I can't find clear doc on how to do that (at least not without pinging the server everytime).
Issue 2: can't tell when Tomcat is done with the client
When I run my app as a comnmand line main() app, I can close the client's trasport at the end of that main. But in a Tomcat context, my code has no way of knowing when Tomcat is done with the client and its transport.
I tried all sorts of tricks using finalize() but none of them work consistently. And from what I read, it's unwise to rely on finalize() to close resources as the JVM offers no garantee as to when an object will be GCed (if ever!).
Thx for your guidance.

Related

Spring Boot application not responding after pushing a large number of requests

I have a problem with a server that called server A:
Server A: Red Hat Enterprise Linux Server release 7.2 (Maipo)
Server B: Red Hat Enterprise Linux Server release 7.7 (Maipo)
jdk-8u231 installed on all of servers.
I have an Spring Boot application running on 2 servers.
Whenever i use Jmeter to send 100 concurrency request to application running on each servers, the application running on Server B have no problem.
But in Server A, the application will be not responding, that mean the Process (PID) still running but I can't visit actuator endpoint, cannot visit Swagger page, cannot send new request ... log file show nothing since that time.
Thread dump and heap dump have no significant difference.
Could anyone show me how to analysis that problem?
I still have no idea why the problem occur.
Well, I can only speculate here, but here some ideas that can help:
There are two possible sources of issue here Java Application and Linux (+its network policies, firewalls and so forth).
Since You don't know for sure, what happens, try working by "elimination".
Create a script that will run 100 concurrent requests. Place the script at the Server A (the problematic one) and run The script will run against "localhost" (obviously). If you see that it works, then the issue is not in Java at all. Probably some network policies or linux setup, who knows.
Place a log message in the controller of the java application and examine the log. The log should print the request number among other things, so that you'll be able to understand whether you get stuck after a well defined number of requests or its always a different number.
Check the configurations of Spring Boot application. Maybe there is a difference in the number of threads allocated to serve the request by the embedded web server that runs inside the spring boot application (assuming you're not using a reactive stack) and this number differs. In this case you won't be able to call rest endpoints, actuator, etc.
If JMX connection is available to the setup, connect via the JMX and check the MBean of Tomcat (again, assuming there is a tomcat under the hood) to check pretty much the same information as in 4.
You've mentioned thread dumps. Try to take more than one thread dump but one before you're running JMeter test, one during the running (when everything still works), one when everything is stuck.
In the thread dumps check the actual stacktraces, maybe all the threads are stuck working with Database or something and can't serve requests like I've explained in "4"
Examine GC logs, maybe GC works so hard that you can't really interact with the application.

SignalR combined with load balancer missing messages

I have 2 web servers (IIS 8.5) behind a hardware firewall and our application uses SignalR for some real-time updates. We are using SQL Server as the backplane to help us work in this load balanced environment. Additionally we are using sticky sessions on the load balancer to help us keep the users on the same web server during their session. When we are running in this hardware configuration we lose at least 1/3 of our messages. Sometimes we get all the expected messages but more often than not we are missing plenty.
When we are running on a single web server all messages are received. Does anyone have any suggestions for troubleshooting this problem? We've turned on logs (both client & server) and nothing looks like it's missing or broken. We're really stumped.
EDIT---
Some additional details that I hope will shed light on the situation.
Server to Client messages are getting lost. Pretty much all our communication is Server to Client.
We are using sticky session just based on IP and limited to 5 minutes but we're losing messages within that 5 minutes.
This is some old SignalR code that has been only minimally touched since SignalR 1 (or even older). We are keeping an in memory list of users along with their connections and we use that list to send notices back to the client. It seems most likely that this is the cause of the troubles but with Sticky sessions the user should be stuck to the same server for at least the 5 minutes right?
This list of users maps Username to connection id. This is useful when our backend services (on another machine) sends a message back with the username not the connection id.
Finally resolved this. There were 2 issues really. The first is that we were using an in memory list of users as mentioned in the edit above. Once we realized that wasn't going to work across machines we removed it. It also led us to the second issue which was how SignalR 2 uses the IUserIdProvider and our call should have been
Clients.User(userId).send(message)
instead of
context.Clients.Client(connection)
This code had existed since we first started using SignalR many years ago and never got properly updated as we upgraded SignalR versions
Have the same machineKey specified in your web.config on both servers.

How to limit Couchbase client from trying to connect to Couchbase server when it's down?

I'm trying to handle Couchbase bootstrap failure gracefully and not fail the application startup. The idea is to use "Couchbase as a service", so that if I can't connect to it, I should still be able to return a degraded response. I've been able to somewhat achieve this by using the Couchbase async API; RxJava FTW.
Problem is, when the server is down, the Couchbase Java client goes crazy and keeps trying to connect to the server; from what I see, the class that does this is ConfigEndpoint and there's no limit to how many times it tries before giving up. This is flooding the logs with java.net.ConnectException: Connection refused errors. What I'd like, is for it to try a few times, and then stop.
Got any ideas that can help?
Edit:
Here's a sample app.
Steps to reproduce the problem:
svn export https://github.com/asarkar/spring/trunk/beer-demo.
From the beer-demo directory, run ./gradlew bootRun. Wait for the application to start up.
From another console, run curl -H "Accept: application/json" "http://localhost:8080/beers". The client request is going to timeout due to the failure to connect to Couchbase, but Couchbase client is going to flood the console continuously.
The reason we choose to have the client continue connecting is that Couchbase is typically deployed in high-availability clustered situations. Most people who run our SDK want it to keep trying to work. We do it pretty intelligently, I think, in that we do an exponential backoff and have tuneables so it's reasonable out of the box and can be adjusted to your environment.
As to what you're trying to do, one of the tuneables is related to retry. With adjustment of the timeout value and the retry, you can have the client referenceable by the application and simply fast fail if it can't service the request.
The other option is that we do have a way to let your application know what node would handle the request (or null if the bootstrap hasn't been done) and you can use this to implement circuit breaker like functionality. For a future release, we're looking to add circuit breakers directly to the SDK.
All of that said, these are not the normal path as the intent is that your Couchbase Cluster is up, running and accessible most of the time. Failures trigger failovers through auto-failover, which brings things back to availability. By design, Couchbase trades off some availability for consistency of data being accessed, with replica reads from exception handlers and other intentionally stale reads for you to buy into if you need them.
Hope that helps and glad to get any feedback on what you think we should do differently.
Solved this issue myself. The client I designed handles the following use cases:
The client startup must be resilient of CB failure/availability.
The client must not fail the request, but return a degraded response instead, if CB is not available.
The client must reconnect should a CB failover happens.
I've created a blog post here. I understand it's preferable to copy-paste rather than linking to an external URL, but the content is too big for an SO answer.
Start a separate thread and keep calling ping on it every 10 or 20 seconds, one CB is down ping will start failing, have a check like "if ping fails 5-6 times continuous then close all the CB connections/resources"

Should I be using AJAX or WebSockets.

Oh the joyous question of HTTP vs WebSockets is at it again, however even after quit a bit of reading on the hundreds of versus blog posts, SO questions, etc, etc.. I'm still at a complete loss as to what I should be working towards for our application. In this post I will be supplying information on application functionality, and the types of requests/responses used in our application currently.
Currently our application is a sloppy piece of work, thrown together using AngularJS and AJAX requests to a Apache server running PHP, namely XAMPP. With the launch of our application I've noticed that we're having problems with response times when the server is under any kind of load. This probably has something to do with the sloppy architecture of our server, the hardware, and the fact that our MySQL database isn't exactly optimized.
However, with such a loyal fanbase and investors seeing potential in our application and giving us a chance to roll out a 2.0 I've been studying hard into how to turn this application into a powerhouse of low latency scalability. Honestly the best option would be hire someone with experience, but unfortunately I'm a hobbyist, and a one-man-army without much experience.
After some extensive research, I've decided on writing the backend using NodeJS this time. However I'm having a hard time deciding on HTTP or Websockets. Here's the types of transactions that are done between the Server/Client.
Client sends a request to the server in JSON format. The request has a few different things.
A request id (For processing logic based on the request)
The data associated with the request ID.
The server receives the request, polls the database (if necessary) and then responds to the client in JSON format. Sometimes the server is serving files to the client. Namely images in Base64 format.
Currently the application (When being used) sends a request to the server every time an interface is changed, which on average for our application is once every few seconds. Every action on our interfaces sends another request to the server. The application also sends requests to check for notifications/messages every 8 seconds, (or two seconds depending on if they're on the messaging interface).
Currently here are the benefits I see of a stated connection over a stateless connection with our application.
If the connection is stated, I can eliminate the requests for notifications and messages, as the server can just tell the client whenever one comes available. This can eliminate x(n)/4 requests per second to the server alone.
Handling something like a disconnection from the server is as simple as attempting to reconnect, opposed to handling timeouts/errors per request, this would only be handled on the socket.
Additional security can be obtained by removing security keys for database interaction, this should prevent the possibility of Hijacking(?) of a session_key and using it to manipulate or access another users data. The session_key is only needed due to there being no state in the AJAX setup.
However, I'm someone who started learning programming through TCP game server emulation. So I understand some benefits of a STATED connection, while I don't understand the benefits of a STATELESS connection very much at all. I know they both have their benefits and quirks, but I'm curious what would be the best approach for us.
We're mainly looking for Scalability, as we had a local application launch and managed to bottleneck at nearly 10,000 users in under 48 hours. Luckily I announced this as a BETA and the users are cutting me a lot of slack after learning that I did it all on my own as a learning project. I've disabled registrations while looking into improving the application's front and backend.
IMPORTANT:
If using WebSockets, would we be able to asynchronously download pictures from the server like we can with AJAX? For example, I can make 5 requests to the server using AJAX for 5 different images, and they will all start downloading immediately, using a stated connection would I have to wait for each photo to be streamed before moving to the next request? Would this only bottle-neck a single user, or every user that is waiting on a request to be completed?
It all boils down on how your application works and how it needs to scale. I would use bare WebSockets rather than any wrapper, since it is an already easy to use API and your hands won't be tied when you need to scale out.
Here some links that will give you insight, although not concrete answers to your questions because as I said, it depends on your expectations.
Hard downsides of long polling?
WebSocket/REST: Client connections?
Websockets, and identifying unique peers[PHP]
How HTML5 Web Sockets Interact With Proxy Servers
If your question is Should I use HTTP over Websockets ?, the response is: You should not.
Even if it is faster because you don't lose time opening the connection, you lose also all the HTTP specification like verbs (GET, POST, PATCH, PUT, ...), path, body, and also response, status code. This seams simple but you'll have to re-implement all or part of these protocol things.
So you should use Ajax, as long as it is one ponctual request.
When you need to make an ajax request every 2 seconds, you need in fact that the server sends you data, not YOU request server to check Api change (if changed). So this is a sign that you should implement a websocket server.

What is ajax-push? Are there caveats to using it on some servers?

Can somebody explain what ajax-push is? From what I understand it involves leaving HTTP connections open for a long time and reconnecting as needed. It seems to be used in chat systems a lot.
I have also heard when using ajax-push in Java it is important to use something with the NIO-connetors or grizzle serlvet api? Again, I'm just researching what it exactly.
In normal AJAX (call it pull) you ask the server for something and you get it immediately. This is fine when you want to get some data from the server now. But what if something happens on the server and the server wants to push that event to the client(s)?
Technically this is implemented using so called long polling - the browser opens the HTTP connection and waits for the response. As long as there is nothing interesting on the server side, it waits. But when something happens, the server sends the response and the client receives it immediately. This is a huge advantage over normal polling where you ask the server every few seconds - it generates a lot of traffic and still introduces noticeable latency.
The only problem with this approach is the number of pending HTTP connections. Old-school Java servlet containers aren't quite capable of handling such amount of connections due to one-thread-per-connection limitation - they quickly run out of memory. Even though the HTTP threads aren't doing anything (waiting for some other part of the system to wake them up and give them the response), they occupy memory.
However there are plenty of solutions nowadays:
Tomcat NIO connectors
Atmosphere Ajax Push/Comet library
Servlet 3.0 #Async (most portable)
Container-specific features, but Servlet 3.0, if available, should be considered superior.

Resources