[Ignite]Problem upgrading Ignite 2.7 Services to 2.10....Services always getting canceled? - spring

I completed the code upgrade and removed all deprecated calls from my service code.
This is a Java Spring application and all services come up (as reported in the log files) when the Cluster (of 5 nodes) starts up.
However, when I try to get a serviceProxy to each service, all services get cancelled as a result of ServiceDeploymentTask attempting to do a redeploy of the services!! The redeploy fails and only cancels all of the services and fails to restart them. This can be demonstrated with both a thick and a thin client.
Why is Ignite trying to redeploy the services? (and why don't the services restart?)
Is there something I'm missing from the move to Ignite 2.10???
Finally, why does a Java Thin Client create a NODE_JOIN event?
Thanks in advance.
Greg

Related

Tomcat 8.5 hot deploy lost connection with Elasticsearch

In my test environment, I am running tomcat and I do hot deployment. This works most of the time, but there are rare occasion wherein the connection between tomcat and Elasticsearch gets lost. In my other application with tomcat and Elasticsearch (different version of Elasticsearch), I still encounter the same issue.
Usually, I just restart tomcat and it will establish the connection with Elasticsearch again. So in my other application, I stop tomcat before deploying the war file, then start it again. Since then, I am not getting this issue (lost ES connection) for that project. However, in the project that still uses hot deployment.. I am still getting this weird issue. If I do stop/deploy/start Tomcat, it will not have that issue, but I want this project to use hot deployment. I am thinking that tomcat should reestablish a connection with Elasticsearch, but this is not the case.
Anyone experienced this issue? and how did you manage to fix this hot deployment issue with Elasticsearch? By the way, I use spring data Elasticsearch and rest high level client Elasticsearch on different projects and both have the same issue on hot deploy (rarely happens though).

How to check if docker Cassandra instance is ready to take connections

I have two docker instances that I launch with docker-compose.
One holds a Cassandra instance
One holds a Spring Boot application that tries to connect to that application.
However, the Spring Boot application will always fail, because it's trying to connect to a Cassandra instance that is not ready yet to take connections.
I have tried:
Using restart:always in Docker-compose
This still doesn't always work, because the Cassandra might be up 'enough' to no longer crash the Spring Boot application, but not up 'enough' to have successfully created the Table/Column family. On top of that, this is a very hacky solution.
Using healthcheck
It seems like healthcheck in compose doesn't have restart capabilities
Using a bash script as entrypoint
In the hope that I could use netstat,ping,... whatever to determine that readiness state of Cassandra
Right now the only thing that really works is using that same bash script and sleep the process for x seconds, then start the jar. This is even more hacky...
Does anyone have an idea on how to solve this?
Thanks!
Does the spring boot service defined in the docker-compose.yml depends_on the cassandara service? If yes then the service is started only if the cassandra service is ready.
https://docs.docker.com/compose/compose-file/#depends_on
Take a look at this github repository, to find a healthcheck for the cassandra service.
https://github.com/docker-library/healthcheck
CONCLUSION
After some discussion we found out that docker-compose seems not to provide a functionality for waiting until services are up and healthy, such as Kubernetes and Openshift provide (See comments below). They recommend to use wrapper script (docker-entrypoint.sh) which waits for the depending service to come up, which make binaries necessary, the actual service shouldn't use such as the cassandra client binary. Additionally the service depending on cassandra could never get up if cassandra doesn't, which shouldn't happen.
A main thing with microservices is that they have to be resilient for failures and are not supposed to die or not to come up if a depending service is currently not available or unexpectedly disappears. Therefore the microservice should be implemented in a way so that it retries to get connection after startup or an unexpected disappearance. Unexpected is a word actually wrongly used in this context, because you should always expect such issues in a distributed environment, and even with docker-compose you will face issues like that as discussed in this topic.
The following link points to a tutorial which helped to integrate cassandra properly into a spring boot application. It provides a way to implement the retrieval of a cassandra connection with a retry behavior, therefore the service is resilient to a non existing cassandra database and will not fail to start anymore. Hope this helps others as well.
https://dzone.com/articles/containerising-a-spring-data-cassandra-application

Spring boot restful webservices

My spring boot restful web services is working even though stopped running microsoft sqlserver database in my services. How does it work?
There might be below reason.
You might be using some kind of cache so still response is coming form cache even your db is down.
You might be checking services which are not required db transaction..
OR if you are only referring you application is continue to running then might be spring.datasource.continue-on-error=true has been set. or you might have some defined data source validations properties to at-least continue run app and whenever db is back, it will established a connection.

my Pivotal cloud foundry app is crashing often while doing healthcheck

I have created a spring boot integration app and deployed it to Pivotal Cloud Foundry (PCF) environment. It works for couple of days and then it starts to crash randomly afterwards. I checked the PCF logs and found this information about the crash.
OUTApp instance exited with guid 3c348d47-48c4-403f-950a-29af1efa551d
payload: {"instance"=>"e2122543-214f-4806-62c7-00e1", "index"=>2,
"reason"=>"CRASHED", "exit_description"=>"Instance became unhealthy: Failed
to make HTTP request to '/health' on port 8080: timed out after 1.00
seconds", "crash_count"=>1, "crash_timestamp"=>1511959503256098495,
"version"=>"10cea919-d490-460d-83d6-5132c96ef781"}
My CPU utilization is not much. My memory is also not leaking.
Information about the application deployed in PCF:
Spring boot integration app connects to IBM MQ queues and polls for messages and then calls couple of web services.
There is also another application Service Bus, which makes the health check call on PCF application to check if the PCF app is available or not. If Service Bus finds that PCF app is available then the requests are routed to PCF else they are processed at Service Bus end itself.
Please let me know, how to find the root cause of the CRASH and fix it.
Thanks in advance. Please let me know, if you need further details.
I have changed the health check type to port type from http in manifest.yml file.
configuration change in manifest file is as follows:
health-check-type: port
Now the app is not crashing. It is working fine. Hope this helps.

Wildfly Swarm Consul

I am trying to register a Wildfly Swarm REST service to a running Consule agent, but it's not working correctly.
I am able to register a service (I can see it in the Consul ui), but somehow the health checks are not working.
The Swarm Server keeps frequently telling me, that "sending the check" failed due to "HTTP 405 Method not allowed". I can see simular logs in the Consule console, that GET method is not allowed.
I am at a dead end: My application is not working, nor does the Wildfly Swarm example (same exception). I also configured a CORS filter on both sides just to be sure, but thats not working either.
I am using Wildfly Swarm 2017.10.1 and Consul 1.0.0.
I hope you guys can help.
Best regards
I figured it out myself. Obviously, it wasn't that hard ^^
I checked the version of the Consul Client API which is used for my Wildfly Swarm version: It's 0.9.16. I've downloaded all Consul versions and checked which one are compatible. I can verify that all versions up to 0.9.3 are working.
Consul 1.0.0 has some very critical breaking changes and I really don't understand why they were not implemented in a HTTP API v2, but thats not the point here.
I highly recommend to upgrade the Consul Client API used by the topology-consul fraction to a newer version like 0.16.5 or 0.17.0.
At least, please add a note in the README for the ribbon-consul example what Consul versions can be used.

Resources