spring boot + hibernate search + elastic search embedded fails to start - elasticsearch

I'm struggeling to setup hibernate seach using the elastic search backend in a spring boot setup.
What I have is spring boot and the following dependencies.
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
<version>1.4.0.M3</version>
</dependency>
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-search-backend-elasticsearch</artifactId>
<version>5.6.0.Alpha3</version>
</dependency>
What happens is, that hibernate search initializes before elastic search has finished starting.
Using the following property exposes the rest interface as well
spring:
data:
elasticsearch:
properties:
http:
enabled: true
Causing an exception
Caused by: org.apache.http.conn.HttpHostConnectException: Connect to localhost:9200 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused: connect
Now, how do I define a dependency here?
I tried using a custom BeanFactoryPostProcessor to inject a dependency on elastic search, but that seems to be ignored in the auto configuration scenario.
Is there any way to introduce a wait until elastic search is up?
The setup works, when I set the hibernate index_management_strategyto NONE, but then the index is not configured and all custom analyzer annotations are ignored, defaulting to the default mappings in elastic search, which can not be configured in the auto configuration scenario.
Ideally elastic search should be hosted external to the jvm, but it's convenient in testing scenarios.

I'm understanding this is an issue you're hitting during integration tests.
You could have a look at how we start ES during the integration tests within Hibernate Search itself, using a Maven plugin which makes sure the server is started before the tests:
- https://github.com/hibernate/hibernate-search/blob/5.6.0.Beta1/elasticsearch/pom.xml#L341-L368
N.B. this uses a custom ES configuration, tuned to start quickly even though it's only a single node cluster:
- https://raw.githubusercontent.com/hibernate/hibernate-search/5.6.0.Beta1/elasticsearch/elasticsearchconfiguration/elasticsearch.yml
Hibernate Search uses the Jest client to connect to ES, so it will require you to enable the HTTP connector of ES: let's not confuse this with a NodeClient, which is a different operating mode.
If your question isn't related to automated testing but rather production clusters, then I'd suggest using a Service Orchestrator like Kubernetes.

Thanks to some help from the spring boot team, I was able to solve the issue - solution here.
The problem is that there's no dependency between the EntityManagerFactory bean and the Elasticsearch Client bean so there's no guarantee that Elasticsearch will start before Hibernate. As it happens, Hibernate starts first and then fails to connect to Elasticsearch.
This can be fixed by setting up a dependency between the two beans. An easy way to do that is with a subclass of EntityManagerFactoryDependsOnPostProcessor:
#Configuration
static class ElasticsearchJpaDependencyConfiguration extends EntityManagerFactoryDependsOnPostProcessor {
public ElasticsearchJpaDependencyConfiguration() {
super("elasticsearchClient");
}
}
Now all that is needed is to set the number of replicas to 0 to fix the health status of the cluster in the single node deployment. This can be done by specifying an additional property in the application.properties file
spring.data.elasticsearch.properties.index.number_of_replicas= 0

I checked sprint-data docs and looks like you misunderstood this piece (and actually it's confusing, guys don't understand the tech underneath?)
By default the instance will attempt to connect to a local in-memory server (a NodeClient in Elasticsearch terms), but you can switch to a remote server (i.e. a TransportClient) by setting spring.data.elasticsearch.cluster-nodes to a comma-separated ‘host:port’ list.
NodeClient is not "local server", it's special type of ES client. This local client can connect to ES cluster nodes containing data, and as I said in the comment, you don't have any ES data nodes running.
Read this for better understanding https://www.elastic.co/guide/en/elasticsearch/guide/current/_transport_client_versus_node_client.html

Related

Using SSM Parameter Store with a Containerised Spring Boot running on EC2 through ECS

I've been working on containerising an application with the intention of using ECS to manage the creation and deployment of the application to EC2.
Guides I've followed include:
The Spring guide for containerising applications
Deploying Spring Boot on ECS
A guide on adding Parameter Store
Another guide on adding Parameter Store
This hasn't panned out entirely, and I believe I've narrowed it down to the Parameter Store config as the source of the issue.
Right now, pom.xml is pretty light-touch, though I've seen more config needed depending on the scenario:
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-bootstrap</artifactId>
<version>3.1.2</version>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-aws-parameter-store-config</artifactId>
<version>2.2.6.RELEASE</version>
</dependency>
Running locally, outside of a container, I can access parameters with very minimal bootstrap config:
spring.application.name=PxTest
#I would usually provide this...
#cloud.aws.credentials.profile-name=default
#...but it seems like Spring assumes this.
As soon as I containerise it, it fails. I then added:
cloud.aws.stack.auto=false
cloud.aws.region.static=us-east-1
cloud.aws.region.auto=false
As I understand it, this should help Spring find the appropriate parameters - but I've seen very little in the way of documentation or articles describing passing SSM Parameter Store properties into a Spring Boot application running on an ECS-provisioned EC2.
The error messages I can access on the EC2 talk about Docker entering promiscuous state, followed by blocking state, and repeating. Given it works fine without SSM, I suspect this is the Spring application starting up, failing, and then being retriggered repeatedly.
Running locally, the errors I get are:
com.amazonaws.SdkClientException: Failed to connect to service endpoint:
(short version, at EC2ResourceFetcher readResource)
Caused by: java.net.ConnectException: Connection refused
Factory method 'ssmClient' threw Exception; nested exception is com.amazonaws.SdkClientException: Unable to find a region via the region provider chain. Must provide an explicit region in the builder or setup environment to supply a region.
And then a knock-on exception about not being able to instantiate beans because it can't get to the SSM service.
As above, I am providing a region in bootstrap.properties but it seems to be ignoring this locally - but as it will be part of an application stack when deployed, it seems unlikely to me that it'll be the same error locally that is seen on AWS?
Has anyone accomplished this before, or have any resources which may be useful on what information I need to pass into the container to allow it to talk to SSM?
I suggest using the built-in support ECS has for injecting secrets from SSM Parameter Store (or Secrets Manager) as environment variables. That way your code is more environment agnostic, and you can run it locally by just setting environment variables instead of connecting it to an external dependency like SSM Parameter Store.

Testing a Spring Boot Elastic Search application and loading context without starting ES-Instance

Since I updated to Spring boot 2.5 my application context won't start in the Test environment.
We have several test environments. Most tests do not need an Elastic search instance. Those that need it share an Elasticsearch test container instance.
Since the Update the creation of repositories causes some kind of query to Elasticsearch. That fails and causes the context not to load.
Is there a way to mock away the Spring Data Elasticsearch part(Not loading is not really an option to load most parts of the context)?
Should I be starting an Elasticsearch Instance for all integration tests(that seems like a little overkill, since few tests actually need it)?
Any ideas are highly appreciated.

Hibernate search quarkus compatibility questions

Im working in a quarkus project, I have to connect to an elasticsearch clusert and in production exists a mysql database with data.
Im thinking about using Hibernate Search but I have some questions.
1-Which version of hibernate search use quarkus? In the pom is not specified. Is 6?
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-hibernate-search-orm-elasticsearch</artifactId>
</dependency>
2-It compatible with elasticsearch 7.11.1?
3-In my project I will connect to the mysql database just once to initialize all the index, then the connection is going to be closed, is this possible? or hibernate search needs to be connected to mysql database always?
4-To initialize the indexs with hibernate search is mandatory to use hibernate annotations (for example #Entity and #Column) in the entitys?
5-As I said, the connection with mysql database is going to be close after first indexing, is there a way to add new records to index if I get a list of objects from other system? (for example something like batch)
Thanks
It's Hibernate Search 6 - in Quarkus 1.13, 6.0.2.Final
Yes, it should be. Our main testing is now against the latest Open Source version of Elasticsearch but we are still testing 7.11.
Hibernate Search handles reads/writes and also hydrate your search data from the database so you should have the MySQL database around. If you are only doing read-only stuff AND only using projections, maybe not having the database around is possible but I don't think it's a supported use case
Yes.
You will have to implement it yourself, there's nothing built-in.

Can ElasticSearch be used as a persistent store for Apache Ignite?

I want to know if there's a way to configure the datasource for Ignite as Elastic Search. I was browsing the web. But I did not find a solution.
I want to implement this integration for a Java application.
If I understand your idea correctly there's a way to do it. As far as I can see Elasticsearch supports SQL table-like data access and it's available through jdbc connection. From the Ignite's side we have 3rd party persistance, it uses jdbc to connect to an underlying store system. To be honest I haven't tested it but I suppose it should work.
Also I need mention that you can use GridGain WebConsole to generate simple Ignite project from existing jdbc connection. This functionality could be found on Configuration tab -> Create Cluster Configuration.

Is there a way Spring bean afterProperties method can pick up new settings added after server startup

We are constructing instance of CouchBase cluster in Spring singleton bean afterProperties() method by reading the configurations (like hosts, ports, connection time outs,..). This is working well.
We were using Apache hierarchical configuration for the configurations. Apache hierarchical configurations has reload strategy and does not require server restart after the configuration changed. New configurations will be reflected in 2 minutes.
Now we got requirement to update CouchBase configurations at run time. But since Spring #afterProperties is bean life cycle method and does not execute again, we could not able to achieve what we are looking for.
Right now, we need to restart the server(Tomcat) to reflect the new settings.
Is there any mechanism in Spring or any other better approach to fulfill our requirement ( singleton bean capable to handle configuration change at run time).
Thinking in better design perspective, please provide your thoughts.

Resources