Elasticsearch - Integration of Elasticsearch with Spring + Mongodb - spring

Currently I am working on application related to Spring boot + Mongodb. Although application is working fine, but there are some collections where some fields are statis/required and others are totally dynamic.
Example:
#Document(collection = "yyyEntry")
public class YYYEntry{
#Id
private String id;
private String yyyId;
private String yyyNumber;
private List<KeyValueVo> customFields;
//...
}
public class KeyValueVo {
private String field;
private Object value;
}
Where field field in KeyValueVo can hold any value. And there can be a list of such pairs. Now although I can search document by these fields in mongodb, but its performance is too low(There can be a very large number of such records!), and this completely looks like a Complete Textual Search.
So for this purpose, we decided to integrate ElasticSearch with Mongodb for searching purpose. I have integrated Elasticsearch with Spring boot and working fine, but came to know that Elasticsearch is using/storing same data in its own format in its own storage and not using data of Mongodb.(Data Duplication!)
So my questions are:
Can we configure Elasticsearch to use data of mongodb?(Remove
Data duplication)
If question-1 is impossible, then how can I store data in both Mongodb/ElasticSearch-engine using spring boot at the same time?
Is it good to use Elasticsearch with Mongodb?
Is it good to use Elasticsearch MongoDB River for that purpose?
No need for code implementations, just need simple guidelines, any help will be highly appreciated.
Thanks,

Related

Spring Data Elasticsearch - How to get mapping for a field

Wondering how to do this in java using spring data elasticsearch library.
GET /my-index-000001/_mapping/field/user
This is not supported by Spring Data Elasticsearch. You'll have to get the mapping for the index and extract the part you need from the returned Map<String, Object>.

spring-data-elasticsearch search template

We have a few similar queries and I wanted to do some templating based on parameters. Recently I've found that elastic supports search templates so I'm wondering whether this is supported by spring-data-elasticearch.
Currently my query looks something like:
final Query query = new NativeSearchQueryBuilder().addAggregation(aggregationBuilder)
.withPageable(EmptyPage.INSTANCE)
.withQuery(queryBuilder)
.build();
I'm wondering if I can somehow pass the template that I've stored in application and get the result from elastic. Or if I can store the template in elastic and get the result based on parameters.
No, Spring Data Elasticsearch currently does not support search templates.
Edit 16.03.2021: search template support has been added to the ReactiveElasticsearchClientin Spring Data Elasticsearch by a pull request from bilak. Thanks for that.

Is it possible to set up the Fuzzy parameter on all indexes data search as the app parameter when the SpringBoot app is requesting ElasticSearch?

I'd like to have a properties set up to adjust fuzziness of elasticsearch search request as a whole application set up, i.e not changing this per #Query of the individual MyEntitySearchRepository. Is there a way to specify this using 1) some SpringBoot properties to be picked up by the Spring Data ElasticSearch 2) using ElasticsearchTemplate to prepopulate it with the fuzzy value from the homegrown spring boot property, while the other part of the app queries to go to ElasticSearch should go from the Spring data definitions (index names, by/in/like parameters). Is it ever possible, or for now the only way it to set up individual #Query to form the request json, containing fuzzy parameter like is described there and I can only paste the fuzzy value there being taken from the homegrown SpringBoot property?
This is at the moment not possible, and I'm not sure if I understand you right: You want to define a global fuzzy setting that should be applied to all queries? On which fields of your document? All String fields?
There is no global fuzzy setting in Elasticsearch itself, so it would be necessary to build custom queries internally.
At the moment the only way to go is with #Query annotated custom repository methods.

Batch indexing Spring Data JPA entries to Elastic through Spring Data ElasticSearch

Our current setup is MySQL as main data source through Spring Data JPA, with Hibernate Search to index and search data. We now decided to go to Elastic Search for searching to better align with other features, besides we need to have multiple servers sharing the indexing and searching.
I'm able to setup Elastic using Spring Data ElasticSearch for data indexing and searching easily, through ElasticsearchRepository. But the challenge now is how to index all the existing MySQL records into Elastic Search. Hibernate Search provides an API to do this org.hibernate.search.jpa.FullTextEntityManager#createIndexer which we use all the time. But I cannot find a handy solution within Spring Data ElasticSearch. Hope somebody can help me out here or provide some pointers.
There is a similar question here, however the solution proposed there doesn't fit my needs very well as I'd prefer to be able to index a whole object, which fields are mapped to multiple DB tables.
So far I haven't found a better solution than writing my own code to index all JPA entries to ES inside my application, and this one worked out for me fine
Pageable page = new PageRequest(0, 100);
Page<Instance> curPage = instanceManager.listInstancesByPage(page); //Get data by page from JPA repo.
long count = curPage.getTotalElements();
while (!curPage.isLast()) {
List<Instance> allInstances = curPage.getContent();
for (Instance instance : allInstances) {
instanceElasticSearchRepository.index(instance); //Index one by one to ES repo.
}
page = curPage.nextPageable();
curPage = instanceManager.listInstancesByPage(page);
}
The logic is very straightforward, just depending on the quantity of the data it might take a while, so breaking down to batches and adding some messages can be helpful.

Guidance on how to index Elasticsearch documents using Spring Data

My application uses both Spring Data JPA and Spring Data Elasticsearch.
I plan to first persist the JPA entities, then map them to a slightly different java class (the Elasticsearch document) and finally index that document into the Elasticsearch index.
However, I have a few questions as how, where and when to index the documents.
Is indexing a time consuming process that should be asynchronous?
What design pattern could help me avoid having problematic code such as the following?
saveAdvertisement method from AdvertisementService:
public void saveAdvertisement(Advertisement jpaAdvertisement) {
jpaAdvertisementRepository.save(jpaAdvertisement);
//somehow map the jpa entity to the es document
elasticSearchTemplate.index(esAdvertisement);
}
whereby I have to have two concerns in the same method:
JPA persist
Elasticsearch indexing

Resources