save springboot output to elasticsearch engine - spring-boot

I have rest API through which I am sending and getting massages to the Kafka server using spring-boot. Now I want to save those messages to elasticsearch. How to do it can anyone help?

Actually this is a systematic job in which case, is somehow like setting up a database storage architecture.
TO BE SIMPLE AND SHORT:
First you need to decide which ES version you want to use, because there are some breaking changes between ES 2.x to 7.x. And those differences may affect the way you design the schema of your storage.
Assume you use latest 7.x ES, you will need to create index(es) where you want the data fetched from kafka to be stored into. Checkout https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html
Later you have indexes created, you need to apply and learn some basic knowledge about ES high level rest client and low level rest client. The low level rest client enables you the basic connection to ES cluster via HTTP. And high level rest client apis give you sufficient ways to do ops like documents CRUD, search, aggregations for your data. You can easily find dependencies via maven and use them in your Spring Boot Application. Checkout https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-high.html

Related

what's the best solution to make kotlin and elasticsearch comunicate

Using google i found two possible solution for now:
-using spring boot with kotlin
-using this kotlin client https://github.com/jillesvangurp/kt-search
I've already finished the android client application in kotlin but now i have to find a way to make this client comunicate with elasticsearch.
What would it be the best solution for my problem that i could look up online
Thanks in advance
First, if you don't use Spring in the current application, integrating it (if possible) would be a lot of work and most likely not worth it.
Another alternative would be to use the officially supported Java client for Elasticsearch (https://www.elastic.co/guide/en/elasticsearch/client/java-api-client/current/index.html).
BUT, you should keep in mind that, if you put the Elasticsearch client in your client application, that the credentials, you may use, will also be in the client application.
They could be accessible for everyone using the Android client and the user could perform any request on the Elasticsearch. Also you couldn't change them without updating the client.
So, it is likely better to use a Three-Tier-Architecture (https://www.ibm.com/topics/three-tier-architecture) and creating an API service to handle the Elasticsearch requests.

logstash vs spring cloud data flow, which one is suitable for data preprocessing?

I'm using spring boot along with elasticsearch to make a search system on my website.
I've some data that i need to push in elastic search, this data ( a product for example ) must be processed before ( passed to another micro-service that filters the JSON, adds some fields for a better search result do some calculations and return the object i want to store ). is it possible to do so with log stash, or do i need to use Spring Cloud Data Flow ? thanks in advance.
what i want to do:
save a product ( product service )
log the saved product or stream it.
process it before storage ( another service )
save the document ( elastic search server )
Thanks in advance.
Obviously it depends on various factors but I can try to provide some insights on Spring Cloud Data Flow from the technical standpoint.
If you want to construct a streaming pipeline where your filtering apps are connected via a messaging system that does this flow of data processing, you can checkout Spring Cloud Data Flow.
Spring Cloud Data Flow (and the underlying framework supports such as Spring Cloud Stream and Spring Cloud Task) provides the operational benefits over how you manage your streaming pipelines but it may not make sense if you don't need a data pipeline with a messaging system etc., In those cases, you would just stick to a simple Spring Boot app that does this whole filtering model. As soon as you start exploring the distribution of these applications loosely coupled via messaging system, Spring Cloud Data Flow would be handy.
Please checkout SCDF guide to understand some of the features and recipes to know more about what SCDF can offer and choose what fits in your case.

Partial Document update using Jest Client in elasticsearch

I am curious to know which client should I use for elasticSearch using java API. There are multiples clients like Jest, Transport, ElasticSearch clients. Also I have to perform CRUD operations on ES.
You should use the java high level rest client as its an official Elasticsearch java client(that supports all documents CRUD operations).
JEST is not official client and not available for latest ES version(not even 7.0 while 7.8 is released so not keeping pace with ES versions).
Transport client is used by high level client and is getting deprecated mention in this official doc.
Please read this thread for more info on all these clients and how they work internally.

Elastic search high/low rest client vs spring rest template

I am in a dilemma over to use spring's rest template or elasticsearch's own high/low rest client while searching in es . Does es client provide any advantage like HTTP connection pooling , performance while compared to spring rest template . Which of the two take less time in getting response from the server . Can some one please explain this ?
The biggest advantage of using Spring Data Elasticsearch is that you don't have to bother about the things like converting your requests/request bodies/responses from your POJO domain classes to and from the JSON needed by Elasticsearch. You just use the methods defined in the ElasticsearchOperations class which is implemented by the *Template classes.
Or going one abstraction layer up, use the Repository interfaces the all the Spring Data modules provide to store and search/retrieve your data.
Firstly, This is a very broad question. Not sure if it suits the SO guidelines.
But my two cents:
High Level Client uses Low Level client which does provide connection pooling
High Level client manages the marshalling and unmarshalling of the Elastisearch query body and response, so it might be easier to work using the APIs.
On the other hand, if you are familiar with the Elasticsearch querying by providing the JSON body then you might find it a bit difficult to translate between the JSON body and the Java classes used for creating the query (i.e when you are using Kibana console or other REST API tools)
I generally overcome this by logging the query generated by the Java API so that I can use it with Kibana console or other REST API tools.
Regarding which one is efficient- the library will not matter that much to affect the response times.
If you want to use Spring Reactive features and make use of WebClient, ES Libraries do provide support for Async search.
Update:
Please check the answer by Wim Van den Brande below. He has mentioned a very valid point of using Transport Client which has been deprecated over REST API.
So it would be interesting to see how RestTemplate or Spring Data ElasticSearch will update their API to replace TransportClient.
One important remark and caveat with regards to the usage of Spring Data Elasticsearch. Currently, Spring Data Elasticsearch doesn't support the communication by the High Level REST Client API. They are using the transport client. Please note, the TransportClient is deprecated as of Elasticsearch 7.0.0 and is expected to be removed in Elasticsearch 8.0!!!
FYI, this statement has been confirmed already by another post: Elasticsearch Rest Client with Spring Data Elasticsearch

Jaeger with ElasticSearch

I have created a microservice based architecture using Spring Boot and deployed the application on Kubernetes/Istio platform.
The different microservices communicate with each other using either JMS (ActiveMQ) or REST API.
I am getting the tracing of REST communication on Istio's Jaeger but the JMS based communication is missing in Jaeger.
I am using ElasticSearch to store my application logs.
Is it possible to use the same ElasticSearch as a backend(DB) of Jaeger?
If yes then I will store tracing specific logs in ElasticSearch and query them on Jaeger UI.
I believe you can reuse Elasticsearch for multiple purposes - each would use a different set of indices, so separation is good.
from: https://www.jaegertracing.io/docs/1.11/deployment/ :
Collectors require a persistent storage backend. Cassandra and Elasticsearch are the primary supported storage backends
Tying the networking all together, a docker-compose example:
How to configure Jaeger with elasticsearch?
While this isn't exactly what you asked, it sounds like what you're trying to achieve is seeing tracing for your JMS calls in Jaegar. If that is the case, you could use an OpenTracing tracing solution for JMS or ActiveMQ to report tracing data directly to Jaegar. Here's one potential solution I found with a quick google. There may be others.
https://github.com/opentracing-contrib/java-jms

Resources