Elasticsearch multinode environment - elasticsearch

In elastic search i created multi-node setup.I use Java Api transport Client to communicate to Elasticsearch server.
Now i created transport client with only one IP[assume:192.129.129.12.9300]. If i request any query in single ip it communicates all nodes and returns results. What happen if it my node[192.129.129.12.9300] that i mentioned in transport Client fails. Can i communicate with other nodes. What is the optimum way configuration to use transport Client for multi node set up.

You need to activate sniff option.
See http://www.elasticsearch.org/guide/en/elasticsearch/client/java-api/current/client.html#transport-client

Related

Migrating ElasticSearch to Elastic cluster on Kubernetes, is there is possible way to send some of the incoming traffic to ECK cluster?

We are migrating from Elastic Search VM to Elastic cluster on Kubernetes, is there is any possible ways to separate some percentage of the incoming traffic to both ECK cluster and Elastic search on VM’s
Yes you can do it however it depends on the what you are using to Load balancer the traffic and handling the requests.
If you are using the Nginx you can use the split traffic and handle the traffic percentages accordingly.
http://nginx.org/en/docs/http/ngx_http_split_clients_module.html
If you are using the Istio also you can manage it with the
https://istio.io/latest/docs/concepts/traffic-management/
So it would be mostly depends on what you are accepting the traffic and proxy setup.

Consul Go Client redundant server connection

I'm testing a consul server cluster. I am using the go client for this.
How do I enter multiple servers for the client to connect to?
Optimally it would be something like:
client, err := api.NewClient(api.DefaultConfig())
client.remotes = host_array
Is this a wrong-headed approach to using consul and the expected way for a user is to start a client node and then read the locally replicated state?
The Consul API client defaults to 127.0.0.1:8500 because there is an expectation that it will connect to a local Consul Agent running in client mode. The Consul Agent should be your "proxy" to the Consul Servers and maintain the connections with active servers so you don't have to.
https://www.consul.io/docs/internals/architecture.html
https://github.com/hashicorp/consul/issues/3689
An alternate approach could be to utilize a load balancer for a cluster of Consul Servers. Strategies for that are documented here... https://www.hashicorp.com/blog/load-balancing-strategies-for-consul

What is the use of Netty in ElasticSearch?

I am a newbie in elastic-search.
I am a web developer with very less networking experience.
I have read the following documents -
https://netty.io/
https://stackoverflow.com/questions/23839437/what-are-the-netty-alternatives-for-high-performance-networking
I wasn't able to understand the purpose of netty for elastic search. Could anyone explain it to me in layman terms?
Elasticsearch offers two interfaces to talk to it. One is the HTTP interface and the other one is the transport interface.
The first usually runs on port 9200 and can be accessed via any HTTP capable tool e.g. curl or your favorite browser. The transport interface is used by cluster members to exchange data and state and runs on port 9300 using a custom protocol.
Both interfaces use netty as "socket / NIO" library.

shared node wise queue

I am building a proxy server using Java. This application is deployed in docker container (multiple instances)
Below are requirements I am working on.
Clients send http requests to my proxy server
Proxy server forward those requests in the order it received to destination node server.
When destination is not reachable, proxy server store those requests and forward it when it is available in future.
Similarly when a request fails, request will be re-tried after "X" time
I implemented a node wise queue implantation (Hash Map - (Key) node name - (value) reachability status + requests queue in the order it received).
Above solution works well when there is only one instance. But I would like to know how to solve this when there are multiple instances? Is there any shared datastructure I can use to solve this issue. ActiveMQ, Redis, Kafka something of that kind (I am very new to shared memory / processing).
Any help would be appreciated.
Thanks in advance.
Ajay
There is an Open Source REST Proxy for Kafka based on Jetty which you might get some implementation ideas from.
https://github.com/confluentinc/kafka-rest
This proxy doesn’t store messages itself because kafka clusters are highly available for writes and there are typically a minimum of 3 kafka nodes available for Message persistence. The kafka client in the proxy can be configured to retry if the cluster is temporarily unavailable for write.

Apart the two hop operation what other differences between Elasticsearch Node Client and Transport Client?

This may have been answered before but I do think their is some other key difference between the two native clients worth mentioning. I summarize what I think is some key differences but also have some questions at the bottom.
The most common knowledge is that Node client joins the cluster as a node and that Transport client requires 2 hop operations.
To take it further "The Node Client connects to the cluster". Actually the Node Client joins the cluster as a node to create a full mesh cluster. I.e: There is two way communication between all the nodes including clients. So anyone can contact anyone. Better explained here: https://www.found.no/foundation/elasticsearch-networking/
The only difference between client node and data node is that the client node does not store data. People should also be aware that the Node client also binds listening ports for the tcp and http transports. This allows it to become a proxy so other clients can connect to it or to serve plugin sites.
The transport client is a client in the purest sense. It doesn't join the cluster and it does not do any kind of port binding and it cannot be connected to and it cannot serve plugin sites.
Also correct me if I'm wrong Node client can also do scatter gather.
Does the Transport client consider routing fields. So if we pass it a routing field we could avoid the two hop? I'm guessing no.
Does the transport client do scatter gather?

Resources