Reconnection in Elasticsearch Cluster - elasticsearch

I have a question about the clustering respectively the reconnection in the clustering in Elasticsearch.
I have 2 Elasticsearch-Server on 2 different servers within a network. Both Elasticsearch's are in the same cluster.
In an error scenario the network connection could be broken. I simulate this behaviour while pulling the network cable on one server.
After reconnecting the server to the network the clustering won't be working. When I put some data to one Elasticsearch, the data would not be transferred to the other Elasticsearch.
Does anybody know if there are some settings about the reconnecting?
Best Regards
Thomas

Why dont just put all Elasticsearch servers behind the load balancer with single DNS name, there could be issue in server which go down and need manual intervention , after correcting problem in server it will be available under load balancer automatically.

Did you check if all nodes join the cluster again?
You may want to try following APIs:
Check nodes status
http://es-host:9200/_nodes
Check cluster status
http://es-host:9200/_cluster/health

Related

Elasticsearch Python - No viable nodes were discovered on the initial sniff attempt

I have a cluster of Elasticsearch nodes running on different AWS EC2 instances. They internally connect via a network within AWS, so their network and discovery addresses are set up within this internal network. I want to use the python elasticsearch library to connect to these nodes from the outside. The EC2 instances have static public IP addresses attached, and the elastic instances allow https connections from anywhere. The connection works fine, i.e. I can connect to the instances via browser and via the Python elasticsearch library. However, I now want to set up sniffing, so I set up my Python code as follows:
self.es = Elasticsearch([f'https://{elastic_host}:{elastic_port}' for elastic_host in elastic_hosts],
sniff_on_start=True, sniff_on_connection_fail=True, sniffer_timeout=60,
sniff_timeout=10, ca_certs=ca_location, verify_certs=True,
http_auth=(elastic_user, elastic_password))
If I remove the sniffing parameters, I can connect to the instances just fine. However, with sniffing, I immediately get elastic_transport.SniffingError: No viable nodes were discovered on the initial sniff attempt upon startup.
http.publish_host in the elasticsearch.yml configuration is set to the public IP address of my EC2 machines, and the /_nodes/_all/http endpoint returns the public IPs as the publish_address (i.e. x.x.x.x:9200).
We have localized this problem to the elasticsearch-py library after further testing with our other microservices, which could perform sniffing with no problem.
After testing with our other microservices, we found out that this problem was related to the elasticsearch-py library rather than our elasticsearch configuration, as our other microservice, which is golang based, could perform sniffing with no problem.
After further investigation we linked the problem to this open issue on the elasticsearch-py library: https://github.com/elastic/elasticsearch-py/issues/2005.
The problem is that the authorization headers are not properly passed to the request made to Elasticsearch to discover the nodes. To my knowledge, there is currently no fix that does not involve altering the library itself. However, the error message is clearly misleading.

NIFI secure 3 node cluster

I am seeing some errors in my nifi cluster, I have a 3 node secured nifi cluster i am seeing the below errors. at the 2 nodes
ERROR [main] org.apache.nifi.web.server.JettyServer Unable to load flow due to:
java.io.IOException: org.apache.nifi.cluster.ConnectionException:
Failed to connect node to cluster due to: java.io.IOException:
Could not begin listening for incoming connections in order to load balance data across the cluster.
Please verify the values of the 'nifi.cluster.load.balance.port' and 'nifi.cluster.load.balance.host'
properties as well as the 'nifi.security.*' properties
See the clustering configuration guide for the list of clustering options you have to configure. For load balancing, you'll need to specify ports that are open in your firewall so that the nodes can communicate. You'll also need to make sure that each host has its node hostname property set, its host ports set and that there are no firewall restricts between the nodes and your Apache Zookeeper cluster.
If you want to simplify the setup to play around, you can use the information in the clustering configuration section of the admin guide to set up an embedded ZooKeeper node within each NiFi instance. However, I would recommend setting up an external ZooKeeper cluster. A little more work, but ultimately worth it.

Graylog2 'cluster' over VPN

We have 2 locations connected by VPN.
Currently we have 2 independent graylog servers.
We want to create some kind co cluster, so we can reach logs on both sides even if VPN is down.
Is is something like this:
We already tried to create Elasticsearch cluster, but this is not the way.
If VPN is down whole cluster is down and logs not working on either side.
I found this article: https://www.elastic.co/blog/scaling_elasticsearch_across_data_centers_with_kafka
with such topology:
but I have no idea how to configure Apache Kafka so it will be broker for graylog and input for syslog server.
Any help/another idea / link will be much appreciated.

How to connect to Elasticsearch server remotely using load balancer

There might be a post which I am looking for. I have very limited time and got requirement at the last moment. I need to push the code to QA and setup elasticsearch with admin team. Please respond me as soon as possible or share the link which has similar post!!.
I have scenario wherein I will have multiple elasticsearch servers, one is hosted on USA , another one in UK and one more server is hosted in India within the same network(companies network) which shares same cluster name. I can set multicast to false and unicast to provide host and IP address information to form a topology.
Now in my application I know that I have to use Transport cLient as follows,
Settings settings = ImmutableSettings.settingsBuilder()
.put("cluster.name", "myClusterName").build();
Client client = new TransportClient(settings)
.addTransportAddress(new InetSocketTransportAddress("host1", 9300))
.addTransportAddress(new InetSocketTransportAddress("host2", 9300));
Following are my concerns,
1) As per the above information, admin team will just provide the single ip address that is load balancer ip address and the loadbalancer will manage the request and response handling .I mean the loadbalance is responsible to redirect to the respective elasticsearch server . Here my question is, Is it okay to use Transport client to connect to the host with the portnumber as follows ,
new TransportClient(settings)
.addTransportAddress(new InetSocketTransportAddress("loadbalancer-ip-address", “loadbalance-port-number”)) ;
If loadbalancer will redirect the request to elastcisearch server what should be the configuration to loadbalancer like, we need to provde all the elasticsearch host or ipaddress details to it? so that at any given point of time , if there is any failure to the master elasticsearch server it will pick another master.
2) What is the best configuration for 4 nodes or elasticsearch servers like, shards , replicas and etc.
Each node will have one primary shard and 1 replicate ? which can be configured in elasticsearch.yml
Please replay me as soon as possible.
Thanks in advance.

Full Clustering in Apache Traffic Server

I followed the steps mentioned in the official documentation for full clustering of multiple ATS instances. I installed 2 instances of ATS on 2 different Ubuntu machines (having the same specs, OS versions and hardware), and both of these act as a reverse proxy for web service hosted on a Tomcat server in a different machine. I wasnt able to set up the cluster. Here are some of the queries that I have.
They are on the same switch or same VLAN : The two Ubuntu machines on which I installed the ATS are connected to the same switch. They have the same interface mentioned in the /etc/network/interfaces. Are these enough or there is something else that has to be done to get the clustering?.
Running the comment traffic_line -r proxy.process.cluster.nodes : This returned 1 after I ran the traffic_line -x and traffic_line -L commands. But, in the cluster.config file, there isnt any additions or changes.
Moreover, when I make a query to one of these ATS instances (I have mapped the URLs in the remap.config file), both of them cache the responses locally and is not shared across.
From this information, can anyone tell me if I am doing something wrong. Let me know if anymore info is required.
Are these on virtual machines? I almost wasted 2 days trying to figure out what is wrong, when I initially set it up on openvz containers. Out of a wild guess, I decided to migrate to 2 physical nodes, and it went well. See Apache Traffic Server Clustering not working
proxy.process.cluster.nodes returns 1
means that it is just the standalone single node, and the second node on the cluster is not discovered.
Try a tcp dump for multicast and broadcast messages. If the other server's IP is not showing in the discovery packet, it has something to do at the network level, where the netops might have disabled multicast packet forwarding across switches.

Resources