How to connect to Elasticsearch server remotely using load balancer - elasticsearch

There might be a post which I am looking for. I have very limited time and got requirement at the last moment. I need to push the code to QA and setup elasticsearch with admin team. Please respond me as soon as possible or share the link which has similar post!!.
I have scenario wherein I will have multiple elasticsearch servers, one is hosted on USA , another one in UK and one more server is hosted in India within the same network(companies network) which shares same cluster name. I can set multicast to false and unicast to provide host and IP address information to form a topology.
Now in my application I know that I have to use Transport cLient as follows,
Settings settings = ImmutableSettings.settingsBuilder()
.put("cluster.name", "myClusterName").build();
Client client = new TransportClient(settings)
.addTransportAddress(new InetSocketTransportAddress("host1", 9300))
.addTransportAddress(new InetSocketTransportAddress("host2", 9300));
Following are my concerns,
1) As per the above information, admin team will just provide the single ip address that is load balancer ip address and the loadbalancer will manage the request and response handling .I mean the loadbalance is responsible to redirect to the respective elasticsearch server . Here my question is, Is it okay to use Transport client to connect to the host with the portnumber as follows ,
new TransportClient(settings)
.addTransportAddress(new InetSocketTransportAddress("loadbalancer-ip-address", “loadbalance-port-number”)) ;
If loadbalancer will redirect the request to elastcisearch server what should be the configuration to loadbalancer like, we need to provde all the elasticsearch host or ipaddress details to it? so that at any given point of time , if there is any failure to the master elasticsearch server it will pick another master.
2) What is the best configuration for 4 nodes or elasticsearch servers like, shards , replicas and etc.
Each node will have one primary shard and 1 replicate ? which can be configured in elasticsearch.yml
Please replay me as soon as possible.
Thanks in advance.

Related

Can I access to Nifi Rest-API using localhost instead of actual node-ip address in Nifi cluster?

For example; I have 3 nifi nodes in nifi cluster. Example hostnames of these nodes;
192.168.12.50:8080(primary)
192.168.54.60:8080
192.168.95.70:8080
I know that I can access to nifi-rest api from all nifi nodes. I have GetHTTP processor for get cluster summary from rest-api, and this processor runs on only pimary node. I did set "URL" property of this processor to 192.168.12.50:8080/nifi-api/controller/cluster.
But, if primary node is down, new primary node will be elected. Thus, I will not be able to access 192.168.12.50:8080 address from new primary node. Because this node was down. So, I will not be able to get cluster summary result from rest-api.
In this case, Can I use "localhost:8080/nifi-api/controller/cluster" instead of "192.168.12.50:8080/nifi-api/controller/cluster" for each node in nifi cluster?
It depends on a few things... if you are running securely then you have certificates that are generated for each node specific to the hostname, so the host in the web requests needs to match the host in the certificates, so you can't use localhost in that case.
It also depends how NiFi's web server is configured. If nifi.web.http.host or nifi.web.https.host has a specific hostname specified, then the web server is only bound to that hostname and may not accept connections with a different hostname. In a default unsecure setup, if you leave nifi.web.http.host blank then it binds to all interfaces.
You may be able to use the expression language function to obtain the hostname of the current node. So you could make the url something like "http://${hostname()}/nifi-api/controller/cluster".

Requiring public IP address for kafka running on EC2

We have kafka and zookeeper installed on a single AWS EC2 instance. We have kafka producers and consumers running on separate ec2 instances which are on the same VPC and have the same security group as that of kafka instance. In the producer or consumer config we are using the internal IP address of the kafka server to connect to it.
But we have noticed that we need to mention the public IP address of the EC2 server as advertised.listeners for letting the producers and consumers connect to the Kafka server:
advertised.listeners=PLAINTEXT://PUBLIC_IP:9092
Also we have to whitelist the public ip addresses and open traffic on 9092 port of each of our ec2 servers running producers and consumers.
We want the traffic to flow using internal IP addresses. Is there a way we need not whitelist the public ip addresses and open traffic on 9092 port for each one of our servers running producer or consumer?
If you don't want to open access to all for either one of your servers, I would recommend adding a proper high performance web server like nginx or Apache HTTPD in front of your applications' servers acting as a reverse proxy. This way you could also add SSL encryption and your server stays on a private network while only the web server would be exposed. It’s very easy and you can find many tutorials on how to set it up. Like this one: http://webapp.org.ua/sysadmin/setting-up-nginx-ssl-reverse-proxy-for-tomcat/
Because of the variable nature of the ecosystem that kafka may need to work in, it only makes sense that you are explicit in declaring the locations which kafka can use. The only way to guarantee that external parts of any system can be reached via an ip address is to ensure that you are using external ip addresses.

Nifi 1.5.0 Cluster configuration

Does anyone know how to cluster NiFi 1.5.0? I want to use dataflow.mydomain.com but... I get this error when I try to hit the loadbalancer that reads:
"The request contained an invalid host header [dataflow.mydomain.com] in the request [/nifi/]. Check for request manipulation or third-party intercept."
According to one post that I read, the problem was that the value of nifi.web.http.host had to match the value of the url.
If that's true, I don't understand how a cluster would be possible.
Thanks!
(I'm using a 3 host setup in AWS, the hosts will individually respond if I set the nifi.web.http.host to their private IP and I access it at http://[ip]/nifi/
but not if I use a loadbalancer in front of the cluster).
It is not really an issue of clustering NiFi, it is an issue of accessing it through a load balancer. A cluster does not imply a load balancer.
In the next version of NiFi there will be a new property (nifi.web.proxy.host) where you could put dataflow.mydomain.com and it would let it through.
For now I think you'd have to strip off the host header of each request at your load balancer so that it doesn't get passed on to the NiFi nodes, that it was is triggering the rejection. NiFi is inspecting the headers of the incoming request and seeing that the host header has a value that is not the host of NiFi.

Server nodes unble to discover each other in Apache Ignite

I'm trying to setup 3 server nodes in my local machine. Each node starts but unable to join the cluster.
I can see the below message being logged in the log file for each server node.
Topology snapshot [ver=1, servers=1, clients=0, CPUs=4, heap=0.1GB]
Here is my code to start the server.
IgniteConfiguration config = new IgniteConfiguration();
TcpDiscoverySpi spi = new TcpDiscoverySpi();
TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder(true);
ipFinder.setAddresses(Arrays.asList("192.168.0.3","192.168.0.3:47100..47120"));
spi.setIpFinder(ipFinder);
config.setDiscoverySpi(spi);
// Start Ignite node.
ignite = Ignition.start(config);
Can anyone please suggest if I was missing something here!
Try removing the address without the port and leave only the one that specifies the range:
ipFinder.setAddresses(Arrays.asList("192.168.0.3:47100..47120"));
Struggled with this for hours and could only solve it by doing the following.
Ensure that you've set the local interface and port range on which your servers are litening:
TcpDiscoverySpi spi = new TcpDiscoverySpi();
//This address should be accessible to other nodes
spi.setLocalAddress("192.168.0.1");
spi.setLocalPort(48500);
spi.setLocalPortRange(20);
Configure your IP finder accordingly. Supposing that the node is to find peers on the same machine configured similarly (i.e., as per [1] above):
TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder();
ipFinder.setAddresses("192.168.0.1:48500..48520");
spi.setIpFinder(ipFinder);
As instances start up in turn, they'll use ports within the configured range, and within that range, they'll discover one another using TCP discovery.
This is the only way I managed to connect 2+ server nodes on the same machine, without using multicast discovery.

Reconnection in Elasticsearch Cluster

I have a question about the clustering respectively the reconnection in the clustering in Elasticsearch.
I have 2 Elasticsearch-Server on 2 different servers within a network. Both Elasticsearch's are in the same cluster.
In an error scenario the network connection could be broken. I simulate this behaviour while pulling the network cable on one server.
After reconnecting the server to the network the clustering won't be working. When I put some data to one Elasticsearch, the data would not be transferred to the other Elasticsearch.
Does anybody know if there are some settings about the reconnecting?
Best Regards
Thomas
Why dont just put all Elasticsearch servers behind the load balancer with single DNS name, there could be issue in server which go down and need manual intervention , after correcting problem in server it will be available under load balancer automatically.
Did you check if all nodes join the cluster again?
You may want to try following APIs:
Check nodes status
http://es-host:9200/_nodes
Check cluster status
http://es-host:9200/_cluster/health

Resources