If there are 3 etcd nodes, will Apache APISIX still be able to get the configuration if 2 of them fail? Why? - apache-apisix

In order to use APISIX I have prepared an etcd cluster with three nodes, I would like to ask that if two nodes fail, can APSIX still get the configuration normally?
Also if all the nodes fail will APISIX still work?

Related

When configuring Snapshot for an ElasticSearch cluster, do I do that to every node?

Sorry for what may be an obvious question. But I have a 3 node ElasticSearch cluster, and I want it to take a nightly snapshot that is sent to S3 for recovery. I have done this for my test cluster which is a single node. And I was starting to do it for my 3 node production cluster when I was left wondering if I have to configure the repository and snapshot on each node separately or can I just do it on one node via Kibana and then it will replicate that across the cluster? I have looked through the documentation but didn't see anything about this.
Thank you!
Yes, you need to configure it in every node.
First you need to install the repository-s3 plugin in every node, this is explained in the documentation.
After that, you also need to add the access and secret keys in the elasticsearch-keystore of every node. (documentation).
The rest of the configuration, creating the repository and setting the snapshots, are done through Kibana once.

New broker cannot join the cluster (Kafka)

We are currently configuring our kafka cluster to have SSL and ACL.
Our cluster is composed of 3 nodes, and all three of them contain the same SSL/TLS(because I just copied the certificates from one node to the other two nodes) and all other kafka configurations.
Two of the nodes are successful in joining the cluster, but the last one does not.
Here is the error:
ReplicaFetcherThread
It says
UnknownTopicOrPartitionException: This server does not host this topic-partition.
I find it weird because all three of them have the same configuration.
The three servers are in Amazon EC2. They also have the same security group.
Hope you can help me understanding the problem.
This should be a comment but am still to low to comment. But you say they all have the same configuration. Did you make sure to bump the broker id on the one that's failing?

ganglia is missing some metrics of some nodes

I use ganglia to monitor performance-related metrics of cluster nodes.
I installed gmond python modules for richer functionality.
However, some metrics from some nodes are missing (i.e. disk_*_read_bytes_per_sec)
There are a few nodes that work as expected reporting the metrics.
But some nodes are missing either disk__read_bytes_per_sec or disk__write_bytes_per_sec or both of them.
If I restart gmond daemon some work correctly again and some work incorrectly again....
I checked /etc/ganglia/gmond.conf, /etc/ganglia/conf.d/* configurations files. All the computation nodes in the cluster have the exactly same configuration settings. How can they behave such differently? Where should I check first to resolve the problem?
Thanks

kubernetes go client used storage of nodes and cluster

I am newbie in Go. I want to get the storage statistics of nodes and cluster in kubernetes using Go code. How i can get the free and used storage of Kubernetes nodes and cluster using Go.
This is actually 2 problems:
How do I perform http requests to the Kubernetes master?
See [1] for more details. Tl;dr you can access the apiserver in at least 3 ways:
a. kubectl get nodes (not go)
b. kubectl proxy, followed by a go http client to this url
c. Running a pod in a kubernetes cluster
What are the requests I need to do to get node stats?
a. Run kubectl describe node, it should show you resource information.
b. Now run kubectl describe node --v=7, it should show you the REST calls.
I also think you should reformat the title of your question per https://stackoverflow.com/help/how-to-ask, so it reflects what you're really asking.
[1] https://github.com/kubernetes/kubernetes/blob/release-1.0/docs/user-guide/accessing-the-cluster.md

Could not determine the current leader

I'm in this situation in which I got two masters and four slaves in mesos. All of them are running fine. But when I'm trying to access marathon I'm getting the 'Could not determine the current leader' error. I got marathon in both masters (117 and 115).
This is basically what I'm running to get marathon up:
java -jar ./bin/../target/marathon-assembly-0.11.0-SNAPSHOT.jar --master 172.16.50.117:5050 --zk zk://172.16.50.115:2181,172.16.50.117:2181/marathon
Could anyone shed some light over this?
First, I would double-check that you're able to talk to Zookeeper from the Marathon hosts.
Next, there are a few related points to be aware of:
Per the Zookeeper administrator's guide (http://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html#sc_zkMulitServerSetup) you should have an odd number of Zookeeper instances for HA. A cluster size of two is almost certainly going to turn out badly.
For a highly available Mesos cluster, you should run an odd number of masters and also make sure to set the --quorum flag appropriately based on that number. See the details of how to set the --quorum flag (and why it's important) in the operational guide on the Apache Mesos website here: http://mesos.apache.org/documentation/latest/operational-guide
In a highly-available Mesos cluster (#masters > 1) you should let both the Mesos agents and the frameworks discover the leading master using Zookeeper. This lets them rediscover the leading master in case a failover occurs. In your case assuming canonical ZK ports you would set the --zk flag on the Mesos masters to --zk=zk://172.16.50.117:2181,172.16.50.115:2181/mesos (add a third ZK instance, see the first point above). The same value should be used for the --master flags in both the Mesos agents and Marathon, instead of specifying a single master.
It's best to run an odd number of masters in your cluster. To do so, either add another master so you have three or remove one so you have only one.

Resources