Datastax Opscenter issue: dashboard timeout - amazon-ec2

I installed Datastax community version in an EC2 server and it worked fine. After that I tried to add one more server and I see two nodes in the Nodes menu but in the main dashboard I see the following error:
Error: Call to /Test_Cluster__No_AMI_Parameters/rc/dashboard_presets/ timed out.
One potential rootcause I can see is the name of the cluster? I specified something else in the cassandra.yaml but it looks like opscenter is still using the original name? Any help would be grealy appreciated.

It was because cluster name change wasn't made properly. I found it easier to change the cluster name before starting Cassandra cluster. On top of this, only one instance of opscentered needs to run in one single cluster. datastax-agent needs to be running in all nodes in the cluster but they need to point to the same opscenterd (change needs to be made at /var/lib/datastax-agent/conf/address.yaml)

Related

Cannot find datadog agent connected to elasticserch

I have an issue where i have multiple host dashboards for the same elasticsearch server. Both dashboards has its own name and way of collecting data. One is connected to the installed datadog-agent and the other is somehow connected to the elasticsearch service directly.
The weird thing is that i cannot seem to find a way to turn off the agent connected directly to the ES service, other than turning off the elasticsearch service completly.
I have tried to delete the datadog-agent completely. This stops the dashboard connected to it, to stop receiving data (of course) but the other dashboard keeps receiving data somehow. I cannot find what is sending this data and therefor is not able to stop it. We have multiple master and data node and this is an issue for all of them. ES version is 7.17
another of our clusters is running ES 6.8, and we have not made the final configuration of the monitoring of this cluster but for now it does not have this issue.
just as extra information:
The dashboard connected to the agent is called the same as the host server name, while the other only has the internal ip as it's host name.
Does anyone have any idea what it is that is running and how to stop it? I have tried almost everything i could think of.
i finally found the reason. as all datadog-agents on all master and data nodes was configured to not use the node name as the name and cluster stats was turned on for the elastic plugin for datadog. This resulted in the behavior that when even one of the datadog-agents in the cluster was running, data was coming in to the dashboard which was not named correclty. Leaving the answer here if anyone hits the same situation in the future.

Setting up a Sensu-Go cluster - cluster is not synchronizing

I'm having an issue setting up my cluster according to the documents, as seen here: https://docs.sensu.io/sensu-go/5.5/guides/clustering/
This is a non-https setup to get my feet wet, I'm not concerned with that at the moment. I just want a running cluster to begin with.
I've set up sensu-backend on my three nodes, and have configured the backend configuration (backend.yml) accordingly on all three nodes through an ansible playbook. However, my cluster does not discover the other two nodes. It simply shows the following:
For backend1:
=== Etcd Cluster ID: 3b0efc7b379f89be
ID Name Peer URLs Client URLs
────────────────── ─────────────────── ─────────────────────── ───────────────────────
8927110dc66458af backend1 http://127.0.0.1:2380 http://localhost:2379
For backend2 and backend3, it's the same, except it shows those individual nodes as the only nodes in their cluster.
I've tried both the configuration in the docs, as well as the configuration in this git issue: https://github.com/sensu/sensu-go/issues/1890
None of these have panned out for me. I've ensured all the ports are open, so that's not an issue.
When I do a manual sensuctl cluster member-add X X, I get an error message and it results in the sensu-backend process failing. I can't remove the member, either, because it causes the entire process to not be able to start. I have to revert to an earlier snapshot to fix it.
The configs on all machines are the same, except the IP's and names are appropriated for each machine
etcd-advertise-client-urls: "http://XX.XX.XX.20:2379"
etcd-listen-client-urls: "http://XX.XX.XX.20:2379"
etcd-listen-peer-urls: "http://0.0.0.0:2380"
etcd-initial-cluster: "backend1=http://XX.XX.XX.20:2380,backend2=http://XX.XX.XX.31:2380,backend3=http://XX.XX.XX.32:2380"
etcd-initial-advertise-peer-urls: "http://XX.XX.XX.20:2380"
etcd-initial-cluster-state: "new" # have also tried existing
etcd-initial-cluster-token: ""
etcd-name: "backend1"
Did you find the answer to your question? I saw that you posted over on the Sensu forums as well.
In any case, the easiest thing to do in this case would be to stop the cluster, blow out /var/lib/sensu/sensu-backend/etcd/ and reconfigure the cluster. As it stands, the behavior you're seeing seems like the cluster members were started individually first, which is what is potentially causing the issue and would be the reason for blowing the etcd dir away.

Elasticsearch multinode setup

I want to setup an 3 node cluster setup in elasticsearch, but I unable to setup, getting error like connection refused in data machine, master machine starting fine, but it shows like 0 nodes added.
I would recommend to read tutorial first, like
https://www.digitalocean.com/community/tutorials/how-to-set-up-a-production-elasticsearch-cluster-on-ubuntu-14-04
https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-zen.html
then ask precise question here about a specific issue.
About your question, I think you didnt configure discovery.zen.ping.unicast.host fine, so nodes dont know each other.
Also, when you post a question, please post:
elasticsearch version
environnement (AWS, VM ...)
configuration sample
Welcome at SO!

ejabberd Cluster Not Working

I've set up a ejabberd (v 15.04) cluster on AWS using 2 Ubuntu images. Whilst I am able to successfully cluster the two (using the command join_cluster from the 2nd node to the 1st node), I am not sure if the behavior is as expected... any thoughts would be much appreciated...
To detail the above, 2 different clients connected to the 2 nodes separately can communicate with each other. However, when I stop the server on the secondary node, I would still expect the two clients to be able to talk to each other. But instead, this 2nd client simply gets disconnected as the server is not running.
Is there something possibly that am overlooking here?
Many thanks!
join two node with join_as_master() method.
cluster code is available on github site.
for doing the Ejabberd Clustering I followed the steps from the link below :
Link : http://chadillac.tumblr.com/post/35967173942/easy-ejabberd-clustering-guide-mnesia-mysql
I hav done the Clustering with no mysql tables only mnesia database
Imp Note :
1) The ejabberd.yml file should be same as in the master host.
2) Copy the .erlang.cookies file frm the master to the slaves
3) The slave host name will be mentioned in the ejabberdctl.cfg which will different from that mentioned in yml file of the slave.
4) For my sql, as in we are creating a totally different machine ..no need to add into the cluster.

HBase NoServerForRegionException?

I am getting this exception when for a while i didn't communicated with HBase:
org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying to locate root region because: Connection refused
is this something related with session expiry, if so, how can i extend session lifetime?
Query bin/hbase hbck and find in which machine root Regionserver is running..
You should get -ROOT- is okay on hbck. Make sure that all your
Regionserver is up and running.
use start regionserver for starting regionserver
I don't think this has anything to do with session lifetime.
Check your cluster to make sure that it is up and working correctly and all region servers are alive. Then check the logs to make sure that they are not reporting some error state.
HBase is complex software -- without more detailed information it is very difficult to diagnose what is going on. And often you can discover the problem by collecting the more detailed information.
This error shows that the client is not able to talk to Region server.
Check the region server associated with the region its trying to connect and check its up.
To identify the region server associated with the region please go through http://hbase.apache.org/0.94/book/regions.arch.html#regions.arch.assignment
Some factors have played a role here.
Please note the below steps which occur when you try to connect to Hbase from a client,
Hbase connects to Zookeeper to get the Ip of the regionservers which host the ROOT table.
The client caches this information about the IP's so that it doesnt have to contact the zookeeper again.
Your problem is that, your client is trying to connect to the zookeeper to get the IP. one of the below things may be going wrong,
Your client is not able to connect to the zookeeper.
The information about the ROOT contained inside the Znode in ZooKeeper is wrong.
Possible fixes.
Check if your zookeeper is working fine.
Delete the Znode for Hbase in your Zookeeper and restart the cluster. Don't worry, this wont delete your data.
Once this is achieved? the client can get the ROOT information and then query for the META table without any issue.

Resources