Redis n00b here.
I'm using Redis locally on Windows to test code before pushing to my server.
I used this tutorial to set it up:
https://dingyuliang.me/redis-3-2-create-cluster-windows/
Before "Map slave node to master node" I ran the "cluster reset hard" command for all nodes because for some reason, all my nodeIDs were the same.
The test then ran exactly as it in the example. However, when I restart my PC nothing works anymore. Tells me that Could not connect to Redis at :0: The requested address is not valid in its context. This is how it looks in the console when checking their state:
Everything except one master has no IP, port and is disconnected. What is going on and how can I fix this? The way to fix it was to completely wipe everything and start over so far. All Redis Services are running. My conf file is like on the tutorial.
Well I followed the official tutorial which turned out to be better (for Linux):
https://redis.io/topics/cluster-tutorial
The fact that I hard to give them a hard reset on the other one showed something did not work as planned and they all ran on the same redis.conf which probably made it weird as hell. Since the Windows Redis is only at V3, you will need an old redis-trib.rb (like here https://github.com/beebol/redis-trib.rb/blob/master/redis-trib.rb) to install the clusters. Bottom line, if you can really try to do this on Linux instead, if not what I did is hopefully good enough for the local dev environment.
Related
I have a recently deployed kolla-ansible stable/victoria with several services I wanted to try but no longer need (designate, octavia, etc.) What is the "right" way to remove these services? I have attempted:
kolla-ansible -i multinode reconfigure --tags <services>
kolla-ansible -i multinode reconfigure --tags common,haproxy,<services>
kolla-ansible -i multinode deploy --tags <services>
In each case I'm left with still-running containers, leftover configuration artifacts (/etc/kolla/.*.conf) and haproxy config files.
I know it's been a while since you posted this question, but I recently had the same problem and haven't found documentation about this anywhere.
The reason why reconfigure and deploy don't do anything even if you set enable_<service> to no is because the Ansible playbooks only run tasks involving a given service if its corresponding enable is true. If you look at the output of your commands run with --tags, you'd see that Ansible isn't really doing anything with regards to your disabled service.
Since Kolla-Ansible deploys everything with containers, I've found most services can simply be removed by doing the following:
Stop and delete all the containers running the service to be removed
Delete those containers' volumes
Remove the configuration and log files (under /etc/kolla and /var/log/kolla respectively)
Remove databases used by the service you're deleting
You can remove the HAproxy config files for each service you're removing.
I know this is perhaps not in the spirit of automating the Openstack management with Ansible, but I've done this a few times without too many problems. I would avoid removing core services like Keystone, Neutron, Nova, Mariadb or Rabbitmq though because if you're doing that you're destroying your entire Openstack deployment anyways.
You can run the cleanup-host and cleanup-containers scripts on hosts running your containers, but those remove everything related to Kolla-Ansible. If you want to remove a specific service, you could modify those scripts though. I'm aware certain services like Nova, Neutron, Openvswitch and Zun reconfigure the host too for networking but I haven't been able to find a reliable way to revert those changes, and cleanup-host/cleanup-containers don't address those either. If you stop and delete the openvswitch containers, Openvswitch's interfaces go away on the next host reboot and that may be a viable method for you too. Remember Kolla-Ansible loads the openvswitch kernel module persistently so that's something else you may want to remove as well.
I was also struggling with such scenario recently and I've found just these:
https://bugs.launchpad.net/kolla-ansible/+bug/1874044
https://review.opendev.org/c/openstack/kolla-ansible/+/504592
Unfortunately, seems like a work already started some time ago, but no big progress has been done yet.
I'm having an issue setting up my cluster according to the documents, as seen here: https://docs.sensu.io/sensu-go/5.5/guides/clustering/
This is a non-https setup to get my feet wet, I'm not concerned with that at the moment. I just want a running cluster to begin with.
I've set up sensu-backend on my three nodes, and have configured the backend configuration (backend.yml) accordingly on all three nodes through an ansible playbook. However, my cluster does not discover the other two nodes. It simply shows the following:
For backend1:
=== Etcd Cluster ID: 3b0efc7b379f89be
ID Name Peer URLs Client URLs
────────────────── ─────────────────── ─────────────────────── ───────────────────────
8927110dc66458af backend1 http://127.0.0.1:2380 http://localhost:2379
For backend2 and backend3, it's the same, except it shows those individual nodes as the only nodes in their cluster.
I've tried both the configuration in the docs, as well as the configuration in this git issue: https://github.com/sensu/sensu-go/issues/1890
None of these have panned out for me. I've ensured all the ports are open, so that's not an issue.
When I do a manual sensuctl cluster member-add X X, I get an error message and it results in the sensu-backend process failing. I can't remove the member, either, because it causes the entire process to not be able to start. I have to revert to an earlier snapshot to fix it.
The configs on all machines are the same, except the IP's and names are appropriated for each machine
etcd-advertise-client-urls: "http://XX.XX.XX.20:2379"
etcd-listen-client-urls: "http://XX.XX.XX.20:2379"
etcd-listen-peer-urls: "http://0.0.0.0:2380"
etcd-initial-cluster: "backend1=http://XX.XX.XX.20:2380,backend2=http://XX.XX.XX.31:2380,backend3=http://XX.XX.XX.32:2380"
etcd-initial-advertise-peer-urls: "http://XX.XX.XX.20:2380"
etcd-initial-cluster-state: "new" # have also tried existing
etcd-initial-cluster-token: ""
etcd-name: "backend1"
Did you find the answer to your question? I saw that you posted over on the Sensu forums as well.
In any case, the easiest thing to do in this case would be to stop the cluster, blow out /var/lib/sensu/sensu-backend/etcd/ and reconfigure the cluster. As it stands, the behavior you're seeing seems like the cluster members were started individually first, which is what is potentially causing the issue and would be the reason for blowing the etcd dir away.
I want to setup an 3 node cluster setup in elasticsearch, but I unable to setup, getting error like connection refused in data machine, master machine starting fine, but it shows like 0 nodes added.
I would recommend to read tutorial first, like
https://www.digitalocean.com/community/tutorials/how-to-set-up-a-production-elasticsearch-cluster-on-ubuntu-14-04
https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-zen.html
then ask precise question here about a specific issue.
About your question, I think you didnt configure discovery.zen.ping.unicast.host fine, so nodes dont know each other.
Also, when you post a question, please post:
elasticsearch version
environnement (AWS, VM ...)
configuration sample
Welcome at SO!
I have installed HDP Ambari with three nodes in VM, i restarted one of three nodes i.e., datanode2 after that, i lost heart beat from that node in Ambari. I restarted ambari-agent in all three nodes, then also not working. Kindly find me a solution.
Well the provided information is not sufficient, anyway i will try to tell you the normal approach I take to debug this.
First check if all the ambari-agents are running, use the command ambari-agent status.
Check the logs of both ambari-agent and ambari-server. Normally the logs are available at /var/log/ambari-agent and /var/log/ambari-server. Logs should tell you the exact reason for heartbeat lost.
Most common reasons for the agent failure would be Connection issues between the machines, version mismatch or corrupt database entry.
I think log files should help you.
I'm new to EC2. I have successfully created a window server 2012 hpc cluster using Amazon Web Service and hope to run parallel programming.
I have successfully run MPJ Express in the Multi-core configuration. However, I am facing some problem with cluster Configuration with niodev. My head node not able to connect to the compute node.
I have followed the instruction given at http://mpj-express.org/docs/guides/windowsguide.pdf. I have setup all the enviroment variable.
Screenshot of my error
The IP address i put in machines file is Private IP of compute node.
My machines file is put inside directory c:\mpj-user.
My compute node have started mpj daemon with same MPJExpress Configuration.
I am able to ping from head node to compute node.
I found out most of the solution on Internet is using ubuntu, I can't really find a solution for windows.
Any help or solution is much appreciated.
You need to start daemons on each compute node separately with same mpjepxress configurations and then try to run hello world program.
If that does not work, use -src switch if there is no Network File sharing enabled. You can find information about this switch in windows guide.
Let me know if you still get the problem.