Is it possible to demote a master node without using "repmgr standby clone" and pg_rewind - clone

I am currently using postgresql with log shipping replication. I use a master/slave resource of pacemaker to deal with postgresql failover.
I was asking if there is a way to demote a master, set it as standby and keep synchronized without using "repmgr standby clone" neither pg_rewind.
In fact, I want the old master to be quickly ready to get back to master state and "repmgr standby clone" takes several minutes to recover which is too long.
I see that it is possible to use pg_rewind to synchronize faster but it implies to have wal_log_hints enable, and I afraid that this options will decrease the performances of the master. The master is already too much busy.
I try to just write the recovery.conf in data directory, the master has well turned to slave mode, however it doesn't have upstream:
[root#bkm-01 httpd]# su - postgres -c "/usr/pgsql-9.5/bin/repmgr -f /var/lib/pgsql/repmgr/repmgr.conf cluster show"
Role | Name | Upstream | Connection String
----------+--------|----------|--------------------------------------
* master | node-02 | | host=node-02 user=repmgr dbname=repmgr
standby | node-01 | | host=node-01 user=repmgr dbname=repmgr
I wish it is clear enough, I actually a newbie in database replication. Any help would be appreciated.

I found the solution by myself. In fact the former-master just need to be registered after been demoted. --force should be used if node was previously registered.
[root#node-01 ] su - postgres -c "/usr/pgsql-9.5/bin/repmgr -f /var/lib/pgsql/repmgr/repmgr.conf standby register --force"

Related

How to sub to and monitor a VerneMQ cluster?

I managed to create a cluster of 2 Vernemq nodes.
Both are well communicating and are located to a different server, respectively
A.company.com
B.company.com
Following command shows both nodes running. (ssl and port 8883 are used)
vmq-admin cluster show
+-----------------------+-------+
| Node |Running|
+-----------------------+-------+
|ServerAMQ#10.107.1.25 | true |
|ServerBMQ#10.107.1.26 | true |
+-----------------------+-------+
Everything is running fine actually, eg :
mosquitto_pub -h a.mycompany.com -p 8883 -t 'topic/A/1' -m "foo" -d --cert client.crt --key client.key --cafile ca.crt -d
As i'm subbing A.mycompany.com, does the Cluster distribute to B.mycompany.com ?
Can please someone share some insight on how it works ?
As i don't have any experience on this subject, may i please ask you which is the best way to supervise cluster's activity ?
Graphite (i saw some cluster's related metrics)
http://(oneofthetwoserver):8888/metrics
Regards,
Pierre
A quick way to overview the cluster is: http://{{your_server}}:8888/status.
For a production deployment, I'd use Prometheus metrics in any case.
I'm not sure I get your question regarding broker A and B. If they are clustered, messages published to A will be delivered to a subscriber on broker B, provided the topic subscription matches.

etcd cluster id mistmatch

Hey I have a cluster id mismatch for some reason, i had it on 1 node then disapperead after clearing data dir few times , changing cluster token and node names, but apperead on another
here is the script i use
IP0=10.150.0.1
IP1=10.150.0.2
IP2=10.150.0.3
IP3=10.150.0.4
NODENAME0=node0
NODENAME1=node1
NODENAME2=node2
NODENAME3=node3
# changing these on each box
THISIP=$IP2
THISNODENAME=$NODENAME2
etcd --name $THISNODENAME --initial-advertise-peer-urls http://$THISIP:2380 \
--data-dir /root/etcd-data \
--listen-peer-urls http://$THISIP:2380 \
--listen-client-urls http://$THISIP:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://$THISIP:2379 \
--initial-cluster-token etcd-cluster-2 \
--initial-cluster $NODENAME0=http://$IP0:2380,$NODENAME1=http://$IP1:2380,$NODENAME2=http://$IP2:2380,$NODENAME3=http://$IP3:2380 \
--initial-cluster-state new
I get
2016-11-11 22:13:12.090515 I | etcdmain: etcd Version: 2.3.7
2016-11-11 22:13:12.090643 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2016-11-11 22:13:12.090713 I | etcdmain: listening for peers on http://10.150.0.3:2380
2016-11-11 22:13:12.090745 I | etcdmain: listening for client requests on http://10.150.0.3:2379
2016-11-11 22:13:12.090771 I | etcdmain: listening for client requests on http://127.0.0.1:2379
2016-11-11 22:13:12.090960 I | etcdserver: name = node2
2016-11-11 22:13:12.090976 I | etcdserver: data dir = /root/etcd-data
2016-11-11 22:13:12.090983 I | etcdserver: member dir = /root/etcd-data/member
2016-11-11 22:13:12.090990 I | etcdserver: heartbeat = 100ms
2016-11-11 22:13:12.090995 I | etcdserver: election = 1000ms
2016-11-11 22:13:12.091001 I | etcdserver: snapshot count = 10000
2016-11-11 22:13:12.091011 I | etcdserver: advertise client URLs = http://10.150.0.3:2379
2016-11-11 22:13:12.091269 I | etcdserver: restarting member 7fbd572038b372f6 in cluster 4e73d7b9b94fe83b at commit index 4
2016-11-11 22:13:12.091317 I | raft: 7fbd572038b372f6 became follower at term 8
2016-11-11 22:13:12.091346 I | raft: newRaft 7fbd572038b372f6 [peers: [], term: 8, commit: 4, applied: 0, lastindex: 4, lastterm: 1]
2016-11-11 22:13:12.091516 I | etcdserver: starting server... [version: 2.3.7, cluster version: to_be_decided]
2016-11-11 22:13:12.091869 E | etcdmain: failed to notify systemd for readiness: No socket
2016-11-11 22:13:12.091894 E | etcdmain: forgot to set Type=notify in systemd service file?
2016-11-11 22:13:12.096380 N | etcdserver: added member 7508b3e625cfed5 [http://10.150.0.4:2380] to cluster 4e73d7b9b94fe83b
2016-11-11 22:13:12.099800 N | etcdserver: added member 14c76eb5d27acbc5 [http://10.150.0.1:2380] to cluster 4e73d7b9b94fe83b
2016-11-11 22:13:12.100957 N | etcdserver: added local member 7fbd572038b372f6 [http://10.150.0.2:2380] to cluster 4e73d7b9b94fe83b
2016-11-11 22:13:12.102711 N | etcdserver: added member d416fca114f17871 [http://10.150.0.3:2380] to cluster 4e73d7b9b94fe83b
2016-11-11 22:13:12.134330 E | rafthttp: request cluster ID mismatch (got cfd5ef74b3dcf6fe want 4e73d7b9b94fe83b)
the other memebers are not even running, how that's possible ?
Thank you
For all those who stumble upon this from google:
The error is about peer member ID, that tries to join cluster with same name as another member (probably old instance) that already exists in cluster (with same peer name, but another ID, this is the problem).
you should delete the peer and re-add it like shown in this helpful post:
In order to fix this it was pretty simple, first we had to log into an existing working server on the rest of the cluster and remove server00 from its member list:
etcdctl member remove <UID>
This free's up the ability to allow the new server00 to join but we needed to simply tell the cluster it could by issuing the add command:
etcdctl member add server00 http://1.2.3.4:2380
It you follow the logs on server00 you'll then see that everything spring into life. You can confirm this with the commands:
etcdctl member list
etcdctl cluster-health
Use "etcdctl member list" to find what are the IDs of current members, and find the one which tries to join cluster with wrong ID, then delete that peer from "members" with "etcdctl member remove " and try to rejoin him.
Hope it helps.
I just ran into this same issue, 2 years later. Dmitry's answer is fine but misses what the OP likely did wrong in the first place when setting up an etcd cluster.
Running an etcd instance with "--cluster-state new" at any point, will generate a cluster ID in the data directory. If you try to then/later join an existing cluster, it will use that old generated cluster ID (which is when the mismatch error occurs). Yes, technically the OP had an "old cluster" but more likely, and 100% common, is when someone is trying to stand up their first cluster, they don't notice the procedure has to change. I find that etcd kind of generally fails in providing a good usage model.
So, removing the member (you don't really need to if the new node never joined successfully) and/or deleting the new node's data directory will "fix" the issue, but its how the OP setup the 2nd cluster node that the problem.
Here's an example of the setup nuance: (sigh... thanks for that etcd...)
# On the 1st node (I used Centos7 minimal, with etcd installed)
sudo firewall-cmd --permanent --add-port=2379/tcp
sudo firewall-cmd --permanent --add-port=2380/tcp
sudo firewall-cmd --reload
export CL_NAME=etcd1
export HOST=$(hostname)
export IP_ADDR=$(ip -4 addr show ens33 | grep -oP '(?<=inet\s)\d+(\.\d+){3}')
# turn on etcdctl v3 api support, why is this not default?!
export ETCDCTL_API=3
sudo etcd --name $CL_NAME --data-dir ~/data --advertise-client-urls=http://127.0.0.1:2379,https://$IP_ADDR:2379 --listen-client-urls=https://0.0.0.0:2379 --initial-advertise-peer-urls https://$IP_ADDR:2380 --listen-peer-urls https://$IP_ADDR:2380 --initial-cluster-state new
Ok, the first node is running. The cluster data is in the ~/data directory. In future runs you only need (note that cluster-state isn't needed):
sudo etcd --name $CL_NAME --data-dir ~/data --advertise-client-urls=http://127.0.0.1:2379,https://$IP_ADDR:2379 --listen-client-urls=https://0.0.0.0:2379 --initial-advertise-peer-urls https://$IP_ADDR:2380 --listen-peer-urls https://$IP_ADDR:2380
Next, add your 2nd node's expected cluster name and peer URLs:
etcdctl --endpoints="https://127.0.0.1:2379" member add etcd2 --peer-urls="http://<next node's IP address>:2380"
Adding the member is important. You won't be able to successfully join without doing it first.
# Next on the 2nd/new node
export CL_NAME=etcd1
export HOST=$(hostname)
export IP_ADDR=$(ip -4 addr show ens33 | grep -oP '(?<=inet\s)\d+(\.\d+){3}')
sudo etcd --name $CL_NAME --data-dir ~/data --advertise-client-urls=https://127.0.0.1:2379,https://$IP_ADDR:2379 --listen-client-urls=https://0.0.0.0:2379 --initial-advertise-peer-urls https://$IP_ADDR:2380 --listen-peer-urls https://$IP_ADDR:2380 --initial-cluster-state existing --initial-cluster="etcd1=http://<IP of 1st node>:2380,etcd2=http://$IP_ADD:2380"
Note the annoying extra arguments here. --initial-cluster must have 100% of all nodes in the cluster identified... which doesn't matter after you join the cluster because cluster data will be replicated anyways... Also "--initial-cluster existing" is needed.
Again, after the 1st time the 2nd node runs/joins, you can run it without any cluster arguments:
sudo etcd --name $CL_NAME --data-dir ~/data --advertise-client-urls=http://127.0.0.1:2379,https://$IP_ADDR:2379 --listen-client-urls=https://0.0.0.0:2379 --initial-advertise-peer-urls https://$IP_ADDR:2380 --listen-peer-urls https://$IP_ADDR:2380
Sure, you could keep running etcd with all the cluster settings in there, but they "might" get ignored for whats in the data directory. Remember that if you join a 3rd node, knowledge of the new node member is replicated to the remaining node, and those "initial" cluster settings could be completely false/misleading in the future when your cluster changes. So run your joined nodes with no initial cluster settings unless you are actually joining one.
Also, last bit to impart, you should/must run at least 3 nodes in a cluster, otherwise the RAFT leader election process will break everything. With 2 nodes, when 1 node goes down or they get disconnected, the node will not elect itself and spin in an election loop. Clients can't talk to an etcd service that's in election mode... Great availability! You need a minimum of 3 nodes to handle if 1 goes down.
in my case i got the error
rafthttp: request cluster ID mismatch (got 1b3a88599e79f82b want b33939d80a381a57)
due to incorrect config on one node
two my nodes got in config
env ETCD_INITIAL_CLUSTER="etcd-01=http://172.16.50.101:2380,etcd-02=http://172.16.50.102:2380,etcd-03=http://172.16.50.103:2380"
and one node got
env ETCD_INITIAL_CLUSTER="etcd-01=http://172.16.50.101:2380"
to resolve the problem i stopped etcd on all nodes, edited incorrect config,
deleted /var/lib/etcd/member folder in all nodes , restarted etcd on all nodes and voila !
p.s.
/var/lib/etcd - is the folder where etcd save its data in my case
My --data-dir=/var/etcd/data, remove and recreate it, that works for me. It seems that something of previous etcd cluster I made left in this directory, which may affect the etcd settings.
I have faced the same problem, our leader etcd server went down and after replacing it with new we were getting an error
rafthttp: request sent was ignored (cluster ID mismatch)
It was looking for the old cluster-Id and generating some random local cluster with some misconfiguration.
Followed these steps to fix the issue.
Login to other working cluster and remove unreachable member from
the cluster
etcdctl cluster-health
etcdctl member remove member-id
Login to new server and stop if etcd process is running systemctl etcd2 stop
Remove data from the data directory rm -rf /var/etcd2/data Keep backup of this data somewhere in other folder before deleting it.
Now start your cluster with --initial-cluster-state existing parameter, don't use --initial-cluster-state new if you are already adding server to existing cluster.
Now go back to one of the running etcd server and add this new member to cluster etcdctl member add node0 http://$IP:2380
I have spent a lot of time on debugging this issue and now my cluster is running healthy with all members. Hope this information helps.
Add a new node to a existing etcd cluster.
etcdctl member add <new_node_name> --peer-urls="http://<new_node_ip>:2380"
Attention, if you enable TLS, replace http with https
Run etcd in new node. It is important to add "--initial-cluster-state existing", the purpose is telling new node that join the existing cluster, instead of creating a new cluster.
etcd --name <new_node_name> --initial-cluster-state existing ...
Check the result
etcdctl member list

redis on windows cluster setup

I have downloaded MSOpenTech Redis version 3.x which includes the long awaited clustering feature. My redis database is all working and I can start my cluster on the min 3 nodes required (in cluster mode). Does anyone know how to configure the cluster (it seems no one knows)?
Installing Linux and running the native Linux version is not an option for me sadly.
Any help would be greatly appreciated.
You can follow the Redis Cluster Tutorial and to create the cluster you can use the redis-trib.rb ruby script, for which you need to install Ruby for Windows.
For example:
> C:\Ruby22\Bin\ruby.exe redis-trib.rb create --replicas 1 192.168.1.1:7000 192.168.1.1:7001 192.168.1.1:7002 192.168.1.1:7003 192.168.1.1:7004 192.168.1.1:7005
Did not have the option to install Ruby on Windows but found the manual steps worked for me. The Ruby script seems to do a lot of checking stuff is setup correctly and is the preferred setup route. So Beware, here be dragons.
Set each node to run in Cluster mode. Edit the redis.windows-service.conf file and uncomment
cluster-enabled yes
cluster-config-file nodes-6379.conf
cluster-node-timeout 15000
restart the service.
Run a powershell window and change to the Redis installed folder and start the redis-cli. e.g.
cd "C:\Program Files\Redis"
.\redis-cli.exe
Now you can join other nodes. Run CLUSTER MEET IPADDRESS PORT for each of the other nodes, than the instance you happen to be on. e.g.
CLUSTER MEET 10.10.0.2 6379
After a few seconds running
CLUSTER NODES
Should list all the nodes connected, but all will be set as MASTER.
On each of the other nodes, run CLUSTER REPLICATE MASTERNODEID. Where MASTERNODEID is the hash-looking value next the node declared "myself" on your master when running CLUSTER NODES. e.g.
CLUSTER REPLICATE b7c767ab3ab7c4a926ac2fed937cf140b96764a7
Now allocate slots to each Master. My setup has three instances, only one master.
for ($slot=0;$slot -le 16383;$slot++) {
.\redis-cli.exe -h REDMST CLUSTER ADDSLOTS $slot
}
Reconnect with redis-cli and try and save data. e.g.
SET foo bar
OK
GET foo
"bar"
Phew! Got most this from reading https://www.javacodegeeks.com/2015/09/redis-clustering.html#InstallingRedis which is not Windows specific.
for windows version:
open the command window then type below command
C:\ProgramFiles\redis>FOR /L %i IN (0,1,16383) DO ( redis-cli.exe -p **6380** CLUSTER ADDSLOTS %i )
6380 is port of master node.

Mesos slave node unable to restart

I've setup a Mesos cluster using the CloudFormation templates from Mesosphere. Things worked fine after cluster launch.
I recently noticed that none of the slave nodes are listed in the Mesos dashboard. EC2 console shows the slaves are running & pass health checks. I restarted nodes on cluster but that didn't help.
I ssh'ed into one of the slaves and noticed mesos-slave services are not running. Executed sudo systemctl status dcos-mesos-slave.service but that couldn't start the service.
Looked in /var/log/mesos/ and tail -f mesos-slave.xxx.invalid-user.log.ERROR.20151127-051324.31267 and saw the following...
F1127 05:13:24.242182 31270 slave.cpp:4079] CHECK_SOME(state::checkpoint(path, bootId.get())): Failed to create temporary file: No space left on device
But the output of df -h and free show there is plenty of disk space left.
Which leads me to wonder, why is it complaining about no disk space?
Ok I figured it out.
When running Mesos for a long time or under frequent load, the /tmp folder won't have any disk space left since Mesos uses the /tmp/mesos/ as the work_dir. You see, the filesystem can only hold a certain number of file references(inodes). In my case, slaves were collecting large number of file chuncks from image pulls in /var/lib/docker/tmp.
To resolve this issue:
1) Remove files under /tmp
2) Set a different work_dir location
It is good practice to run
docker rmi -f $(docker images | grep "<none>" | awk "{print \$3}")
this way you will free space by deleting unused docker images

cant' replace dead cassandra node because it doesn't exist in gossip

One of the nodes in a cassandra cluster has died.
I'm using cassandra 2.0.7 throughout.
When I do a nodetool status this is what I see (real addresses have been replaced with fake 10 nets)
[root#beta-new:/opt] #nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.10.1.94 171.02 KB 256 49.4% fd2f76ae-8dcf-4e93-a37f-bf1e9088696e rack1
DN 10.10.1.98 ? 256 50.6% f2a48fc7-a362-43f5-9061-4bb3739fdeaf rack1
I tried to get the token ID for the down node by doing a nodetool ring command, grepping for the IP and doing a head -1 to get the initial one.
[root#beta-new:/opt] #nodetool ring | grep 10.10.1.98 | head -1
10.10.1.98 rack1 Down Normal ? 50.59% -9042969066862165996
I then started following this documentation on how to replace the node:
[http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html?scroll=task_ds_aks_15q_gk][1]
So I installed cassandra on a new node but did not start it.
Set the following options:
cluster_name: 'Jokefire Cluster'
seed_provider:
- seeds: "10.10.1.94"
listen_address: 10.10.1.94
endpoint_snitch: SimpleSnitch
And set the initial token of the new install as the token -1 of the node I'm trying to replace in cssandra.yaml:
initial_token: -9042969066862165995
And after making sure there was no data yet in:
/var/lib/cassandra
I started up the database:
[root#web2:/etc/alternatives/cassandrahome] #./bin/cassandra -f -Dcassandra.replace_address=10.10.1.98
The documentation I link to above says to use the replace_address directive on the command line rather than cassandra-env.sh if you have a tarball install (which we do) as opposed to a package install.
After I start it up, cassandra fails with the following message:
Exception encountered during startup: Cannot replace_address /10.10.10.98 because it doesn't exist in gossip
So I'm wondering at this point if I've missed any steps or if there is anything else I can try to replace this dead cassandra node?
Has the rest of your cluster been restarted since the node failure, by chance? Most gossip information does not survive a full restart, so you may genuinely not have gossip information for the down node.
This issue was reported as a bug CASSANDRA-8138, and the answer was:
I think I'd much rather say that the edge case of a node dying, and then a full cluster restart (rolling would still work) is just not supported, rather than make such invasive changes to support replacement under such strange and rare conditions. If that happens, it's time to assassinate the node and bootstrap another one.
So rather than replacing your node, you need to remove the failed node from the cluster and start up a new one. If using vnodes, it's quite straightforward.
Discover the node ID of the failed node (from another node in the cluster)
nodetool status | grep DN
And remove it from the cluster:
nodetool removenode (node ID)
Now you can clear out the data directory of the failed node, and bootstrap it as a brand-new one.
Some less known issues of Cassandra dead node replacement has been captured in below link based on my experience:
https://github.com/laxmikant99/cassandra-single-node-disater-recovery-lessons

Resources