How To Load-Distribution in RabbitMQ cluster? - amazon-ec2

Hi I create three RabbitMQ servers running in cluster on EC2
I want to scale out RabbitMQ cluster base on CPU utilization but when I publish message only one server utilizes CPU and other RabbitMQ-server not utilize CPU
so how can i distribute the load across the RabbitMQ cluster

RabbitMQ clusters are designed to improve scalability, but the system is not completely automatic.
When you declare a queue on a node in a cluster, the queue is only created on that one node. So, if you have one queue, regardless to which node you publish, the message will end up on the node where the queue resides.
To properly use RabbitMQ clusters, you need to make sure you do the following things:
have multiple queues distributed across the nodes, such that work is distributed somewhat evenly,
connect your clients to different nodes (otherwise, you might end up funneling all messages through one node), and
if you can, try to have publishers/consumers connect to the node which holds the queue they're using (in order to minimize message transfers within the cluster).
Alternatively, have a look at High Availability Queues. They're like normal queues, but the queue contents are mirrored across several nodes. So, in your case, you would publish to one node, RabbitMQ will mirror the publishes to the other node, and consumers will be able to connect to either node without worrying about bogging down the cluster with internal transfers.

That is not really true. Check out the documentation on that subject.
Messages published to the queue are replicated to all mirrors. Consumers are connected to the master regardless of which node they connect to, with mirrors dropping messages that have been acknowledged at the master. Queue mirroring therefore enhances availability, but does not distribute load across nodes (all participating nodes each do all the work).

Related

Achieve high availability and failover in Artemis Cluster with shared-store HA policy through JGroups protocol

In the documentation of Artemis ActiveMQ it is stated that if high availability is configured for the replication HA policy then you can specify a group of live servers that a backup server can connect to. This is done by configuring group-name in the master and the slave element of the broker.xml. A backup server will only connect to a live server that shares the same node group name.
But in shared-store there is no such concept of group-name. I am confused. If I have to achieve high availability through shared-store in JGroups then how it can be done.
Again when I tried doing it through replication HA policy providing group-name the cluster was formed and failover was working, but I got the warning saying:
2020-10-02 16:35:21,517 WARN [org.apache.activemq.artemis.core.client] AMQ212034: There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID=220da24b-049c-11eb-8da6-0050569b585d
2020-10-02 16:35:21,517 WARN [org.apache.activemq.artemis.core.client] AMQ212034: There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID=220da24b-049c-11eb-8da6-0050569b585d
2020-10-02 16:35:25,350 WARN [org.apache.activemq.artemis.core.server] AMQ224078: The size of duplicate cache detection (<id_cache-size/>) appears to be too large 20,000. It should be no greater than the number of messages that can be squeezed into confirmation window buffer (<confirmation-window-size/>) 32,000.
As the name "shared-store" indicates, the live and the backup broker become a logical pair which can support high availability and fail-over because they share the same data store. Because they share the same data store there is no need for any kind of group-name configuration. Such an option would be confusing, redundant, and ultimately useless.
The JGroups configuration (and the cluster-connection more generally) exists because the two brokers need to exchange information with each other about their respective network locations so that the live broker can inform clients how to connect to the backup in case of a failure.
Regarding the WARN message about duplicate node ids on the network...You might get that warn message once, possibly twice, during failover or fail-back, but if you see it more than that then there's something wrong. If you're using shared-store it indicates a problem with the locks on the shared file system. If you're using replication then that indicates a potential misconfiguration or possibly a split-brain.

Solace application HA across regions

Currently intra-region we achieve HA (hot/hot) between applications by using exclusive queues to ensure 1 application is Active and the rest are standby.
How do I achieve the same thing across region when the appliances are linked via cspf neighbour links? As queues are local to an appliance the approach above doesn't work.
Not possible using your design of CSPF neighbors - they are meant for direct messages and not guaranteed.
Are you able to provide more details about your use case?
Solace can easily do active/standby across regions using Data Center Replication.
Solace can also allow consumers to consume messages from the endpoints on both the active and standby regions by Allowing Clients to Connect to Standby Sites. However this means two consumers will be active - one on the active and one on the standby site.

Should Apache Kafka and Apache Hadoop share the same ZooKeeper instance?

Is it possible to use same ZooKeeper instance for coordinating Apache Kafka and Apache Hadoop clusters? If yes, what would be the appropriate configuration of ZooKeeper?
Thanks!
Yes, as far as my understanding goes, ideally there should be a single zookeeper cluster with dedicated machines for managing the co-ordination between different application in a distributed system. i would try to share few points here
The zookeeper cluster consisting of several servers are typically called ensemble and basically manages to track and share states of your application.e.g Kafka uses it to commit offset changes to it so that in case of failure it can identify from where to start again. from the doc page :
Like the distributed processes it coordinates, ZooKeeper itself is intended to be replicated over a sets of hosts(ensemble). whenever a change is made, it is not considered successful until it has been written to a quorum (at least half) of the servers in the ensemble.
Now
Imagine both Kafka & Hadoop are having a dedicated cluster of 3 zookeeper servers each, in case couple of nodes get down in any of the two clusters it will result a service failure (ZK works based on simple majority voting, so it will tolerate up to 1 node failure keeping the service alive but not 2 ) . Instead if there is One Single cluster of 5zk servers managing both the applications and in case two of the nodes are down you still have the service available.Not only this offer better reliability also it reduces the hardware expenses as instead of managing 6 servers you only have to take care of 5.

Does websphere MQ create duplicate queues in a clustered environment

If I want to set up Websphere MQ in a distributed environment (across 2 machines in a cluster), will the queues and topics(which I understand as physical storage spaces for messages) be created on both machines?
Or will the queues and topics be created on one machine only but the program (I guess it is called the websphere MQ broker) will be deployed on 2 machines and both the instances will access the same queue and topic.
Cluster concept in WebSphere MQ is different from the traditional high availability (HA) clusters. In traditional HA cluster two systems access the same storage/data to provide HA feature. Both systems can be configured to be active at anytime and processing requests. You can also have a active/passive type of HA configuration.
Unlike traditional HA cluster, WebSphere MQ cluster is different. Two queue managers do not share the same storage/data. Each queue manager is unique. WebSphere MQ cluster is more suitable for workload balance than a HA. You can have a queue with same name in multiple queue managers in a MQ cluster and when messages are put, MQ cluster will load balance them to all queues in that cluster. It should be noted the messages in each instance of a queue in cluster are independent and are not shared. If for some reason one of the queue manager in a cluster goes down, then messages in that queue manager become unavailable till the queue manager comes back.
Are you aiming for workload balance or HA? If you are aim is to achieve HA, then you could look at the multi-instance queue manager feature of MQ or any other HA solutions. If you are aiming for workload balance then you can go for MQ clustering. You can also have a mix of mutli-instance queue manager and MQ clustering to achieve HA and workload balance.
No, MQ doesnot create duplicate queues in the cluster if you don't(manually).
Further, check whether your queue manager is a Partial repository or a Full repository for the cluster.
A partial repository will only contain information about its own objects whereas a full repository will have information about the objects of all queue managers in the cluster.
A cluster needs at least one full repository in it, and other partial repository can use this full repository for accessing objects of other queue managers.
But, the object information in full repository is just a list. Actual physical object will only be there in the queue manager where it was created.

Websphere MQ and High Availability

When I read about HA in Websphere MQ I always come to the point, when the best practise is to create two Queue Managers handling the same queue and use out-of-the-box load balancing. Therefore, when one is down, the other takes over his job.
Well, this is great but what about the messages in the queue that belong to the Queue Manager that went down? I mean do these messages reside there (when queue is persistent of course) until QM is up and running again?
Furthermore, is it possible to create a common storage for this doubled Queue Managers? Then no message would wait for the QM to be up. Every message would be delivered in the proper order. Is this correct?
WebSphere MQ provides different capabilities for HA, depending on your requirements. WebSphere MQ clustering uses parallelism to distribute load across multiple instances of a queue. This provides availability of the service but not for in-flight messages.
Hardware clustering and Multi-Instance Queue Manager (MIQM) are both designs using multiple instances of a queue manager that see a single disk image of that queue manager's state. These provide availability of in-flight messages but the service is briefly unavailable while the cluster fails over.
Using these in combination it is possible to provide recovery of in-flight messages as well as availability of the service across multiple queue instances.
In hardware cluster model the disk is mounted to only one server and the cluster software monitors for failure and swaps the disk, IP address and possibly other resources to the secondary node. This requires a hardware cluster monitor such as PowerHA to manage the cluster.
The Multi-Instance QMgr is implemented entirely within WebSphere MQ and needs no other software. It works by having two running instances of the QMgr pointing to the same NFS 4 shared disk mount. Both instances compete for locks on the files. The first one to acquire a lock becomes the active QMgr. Because there is no hardware cluster monitor to perform IP address takeover this type of cluster will have multiple IP addresses. Any modern version of WMQ allows for this using multi-instance CONNAME where you can supply a comma-separated list of IP or DNS names. Client applications that previously used Client Channel Definition Tables (CCDT) to manage failover across multiple QMgrs will continue to work and CCDT continues to be supported in current versions of WMQ.
Please see the Infocenter topic Using WebSphere MQ with high availability configurations for details of hardware cluster and MIQM support.
Client Channel Definition Table files are discussed in the Infocenter topic Client Channel Definition Table file.

Resources