I have 2 servers server1 and server2.
server1 is the master server and server2 is slave.
Both are running in clustered environment.
If 2 messages with same group ID arrives simultaneously on node 1 and node 2 they won't know to which consumer the message should be sent to. Therefore, the message ends up being processed by different consumers and sometimes the message which arrived first gets processed later which is not desirable.
I would like to configure the system so that both nodes know each other that the message should be processed by which consumer.
Solution I tried :
Configured the server1 with group handler LOCAL and server2 with REMOTE.
Now whenever the message arrives LOCAL group handler identifies that the consumer is on which node and the message is picked accordingly.
This solution is valid until the server1 is running fine. However, if the server1 goes down messages won't be processed anymore.
To fix this I added backup server to messaging subsystem active mq of server1 to server2 and similarly did the same for server2.
/profile=garima/subsystem=messaging-activemq/server=backup:add
And added the same cluster-connection, discovery-group, http-connector, broadcast-group to this backup server but when I tried this solution did not seems to fix the failover condition and messages were not processed on other node.
Please suggest any other approach or how can I configure the scenario where the server with LOCAL group handler stops.
The recommended solution for clustered grouping is what you have configured - a backup for the node with the LOCAL grouping-handler. The bottom line here is if there isn't an active node in the cluster with a LOCAL grouping-handler then a decision about what consumer should handle which group simply can't be made. It sounds to me like your backup broker simply isn't working as expected (which is probably a subject for a different question).
Aside from having a backup you might consider eliminating the cluster altogether. Clusters are a way to improve overall message throughput using horizontal scaling. However, message grouping naturally serializes message consumption for each group which then decreases overall message throughput (perhaps severely depending on the use-case). It may be that you don't need the performance scalability of a cluster since you're grouping messages. Have you performed any benchmarking to determine your performance bottlenecks? If so, was clustering the proven solution to these bottlenecks?
Related
We have this use case for implementing data synchronization between two environments (env for short) - an active (and very busy) and a fail-over env.
The two env have multiple servers with multiple Web Services (WS) on each server but only one Database (DB) per env. The idea is if any of the WS on any of the servers in the active env sends a message, only one WS on one of the servers in the fail-over env receives it and updates the DB accordingly.
We do not care much WHEN the message is delivered. What we care about is:
if Rabbit MQ accepts the message, it must deliver it at some point
if Rabbit MQ’s status prevents it from delivering a message it should reject it right away
in both cases above, there should be minimal performance impact to the WS
We think we can use Rabbit MQ broker with Quorum type queue to make this possible (we did some initial experiments).
But we have this question regarding configuration – can we achieve this in synchronous mode without much performance penalty or async mode without running out of resources (threads, memory) waiting for task cancellation?
What would the configuration look like in each case?
In the documentation of Artemis ActiveMQ it is stated that if high availability is configured for the replication HA policy then you can specify a group of live servers that a backup server can connect to. This is done by configuring group-name in the master and the slave element of the broker.xml. A backup server will only connect to a live server that shares the same node group name.
But in shared-store there is no such concept of group-name. I am confused. If I have to achieve high availability through shared-store in JGroups then how it can be done.
Again when I tried doing it through replication HA policy providing group-name the cluster was formed and failover was working, but I got the warning saying:
2020-10-02 16:35:21,517 WARN [org.apache.activemq.artemis.core.client] AMQ212034: There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID=220da24b-049c-11eb-8da6-0050569b585d
2020-10-02 16:35:21,517 WARN [org.apache.activemq.artemis.core.client] AMQ212034: There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID=220da24b-049c-11eb-8da6-0050569b585d
2020-10-02 16:35:25,350 WARN [org.apache.activemq.artemis.core.server] AMQ224078: The size of duplicate cache detection (<id_cache-size/>) appears to be too large 20,000. It should be no greater than the number of messages that can be squeezed into confirmation window buffer (<confirmation-window-size/>) 32,000.
As the name "shared-store" indicates, the live and the backup broker become a logical pair which can support high availability and fail-over because they share the same data store. Because they share the same data store there is no need for any kind of group-name configuration. Such an option would be confusing, redundant, and ultimately useless.
The JGroups configuration (and the cluster-connection more generally) exists because the two brokers need to exchange information with each other about their respective network locations so that the live broker can inform clients how to connect to the backup in case of a failure.
Regarding the WARN message about duplicate node ids on the network...You might get that warn message once, possibly twice, during failover or fail-back, but if you see it more than that then there's something wrong. If you're using shared-store it indicates a problem with the locks on the shared file system. If you're using replication then that indicates a potential misconfiguration or possibly a split-brain.
we are running an in-house EAI system using ActiveMQ as the message broker using JDBC persistence.
There we have a cold-standby failover solution each one having an own database schema (due to several reasons).
Now if the primary goes down and we want to startup the backup we would like to transfer all undelivered messages on database level from the one node to the other.
Having a look at the table "ACTIVEMQ_MSGS" made us unsure if we can do this without any drawbacks or side effects:
There is a column "ID" without any DB-sequence behind - can the backup broker handle this?
The column "MSGID_PROD" contains the host name of the primary server - is there a problem if the message should be processed by a broker with a different name?
There is a column "MSGID_SEQ" (which seems to be "1" all the time) - what does this mean? Can we keep it?
Thanks and kind regards,
Michael
I would raise a big red flag about this idea. Well, yes, in theory you could well succeed with this, but you are not supposed to touch the JDBC data piece by piece.
ActiveMQ has a few different patterns for master/slave HA setups. Either using a shared store for both the master and the slave, or use a replicated store (LevelDB+ZooKeeper).
Even a shared JDBC store could be replicated, but on the database level.
Ok, so you somehow want another setup than the official ones, fine. There is a way, but not using raw SQL commands.
By "Primary goes down", I assume you somehow assumes the primary database is still alive to copy data from. Fine. Then have a spare installation of ActiveMQ ready (on a laptop, on the secondary server or anywhere safe). You can configure that instance to connect to the "primary database" and ship all messages over to the secondary node using "network of brokers". From the "spare" broker, configure a network connection to the secondary broker and make sure you specify the "staticBrige" option to true. That will make the "spare" broker hand over all unread messages to the secondary broker. Once the spare broker is done, it can be shut down and the secondary should have all messages. This way, you can reuse the logic in whatever ActiveMQ version you have and need not to worry about ID sequences and so forth.
I have a clustered web logic environment with 2 servers.
The source drops JMS messages in the queues of both the servers.
My service, however, is designed to consume these messages only at a particular time of the day when it is activated by a "trigger.txt" file which is picked up by a file adapter which then activates the BPEL to start consuming JMS messages.
However, the problem is, if the server 1 adapter picks up the trigger.txt file, then JMS messages from only server 1 queue are consumed, messages on the other server are left untouched and vice versa.
I want the messages to be consumed by both the servers.
Is there any solution to this ?
This isn't a WLS JMS issue.
So the solution will lie within your BPEL implementation and your solution of leaving the trigger.txt file behind.
I am assuming you are removing the trigger.txt once its picked by a BPEL instance.
You will have to change this logic to say include timestamp something like trigger.txt so each BPEL instance picks and marks internally that it has picked this particular file and does it process it again.
or create 2 files one for each server, but this will be messy if you add say an extra server lateron.
The other option is for WLS to redirect the JMS messages to the server which has an active consumer, this would however effect your ability to parallely process the JMS messages on both the servers.
Hi I create three RabbitMQ servers running in cluster on EC2
I want to scale out RabbitMQ cluster base on CPU utilization but when I publish message only one server utilizes CPU and other RabbitMQ-server not utilize CPU
so how can i distribute the load across the RabbitMQ cluster
RabbitMQ clusters are designed to improve scalability, but the system is not completely automatic.
When you declare a queue on a node in a cluster, the queue is only created on that one node. So, if you have one queue, regardless to which node you publish, the message will end up on the node where the queue resides.
To properly use RabbitMQ clusters, you need to make sure you do the following things:
have multiple queues distributed across the nodes, such that work is distributed somewhat evenly,
connect your clients to different nodes (otherwise, you might end up funneling all messages through one node), and
if you can, try to have publishers/consumers connect to the node which holds the queue they're using (in order to minimize message transfers within the cluster).
Alternatively, have a look at High Availability Queues. They're like normal queues, but the queue contents are mirrored across several nodes. So, in your case, you would publish to one node, RabbitMQ will mirror the publishes to the other node, and consumers will be able to connect to either node without worrying about bogging down the cluster with internal transfers.
That is not really true. Check out the documentation on that subject.
Messages published to the queue are replicated to all mirrors. Consumers are connected to the master regardless of which node they connect to, with mirrors dropping messages that have been acknowledged at the master. Queue mirroring therefore enhances availability, but does not distribute load across nodes (all participating nodes each do all the work).