I plan to have persistent message Queues based on some implementation of AMQP and JMS API. I would like to know whether is ok (from architectural point of view) to have messages staying in the queues for hours. A day is max.
I plan to use the message broker as another persistence layer basically. Is this viable?
The technologies that I am evaluating are ActiveMQ, RabbitMQ or qupid.
I plan to use the message broker as another persistence layer
basically. Is this viable?
The broker's persistence mechanism for message retention is usually file-based, or JDBC; either one will work. It is viable? Sure, its a feature of the broker, nothing wrong with using it for the intended purpose, assuming temporary message retention is your goal; 1 day is not a big deal.
But if you're planning to retain messages for 1 day, or more, I recommend doing some calculations based on average message size and total messages per day that may end up sitting in a queue. Queue depth, by default, is usually a low number, like 10Mb, and if exceeded, the broker will probably drop subsequent messages; you want to prevent this from happening. Vendors handle this differently, so check with RabbitMq and ActiveMQ for specifics and what configuration parameters are used to control depth. I know SonicMq has what's known as the "DeadMessage" queue, a destination for expired or undeliverable messages; other products might have something similar.
It's OK to have persistent queues, and it's OK if messages are hanging around in the queues: Clients might be disconnected because of updates, network problems etc. That's one benefit of queues to decouple sender from receiver, and the queue is the buffer. However these use cases are not the normal mode of operation, it's rather an exceptional situation.
Using a messaging broker as "another persistence layer" is technically speaking possible, but in this case a database is probably more suitable, because quick message delivery/messaging and long term storage/database are different tools/scenarios. So ask yourself the question: Is it still messaging or is it already a database?
If in your use case the normal message delay (= period between sending and reception) is always beyond an hour, a database might be better, because JMS selectors are normally slower and less comfortable than database queries using where clauses.
There is another aspect: Consider the need for an online backup of your messages in a JMS provider, especially in a HA cluster mode. It might be easier to do this using a database.
Related
I am using ActiveMQ and want to generate alerts for messages which are sitting int the queue for very long time. I looked at "Advisory Message" feature but it has no such provision. It is very important for me to use a solution which does not add too much overhead on AMQ.
Note:This requirement is very different from alerts when message moves to DLQ after expiry.
The only means of reviewing what is in a Queue really is to browse it and the broker will place limitations on how far into the contents of the queue you can browse.
A message broker is not a database and you should not try to treat as such. If you have concerns about things remaining on a queue for to long then explicit expiration is your most effective tool.
You can build you own tooling to track the advisories around message enqueue and dequeue but you'd just end up needing to persist that information to make it effective so going back and reevaluating why you need to do this and what might be a better choice of architecture might be appropriate.
If you insist on want to audit the contents of the Queues then you'd want to look at configuration for max browse page size to try and let you get further into the Queue on a browse but depending on depth this probably won't get you everything you want.
I'm looking into a message queue solution where some messages need to be delivered without delay, and other messages need to be delivered at a specified time. The delay is anywhere from hours to a week or two. I have access to a JMS message Queue, but I'm questioning whether it is a good idea to put messages on the queue with long delays.
Is delaying messages a common practice?
Is using the QueueBrowser to peek at the messages and cherry picking the messages at the right time a viable solution (assuming the message as the delivery date in it)?
Is there another solution (other than putting the messages in the database with a time stamp) and periodically querying the database?
JMS 2.0 supports message delaying; see the spec, section 7.9: You can call setDeliveryDelay on the JMSProducer with the number of milliseconds you want messages to be delayed. (Note that, confusing as it is, you can not use the setJMSDeliveryTime method on the Message object.) In JMS 1.1, some JMS implementations support proprietary headers for the same effect.
It's a quite common practice, but it has a major drawback in practical use, when the delay is longer: There's no (standardized) way to access the delayed messages: The QueueBrowser doesn't return them until their time has come. If you need more control, you're better off with polling a database.
I am Using WebSphere MQ 7,and I have two clients connected to the same QMgr and consuming messages from same queue, like following code:
while (true) {
TextMessage message = (TextMessage) consumer.receive(1000);
if (message != null) {
System.out.println("*********************" + message.getText());
}
}
I found only one client always retrieve messages. Is there any method to let consume-message load balancing in two client? Any config options in MQ Server side?
When managing queue handles, it is MUCH faster for WMQ to put them in a stack rather than a LIFO queue. So if the messages arrive on the queue slower than it takes to process them, it is possible that an instance will process the message and perform another GET, which WMQ pushes down on the stack. The result is that only one instance will see messages in a low-volume use case.
In larger environments where there are many instances waiting on messages, it is possible that activity will round-robin amongst a portion of those instances while the other instances starve for messages. For example, with 10 GETters on the queue you may see three processing messages and 7 idle.
Although this is considerably faster for MQ, it is confusing to customers who are not aware of how it works internally and so they open PMRs asking this exact question. IBM had to choose among several alternatives:
Adding several code paths to manage by stack for performance when fully loaded, versus manage by LIFO for apparent balancing when lightly loaded. This bloats the code, adds many new decision points to introduce errors and solves a problem that was one of perception rather than reliability or performance.
Educate the customers as to how it works. Of course, once you document it, then you can't change it. The way I found out about this was attending the "WMQ Internals" presentation at IMPACT. It's not in the Infocenter so IBM can change it, but it is available for customers.
Do nothing. Although this is the best result from the code design point of view, the behavior is counter-intuitive. Users need to understand why things do not behave as expected and will waste time trying to find the configuration that results in the desired behavior, or open a PMR.
I don't know for sure that it still works this way but I expect that it does. The way I used to test it was to put many messages on the queue at once and then see how they were distributed. If you drop about 50 messages on the queue in one unit of work, you should see a better distribution between the two instances.
How do you drop 50 messages on the queue at once? First generate them with the applications turned off or to a spare queue. If you generated them in the target queue, use the Q program to move them to the spare queue. Now start the apps and make sure the queue's IPPROC count equals however many instances of the app you started. Using Q again, copy all of the messages to the original queue in a single unit of work. Since they all become available on the queue at once, your two app instances should both immediately be passed a message. If you used copy instead of move, you can repeat this as often as required.
Your client is not doing much, so one instance can probably handle the full load. Try implementing a more realistic workload, or, simpler yet, put a Thread.sleep in the client.
I am looking looking for a message queue with these requirements. Couldn't find it; maybe the closest was the rabbitmq-lvc plugin (but I need the first value in the line to stick and stay in front).
Would anyone know a technology to support these?
message queue is FIFO
if a duplicate message is being enqueued, the message queue itself either rejects or drops it.
For example, producers put these three messages (each with a discriminator value) into the queue in this sequence: M1(discriminator=7654), M2(discriminator=2435), M3(discriminator=7654).
Now I want the message queue to see that M3 has the same discriminator value as M1 and thus drop/reject M3. Consumers receive only: M1, M2.
Thanks
Tom
I don't know the other transports but I know that WebSphere MQ doesn't do this and I believe that the explanation why would apply broadly across the category. I'd be very surprised to find that any messaging transport actually provides this. Here are a few reasons why:
Async messages are supposed to be atomic. Different vendors make their own accommodations for message affinity (a relationship between two or more messages) but as a rule, message affinity is to be avoided. Your use case not only requires the transport to deal with message affinity, but to do so over an indeterminate interval between related messages.
Message payload is a blob. For performance reasons, WMQ doesn't touch message payloads except for things like compression or code page conversion. Anything that requires parsing the message payload is a job for WebSphere Message Broker, DataPower or WebSphere ESB. I would expect any messaging transport which claims to be performant would face similar issues because parsing payloads results in longer code paths and non-linear performance degradation. The exception is message properties but WMQ uses these for selection only and I expect that is generally the case.
Stateless operation. As a transport, the state of the application may be stored in a persistent message but the state of the transport layer should not depend on the state of the application across different units of work. Again, an ESB type of product is best suited when you want to delegate management of some of the application state to the messaging layer and especially when such management spans many units of work.
Assured delivery. WMQ was designed to never lose your persistent message. If the app explicitly sets expiry the message might go away because the sender said it was OK to do so. If the message is non-persistent it might go away, but only in an exceptional condition and, again, because the sender said it was OK to do so. The use case you describe might result in a message going away not because the sender said it was OK, or even because the recipient said it was OK but because of an interaction with some unrelated 3rd party who happened to beat you to the queue with a duplicate value. What if that first message has an invalid header or code page problem and gets rolled back? What if I as an attacker spew out garbage messages with all possible 4-digit values for discriminator?
As I said, I don't know the other messaging products so there may be something out there which meets your requirement and if so I'll be interested to read about it. However in the event hat nobody replies, this post may shed some light on the reasons why.
Concerning ActiveMQ: I have a scenario where I have one producer which sends small (around 10KB) files to the consumers. Although the files are small, the consumers need around 10 seconds to analyze them and return the result to the producer. I've researched a lot, but I still cannot find answers to the following questions:
How do I make the broker store the files (completely) in a queue?
Should I use ObjectMessage (because the files are small) or blob messages?
Because the consumers are slow processing, should I lower their prefetchLimit or use a round-robin dispatch policy? Which one is better?
And finally, in the ActiveMQ FAQ, I read this - "If a consumer receives a message and does not acknowledge it before closing then the message will be redelivered to another consumer.". So my question here is, does ActiveMQ guarantee that only 1 consumer will process the message (and therefore there will be only 1 answer to the producer), or not? When does the consumer acknowledge a message (in the default, automatic acknowledge settings) - when receiving the message and storing it in a session, or when the onMessage handler finishes? And also, because the consumers are so slow in processing, should I change some "timeout limit" so the broker knows how much to wait before giving the work to another consumer (this is kind of related to my previous questions)?
Not sure about others, but here are some thoughts.
First: I am not sure what your exact concern is. ActiveMQ does store messages in a data store; all data need NOT reside in memory in any single place (either broker or client). So you should actually be good in that regard; earlier versions did require that all ids needed to fit in memory (not sure if that was resolved), but even that memory usage would be low enough unless you had tens of millions of in-queue messages.
As to ObjectMessage vs blob; raw byte array (blob) should be most compact representation, but since all of these get serialized for storage, it only affects memory usage on client. Pre-fetch mostly helps with access latency; but given that they are slow to process, you probably don't need any prefetching; so yes, either set it to 1 or 2 or disable altogether.
As to guarantees: best that distributed message queues can guarantee is either at-least-once (with possible duplicates), or at-most-once (no duplicates, can lose messages). It is usually better to take at-least-once, and make clients to de-duping using client-provided ids. How acknowledgement is sent is defiend by JMS specification so you can read more about JMS; this is not ActiveMQ specific.
And yes, you should set timeout high enough that worker typically can finish up work, including all network latencies. This can slow down re-transmit of dropped messages (if worked dies), but it is probably not a problem for you.