It is generally discouraged to use the message id returned from the JMS provider as the correlation id with which a message is published onto a queue. How have people generated their correlation ids for a request/response architecture?
Clients can use a unique ID standard like UUID to generate a new ID.
Here is good tutorial for you.
You can return correlation id from JMS provider using following code.
message.setJMSCorrelationID(UUID.randomUUID().toString());
producer.send(message);
LOG.info("jms-client sent:" + message.getJMSCorrelationID());
Cheers.
Server-side correlation ID generation suffers from two problems though:
One-way protocols (like JMS) have no direct means of returning the
correlation ID back to the client. Another channel could be used but
that complicates things.
Unexpected issues can prevent the client from receiving the
generated ID even though the request has been accepted and
processed on the server. This is why client ID generation should
be considered.
Client generated correlation IDs
Clients can use a unique ID standard like UUID to generate a new ID
message.setJMSCorrelationID(UUID.randomUUID().toString());
Ref: http://blogs.mulesoft.com/dev/anypoint-platform-dev/total-traceability/
Related
I have just been reading up on logging in a microservice architecture and there seems to be this concept of Correlation ID, which is basically passed along to each microservice. This Correlation ID can be useful for logging - you can trace requests across microservices just by searching the Correlation ID.
My question is, how is this usually implemented? Where is this Correlation ID generated? Do you generate some sort of UUID on the Frontend, and pass it along via the X-REQUEST-ID HTTP header?
My second question is: when you receive this Correlation ID in the server, how do you make it accessible to all the functions in the server?
Suppose your server had something like this:
requestHandler(httpRequest) {
correlationId = httpRequest.header.get(X-REQUEST-ID)
...
function2()
}
function2() {
...
function3()
}
function3() {
...
function4()
}
function4() {
...
}
Suppose I wanted to log something in function4() (assume that I want the log to include the Correlation ID as well), do I really have to pass the Correlation ID all the way down from requestHandler() to function4()? Or is there a better approach?
My first thought would be to have some sort of in-memory key-value DB where you can store the Correlation ID as a value, but what would be the key?
Yes its a kind of UUID only, whenever frontend sends the request to API Gateway or orchestration layer, then this UUID is generated and added to the subsequent calls to be made to each of the microservices to be traced.
Hence if you are giving synchronous calls or asynchronous calls via some messaging broker, this is something that you are supposed to embed in the header always and the called function will put this into logging at the beginning only, using this correlationID later you can intercept the call flow in your logging client.
I am thinking about something what is connected a little bit with CQRS. There is a pattern Request-Reply. In example of HTTP transport into header we put Request-Id for at least tracking purposes. In my case monitoring between different microservices. If incoming request contains it than rewrite is done to Correlation-Id header. As I think this is done on transport layer (infrastructure). Question is if that Request-Id (sometimes named as Message-Id) should be delivered from business layer in example directly from command that we are executing - some mechinics does this auto-magically - like ICommand requires that Id is present?
Or it's totally different thing that exists only in infrastructure layer (transport)? If yes, than how to correlate transport id with business command id? At least one log/trace/track thing has to be placed with both identifiers? Is there pattenr that I missed? Moreover what you think CorrelationId should be in business command or not?
IMHO concepts such as correlation id, causation id, request id, message id, etc belong to the infrastructure layer as they are not part of the business rules.
However, I've added a metadata attribute to my Command and Event objects to save this kind of info which helps me to manage the correlation and causation relationship between commands and events.
By having this metadata attribute in the form of an associative array (hash map, dictionary or whatever key-value format), you leave your code opened to persist any tracking info you may need in the future without polluting your Application and Domain layers too much.
We're moving from Mandrill to SparkPost. We figured that SparkPost's transmission is the closest thing to Mandrill's send-template message call.
Mandrill responded to those calls with a list of ids and statuses for each email. On the other hand SparkPost returns a single id and summary statistics (number of emails that were sent and number of emails that failed). Is there some way to get those ids and statuses out of the transmission response or at all?
you can get the message IDs for messages sent using the tranmissions API two ways:
Query the message events API, which allows you to filter by recipients, template IDs, campaign IDs, and other values
Use webhooks - messages are sent to your endpoint in batches, and each object in the batch contains the message ID
Which method you choose really depends on your use case. It's essentially poll (message events) vs. push (webhooks). There is no way to get the IDs when you send the transmission because they are sent asynchronously.
Querying message events API, while a viable option, would needlessly complicate our simple solution. On the other hand we very much want to use Webhooks, but not knowing which message they relate to would be troublesome...
The missing link was putting our own id in rcpt_meta. Most of the webhooks we care about do contain rcpt_meta, so we can substitute message_id with that.
I'm stacked too in this problem..
using rcpt_meta solution would be perfect if substitution would work on rcpt_meta but it's not the case.
So, in case of sending a campaign, I cannot specify all recipients inline but have to make a single API call for every message, wich is bad for - say - 10/100k recipients!
But now, all transmission_id are unique for every SINGLE recipient, so I have the missing key and rcpt_meta is not needed anymore.
so the key to be used when receiving a webhook is composed:
transmission_id **AND** rcpt_to
Problem
When my web application updates an item in the database, it sends a message containing the item ID via Camel onto an ActiveMQ queue, the consumer of which will get an external service (Solr) updated. The external service reads from the database independently.
What I want is that if the web application sends another message with the same item ID while the old one is still on queue, that the new message be dropped to avoid running the Solr update twice.
After the update request has been processed and the message with that item ID is off the queue, new request with the same ID should again be accepted.
Is there a way to make this work out of the box? I'm really tempted to drop ActiveMQ and simply implement the update request queue as a database table with a unique constraint, ordered by timestamp or a running insert id.
What I tried so far
I've read this and this page on Stackoverflow. These are the solutions mentioned there:
Idempotent consumers in Camel: Here I can specify an expression that defines what constitutes a duplicate, but that would also prevent all future attempts to send the same message, i.e. update the same item. I only want new update requests to be dropped while they are still on queue.
"ActiveMQ already does duplicate checks, look at auditDepth!": Well, this looks like a good start and definitely closest to what I want, but this determines equality based on the Message ID which I cannot set. So either I find a way to make ActiveMQ generate the Message ID for this queue in a certain way or I find a way to make the audit stuff look at my item ID field instead of the Message ID. (One comment in my second link even suggests using "a well defined property you set on the header", but fails to explain how.)
Write a custom plugin that redirects incoming messages to the deadletter queue if they match one that's already on the queue. This seems to be the most complete solution offered so far, but it feels so overkill for what I perceive as a fairly mundane and every-day task.
PS: I found another SO page that asks the same thing without an answer.
What you want is not message broker functionality, repeat after me, "A message broker is not a database, A message broker is not a database", repeat as necessary.
The broker's job is get messages reliably from point A to point B. The client offers some filtering capabilities via message selectors but this is minimal and mainly useful in keeping only specific messages that a single client is interested in from flowing there and not others which some other client might be in charge of processing.
Your use case calls for a more stateful database centric solution as you've described. Creating a broker plugin to walk the Queue to check for a message is reinventing the wheel and prone to error if the Queue depth is large as ActiveMQ might not even page in all the messages for you based on memory constraints.
I have many instances of my client application. These clients send requests to a server application via messaging and receive a reply. Normally the reply would be sent using a temporary queue.
Unfortunately I have to use the Stomp protocol which has no concept of temporary queues or topics. (Although the message broker has)
What's the best way to ensure only the original requestor receives the reply? Are there any best-practices for this unfortunate situation?
The customary solution when several requestors listen for replies on the same queue is to use correlation IDs to select messages. On the client side it looks like this:
Place a message on the request queue and commit.
Retrieve the JMSMessageID from the outbound message (the value is determined by the broker and updates the message object as a result of the send).
Receive a message from the reply queue specifying the JMSMessageID from the outbound message as the correlation ID in the selector.
Process and commit.
On the server side it looks like this:
Receive a message under syncpoint.
Process the request and prepare the response.
Set the JMSCorrelationID on the response to the value of JMSMessageID from the request.
Send the message.
Commit.
The consumer would set the selector something like this: activemq.selector:JMSCorrelationID=.
Since the broker creates a message ID that is supposed to be globally unique, the pattern of using it as the correlation ID prevents collisions that are possible when each requestor is allowed to specify it's own value.
The best way to implement this pattern with JMS (that I've found, anyway) is to create a pre-configured topic for the response messages, and use correlation selectors on the response message so that the client can get the correct one.
In more detail, this means setting a random ID on the request message (using setJMSCorrelationID()), and putting that message on the request Queue. The consumer of that request message processes it, creates the response message, sets the same correlation ID on the response message, and puts it on the response Topic. The client, meanwhile, is listening on the response topic with a selector expression which specifies the correlation ID that it's expecting.
The danger is that the response message is sent before the client can get around to listening for it, although that's probably unlikely. You can try using a pre-configured Queue for the responses rather than a topic, but I've found that topics tend to work more reliably (my JMS provider of choice is HornetQ - your mileage may vary).
All this is tell me that JMS is a very poor fit for the request/response model. The API just doesn't support it properly. This is hardly surprising, since that was never a use-case for JMS.
Something like a compute grid (Terracotta, Gigaspaces, Infinispan, etc) would likely yield better results, but that's not really an option for you.