How to properly deal with Correlation IDs? - microservices

I have just been reading up on logging in a microservice architecture and there seems to be this concept of Correlation ID, which is basically passed along to each microservice. This Correlation ID can be useful for logging - you can trace requests across microservices just by searching the Correlation ID.
My question is, how is this usually implemented? Where is this Correlation ID generated? Do you generate some sort of UUID on the Frontend, and pass it along via the X-REQUEST-ID HTTP header?
My second question is: when you receive this Correlation ID in the server, how do you make it accessible to all the functions in the server?
Suppose your server had something like this:
requestHandler(httpRequest) {
correlationId = httpRequest.header.get(X-REQUEST-ID)
...
function2()
}
function2() {
...
function3()
}
function3() {
...
function4()
}
function4() {
...
}
Suppose I wanted to log something in function4() (assume that I want the log to include the Correlation ID as well), do I really have to pass the Correlation ID all the way down from requestHandler() to function4()? Or is there a better approach?
My first thought would be to have some sort of in-memory key-value DB where you can store the Correlation ID as a value, but what would be the key?

Yes its a kind of UUID only, whenever frontend sends the request to API Gateway or orchestration layer, then this UUID is generated and added to the subsequent calls to be made to each of the microservices to be traced.
Hence if you are giving synchronous calls or asynchronous calls via some messaging broker, this is something that you are supposed to embed in the header always and the called function will put this into logging at the beginning only, using this correlationID later you can intercept the call flow in your logging client.

Related

springboot endpoint async response

I am working on a springboot REST API.
I have one endpoint whih is in charge of doing several treatments.
I would like that each time a treatment is done it return it back right away to the caller.
so it will not wait that the entire treatments are done before returning a result.
#async is not exactly what I want because it will run a new thread for the whole endpoint and will return a result when all treatments will be done.
Even, I tried a threadPoolExecutor and I used one thread by treatment but also I cannot return each thread result right away. I have to wait that all threads end before returning the result.
So is there away that will take care of each treatment and return it own result right away when it is done?
Depending on your architecture you have a few option:
you would send the request as it is now; on the backend side you would generate a new id for the request, and send back that id to the client. Now the client could subscribe to the websocket and wait for the backend to push the id related result to the websocket.
you could use a messaging solution which would be similar to the previous one, but with Kafka, RabbitMQ, ...
you could implement a polling mechanism between the two side: you would return an id like in the previous options and the client would check the status of the request with the id periodically from the backend. When the status is completed, it could fetch the result from another endpoint using the id.

How command id differs from infrastructure message id?

I am thinking about something what is connected a little bit with CQRS. There is a pattern Request-Reply. In example of HTTP transport into header we put Request-Id for at least tracking purposes. In my case monitoring between different microservices. If incoming request contains it than rewrite is done to Correlation-Id header. As I think this is done on transport layer (infrastructure). Question is if that Request-Id (sometimes named as Message-Id) should be delivered from business layer in example directly from command that we are executing - some mechinics does this auto-magically - like ICommand requires that Id is present?
Or it's totally different thing that exists only in infrastructure layer (transport)? If yes, than how to correlate transport id with business command id? At least one log/trace/track thing has to be placed with both identifiers? Is there pattenr that I missed? Moreover what you think CorrelationId should be in business command or not?
IMHO concepts such as correlation id, causation id, request id, message id, etc belong to the infrastructure layer as they are not part of the business rules.
However, I've added a metadata attribute to my Command and Event objects to save this kind of info which helps me to manage the correlation and causation relationship between commands and events.
By having this metadata attribute in the form of an associative array (hash map, dictionary or whatever key-value format), you leave your code opened to persist any tracking info you may need in the future without polluting your Application and Domain layers too much.

Maintaining state in distributed application

I am creating an asynchronous application using zeromq, celluloid. I need to maintain the states for different tasks that depend on some response. I can do it by sending data about the state in the response params. But is there any better way to do this?
Use uniquely identified Conditions to wait for replies.
The question is very abstract/vague, so I will do the best I can to give you a functional example. If I understand/guess correctly, this is what you need to do:
Generate a UUID with Celluloid::Internals::UUID.generate before the request is made.
Package the request data, and send the UUID with the data.
Create a Condition, associated with that UUID.
Send the request.
Process the request, keeping the UUID associated with it.
Return the response, sending the UUID back with the reply data.
Process the response on a generic ( non-stated ) level... then signal the Condition with the data returned, locating the Condition in a thread-safe Hash by UUID.
Receive the data you needed within the stated object, by blocking at wait in that stated object... rather than processing the response directly.
The stated object ought to have no awareness of 0MQ which is by design stateless... it ought to communicate with a layer which makes requests, receives responses, and delivers the response to the object who asked for it, without that stated object losing its state, and without the state information being passed in the request or the response.
This strategy is used heavily by ECell which is forthcoming. Hopefully you are already using the celluloid-zmq gem for evented 0MQ support already.

Tracking ajax request status in a Flux application

We're refactoring a large Backbone application to use Flux to help solve some tight coupling and event / data flow issues. However, we haven't yet figured out how to handle cases where we need to know the status of a specific ajax request
When a controller component requests some data from a flux store, and that data has not yet been loaded, we trigger an ajax request to fetch the data. We dispatch one action when the request is initiated, and another on success or failure.
This is sufficient to load the correct data, and update the stores once the data has been loaded. But, we have some cases where we need to know whether a certain ajax request is pending or completed - sometimes just to display a spinner in one or more views, or sometimes to block other actions until the data is loaded.
Are there any patterns that people are using for this sort of behavior in flux/react apps? here are a few approaches I've considered:
Have a 'request status' store that knows whether there is a pending, completed, or failed request of any type. This works well for simple cases like 'is there a pending request for workout data', but becomes complicated if we want to get more granular 'is there a pending request for workout id 123'
Have all of the stores track whether the relevant data requests are pending or not, and return that status data as part of the store api - i.e. WorkoutStore.getWorkout would return something like { status: 'pending', data: {} }. The problem with this approach is that it seems like this sort of state shouldn't be mixed in with the domain data as it's really a separate concern. Also, now every consumer of the workout store api needs to handle this 'response with status' instead of just the relevant domain data
Ignore request status - either the data is there and the controller/view act on it, or the data isn't there and the controller/view don't act on it. Simpler, but probably not sufficient for our purposes
The solutions to this problem vary quite a bit based on the needs of the application, and I can't say that I know of a one-size-fits-all solution.
Often, #3 is fine, and your React components simply decide whether to show a spinner based on whether a prop is null.
When you need better tracking of requests, you may need this tracking at the level of the request itself, or you might instead need this at the level of the data that is being updated. These are two different needs that require similar, but slightly different approaches. Both solutions use a client-side id to track the request, like you have described in #1.
If the component that calls the action creator needs to know the state of the request, you create a requestID and hang on to that in this.state. Later, the component will examine a collection of requests passed down through props to see if the requestID is present as a key. If so, it can read the request status there, and clear the state. A RequestStore sounds like a fine place to store and manage that state.
However, if you need to know the status of the request at the level of a particular record, one way to manage this is to have your records in the store hold on to both a clientID and a more canonical (server-side) id. This way you can create the clientID as part of an optimistic update, and when the response comes back from the server, you can clear the clientID.
Another solution that we've been using on a few projects at Facebook is to create an action queue as an adjunct to the store. The action queue is a second storage area. All of your getters draw from both the store itself and the data in the action queue. So your optimistic updates don't actually update the store until the response comes back from the server.

Creating a JMS Correlation ID

It is generally discouraged to use the message id returned from the JMS provider as the correlation id with which a message is published onto a queue. How have people generated their correlation ids for a request/response architecture?
Clients can use a unique ID standard like UUID to generate a new ID.
Here is good tutorial for you.
You can return correlation id from JMS provider using following code.
message.setJMSCorrelationID(UUID.randomUUID().toString());
producer.send(message);
LOG.info("jms-client sent:" + message.getJMSCorrelationID());
Cheers.
Server-side correlation ID generation suffers from two problems though:
One-way protocols (like JMS) have no direct means of returning the
correlation ID back to the client. Another channel could be used but
that complicates things.
Unexpected issues can prevent the client from receiving the
generated ID even though the request has been accepted and
processed on the server. This is why client ID generation should
be considered.
Client generated correlation IDs
Clients can use a unique ID standard like UUID to generate a new ID
message.setJMSCorrelationID(UUID.randomUUID().toString());
Ref: http://blogs.mulesoft.com/dev/anypoint-platform-dev/total-traceability/

Resources