Why Actors in Actor Model can have multiple adresses - actor

Hello this is preety straightforward question, i've seen and read around that people write that the address can point to multiple actors. I'm wondering why? What would be the use case of such.
Let's say router actor can have one address but then, on message it can dispatch the messages to multiple actors but each of it's children still might have only one address.
Thank you

First, let's analyze your example: indeed, a message payload can be delivered to multiple actors, as you have described (i.e. via router), as well as via pub-sub pattern .
I think, pub-sub is better because of convenience as there is no need of additional actor (router) in-between. Another reason, is less coupling in the pub-sub case: to subscribe and publish the message, there is need to know the address only, meanwhile in the router case, the donwstream actors have to know also at least router address and "subscription protocol", (e.g. donwstream.send<subscribe_me, message_type>(router_address, donwstream.address) or the donwstream actors have to know router class instance and (in the worse case) they have to be children of the router (or to be owned by router) - e.g. (router.subscribe(downstream_actor)).
The last reason on the point which might matter and depends on the implementation, that is is not the same message is delivered: in the pub-sub model the original message is delivered for multiple actors, in the router case multiple clones of the original message are delivered.
Second, the multiple patterns with actor multi-addressing are possible. Here are a few examples:
the actor's behavior might be different, whether a message of the same type is delivered to one or other actor's address, e.g. if actor owns some finite resource, say memory, and when it is about out of the resource, it might answer with rejections on it's "main" address, and still provide the resource on "critical" address. Without multi-addressing, you have to either share the resource between 2 different actors (discouraged) or to have 2 different types of messages (better).
in rotor (disclaimer: I'm the author), one actor can subscribe to the other service-actor address for some silent-side effects, i.e. for auditing or logging messages which are coming for the service-actor.
the message supervising can be implemented because of multi-addressing, e.g. in rotor the request-response pattern is implemented the following way: the request is routed via supervisor, which spawns a timer and a new address(X), and sends a payload copy to the original destination; then the response arrives at (X), it is just forwarded to the requesting-actor and timer is cancelled; however if the timer triggers, the supervisor creates a new message with error code (timeout) and delivers it back to the requesting-actor.
something like NAT is possible, i.e. when 2 nodes are connected, and here is need to deliver a message from actorA from node1 to actorB node2, the message can be actually just delivered to the unique address of NAT-actor on node1 representing actorB, serialized it special way and send over network. The reverse procedure will happen on node2. In that case NAT-actor will have multiple addresses (like ports in real routers), and still will be transparent for actorA and actorB.

Related

How does a microservice return data to the caller when using a message broker? or a message queue?

I am prettty new to microservices, and I am trying to figure out how to set a micro-service architecture in which my publisher that emits an event, can receive a response with data from the consumer within the publisher?
From what i have read about message-brokers and message-queues, it seems like it's one-way communication. The producer emits an event (or rather, sends a message) which is handled by the message broker, and then the consumer consumes that event and performs some action.
This allows for decoupled code, which is part of what im looking for, but i dont understand if the consumer is able to return any data to the caller.
Say for example I have a microservice that communicates with an external API to fetch data. I want to be able to send a message or emit an event from my front-facing server, which then calls the service that fetches data, parses the data, and then returns that data back to my servver1 (front-facing server)
Is there a way to make message brokers or queues bidirectional? Or is it only useable in one direction. I keep reading message brokers allow services to communicate with each other, but I only find examples in which data flow goes one way.
Even reading rabbitMQ documentation hasn't really made it very clear to me how i could do this
In general, when talking about messaging, it's one-way.
When you send a letter to someone you're not opening up a mind-meld so that they telepathically communicate their response to you.
Instead, you include a return address (or some other means of contacting you).
So to map a request-response interaction when communicating with explicit messaging (e.g. via a message queue), the solution is the same: you include some directions which the recipient can/will interpret as "send a response here". That could, for instance be, "publish a message on this queue with this correlation ID".
Your publisher then, after sending this message, subscribes to the queue it's designated and waits for a message with the expected correlation ID.
Needless to say, this is fairly elaborate: you are, in some sense, reimplementing a decent portion of a session protocol like TCP on top of a datagram protocol like IP (albeit in this case, we may have some stronger reliability guarantees than we'd get from IP). It's worth noting that this sort of request-response interaction intrinsically couples the two parties (we can't really say "sender and receiver": each is the other's audience), so we're basically putting in some effort to decouple the two sides and then some more effort to recouple them.
With that in mind, if the actual business use case calls for a request-response interaction like this, consider implementing it with an actual request-response protocol (e.g. REST over HTTP or gRPC...) and accept that you have this coupling.
Alternatively, if you really want to pursue loose coupling, go for broke and embrace the asynchronicity at the heart of the universe (maybe that way lies true enlightenment?). Have your publisher return success with that correlation ID as soon as its sent its message. Meanwhile, have a different service be tracking the state of those correlation IDs and exposing a query interface (CQRS, hooray!). Your client can then check at any time whether the thing it wanted succeeded, even if its connection to your publisher gets interrupted.
Queues are the wrong level of abstraction for request-reply. You can build an application out of them, but it would be nontrivial to support and operate.
The solution is to use an orchestration system like temporal.io or AWS Step Functions. These services out of the box provide state management, asynchronous communication, and automatic recovery in case of various types of failures.

What is the Purpose of the DestinationAddress field in the MassTransit Envelope?

When sending a message, MassTransit wraps that payload with an envelope which has a field called destinationAddress. What purpose does this field have?
I found this because I have a number of C# microservices communicating with some node and java based services - so I've been using the minimum payload defined here:
http://masstransit-project.com/MassTransit/advanced/interoperability.html
I've had no problem integrating the two services together I was just wondering what the point was of having the destinationAddress as part of the message itself? Is it just a belts and braces kind of thing to make sure messages don't go on the wrong queue by mistake?
I would have thought that all of this information can be derived since it is literally just built up of a) the message bus host and b) the queue name used when actually sending the message?
Transports have a variety of ways to delivering messages. For instance, publishing a message to a topic would set the destination address to (URI of topic) but it may be delivered to a queue (via a subscription, forwarded by the transport) with a different address. In this case, the envelope has the original destinationAddress, whereas the queue would have a different address.
There are also cases where messages may be scheduled, redelivered, faulted, etc., and having that information helps in troubleshooting production systems in cases where the original destination may not be known otherwise.
So, yeah, in the simplest case it seems superfluous, however, it comes in useful down the road when trying to figure out why something doesn't work.

In event-driven architecture, is it ok to have all services send their event to a component that forwards it to the proper service?

Let's say I want to set up and event-driven architecture with services A-D where the events propagate as follows
A
/ \
B C
/
D
In other words,
(1) A publishes an event
(2) Subscribers B and C receive A's event
(3) C publishes an event
(4) Subscriber D receive's C's event
One way is to have services B and C directly listen to a queue into which A posts messages. But the issue I see with this is maintenance. Once the system becomes complicated with 1000s of subscriptions, it becomes difficult to have any visibility into how the updates are propagating.
A solution I propose to this problem is to have another service X that knows the tree in the in the first image and is responsible for directing the propagation of events according to the tree. Every service publishes its event to X and it publishes the event to the listening services. So it's kinda of a middleman like
A
|
X
/ \
B C
|
X
|
D
This also makes it easier to track the event propagation.
Are there any downsides to this (other than extra cost associating with twice as much message transferring)?
You’re thinking of events like they are implemented in a Winforms UI where the publisher sends the event directly to the subscriber. That’s not how events work in an EDA architecture. The word “event” has taken on a whole new meaning.
Before we start, you’re jumbling together the ideas of a message and an event when they really need to be kept separate. A message is a request for some action to happen, while an event is notification that something has already happened. The important distinction for this discussion is that a message publisher assumes 1 or more other processes will receive and process the message. If the message is not processed by something, downstream errors will occur. An event has no such assumption and can go unread without adversely affecting anything. Another difference is that once messages are processed they are typically thrown away, whereas events are kept for an extended period (days, or weeks).
With that in mind, the ‘X’ service you talk about already exists (please don’t build one) and is integral to the process – it’s called the bus. There are 2 types of bus; a message bus (think RabbitMQ, MSMQ, ZeroMQ, etc) or event bus (Kafka, Kinesis, or Azure Event Hub). In either case, a publisher puts a message on to the bus and subscribers get it from the bus. You may implement the bus servers as multiple physical buses, but when imagining it think of them all being the same logical bus.
The key point that’s tripping you up, and it’s a subtle difference, is thinking that the message bus has business logic indicating where messages go. The business logic of who gets what message is determined by the subscribers – the message bus is just a holding place for the messages to wait for pickup.
In your example, A publishes an event to the bus with a message type of “MT1”. B and C both tell the bus that they are interested in events of type “MT1”. When the bus receives the request from B and C to be notified of “MT1” messages, the bus creates a queue for B and a queue for C. When A publishes the message, the bus puts a copy in the “B-MT1” queue and a copy in the “C-MT1” queue. Note that the bus doesn’t know why B and C want to receive those messages, only that they’ve subscribed.
These messages sit there until processed by their respective subscribers (the processes can poll or the bus can push the messages, but the key idea is that the messages are held until processed). Once processed, the messages are thrown away.
For C to communicate with D, D will subscribe to messages of type “MT2” and C will publish them to the bus.
Constantin’s answer above has a point that this is a single point of failure, but it can be managed with standard network architecture like failover servers, local message persistence, message acknowledgements, etc.
One of your concerns is that with 1000’s of subscriptions it becomes difficult to follow the path, and you’re right. This is an inherent downside of EDA and there’s nothing you can do about it. Eventual consistency is also something the business is going to complain about, but it’s part of the beast and is actually a good thing from a technical perspective because it enables more scalability. The biggest problem I’ve found using the term Eventual Consistency is that the business thinks it means hours or days, not seconds.
BTW, This whole discussion assumes the message publishers and subscribers are different apps. All the same ideas can be applied within the same address space, just with a different bus. If you’re a .net shop look at Mediatr. For other tech stacks, there are similar solutions that I’m sure google knows about.
If your main concern is visibility into the propagation of events (which is a very valid concern for debugging and long-term application maintenance of a distributed system), you can use a correlation identifier to trace the generation of messages from the initial event through the entire chain. You don't need to build another layer of orchestration -- let your messaging platform handle that for you.
Most messaging platforms/libraries have the concept built in: e.g., NServiceBus defines a ConversationId field in the message headers, and AMQP defines a correlation-id field in the basic messaging model.
Your system should have some kind of logging that allows you to audit messages -- the correlation ID will allow you to group all messages that result from a single command/request to make debugging distributed logic much simpler.
If you set a GUID in the client requests, you can even correlate actions in the UI to the backend API, right through all the events recursively generated.
It is OK but the microservices shouldn't care how they get the messages in the first place. From their point of view the input messages just arrive. You will then be tempted to design your system to depend on some global order of events, which is hard in a distributed scalable system. Resist that temptation and design your system to relay only on local ordering of events (i.e. the ordering in an Event stream emitted by an Aggregate in Event sourcing + DDD).
One downside that I see is that the availability and the scalability may be hurt. You will then have a single point of failure for the entire system. If this fails everything fails. When it needs to be scaled up then you will have again problems as you will have distributed messaging system.

Is there a way to reverse the bind on zmq pub/sub?

I have server code on one box that needs to listen in on status coming from another box with about 10 chips with linux embedded in them. The 10 chips have their own ip addresses and each will send basically health status to the server which could (possibly) do something with it.
I would like the server just to passively listen and not have to send a response. So, this looks like a job for zmq's pub/sub. Where, each of the 10 chips have their own publication and the server would subscribe to each.
However, the server would need to know the well known address that each chip bound their publication to. But, in the field, these chips can be swapped or replace with a different ip address.
Instead, it's safer to have the chips know the server code's ip adddress.
What I would like a pub/sub where the receiver is the well known address. Or, a request/response pattern where the clients (the chips) send a messages to the server (the requests), but neither the server nor the chips need to send/receive a response.
Now, currently, there are two servers on the separate box. So, if possible I'd like a solution for one server and multiple servers.
Is this possible in zmq? And what pattern would that be?
thanks.
Yes, you can do this exactly the way you'd expect to do so. Just bind on your subscriber, then connect to that subscriber with your publishers. ZMQ doesn't designate which end should be the "server", or more reliable end, and which should be the "client", or more transient end, specifically for this reason, and this is an excellent reason to switch up the normal paradigm.
Edit to address the new clarification--
It should work fine with multiple servers. In general it would work like the following (the order of operations in this case is just to ensure no messages get lost, which is possible if the PUB socket starts sending messages before the SUB is ready):
Spin up server 1. Create SUB socket and bind on address:port.
Spin up server 2. Create SUB socket and bind on address:port.
Spin up a chip. That chip will create a PUB socket and connect on [server 1] address:port and connect on [server 2] address:port.
Repeat step (3) for the other nine chips.
Dual .SUB model
Oh yes, each .PUB-lishing entity may have numerous .SUB-s listening,
so having two <serverNode>-s meets the .PUB/.SUB-primitive Formal Communication Pattern ( one speaks - many listen )
As given above, each of your <serverNode> binds
.bind( aFixServer{A|B}_ipAddress_portNumber )
so as allow each .PUB-lishing <chipNode> to
.connect( anAprioriKnownServer{A|B}_bindingNode_ipAddress_portNumber )
And both <serverNode{A|B}> than .SUB-s to receive any messages from them.
Multi-Server model
As seen above, the {A|B} grammar is freely extensible to {A|B|C|D|...} so the principal messaging model will stand for any reasonable multi-server extension
Q.E.D.

Can AMQP clients be both a publisher and subscriber?

I'm just starting to research AMQP and I'm wondering if I'd be using it for something it's not designed for. Here's something like what I want to do:
ClientA does goes about it's business
and publishes it's state to some
exchange (correct me if I use the
wrong terms anywhere).
ClientB connects to the same broker
and "says what publishers are
publishing here? I choose you,
clientB. What is going on?".
ClientA says "My foo is bar and my baz
is true"
ClientB says "OK. Set your baz to
false"
edit for a less abstract example"
ClientA talks/listens to a hardware
device, say a video projector. When
ClientB comes online, it wants to find
any projector clients (like ClientA)
that are connected and then to know
the status of the projectors (is the
lamp on?) and also change, if it needs to, the status
(turn the lamp off). So ClientA is
keeping some state (lamp is off) and
can send it out when requested, and
call also respond to commands from the
exchange and convert and pass them to
the projector (turn lamp on).
I'm finding it hard to follow your example, but it sounds like you want these A and B types to have back-and-forth conversations with each other. Is that correct?
AMQP is better suited for asynchronous message passing, and to add the kind of point-to-point style you're describing requires that you set up request and reply queues so that clients can both send and receive messages. It's certainly possible to have clients both publish and consume messages.
This is possible and it would make sense if the different actors in your example, are networked devices because AMQP would provide a loosely coupled way of messaging.
One thing to watch out for is the last abstract line where client B says "OK, set some attribute". That sounds suspiciously like a scenario where subroutine calls return some value and then the next step takes place. AMQP can certainly simulate that kind of RPC, but it works better when processes can send a message and don't have to wait for completing.
If most of your messaging doesn't involve waiting for turnaround replies, then AMQP sounds like a fit for what you are doing. But if most of your needs are RPC, then it may not be the best choice.
AMQP really shines when there are future possibilities, for instance in your scenario, if you needed to add a couple thousand projectors, 10,000 client Bs, and several other device types that also need to exchange status. The loose coupling of AMQP makes it easy to add other applications to the broker, just by declaring new exchanges.

Resources