Difference between Observer, Pub/Sub, and Data Binding - model-view-controller

What is the difference between the Observer Pattern, Publish/Subscribe, and Data Binding?
I searched around a bit on Stack Overflow and did not find any good answers.
What I have come to believe is that data binding is a generic term and there are different ways of implementing it such as the Observer Pattern or the Pub/Sub pattern. With the Observer pattern, an Observable updates its Observers. With Pub/Sub, 0-many publishers can publish messages of certain classes and 0-many subscribers can subscribe to messages of certain classes.
Are there other patterns of implementing "data binding"?

There are two major differences between Observer/Observable and Publisher/Subscriber patterns:
Observer/Observable pattern is mostly implemented in a synchronous way, i.e. the observable calls the appropriate method of all its observers when some event occurs. The Publisher/Subscriber pattern is mostly implemented in an asynchronous way (using message queue).
In the Observer/Observable pattern, the observers are aware of the observable. Whereas, in Publisher/Subscriber, publishers and subscribers don't need to know each other. They simply communicate with the help of message queues.
As you mentioned correctly, data binding is a generic term and it can be implemented using either Observer/Observable or Publisher/Subscriber method. Data is the Publisher/Observable.

Here's my take on the three:
Data Binding
Essentially, at the core this just means "the value of property X on object Y is semantically bound to the value of property A on object B. No assumptions are made as to how Y knows or is fed changes on object B.
Observer, or Observable/Observer
A design pattern by which an object is imbued with the ability to notify others of specific events - typically done using actual events, which are kind of like slots in the object with the shape of a specific function/method. The observable is the one who provides notifications, and the observer receives those notifications. In .net, the observable can expose an event and the observer subscribes to that event with an "event handler" shaped hook. No assumptions are made about the specific mechanism which notifications occur, nor about the number of observers one observable can notify.
Another name (perhaps with more "broadcast" semantics) of the Observable/Observer pattern, which usually implies a more "dynamic" flavor - observers can subscribe or unsubscribe to notifications and one observable can "shout out" to multiple observers. In .NET, one can use the standard events for this, since events are a form of MulticastDelegate, and so can support delivery of events to multiple subscribers, and also support unsubscription. Pub/Sub has a slightly different meaning in certain contexts, usually involving more "anonymity" between event and eventer, which can be facilitated by any number of abstractions, usually involving some "middle man" (such as a message queue) who knows all parties, but the individual parties don't know about each other.
Data Binding, Redux
In many "MVC-like" patterns, the observable exposes some manner of "property changed notification" that also contains information about the specific property changed. The observer is implicit, usually created by the framework, and subscribes to these notifications via some binding syntax to specifically identify an object and property, and the "event handler" just copies the new value over, potentially triggering any update or refresh logic.
Data binding re Redux
An alternative implementation for data binding? Ok, here's a stupid one:
a background thread is started that constantly checks the bound property on an object.
if that thread detects that the value of the property has changed since last check, copy the value over to the bound item.

I am a bit amused that all the answers here were trying to explain the subtle difference between Observer and Pub/Sub patterns without giving any concrete examples. I bet most of the readers still don't know how to implement each one by reading one is synchronous and the other is asynchronous.
One thing to note is: The goal of these patterns is trying to decouple code
The Observer is a design pattern where an object (known as a subject) maintains a list of objects depending on it (observers), automatically notifying them of any changes to state.
Observer pattern
This means an observable object has a list where it keeps all its observers(which are usually functions). and can traverse this list and invoke these functions when it feels a good time.
see this observer pattern example for details.
This pattern is good when you want to listen for any data change on an object and update other UI views correspondingly.
But the Cons are Observables only maintain one array for keeping observers
(in the example, the array is observersList).
It does NOT differentiate how the update is triggered because it only has one notify function, which triggers all the functions stored in that array.
If we want to group observers handlers based on different events. We just need to modify that observersList to an Object like
var events = {
"event1": [handler1, handler2],
"event2": [handler3]
see this pubsub example for details.
and people call this variation as pub/sub. So you can trigger different functions based on the events you published.

I agree with your conclusion about both patterns, nevertheless, for me, I use Observable when I'm in the same process and I use the Pub/Sub in inter-process scenarios, where all parties only know the common channel but not the parties.
I don't know other patterns, or let me say this way, I've never needed another patterns for this task. Even most MVC frameworks and data binding implementations use usually internally the observer concept.
If you're interested in inter-process communication, I recommend you:
"Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions" -
This book contains a lot of ideas about how to send messages between processes or classes that can be used even in intra-process communication tasks (it helped me to program in a more loose-coupled way).
I hope this helps!

One concrete difference is that an Observable is always engaged when an observer no longer wants to observe. But a subscriber can stop the subscription and the publisher will never be aware of these intent to unsubscribe


Saga Choreography implementation problems

I am designing and developing a microservice platform based on the specifications of http://microservices.io/
The entire framework integrates through socket thus removing the overhead of multiple HTTP requests (like most REST APIs).
A service registry host receives the registry of multiple microservice hosts, each microservice is responsible for a domain of the business. Another host we call a router (or API gateway) is responsible for exposing the microservices for consumption by third parties.
We will use the structure of Sagas (in choreography style) to distribute the requisitions, so we have some doubts:
Should a microservice issue the event in any process manager or should it be passed directly to the next microservice responsible for the chain of events? (the same logic applies to rollback)
Who should know how to build the Saga chain of events? The first microservice that receives a certain work or the router?
If an event needs to pass a very large volume of data to the next Saga event, how is this done in terms of the request structure? Is it divided into multiple Sagas for example (as a result pagination type)?
I think the main point is that in this router and microservice structure, who is responsible for building the Sagas and propagating their events.
The article Patterns for Microservices — Sync vs. Async does a great job defining many of the terms used here and has animated gifs demonstrating sync vs. async and orchestrated vs. choreographed as well as hybrid setups.
I know the OP answered his own question for his use case, but I want to try and address the questions raised a bit more generally in lieu of the linked article.
Should a microservice issue the event in any process manager or should it be passed directly to the next microservice responsible for the chain of events?
To use a more general term, a process manager is an orchestrator. A concrete implementation of this may involve a stateful actor that orchestrates a workflow, keeping track of the progress in some way. Since a saga is workflow itself (composed of both forward and compensating actions), it would be the job of the process manager to keep track of the state the saga until completion (success or failure). This typically involves the actor sending synchronous* calls to services waiting for some result before going to the next step. Parallel operations can of course be introduced and what not, but the point is that this actor dictates the progression of the saga.
This is fundamentally different from the choreography model. With this model there is no central actor keeping track of the state of a saga, but rather the saga progresses implicitly via the events that each step emits. Arguably, this is a more pure case of an event-driven model since there is no coordination.
That said, the challenge with this model is observing the state at any given point in time. With the orchestration model above, in theory, each actor could be queried for the state of the saga. In this choreographed model, we don't have this luxury, so in practice a correlation ID is added to every message corresponding to (in this case) a saga. If the messages are queryable in some way (the event bus supports it or through some other storage means), then the messages corresponding to a saga could be queried and the saga state could be reconstructed.. (effectively an event sourced modeled).
Who should know how to build the Saga chain of events? The first microservice that receives a certain work or the router?
This is an interesting question by itself and one that I have been thinking about quite a lot. The easiest and default answer would be.. hard code the saga plans and map them to the incoming message types. E.g. message A triggers plan X, message B triggers plan Y, etc.
However, I have been thinking about what a control plane might look like that manages these plans and provides the mechanism for pushing changes dynamically to message handlers and/or orchestrators dynamically. The two specific use cases in mind are changes in authorization policies or dynamically adding new steps to a plan.
If an event needs to pass a very large volume of data to the next Saga event, how is this done in terms of the request structure? Is it divided into multiple Sagas for example (as a result pagination type)?
The way I have approached this is to include references to the large data if these are objects such as a file or something. For data that are inherently streams themselves, a parallel channel could be referenced that a consumer could read from once it receives the message. I think the important distinction here is to decouple thinking about the messages driving the workflow from where the data is physically materialized which depends on the data representation.
For microservices, every microservice should be responsible for its domain business.
Should a microservice issue the event in any process manager or should it be passed directly to the next microservice responsible for the chain of events? (the same logic applies to rollback)
All events are not passed to the next microservice, but are published, then all microservices interested in the events should subscribe to them.
If there is rollback, you should consider orchestration.
Who should know how to build the Saga chain of events? The first microservice that receives a certain work or the router?
The microservice who publish the event will certainly know how to build it. There are no chain of events, because every microservice interested in the event will subscribe it separately.
If an event needs to pass a very large volume of data to the next Saga event, how is this done in terms of the request structure? Is it divided into multiple Sagas for example (as a result pagination type)?
Only publish the data others may be interested, not all. In most cases, the data are not large, and message queue can handle them efficiently

Listening on multiple events

How to deal with correlated events in an Event Driven Architecture? Concretely, what if multiple events must be triggered in order for some action to be performed. For example, I have a microservice that listens to two events foo and bar and only performs an action when both of the events arrive and have the same correlation id.
One way would be to keep an internal data structure inside the microservice that does the book keeping and when everything is satisfied an appropriate action is triggered. However, the problem with this approach is that the microservice is not immutable anymore.
Is there a better approach?
A classic example is where an order comes in at sales and an event is published. Both Finance and Shipping are subscribed to the event, but shipping is also subscribed to the event coming from finance.
The funny thing is that you have no idea on the order in which the messages arrive. The event from sales might cause a technical error, because the database is offline. It might get queued again or end up in an error queue for operations to retry it. In the meantime the event from finance might arrive. So theoretically
the event from sales should arrive first and then the finance event, but in practice it can be the other way around.
There are a number of solutions here, but I've never liked the graphical ones. As a .NET developer I've used K2 and Windows Workflow Foundation in the past, but the solutions most flexible are created in code, not via a graphical interface.
I currently would use NServiceBus or MassTransit for this. On a sidenote, I currently work at Particular Software and we make NServiceBus. NServiceBus has Sagas for this kind of work (documentation) and you can also read on my weblog about a presentation, incl. code on GitHub.
The term saga is kind of loaded, but it basically handles long running (business) processes. Gregor Hohpe calls it a Process Manager (link).
To summarize what sagas do : they are instantiated by incoming messages and have state. Incoming messages are bound/dispatched to a specific saga instance based on a correlationid, for example a customer id or order id. Once the message (event) is processed, state is stored until a new message arrives, or until the code marks the saga as completed and the state is removed from storage.
As said, in the .NET world MassTransit and NServiceBus support this, but there are most likely alternatives in other environments.
If i understand correctly, it looks like you need a CEP ( complex event processor), like ws02 cep or other , which does exactly that.
cep's can aggregate events and perform actions when certain conditions
have been met.

Event-driven architecture and structure of events

I'm new to EDA and I've read a lot about benefits and would probably be interested to apply it during my next project but still haven't understood something.
When raising an event, which pattern is the most suited:
Name the event "CustomerUpdate" and include all information (updated or not) about the customer
Name the event "CustomerUpdate" and include only information that have really been updated
Name the event "CustomerUpdate" and include minimum information (Identifier) and/or a URI to let the consumer retrieves information about this Customer.
I ask the question because some of our events could be heavy and frequent.
Thx for your answers and time.
Name the event "CustomerUpdate"
First let's start with your event name. The purpose of an event is to describe something which has already happenned. This is different from a command, which is to issue an instruction for something yet to happen.
Your event name "CustomerUpdate" sounds ambiguous in this respect, as it could be describing something in the past or something in the future.
CustomerUpdated would be better, but even then, Updated is another ambiguous term, and is nonspecific in a business context. Why was the customer updated in this instance? Was it because they changed their payment details? Moved home? Were they upgraded from silver to gold status? Events can be made as specific as needed.
This may seem at first to be overthinking, but event naming becomes especially relevant as you remove data and context from the event payload, moving more toward skinny events (the "option 3" from your question, which I discuss below).
That is not to suggest that it is always appropriate to define events at this level of granularity, only that it is an avenue which is open to you early on in the project which may pay dividends later on (or may swamp you with thousands of event types).
Going back to your actual question, let's take each of your options in turn:
Name the event "CustomerUpdate" and include all information (updated
or not) about the customer
Let's call this "pattern" the Fat message.
Fat messages (also called snapshots) represent the state of the described entity at a given point in time with all the event context present in the payload. They are interesting because the message itself represents the contract between service and consumer. They can be used for communicating changes of state between business domains, where it may be preferred that all event context be present during message processing by the consumer.
Self consistent - can be consumed entirely without knowledge of other systems.
Simple to consume (upsert).
Brittle - the contract between service and consumer is coupled to the message itself.
Easy to overwrite current data with old data if messages arrive in the wrong order (hint: you can mitigate this by using the event sourcing pattern)
Name the event "CustomerUpdate" and include only information that have
really been updated
Let's call this pattern the Delta message.
Deltas are similar to fat messages in many ways, though they are generally more complex to generate and consume. A good example here is the JSONPatch standard.
Because they are only a partial description of the event entity, deltas also come with a built-in assumption that the consumer knows something about the event being described. For this reason they may be less suitable for sending outside a business domain, where the event entity may not be well known.
Deltas really shine when synchronising data between systems sharing the same entity model, ideally persisted in non-relational storage (eg, no-sql). In this instance an entity can be retrieved, the delta applied, and then persisted again with minimal effort.
Smaller than Fat messages
Excels in use cases involving shared entity models
Portable (if based on a standard such as jsonpatch, or to a lesser extent, diffgram)
Similar to the Fat message, assumes complete knowledge of the data entity.
Easy to overwrite current data with old data.
Complex to generate and consume (except for specific use cases)
Name the event "CustomerUpdate" and include minimum information
(Identifier) and/or a URI to let the consumer retrieves information
about this Customer.
Let's call this the Skinny message.
Skinny messages are different from the other message patterns you have defined, in that the service/consumer contract is no longer explicit in the message, but implied in that at some later time the consumer will retrieve the event context. This decouples the contract and the message exchange, which is a good thing.
This may or may not lend itself well to cross-business domain communication of events, depending on how your enterprise is set up. Because the event payload is so small (usually an ID with some headers), there is no context other than the name of the event on which the consumer can base processing decisions; therefore it becomes more important to make sure the event is named appropriately, especially if there are multiple ways a consumer could handle a CustomerUpdated message.
Additionally it may not be good practice to include an actual resource address in the event data - because events are things which have already happened, event messages are generally immutable and therefore any information in the event should be true forever in case the events need to be replayed. In this instance a resource address could easily become obsolete and events would not be re-playable.
Decouples service contract from message.
Information about the event contained in the event name.
Naturally idempotent (with time-stamp).
Generally tiny.
Simple to generate and consume.
Consumer must make additional call to retrieve event context - requires explicit knowledge of other systems.
Event context may have become obsolete at the point where the consumer retrieves it, making this approach generally unsuitable for some real-time applications.
When raising an event, which pattern is the most suited?
I think the answer to this is: it depends on lots of things, and there is probably no one right answer.
Update from comments: Also worth reading, a very old, classic, blog post on messaging: https://learn.microsoft.com/en-gb/archive/blogs/nickmalik/killing-the-command-message-should-we-use-events-or-documents (also here: http://vanguardea.com/killing-the-command-message-should-we-use-events-or-documents/)
Martin Fowler gave a great talk about "The Many Meanings of Event-Driven Architecture" (the content is based on this paper) in which he mentioned the Event-Carried State Transfer pattern.
It seems to be close to your second option "Delta message" with the difference that it doesn't try to describe an entity, but instead describe a named business fact that happened and carry over all the necessary data to understand this fact.
I don't think it matters how you have modeled your persistence layer when it comes to designing domain events. Likewise, I don't think it matters how your consumer has modeled its own persistence layer when designing domain events.
Thus, I don't think it's wise to put as an advantage the fact that you can apply the event as a patch directly on your data (from a consumer point of view), because it pushes the producer to design their events given the persistence model of a consumer.
In that case, I would tend to think that you're designing persistence patches, instead of domain events.
What do you think?

what is the difference between event listerners and subscribers in symfony2 [duplicate]

I'm working in the Symfony2 framework and wondering when would one use a Doctrine subscriber versus a listener. Doctrine's documentation for listeners is very clear, however subscribers are rather glossed over. Symfony's cookbook entry is similar.
From my point of view, there is only one major difference:
The Listener is signed up specifying the events on which it listens.
The Subscriber has a method telling the dispatcher what events it is listening to
This might not seem like a big difference, but if you think about it, there are some cases when you want to use one over the other:
You can assign one listener to many dispatchers with different events, as they are set at registration time. You only need to make sure every method is in place in the listener
You can change the events a subscriber is registered for at runtime and even after registering the subscriber by changing the return value of getSubscribedEvents (Think about a time where you listen to a very noisy event and you only want to execute something one time)
There might be other differences I'm not aware of though!
Don't know whether it is done accidentally or intentionally.. But subscribers have higher priority that listeners - https://github.com/symfony/symfony/blob/master/src/Symfony/Bridge/Doctrine/DependencyInjection/CompilerPass/RegisterEventListenersAndSubscribersPass.php#L73-L98
From doctrine side, it doesn't care what it is (listener or subscriber), eventually both are registered as listeners - https://github.com/doctrine/common/blob/master/lib/Doctrine/Common/EventManager.php#L137-L140
This is what I spotted.
You should use event subscriber when you want to deal with multiple events in one class, for example in this symfony2 doc page article, one may notice that event listener can only manage one event, but lets say you want to deal with several events for one entity, prePersist, preUpdate, postPersist etc... if you use event listener you would have to code several event listener, one for each event, but if you go with event subscriber you just have to code one class the event susbcriber, look that with the event subscriber you can manage more than one event in one class, well thats the way i use it, i preffer to code focused in what the model business need, one example of this may be went you want to handle several lifecycle events globaly only for a group of your entities, to do that you can code a parent class and defined those global methods in it, then make your entities inherit that class and later in your event susbcriber you subscribe every event you want, prePersist, preUpdate, postPersist etc... and then ask for that parent class and execute those global methods.
Another important thing: Doctrine EventSubscribers do not allow you to set a priority.
Read more on this issue here
Both allow you to execute something on a particular event pre / post persist etc.
However listeners only allow you to execute behaviours encapsulated within your Entity. So an example might be updating a "date_edited" timestamp.
If you need to move outside the context of your Entity, then you'll need a subscriber. A good example might be for calling an external API, or if you need to use / inspect data not directly related to your Entity.
Here is what the doc is saying about that in 4.1.
As this is globally applied to events, I suppose it's also valid for Doctrine (not 100% sure).
Listeners or Subscribers
Listeners and subscribers can be used in the same application indistinctly. The decision to use either of them is usually a matter
of personal taste. However, there are some minor advantages for each
of them:
Subscribers are easier to reuse because the knowledge of the events is kept in the class rather than in the service definition.
This is
the reason why Symfony uses subscribers internally;
Listeners are more flexible because bundles can enable or disable each of them conditionally depending on some configuration value.
From the documentation :
The most common way to listen to an event is to register an event
listener with the dispatcher. This listener can listen to one or more
events and is notified each time those events are dispatched.
Another way to listen to events is via an event subscriber. An event
subscriber is a PHP class that's able to tell the dispatcher exactly
which events it should subscribe to. It implements the
EventSubscriberInterface interface, which requires a single static
method called getSubscribedEvents().
See the example here :

When to use events?

At work, we have a huge framework and use events to send data from one part of it to another. I recently started a personal project and I often think to use events to control the interactions of my objects.
For example, I have a Mixer class that play sound effects and I initially thought I should receive events to play a sound effect. Then I decided to only make my class static and call
in my classes. I have a ton of examples like this one where I initially think of an implementation with events and then change my mind, saying to myself it is too complex for nothing.
So when should I use events in a project? In which occasions events have a serious advantage over others techniques?
You generally use events to notify subscribers about some action or state change that occurred on the object. By using an event, you let different subscribers react differently, and by decoupling the subscriber (and its logic) from the event generator, the object becomes reusable.
In your Mixer example, I'd have events signal the start and end of playing of the sound effect. If I were to use this in a desktop application, I could use those events to enable/disable controls in the UI.
The difference between Calling a subroutine and raising events has to do with: Specification, Election, Cardinality and ultimately, which side, the initiator or the receiver has Control.
With Calls, the initiator elects to call the receiving routine, and the initiator specifies the receiver. And this leads to many-to-one cardinality, as many callers may elect to call the same subroutine.
With Events on the other hand, the initiator raises an event that will be received by those routines that have elected to receive that event. The receiver specifies what events it will receive from what initiators. This then leads to one-to-many cardinality as one event source can have many receivers.
So the decision as to Calls or Events, mostly has to do with whether the initiator determines the receiver is or the receiver determines the initiator.
Its a tradeoff between simplicity and re-usability. Lets take an metaphor of "Sending the email" process:
If you know the recipients and they are finite in number that you can always determine, its as simple as putting them in "To" list and hitting the send button. Its simple as thats what we use most of the time. This is calling the function directly.
However, in case of mailing list, you don't know in advance that how many users are going to subscribe to your email. In that case, you create a mailing list program where the users can subscribe to and the email goes automatically to all the subscribed users. This is event modeling.
Now, even though, in both above option, emails are sent to users, you are a better judge of when to send email directly and when to use the mailing list program. Apply the same judgement, hope that you would get your answer :)
I have been working with a huge code base at my previous work place and have seen, that using events can increase the complexity quite a lot and often unnecessarily.
I had often to reverse engineer existing code in order to fix it or to extend it.
In both cases, it is a lot easier to understand what is going on, when you can simply read a list of function calls instead of just seeing the raise of an event.
The event forces you to look for usages in order to fully understand what is happening. Not a problem with modern IDEs, but if you then encounter many functions, which also raise events, it quickly becomes complex. I had encountered cases, where it mattered in what order functions did subscribe to an event, even though most languages don't even gurantee a calling order...
There are cases when it is a really good idea to use events. But before you start eventing, consider the alternative. It is probably easier to read and mantain.
A Classic example for the use of events is a UI framework, which provides elements like buttons etc.
You want the function "ButtonPressed()" of the framework to call some of your functions, so that you can react to the user action.
The alternative to an event that you can subscribe to, would for example be a public bool "buttonPressed", which the UI framework exposes
and which you can regurlary check for beeing true or false. This is of course very ineffecient, when there are hundreds of UI elements.
