I often have to use the event model in an application based on Windows messages to notify the main thread of the application about some of the results of other threads. This approach allows you to get rid of blocking between threads.
The documentation says that system message IDs must be unique. But I did not find such a condition for application messages. At the same time, very often on the Internet you can find the opinion that within the application, message identifiers must be unique.
I see no point in being unique within the application. When sending a message, we specify a specific window handle that should process the message. I can use the same message ID in different windows, but it is the window handle that determines the recipient of the message. And it works.
Are there any obvious reasons to keep track of the uniqueness of message IDs in the application?
There is no requirement that all message IDs be unique within an application. In fact, we are using multiply assigned message IDs all the time: WM_USER-based messages are used by window classes to implement class-specific behavior. The most prominent example are the Common Controls that implement control-specific behavior using the [WM_USER..WM_APP) range of message IDs.
If you are calling RegisterClassEx you opt in to using WM_USER-based messages. You are free to reuse any value used by a different window class without risking a collision. It's the combination of the receiving window's class and the message ID that controls the behavior.
If you don't have a receiver (e.g. when calling PostThreadMessage) you would need to make sure that you can uniquely identify a message (and its payload). The easiest way in this case is to use unique message IDs in the WM_APP range.
Related
I am designing some events that will be raised when actions are performed or data changes in a system. These events will likely be consumed by many different services and will be serialized as XML, although more broadly my question also applies to the design of more modern funky things like Webhooks.
I'm specifically thinking about how to describe changes with an event and am having difficulty choosing between different implementations. Let me illustrate my quandry.
Imagine a customer is created, and a simple event is raised.
<CustomerCreated>
<CustomerId>1234</CustomerId>
<FullName>Bob</FullName>
<AccountLevel>Silver</AccountLevel>
</CustomerCreated>
Now let's say Bob spends lots of money and becomes a gold customer, or indeed any other property changes (e.g.: he now prefers to be known as Robert). I could raise an event like this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<FullName>Bob</FullName>
<AccountLevel>Gold</AccountLevel>
</CustomerModified>
This is nice because the schema of the Created and Modified events are the same and any subscriber receives the complete current state of the entity. However it is difficult for any receiver to determine which properties have changed without tracking state themselves.
I then thought about an event like this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<AccountLevel>Gold</AccountLevel>
</CustomerModified>
This is more compact and only contains the properties that have changed, but comes with the downside that the receiver must apply the changes and reassemble the current state of the entity if they need it. Also, the schemas of the Created and Modified events must be different now; CustomerId is required but all other properties are optional.
Then I came up with this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<Before>
<FullName>Bob</FullName>
<AccountLevel>Silver</AccountLevel>
</Before>
<After>
<FullName>Bob</FullName>
<AccountLevel>Gold</AccountLevel>
</After>
</CustomerModified>
This covers all bases as it contains the full current state, plus a receiver can figure out what has changed. The Before and After elements have the exact same schema type as the Created event. However, it is incredibly verbose.
I've struggled to find any good examples of events; are there any other patterns I should consider?
You tagged the question as "Event Sourcing", but your question seems to be more about Event-Driven SOA.
I agree with #Matt's answer--"CustomerModified" is not granular enough to capture intent if there are multiple business reasons why a Customer would change.
However, I would back up even further and ask you to consider why you are storing Customer information in a local service, when it seems that you (presumably) already have a source of truth for customer. The starting point for consuming Customer information should be getting it from the source when it's needed. Storing a copy of information that can be queried reliably from the source may very well be an unnecessary optimization (and complication).
Even if you do need to store Customer data locally (and there are certainly valid reasons for need to do so), consider passing only the data necessary to construct a query of the source of truth (the service emitting the event):
<SomeInterestingCustomerStateChange>
<CustomerId>1234</CustomerId>
</SomeInterestingCustomerStateChange>
So these event types can be as granular as necessary, e.g. "CustomerAddressChanged" or simply "CustomerChanged", and it is up to the consumer to query for the information it needs based on the event type.
There is not a "one-size-fits-all" solution--sometimes it does make more sense to pass the relevant data with the event. Again, I agree with #Matt's answer if this is the direction you need to move in.
Edit Based on Comment
I would agree that using an ESB to query is generally not a good idea. Some people use an ESB this way, but IMHO it's a bad practice.
Your original question and your comments to this answer and to Matt's talk about only including fields that have changed. This would definitely be problematic in many languages, where you would have to somehow distinguish between a property being empty/null and a property not being included in the event. If the event is getting serialized/de-serialized from/to a static type, it will be painful (if not impossible) to know the difference between "First Name is being set to NULL" and "First Name is missing because it didn't change".
Based on your comment that this is about synchronization of systems, my recommendation would be to send the full set of data on each change (assuming signal+query is not an option). That leaves the interpretation of the data up to each consuming system, and limits the responsibility of the publisher to emitting a more generic event, i.e. "Customer 1234 has been modified to X state". This event seems more broadly useful than the other options, and if other systems receive this event, they can interpret it as they see fit. They can dump/rewrite their own data for Customer 1234, or they can compare it to what they have and update only what changed. Sending only what changed seems more specific to a single consumer or a specific type of consumer.
All that said, I don't think any of your proposed solutions are "right" or "wrong". You know best what will work for your unique situation.
Events should be used to describe intent as well as details, for example, you could have a CustomerRegistered event with all the details for the customer that was registered. Then later in the stream a CustomerMadeGoldAccount event that only really needs to capture the customer Id of the customer who's account was changed to gold.
It's up to the consumers of the events to build up the current state of the system that they are interested in.
This allows only the most pertinent information to be stored in each event, imagine having hundreds of properties for a customer, if every command that changed a single property had to raise an event with all the properties before and after, this gets unwieldy pretty quickly. It's also difficult to determine why the change occurred if you just publish a generic CustomerModified event, which is often a question that is asked about the current state of an entity.
Only capturing data relevant to the event means that the command that issues the event only needs to have enough data about the entity to validate the command can be executed, it doesn't need to even read the whole customer entity.
Subscribers of the events also only need to build up a state for things that they are interested in, e.g. perhaps an 'account level' widget is listening to these events, all it needs to keep around is the customer ids and account levels so that it can display what account level the customer is at.
Instead of trying to convey everything through payload xmls' fields, you can distinguish between different operations based on -
1. Different endpoint URLs depending on the operation(this is preferred)
2. Have an opcode(operation code) as an element in the xml file which tells which operation is to used to handle the incoming request.(more nearer to your examples)
There are a few enterprise patterns applicable to your business case - messaging and its variants, and if your system is extensible then Enterprise Service Bus should be used. An ESB allows reliable handling of events and processing.
Problem
When my web application updates an item in the database, it sends a message containing the item ID via Camel onto an ActiveMQ queue, the consumer of which will get an external service (Solr) updated. The external service reads from the database independently.
What I want is that if the web application sends another message with the same item ID while the old one is still on queue, that the new message be dropped to avoid running the Solr update twice.
After the update request has been processed and the message with that item ID is off the queue, new request with the same ID should again be accepted.
Is there a way to make this work out of the box? I'm really tempted to drop ActiveMQ and simply implement the update request queue as a database table with a unique constraint, ordered by timestamp or a running insert id.
What I tried so far
I've read this and this page on Stackoverflow. These are the solutions mentioned there:
Idempotent consumers in Camel: Here I can specify an expression that defines what constitutes a duplicate, but that would also prevent all future attempts to send the same message, i.e. update the same item. I only want new update requests to be dropped while they are still on queue.
"ActiveMQ already does duplicate checks, look at auditDepth!": Well, this looks like a good start and definitely closest to what I want, but this determines equality based on the Message ID which I cannot set. So either I find a way to make ActiveMQ generate the Message ID for this queue in a certain way or I find a way to make the audit stuff look at my item ID field instead of the Message ID. (One comment in my second link even suggests using "a well defined property you set on the header", but fails to explain how.)
Write a custom plugin that redirects incoming messages to the deadletter queue if they match one that's already on the queue. This seems to be the most complete solution offered so far, but it feels so overkill for what I perceive as a fairly mundane and every-day task.
PS: I found another SO page that asks the same thing without an answer.
What you want is not message broker functionality, repeat after me, "A message broker is not a database, A message broker is not a database", repeat as necessary.
The broker's job is get messages reliably from point A to point B. The client offers some filtering capabilities via message selectors but this is minimal and mainly useful in keeping only specific messages that a single client is interested in from flowing there and not others which some other client might be in charge of processing.
Your use case calls for a more stateful database centric solution as you've described. Creating a broker plugin to walk the Queue to check for a message is reinventing the wheel and prone to error if the Queue depth is large as ActiveMQ might not even page in all the messages for you based on memory constraints.
I have a message model and I want it to have several receivers, possibly a lot of them.
I would also like to be able to tell for each receiver if the message was viewed or not (read/unread). Also I would like a receiver to be able to delete the message.
The two possible solutions are the following, for each I have a Message model an User model.
For the first (using the ideas presented here http://www.google.com/events/io/2009/sessions/BuildingScalableComplexApps.html)
I have a MessageReceivers class which has a ListProperty containing the users that will receive the message and set the parent to the message. I query of this with messages = db.GqlQuery('SELECT __key__ FROM MessageReceivers WHERE receivers = :1', user) and the do a db.get([ key.parent() for key in messages ]).
The problem I have which this is that I'm not sure how to store the state of the message: whether it is read or not and a subsequent issue whether the user has new messages. An additional issue would be the overhead of deleting a message (would have to remove user from receivers list property)
For the second: I have a MessageReceiver for each receiver it has links to message and to user and also stores the state (read/unread).
Which of this two approached do you consider that it has a better performance? And in the case of the first do you have any suggestion on handling the status of the message.
I've implement first option in production. Drawback is that ListProperty is limited to 2500 entries if you use custom index. Shameless plug: See my blog bost http://bravenewmethod.wordpress.com/2011/03/23/developing-on-google-app-engine-for-production/
Read state storing. I did this by implementing an entity that stored unread messages up to few months back and then just assumed older ones read. Even simpler is to query the messages in date order, and store the last known message timestamp in entity and assume all older as read. I don't recommended keeping long history in entity with huge list property, because reading and storing such entities can get really slow.
Message deletion is expensive, no way around that.
If you need to store state per message, your best option is to write one entity per recipient, with read state (and anything else, such as flags, etcetera), rather than using the index relation pattern.
Windows messages seems a good way to notify an application on Windows OSes. It actually works well, but few question comes up to my mind:
How to specify structured data to the lparam of the SendMessage routines (like many message codes does)? I mean... of course the parameter is a pointer, but how the process access to it? Maybe is it allocated by a DLL loaded by the processes sending/receiving the message?
Is it possible to share message structured parameters (between sender and receiver)? They are marshalled between the send operation and the peeking operation? If this is the case, it is possible to return data from the caller by modifying the structured parameter? This could be usefull with SendMessage, since it is executed synchronously, instead the PostMessage routine.
Other doubts...
What the differences from PostMessage and SendNotifyMessage?
Is it possible to cause a deadlock in the case an application calls SendMessage to itself while processing the message pump?
If the message is one of the standard window's messages - usually with a message id between 0 and WM_USER, then the systems window message dispatching logic contains code to marshal the struct to the any processes the message is dispatched to.
Messages above WM_USER get no such treatment - and this includes all the common control messages introduced with Windows 95 - you cannot end any of the LVM_* (list view messages) or other new control messages to a control in a different process and get a result back.
WM_COPYDATA was specifically introduced as a general purpose mechanism for user code to marshal arbitrary data between processes - outside of WM_COPYDATA (or reusing other windows standard messages) there is no way to get windows to automatically marshal structured data using the message queue mechanism into another process.
If it is your own code doing the sending AND receiving of messages, you could use a dll to define a shared memory section, instead of sending pointers (the dll might be based differently in each process) sends offsets to the shared memory block.
If you want to exchange structured data with external applications that do not marshal their data (for example to extract data from a list or tree view) then you need to perform dll injection so you can send and process the message from "in-process".
SendNofityMessage is different to PostMessage because PostMessage always puts the message in the message queue, whereas SendNotifyMessage acts like SendMessage for windows in the same process. Then, even if the target window is in another process, the message is dispatched DIRECTLY to the window proc not placed in the posted message queue for retreivel via GetMessage or PeekMessage.
Lastly it is possible to cause a deadlock - however while in a "blocking" sendmessage waiting for another thread to reply, SendMessage will dispatch messages sent (not posted) from other threads - to prevent deadlocks. This mitigates most potential deadlocks, but its still possible to create deadlocks by calling other blocking apis, or going into modal message processing loops.
Your concerns apply primarily in the case of messages sent between processes -- within a process, you can just send a pointer, and the sender just has to ensure the data remains valid until the receiver is finished using it.
Some interprocess messages require you to work with an HGLOBAL (e.g., the clipboard messages). Others require an explicit block size (e.g., with WM_COPYDATA). Still others send data in a pre-specified structure (e.g., a zero-terminated string) to support marshalling.
Generally speaking you can't return a value by modifying the block you received. To return a value, you need to (for example) send a reply message.
SendMessage has some (fairly complex) logic to prevent deadlocks, but you still need to be careful.
At work, we have a huge framework and use events to send data from one part of it to another. I recently started a personal project and I often think to use events to control the interactions of my objects.
For example, I have a Mixer class that play sound effects and I initially thought I should receive events to play a sound effect. Then I decided to only make my class static and call
Mixer.playSfx(SoundEffect)
in my classes. I have a ton of examples like this one where I initially think of an implementation with events and then change my mind, saying to myself it is too complex for nothing.
So when should I use events in a project? In which occasions events have a serious advantage over others techniques?
You generally use events to notify subscribers about some action or state change that occurred on the object. By using an event, you let different subscribers react differently, and by decoupling the subscriber (and its logic) from the event generator, the object becomes reusable.
In your Mixer example, I'd have events signal the start and end of playing of the sound effect. If I were to use this in a desktop application, I could use those events to enable/disable controls in the UI.
The difference between Calling a subroutine and raising events has to do with: Specification, Election, Cardinality and ultimately, which side, the initiator or the receiver has Control.
With Calls, the initiator elects to call the receiving routine, and the initiator specifies the receiver. And this leads to many-to-one cardinality, as many callers may elect to call the same subroutine.
With Events on the other hand, the initiator raises an event that will be received by those routines that have elected to receive that event. The receiver specifies what events it will receive from what initiators. This then leads to one-to-many cardinality as one event source can have many receivers.
So the decision as to Calls or Events, mostly has to do with whether the initiator determines the receiver is or the receiver determines the initiator.
Its a tradeoff between simplicity and re-usability. Lets take an metaphor of "Sending the email" process:
If you know the recipients and they are finite in number that you can always determine, its as simple as putting them in "To" list and hitting the send button. Its simple as thats what we use most of the time. This is calling the function directly.
However, in case of mailing list, you don't know in advance that how many users are going to subscribe to your email. In that case, you create a mailing list program where the users can subscribe to and the email goes automatically to all the subscribed users. This is event modeling.
Now, even though, in both above option, emails are sent to users, you are a better judge of when to send email directly and when to use the mailing list program. Apply the same judgement, hope that you would get your answer :)
Cheers,
Ajit.
I have been working with a huge code base at my previous work place and have seen, that using events can increase the complexity quite a lot and often unnecessarily.
I had often to reverse engineer existing code in order to fix it or to extend it.
In both cases, it is a lot easier to understand what is going on, when you can simply read a list of function calls instead of just seeing the raise of an event.
The event forces you to look for usages in order to fully understand what is happening. Not a problem with modern IDEs, but if you then encounter many functions, which also raise events, it quickly becomes complex. I had encountered cases, where it mattered in what order functions did subscribe to an event, even though most languages don't even gurantee a calling order...
There are cases when it is a really good idea to use events. But before you start eventing, consider the alternative. It is probably easier to read and mantain.
A Classic example for the use of events is a UI framework, which provides elements like buttons etc.
You want the function "ButtonPressed()" of the framework to call some of your functions, so that you can react to the user action.
The alternative to an event that you can subscribe to, would for example be a public bool "buttonPressed", which the UI framework exposes
and which you can regurlary check for beeing true or false. This is of course very ineffecient, when there are hundreds of UI elements.