Implementing thread-safe, parallel processing - task-parallel-library

I am trying to convert an existing process in a way that it supports multi-threading and concurrency to make the solution more robust and reliable.
Take the example of an emergency alert system. When a worker clocks-in, a new Recipient object is created with their information and added to the Recipients collection. Conversely, when they clock-out, the object is removed. And in the background, when an alert occurs, the alert engine will iterate through the same list of Recipients (foreach), calling SendAlert(...) on each object.
Here are some of my requirements:
Adding a recipient should not block if an alert is in progress.
Removing a recipient should not block if an alert is in progress.
Adding or removing a recipient should not affect the list of
recipients used by an in-progress alert.
I've been looking at the Task and Parallel classes as well as the BlockingCollection and ConcurrentQueue classes but am not clear what the best approach is.
Is it as simple as using a BlockingCollection? After reading a ton of documentation, I'm still not sure what happens if Add is called while I am enumerating the collection.
UPDATE
A collegue referred me to the following article which describes the ConcurrentBag class and how each operation behaves:
http://www.codethinked.com/net-40-and-system_collections_concurrent_concurrentbag
Based on the author's explanation, it appears that this collection will (almost) serve my purposes. I can do the following:
Create a new collection
var recipients = new ConcurrentBag();
When a worker clocks-in, create a new Recipient and add it to the collection:
recipients.Add(new Recipient());
When an alert occurs, the alert engine can iterate through the collection at that time because GetEnumerator uses a snapshot of the collection items.
foreach (var recipient in recipients)
recipient.SendAlert(...);
When a worker clocks-out, remove the recipient from the collection:
???
The ConcurrentBag does not provide a way to remove a specific item. None of the concurrent classes do as far as I can tell. Am I missing something? Aside from this, ConcurrentBag does everything I need.

ConcurrentBag<T> should definitely be the best performing class out of the bunch for you to use for such a case. Enumeration works exactly as your friend describes and so it should serve well for the scenario you have laid out. However, knowing you have to remove specific items from this set, the only type that's going to work for you is ConcurrentDictionary<K, V>. All the other types only offer a TryTake method which, in the case of ConcurrentBag<T>, is indeterminate or, in the case of ConcurrentQueue<T> or ConcurrentStack<T> ordered only.
For broadcasting you would just do:
ConcurrentDictionary<string, Recipient> myConcurrentDictionary = ...;
...
foreach(Recipient recipient in myConcurrentDictionary.Values)
{
...
}
The enumerator is once again a snapshot of the dictionary in that instant.

I came into work this morning to an e-mail from a friend that gives me the following two answers:
1 - With regards to how the collections in the Concurrent namespace work, most of them are designed to allow additions and subtractions from the collection without blocking and are thread-safe even when in the process of enumerating the collection items.
With a "regular" collection, getting an enumerator (via GetEnumerator) sets a "version" value that is changed by any operation that affects the collection items (such as Add, Remove or Clear). The IEnumerator implementation will compare the version set when it was created against the current version of the collection. If different, an exception is thrown and enumeration ceases.
The Concurrent collections are designed using segments that make it very easy to support multi-threading. But, in the case of enumerating, they actually create a snapshot copy of the collection at the time GetEnumerator is called and the enumerator works against this copy. That allows changes to be made to the collection without adverse affects on the enumerator. Of course this means that the enumeration will know nothing of these changes but it sounds like your use-case allows this.
2 - As far as the specific scenario you are describing, I don't believe that a Concurrent collection is needed. You can wrap a standard collection using a ReaderWriterLock and apply the same logic as the Concurrent collections when you need to enumerate.
Here's what I suggest:
public class RecipientCollection
{
private Collection<Recipient> _recipients = new Collection<Recipient>();
private ReaderWriterLock _lock = new ReaderWriterLock();
public void Add(Recipient r)
{
_lock.AcquireWriterLock(Timeout.Infinite);
try
{
_recipients.Add(r);
}
finally
{
_lock.ReleaseWriterLock();
}
}
public void Remove(Recipient r)
{
_lock.AcquireWriterLock(Timeout.Infinite);
try
{
_recipients.Remove(r);
}
finally
{
_lock.ReleaseWriterLock();
}
}
public IEnumerable<Recipient> ToEnumerable()
{
_lock.AcquireReaderLock(Timeout.Infinite);
try
{
var list = _recipients.ToArray();
return list;
}
finally
{
_lock.ReleaseReaderLock();
}
}
}
The ReaderWriterLock ensures that operations are only blocked if another operation that changes the collection's contents is in progress. As soon as that operation completes, the lock is released and the next operation can proceed.
Your alert engine would use the ToEnumerable() method to obtain a snapshot copy of the collection at that time and enumerate the copy.
Depending on how often an alert is sent and changes are made to the collection, this could be an issue but you might be able to still implement some type of version property that is changed when an item is added or removed and the alert engine can check this property to see if it needs to call ToEnumerable() again to get the latest version. Or encapsulate this by caching the array inside the RecipientCollection class and invalidating the cache when an item is added or removed.
HTH

There is much more to an implementation like this than just the parallel processing aspects, durability probably being paramount among them. Have you considered building this using an existing PubSub technology like say... Azure Topics or NServiceBus?

Your requirements strike me as an good fit for the way standard .NET events are triggered in C#. I don't know offhand if the VB syntax gets compiled to similar code or not. The standard pattern looks something like:
public event EventHandler Triggered;
protected void OnTriggered()
{
//capture the list so that you don't see changes while the
//event is being dispatched.
EventHandler h = Triggered;
if (h != null)
h(this, EventArgs.Empty);
}
Alternatively, you could use an immutable list class to store the recipients. Then when the alert is sent, it will first take the current list and use it as a "snapshot" that cannot be modified by adding and removing while you are sending the alert. For example:
class Alerter
{
private ImmutableList<Recipient> recipients;
public void Add(Recipient recipient)
{
recipients = recipients.Add(recipient);
}
public void Remove(Recipient recipient)
{
recipients = recipients.Remove(recipient);
}
public void SendAlert()
{
//make a local reference to the current list so
//you are not affected by any calls to Add/Remove
var current = recipients;
foreach (var r in current)
{
//send alert to r
}
}
}
You will have to find an implementation of an ImmutableList, but you should be able to find several without too much work. In the SendAlert method as I wrote it, I probably didn't need to make an explicit local to avoid problems as the foreach loop would have done that itself, but I think the copy makes the intention clearer.

Related

Who says that an Action is allowed in Flux?

I'm trying to use Flux architecture into one of my projects.
Some of my Actions have pre-requisites that need to be satisfied in order to allow that Action to be dispatched.
Currently, this pre-requisite checking logic is inside my View code, something like this (pseudocode):
class FooView {
void OnButtonClick() {
if (FooStore.IsButtonClickAllowed) {
Dispatch(ButtonClickAction);
}
}
}
This looks awkward to me, because now my View has business logic code inside it. I tought about putting this checking code into my Store, but I can't. I have more than one Store that handles this Action, and just one of the Stores knows if it's valid or not. So it won't work:
class FooStore {
void Handle(Action) {
if (Action is ButtonClickAction) {
if (IsButtonClickAllowed) {
FooData.Something();
} else {
// Ignore
}
}
}
}
class BarStore {
void Handle(Action) {
if (Action is ButtonClickAction) {
BarData.Something();
}
}
}
I can't tell from BarStore if the ButtonClickAction is allowed, unless I "WaitFor(FooStore)" and then ask it, but this will cause every Store that handles this Action to contain the same check, leading to something really messy.
So far, I'm understanding that an Action in Flux is only dispatched if it's guaranteed to be allowed, i.e. the validity of the Action needs to be checked before it's dispatched. This means this validation needs to be in the View?
The solution to the case of the button example above may be simple: "just hide the button when it's not allowed and it will never happen". But suppose I have an Action that is dispatched when the user hits the spacebar, what should I do? I can't remove the spacebar from the user's keyboard when the Action is not allowed to happen.
P.S. I'm not using React, so the question is purely about Flux archiectural style, I'm not even using JavaScript.
Perhaps it's a matter of degree. I don't consider a simple check to a property managed by the store to be business logic. I consider that to be very simple view logic:
if (FooStore.IsButtonClickAllowed) {
That line seems to have abstracted away all the rules about why the button might not be clickable, and it looks like all that gets managed in the store, which is appropriate.
an Action in Flux is only dispatched if it's guaranteed to be allowed, i.e. the validity of the Action needs to be checked before it's dispatched.
I disagree with this. There are different kinds of validation.
Sometimes very simple validation can be done in a view component. In React, components may employ a small degree of state. Input components are particularly good places to do this. Checking whether the user has typed in something that looks like an email address, for example, can be done in the view component.
Most validation, however, needs to be done against application state or against persistent data. In these cases, you need to send off the action and then let the stores respond to it with business logic.
The action should be like a newspaper, reporting on something that happened in the real world: the user did something, a response came back from the server, etc. Actions don't need to prevented from happening; they report on what actually happened. Stores do the rest.
When the user does something that violates the validation rules, the stores respond to this by providing error data to the views.
I've created a new component (ActionDispatcher) to handle the validations:
class ActionDispatcher {
void ButtonClick() {
if (FooStore.IsButtonClickAllowed) {
Dispatch(ButtonClickAction);
}
}
}
Then, the views (and other action sources) always uses the ActionDispatcher, instead of dispatching the actions directly:
class FooView {
void OnButtonClick() {
ActionDispatcher.ButtonClick();
}
}
I've just organized the validation code into a single location.

AngularDart custom filter call() method required to be idempotent?

The main running example of the Angular Dart tutorial is a Recipe Book app. The exercise at the end of the Chapter 5 on filters and services suggests trying to "create a [custom] filter that will multiply all the amounts [of each ingredient listed] in the recipes" thus allowing a "user to double, triple, or quadruple the recipe." E.g. an ingredient of "1/2 cup of flour" would become "1 cup of flour" when doubled.
I have written such a custom filter: it takes a list of Ingredients (consisting of a quantity and a description) and returns a new list of new Ingredients (with increased quantities), but I am getting the following error:
5 $digest() iterations reached. Aborting!
My question is: what is the required and/or permitted behavior of an AngularDart custom filter call() method? E.g., clearly it is permitted to remove (i.e. filter) elements from its input list, but can it also add new or replace elements? The Dart angular.core NgFilter documentation simply says that a "filter is a class with a call method". I have not found more details.
Extrapolating from the answer to this AngularJS post, it would seem that repeated invocations of call() should (eventually?) yield "the same result". If so, this would be a reasonable constraint.
Yielding "the same result" could mean that call() needs to be idempotent, but in the case of Dart such idempotence should be relative to == (object equivalence) not identical() (object identity), IMHO. I ran a few tests using the following small example to illustrate the issues:
main.dart
import 'package:angular/angular.dart';
class A { }
#NgFilter(name:'myFilter') class MutatingCustomFilter {
final A _a = new A();
call(List list) => new List.from(list)..add(_a); // runs ok.
// call(List list) => new List.from(list)..add(new A()); // gives error
}
class MyAppModule extends Module {
MyAppModule() { type(MutatingCustomFilter); }
}
main() => ngBootstrap(module: new MyAppModule());
index.html excerpt
<ul>
<li ng-repeat="x in [1,2,3] | myFilter">{{x}}</li>
</ul>
If I change the body of class A to be
#override bool operator==(other) => true;
#override int get hashCode => 1;
which makes all instances of A considered ==, then the second implementation of call() in main.dart (the one with add(new A())) still gives an error (though a different one).
I can see how to solve the tutorial exercise without use of a custom filter, but I am trying to not give up on the challenge of finding a filter that will work as requested. I am new to Angular and decided to jump in with AngularDart, so any help in explaining the effects of the various flavors of call(), or in finding documentation for the expected behavior of call(), (or letting me know if you think such a custom filter simply cannot be written!) would be appreciated.
Too many iterations
When angular detects a change in the model, it executes a reaction function. The reaction function can further change the model. This would leave the model in inconsistent state. For this reason we re-run the change detection, which can further create more changes. For this reason we keep re-running the changes until the model stabilizes. But how many times should we rerun the change detection before giving up? By default it is 5 times. If the model does not stabilize after 5 iteration we give up. This is what is going on in your case.
Change Detection
When has object changed? one can use identical or == (equals). Good arguments can be made for each, but we have chosen to use identical because it is fast and consistent. Using == (equals) is tricky and it would negatively impact the change detection algorithm.
Filters and arrays
When a filter which operates an an array, executes it has no choice but to create a new instance of the array. This breaks identical, but luckily it is fed into ng-repeat which uses its own algorithm for array contents detection rather the array detection. While the array does not have to be identical between runs, its content must be. Otherwise ng-repeat can not tell the difference between insertions and changes, which it needs to do proper animations.
Your code
The issue with your filter is that it creates new instance on each iteration of the digest loop. These new instances prevent the model from stabilizing and hence the error. (There are plans to solve this issue, but it will be few weeks before we get there.)
Solution
Your solutions is attempting to create a filter which consumes the whole array and then attempts to create a new array, for the ng-repeat. A different (prefered) solution would be to leave the ng-repeat iteration as is, and instead place the filter on the binding which is creating the qty and apply it there.
<span>{{recipe.qty | myFilter:multiply}}</span>

How do I do cross-entity server-side validation

In my application, I have cross-entity validation logic that requires me to look at the entire change set and I'm doing this using the BeforeSaveEntities override.
I can construct the right logic by examining the saveMap parameter, but what am I supposed to do if I find something invalid?
If I throw an exception, like I would for single entity validation in the BeforeSaveEntity override, the whole save is aborted and the error is reported to the client. But some of the entities might be valid so I would want to save those and only abort the invalid parts.
Because BeforeSaveEntities returns a saveMap, I think I should be able to remove the invalid entities from the change set and continue to save the valid entities, but then how do I report the invalid parts to the client?
Is it possible to do a partial save of only the valid entities and at the same time, report a sensible error to the client to describe the parts of the save that failed?
Jay told you the way it is.
I wouldn't hold my breath waiting for Breeze to change because I think yours is a rare scenario and it isn't one we would want to encourage anyway.
But I'm weird and I can't stop thinking what I'd do if were you and I absolutely HAD to do it. I might try something like this.
Warning: this is pseudo-code and I'm making this up. I do not recommend or warrant this
Create a custom MyCustomEFContextProvider that derives from EFContextProvider.
Give it an ErrorEntities property to hold the error object
Override (shadow) the SaveChanges method with another that delegates to the base
public new CustomSaveResult SaveChanges(JObject saveBundle,
TransactionSettings transactionSettings = null) {
var result = base.SaveChanges(saveBundle, transactionSettings);
// learn about CustomSaveResult below
return new CustomSaveResult(this.ErrorEntities, result);
}
Catch an invalid entity inside BeforeSaveEntities
Pass it with error message to your custom ErrorEntities property
You get to that property via the EntityInfo instance as in
((MyCustomEFContextProvider) info.ContextProvider).ErrorEntities.Add(new ErrorEntity(info, message));
Remove the invalid entity from the SaveMap so it won't be included in the actual save
Let the save continue
The second line of your override SaveChanges method creates a new instance of your CustomSaveResult from the standard one and returns that to the caller.
public class CustomSaveResult : SaveResult {
public List ErrorEntities;
public CustomSaveResult(List errorEntities, SaveResult result){
// copy over everything
this.Entities = result.Entities;
this.KeyMappings = result.KeyMappings;
this.Errors = this.Errors;
// and now your error stuff
this.ErrorEntities = errorEntities;
}
}
Let's assume the caller is your Web API controller's SaveChanges method. Well you don't have to change a thing but you might make it clear by explicitly returning your custom SaveResult:
readonly MyCustomEFContextProvider _contextProvider = new MyCustomEFContextProvider();
...
[HttpPost]
public CustomSaveResult SaveChanges(JObject saveBundle) {
return _contextProvider.SaveChanges(saveBundle);
}
JSON.Net will happily serialize the usual material + your custom ErrorEntities property (be sure to make it serializable!) and send it to the Breeze client.
On the Breeze client you write your own variation on the stock Breeze Web API data service adapter. Yours does almost exactly the same thing as the Breeze version. But, when processing the save payload from the server, it also extracts this extra "error entities" material in the response and does whatever you want to do with it.
I don't know what that will be but now you have it.
See how easy that was? LOL.
Breeze does not currently support a save mechanism that both saves and returns an error at the same time. While possible this seems a bit baroque.
As you pointed out, you can
1) Throw an exception inside of the BeforeSaveEntities and fail the save. You can even specify which specific entity or entities caused the failure and why. In this case the entire save is aborted.
or
2) Remove 'bad' items from the saveMap within the BeforeSaveEntities and save only a subset of what was passed in. In this case you are performing a partial save.
But we don't support a hybrid of these two. Please add this to the Breeze User Voice if you feel strongly and we can see if other members of the community feel that this would be useful.

How to change a single querystring parameter, possibly via a control action?

In the last three days I've struggled trying to find a way to accomplish what I though was supposed to be a simple thing. Doing this on my own or searching for a solution in the web, didn't help. Maybe because I'm not even sure what to look for, when I do my researches.
I'll try to explain as much as I can here: maybe someone will be able to help me.
I won't say how I'm doing it, because I've tried to do it in many ways and none of them worked for different reasons: I prefer to see a fresh advice from you.
In most of the pages of web application, I have two links (but they could be more) like that:
Option A
Option B
This is partial view, retured by a controller action.
User can select or both (all) values, but they can't never select none of them: meaning that at least one must be always selected.
These links must che accessible in almost all pages and they are not supposed to redirect to a different page, but only to store this information somewhere, to be reused when action needs to filter returned contents: a place always accessible, regarding the current controller, action or user (including non authenticated users) (session? cookie?).
This information is used to filter displayed contents in the whole web application.
So, the problem is not how to create the business logi of that, but how (and where) to store this information:
without messing with the querystring (means: keeps the querystring as empty/clean as possible)
without redirecting to other pages (user must get the current page, just with different contents)
allow this information to persists between all views, until user click again to change the option(s)
My aim is to have this information stored in a model that will contains all options and their selection status (on/off), so the appropriates PartialView will know how to display them.
Also, I could send this model to the "thing" that will handle option changes.
Thanks.
UPDATE
Following Paul's advice, I've took the Session way:
private List<OptionSelectionModel> _userOptionPreferences;
protected List<OptionSelectionModel> UserOptionPreferences
{
get
{
if (Session["UserOptionPreferences"] == null)
{
_userOptionPreferences= Lib.Options.GetOptionSelectionModelList();
}
else
{
_userOptionPreferences= Session["UserOptionPreferences"].ToString().Deserialize<List<OptionSelectionModel>>();
}
if (_userOptionPreferences.Where(g => g.Selected).Count() == 0)
{
foreach (var userOptionPreferencesin _userOptionPreferences)
{
userOptionPreferences.Selected = true;
}
}
UserOptionPreferences= _userOptionPreferences;
return _userOptionPreferences;
}
private set
{
_userOptionPreferences= value;
Session["UserOptionPreferences"] = _userOptionPreferences.SerializeObject();
}
}
Following this, I've overridden (not sure is the right conjugation of "to override" :) OnActionExecuting():
protected override void OnActionExecuting(ActionExecutingContext filterContext)
{
GetOptionSelections();
base.OnActionExecuting(filterContext);
}
GetOptionSelections()...
private void GetOptionSelections()
{
if (String.IsNullOrEmpty(Request["optionCode"])) return;
var newOptionCode = Request["optionCode "];
foreach (var userOptionPreferencesin UserOptionPreferences)
{
if (userOptionPreferences.OptionCode == newOptionCode )
userOptionPreferences.Selected = !userOptionPreferences.Selected;
}
}
This code I think can be better, but right now I just want to make it work and it doesn't.
Maybe there are also other issues there (quite sure, actually), but I believe the main issue is that OnActionExecuting is called by each action in a controller that inherit from BaseController, therefore it keeps toggling userOptionPreferences.Selected on/off, but I don't know how to make GetOptionSelections() being called only once in each View: something like the old Page_Load, but for MVC.
Last update AKA solution
Ok, using the session way, I've managed to store this information.
The other issue wasn't really on topic with this question and I've managed to solve it creating a new action that take cares of handling the option's change, then redirects to the caller URL (using the usual returnUrl parameter, but as action parameter).
This way, the option change is done only once per call.
The only thing I don't really like is that I can't simply work with the UserOptionPreferences property, as it doesn't change the session value, but only the value in memory, so I have to set the property with the new object's status each time: not a big deal, but not nice either.
This is a place to use session.
The session will keep your setting between requests while keeping it out of the url querystring. It seems that you have probably tried this already, but try it again and if you have problems ask again. I think it will be the best way for you to solve this problem.

Reliable and efficient way to handle Azure Table Batch updates

I have an IEnumerable that I'd like to add to Azure Table in the most efficient way possible. Since every batch write has to be directed to the same PartitionKey, with a limit of 100 rows per write...
Does anyone want to take a crack at implementing this the "right" way as referenced in the TODO section? I'm not sure why MSFT didn't finish the task here...
Also I'm not sure if error handling will complicate this, or the correct way to implement it. Here is the code from the Microsoft Patterns and Practices team for Windows Azure "Tailspin Toys" demo
public void Add(IEnumerable<T> objs)
{
// todo: Optimize: The Add method that takes an IEnumerable parameter should check the number of items in the batch and the size of the payload before calling the SaveChanges method with the SaveChangesOptions.Batch option. For more information about batches and Windows Azure table storage, see the section, "Transactions in aExpense," in Chapter 5, "Phase 2: Automating Deployment and Using Windows Azure Storage," of the book, Windows Azure Architecture Guide, Part 1: Moving Applications to the Cloud, available at http://msdn.microsoft.com/en-us/library/ff728592.aspx.
TableServiceContext context = this.CreateContext();
foreach (var obj in objs)
{
context.AddObject(this.tableName, obj);
}
var saveChangesOptions = SaveChangesOptions.None;
if (objs.Distinct(new PartitionKeyComparer()).Count() == 1)
{
saveChangesOptions = SaveChangesOptions.Batch;
}
context.SaveChanges(saveChangesOptions);
}
private class PartitionKeyComparer : IEqualityComparer<TableServiceEntity>
{
public bool Equals(TableServiceEntity x, TableServiceEntity y)
{
return string.Compare(x.PartitionKey, y.PartitionKey, true, System.Globalization.CultureInfo.InvariantCulture) == 0;
}
public int GetHashCode(TableServiceEntity obj)
{
return obj.PartitionKey.GetHashCode();
}
}
Well, we (the patterns & practices team) just optimized for showing other things we considered useful. The code above is not really a "general purpose library", but rather a specific method for the sample that uses it.
At that moment we thought that adding that extra error handling would not add much, and we diceided to keep it simple, but....we might have been wrong.
Anyway, if you follow the link in the //TODO:, you will find another section of a previous guide we wrote that talks a little bit more on error handling in "complex" storage transactions (not in the "ACID" form though as transactions "ala DTC" are not supported in Windows Azure Storage).
Link is this: http://msdn.microsoft.com/en-us/library/ff803365.aspx
The limitations are listed in more detail there:
Only one instance of the entity should be present in the batch
Max 100 entities or 4 MB payload
Same PartitionKey (which is being handled in the code: notice that "batch" is only specified if there's a single Partition key)
etc.
Adding some extra error handling should not overcomplicate things too much, but depends on the type of app you are building on top of this and your preference to handle this higher or lower in your app stack. In our example, the app would never expect > 100 entities anyway, so it would simply bubble the exception up if that situation happens (because it should be truly exceptional). Same with the total size. The use cases implemented in the app make it impossible to have the same entity in the same collection, so again, that should never happen (and if it happens, it wouls simply throw)
All "entity group transactions" limitations are documented here: http://msdn.microsoft.com/en-us/library/dd894038.aspx
Let us know how it goes! I'm also interested to know if other pieces of the guide were useful for you.

Resources