Reliable and efficient way to handle Azure Table Batch updates - linq

I have an IEnumerable that I'd like to add to Azure Table in the most efficient way possible. Since every batch write has to be directed to the same PartitionKey, with a limit of 100 rows per write...
Does anyone want to take a crack at implementing this the "right" way as referenced in the TODO section? I'm not sure why MSFT didn't finish the task here...
Also I'm not sure if error handling will complicate this, or the correct way to implement it. Here is the code from the Microsoft Patterns and Practices team for Windows Azure "Tailspin Toys" demo
public void Add(IEnumerable<T> objs)
{
// todo: Optimize: The Add method that takes an IEnumerable parameter should check the number of items in the batch and the size of the payload before calling the SaveChanges method with the SaveChangesOptions.Batch option. For more information about batches and Windows Azure table storage, see the section, "Transactions in aExpense," in Chapter 5, "Phase 2: Automating Deployment and Using Windows Azure Storage," of the book, Windows Azure Architecture Guide, Part 1: Moving Applications to the Cloud, available at http://msdn.microsoft.com/en-us/library/ff728592.aspx.
TableServiceContext context = this.CreateContext();
foreach (var obj in objs)
{
context.AddObject(this.tableName, obj);
}
var saveChangesOptions = SaveChangesOptions.None;
if (objs.Distinct(new PartitionKeyComparer()).Count() == 1)
{
saveChangesOptions = SaveChangesOptions.Batch;
}
context.SaveChanges(saveChangesOptions);
}
private class PartitionKeyComparer : IEqualityComparer<TableServiceEntity>
{
public bool Equals(TableServiceEntity x, TableServiceEntity y)
{
return string.Compare(x.PartitionKey, y.PartitionKey, true, System.Globalization.CultureInfo.InvariantCulture) == 0;
}
public int GetHashCode(TableServiceEntity obj)
{
return obj.PartitionKey.GetHashCode();
}
}

Well, we (the patterns & practices team) just optimized for showing other things we considered useful. The code above is not really a "general purpose library", but rather a specific method for the sample that uses it.
At that moment we thought that adding that extra error handling would not add much, and we diceided to keep it simple, but....we might have been wrong.
Anyway, if you follow the link in the //TODO:, you will find another section of a previous guide we wrote that talks a little bit more on error handling in "complex" storage transactions (not in the "ACID" form though as transactions "ala DTC" are not supported in Windows Azure Storage).
Link is this: http://msdn.microsoft.com/en-us/library/ff803365.aspx
The limitations are listed in more detail there:
Only one instance of the entity should be present in the batch
Max 100 entities or 4 MB payload
Same PartitionKey (which is being handled in the code: notice that "batch" is only specified if there's a single Partition key)
etc.
Adding some extra error handling should not overcomplicate things too much, but depends on the type of app you are building on top of this and your preference to handle this higher or lower in your app stack. In our example, the app would never expect > 100 entities anyway, so it would simply bubble the exception up if that situation happens (because it should be truly exceptional). Same with the total size. The use cases implemented in the app make it impossible to have the same entity in the same collection, so again, that should never happen (and if it happens, it wouls simply throw)
All "entity group transactions" limitations are documented here: http://msdn.microsoft.com/en-us/library/dd894038.aspx
Let us know how it goes! I'm also interested to know if other pieces of the guide were useful for you.

Related

What is the convention around derivative information?

I am working on a service that provides information about a few related entities, somewhat like a database. Suppose that there's calls to retrieve information about a school:
service MySchool {
rpc GetClassRoom (ClassRoomRequest) returns (ClassRoom);
rpc GetStudent (StudentRequest) returns (Student);
}
Now, suppose that I want to find out a class room's information, I'd receive a proto that looks like so:
message ClassRoom {
string id = 1;
string address = 2;
string teacher = 3;
}
Sometimes I also want to know all of the students of the classroom. I am struggling to think which is the better design pattern.
Option A) Add an extra rpc like so: rpc GetClassRoomStudents (ClassRoomRequest) returns (ClassRoomStudents), where ClassRoomStudents has a single field repeated Student students. This technique requires more than one call to get all the information that we want (and many if we wanted to know information for more than one classroom).
Option B) Add an extra repeated Student students field to the ClassRoom proto, and B') Fill it up only when necessary, or B") Fill it up whenever the server receives a GetClassRoom call. This may sometimes fetch extra information, or lead to ambiguity according to what fields are filled up.
I am not sure what's the best / most conventional way of dealing with this. How have some of you dealt with this?
There is no simple answer. It's a tradeoff between simplicity (option A) and performance (option B), and it depends on the situation which solution is best.
In general, I'd recommend to go with the simple solution first, unless your measurements demonstrate that it leads to performance issues. At that point, it's easy to add repeated Student students to ClassRoom and a field bool fetch_students [default=false] to ClassRoomRequest. Then clients are free to continue using the simple API, or choose to upgrade to the more performant API if they need to.
Note that this isn't specific to gRPC; the same issue is seen in REST APIs, and basically almost any request/response model.

retrieving data from arbitrary memory addresses using VSIX

I am working on developing a debugger plugin for visual studio using VSIX. My problem is I have an array of addresses but I cannot set the IDebugMemoryBytes2 to a particular address. I use DEBUG_PROPERTY_INFO and get the array of addresses, and I also am able to set the context to the particular addresses in the array using the Add function in IDebugMemoryContext2. However, I need to use the ReadAt function to retrieve n bytes from a specified address (from IDebugMemoryBytes2).
Does anyone have any idea how to retrieve data from arbitrary addresses from memory?
I am adding more information on the same:
I am using the Microsoft Visual Studio Extensibility package to build my debugger plugin. In the application I am trying to debug using this plugin, there is a double pointer and I need to read those values to process them further in my plugin. For this, there is no way to display all the pointer variables in the watch window and hence, I am not able to get the DEBUG_PROPERTY_INFO for all the block of arrays which the pointer variable is pointing to. This is my problem which I am trying to address. There is no way for me to read the memory pointed to by this double pointer.
Now as for the events in the debuggee process, since the plugin is for debugging variables, I put a breakpoint at a place where I know this pointer is populated and then come back to the plugin for further evaluation.
As a start, I was somehow able to get the starting addresses of each of the array. But still, I am not able to read x bytes of memory from each of these starting addresses.
ie., for example, if I have int **ptr = // pointing to something
I have the addresses present in ptr[0], ptr[1], ptr[2], etc. But I need to go to each of these addresses and fetch the memory block they are pointing to.
For this, after much search, I found this link: https://macropolygon.wordpress.com/2012/12/16/evaluating-debugged-process-memory-in-a-visual-studio-extension/ which seems to address exactly my issue.
So to use expression evaluator functions, I need an IDebugStackFrame2 object to get the ExpressionContext. To get this object, I need to register to events in the debuggee process which is for breakpoint. As said in the post, I did:
public int Event(IDebugEngine2 engine, IDebugProcess2 process,
IDebugProgram2 program, IDebugThread2 thread, IDebugEvent2
debugEvent, ref Guid riidEvent, uint attributes)
{
if (debugEvent is IDebugBreakpointEvent2)
{
this.thread = thread;
}
return VSConstants.S_OK;
}
And my registration is like:
private void GetCurrentThread()
{
uint cookie;
DBGMODE[] modeArray = new DBGMODE[1];
// Get the Debugger service.
debugService = Package.GetGlobalService(typeof(SVsShellDebugger)) as
IVsDebugger;
if (debugService != null)
{
// Register for debug events.
// Assumes the current class implements IDebugEventCallback2.
debugService.AdviseDebuggerEvents(this, out cookie);
debugService.AdviseDebugEventCallback(this);
debugService.GetMode(modeArray);
modeArray[0] = modeArray[0] & ~DBGMODE.DBGMODE_EncMask;
if (modeArray[0] == DBGMODE.DBGMODE_Break)
{
GetCurrentStackFrame();
}
}
}
But this doesn't seem to invoke the Event function at all and hence, I am not sure how to get the IDebugThread2 object.
I also tried the other way suggested in the same post:
namespace Microsoft.VisualStudio.Debugger.Interop.Internal
{
[InterfaceType(ComInterfaceType.InterfaceIsIUnknown), Guid("1DA40549-8CCC-48CF-B99B-FC22FE3AFEDF")]
public interface IDebuggerInternal11 {
[DispId(0x6001001f)]
IDebugThread2 CurrentThread { [return:
MarshalAs(UnmanagedType.Interface)]
[MethodImpl(MethodImplOptions.InternalCall, MethodCodeType =
MethodCodeType.Runtime)]
get; [param: In, MarshalAs(UnmanagedType.Interface)]
[MethodImpl(MethodImplOptions.InternalCall, MethodCodeType =
MethodCodeType.Runtime)] set; }
}
}
private void GetCurrentThread()
{
debugService = Package.GetGlobalService(typeof(SVsShellDebugger)) as IVsDebugger;
if (debugService != null)
{
IDebuggerInternal11 debuggerServiceInternal =
(IDebuggerInternal11)debugService;
thread = debuggerServiceInternal.CurrentThread;
GetCurrentStackFrame();
}
}
But in this method, I think I am missing something but I am not sure what, because after the execution of the line
IDebuggerInternal11 debuggerServiceInternal =
(IDebuggerInternal11)debugService;
when I check the values of the debuggerServiceInternal variable, I see there is a System.Security.SecurityException for CurrentThread, CurrentStackFrame (and so obviously the next line causes a crash). For this, I googled the error and found I was missing the ComImport attribute to the class. So I added that and now, I get a System.AccessViolationException : Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
I am new to C# programming as well and hence, it is a bit difficult to grasp many things in short duration. I am lost as to how to proceed further now.
Any help in the same or suggestions to try another way to achieve my objective will be greatly appreciated.
Thanks a lot,
Esash
After much search, since I am short of time, I need a quick solution and hence, for now, it seems like the quickest way to solve this problem is to hack the .natvis files by making it display all the elements of the pointer and then using the same old way by using IDebug* interface methods to access and retrieve the memory context for each of the pointer elements. But, after posting the same question in msdn forums, I think the proper answer to this problem is as mentioned by Greggs:
"For reading memory, if you want a fast way to do this, you just want the raw memory, and the debug engine of the target is the normal Visual Studio native engine (in other words, you aren't creating your own debug engine), I would recommend referencing Microsoft.VisualStudio.Debugger.Engine. You can then use DkmStackFrame.ExtractFromDTEObject to get the DkmStackFrame object. This will give you the DkmProcess object and you can call DkmProcess.ReadMemory to read memory from the target."
Now, after trying a lot to understand how to implement this, I found that you could just accomplish this using :
DkmProcess.GetProcesses() and doing a ReadMemory on the process returned.
There is a question now, what if more than one processes are returned. Well, I tried attaching many processes to the current debugging process and tried attaching many processes to the debuggee process as well, but found that the DkmProcess.GetProcesses() gets only the one from which I regained the control from, and not the other processes I am attached to. I am not sure if this will work in all cases but for me, it worked this way and for anyone who has similar requirements, this might work as well.
Using the .natvis files to accomplish this means, using IndexListItems for VS2013 and prior versions, and using CustomListItems for VS2015 and greater versions, and to make it look prettier, use the "no-derived" attribute. There is no way to make the Synthetic tag display only the base address of each variable and hence, the above attribute is the best way to go about, but this is not available in VS2013 and prior versions (The base address might get displayed but for people who want to go beyond just displaying contents and also access the memory context of the pointer element, Synthetic tag is not the right thing).
I hope this helps some developer who struggled like me using IDebug* interfaces. For reference, I am also giving the link to the msdn forum where my question was answered.
https://social.msdn.microsoft.com/Forums/en-US/030cef1c-ee79-46e9-8e40-bfc59f14cc34/how-can-i-send-a-custom-debug-event-to-my-idebugeventcallback2-handler?forum=vsdebug
Thanks.

How can the User Interface know which commands is allowed to perform against an Aggregate Root?

The UI is decoupled from the domain, but the UI should try its best to never allow the user to issue commands that are sure to fail.
Consider the following example (pseudo-code):
DiscussionController
#Security(is_logged)
#Method('POST')
#Route('addPost')
addPostToDiscussionAction(request)
discussionService.postToDiscussion(
new PostToDiscussionCommand(request.discussionId, session.myUserId, request.bodyText)
)
#Method('GET')
#Route('showDiscussion/{discussionId}')
showDiscussionAction(request)
discussionWithAllThePosts = discussionFinder.findById(request.discussionId)
canAddPostToThisDiscussion = ???
// render the discussion to the user, and use `canAddPostToThisDiscussion` to show/hide the form
// from which the user can send a request to `addPostToDiscussionAction`.
renderDiscussion(discussionWithAllThePosts, canAddPostToThisDiscussion)
PostToDiscussionCommand
constructor(discussionId, authorId, bodyText)
DiscussionApplicationService
postToDiscussion(command)
discussion = discussionRepository.get(command.discussionId)
author = collaboratorService.authorFrom(discussion.Id, command.authorId)
post = discussion.createPost(postRepository.nextIdentity(), author, command.bodyText)
postRepository.add(post)
DiscussionAggregate
// originalPoster is the Author that started the discussion
constructor(discussionId, originalPoster)
// if the discussion is closed, you can't create a post.
// *unless* if you're the author (OP) that started the discussion
createPost(postId, author, bodyText)
if (this.close && !this.originalPoster.equals(author))
throw "Discussion is closed."
return new Post(this.discussionId, postId, author, bodyText)
close()
if (this.close)
throw "Discussion already closed."
this.close = true
isClosed()
return this.close
The user goes to /showDiscussion/123 and he see the discussion with the <form> from which he can submit a new post, but only if the discussion is not closed or the current user is who started that discussion.
Or, the user goes to /showDiscussion/123 where it's presented as a REST-as-in-HATEOAS API. A hypermedia link to /addPost will be provided, only if the discussion is not closed or the authenticated user is who started that discussion.
How can I provide that knowledge into the UI?
I could code that into the read model,
canAddPostToThisDiscussion = !discussionWithAllThePosts.discussion.isClosed
&& discussionWithAllThePosts.discussion.originalPoster.id == session.currentUserId
but then I need to maintain that logic and keep it in sync with the write model. This is a fairly simple example, but as the states transitions of an aggregate become more complex, it may become really hard to do. I'd like to image my aggregates as state machines, with their workflows (like the RESTBucks example). But I don't like the idea to move that business logic outside my domain model, and put it in a service that both the read side and write side can use.
Maybe this isn't the best example, but as an aggregate root is basically a consistency boundary, we know that we need to prevent invalid state transitions in its life cycle, and in each transitions to a new state some operations may become illegal and vice versa. So, how can the user interface know what is allowed or not? What are my alternative? How should I approach to this problem? Do you have any example to provide?
How can I provide that knowledge into the UI?
The easiest way is probably to share the domain model's understanding of what is possible with the UI. Ta Da.
Here's a way to think about it -- in the abstract, all of the write model logic has a fairly simple looking shape.
{
// Notice that these statements are queries
State currentState = bookOfRecord.getState()
State nextState = model.computeNextState(currentState, command)
// This statement is a command
bookOfRecord.replace(currentState, nextState)
}
Key ideas here: the book of record is the authority of state; everybody else (including the "write model") is working with a stale copy.
What the model represents is a collection of constraints that ensure that the business invariant is satisfied. Over the lifetime of a system, there might be many different sets of constraints, as the understanding of the business changes.
The write model is the authority for which collection of constraints is currently enforced when replacing the state in the book of record. Everybody else is working with a stale copy.
The staleness is something to keep in mind; in a distributed system, any validation you perform is provisional -- unless you have a lock on the state and a lock on the model, either could be changed while your messages are in flight.
This means that your validation is approximate anyway, so you don't need to be too concerned that you have all of the fiddly details right. You assume that your stale copy of the state is approximately right, and your current understanding of the model is approximately right, and if the command is valid given those pre-conditions, then it is checked enough to send.
I don't like the idea to move that business logic outside my domain model, and put it in a service that both the read side and write side can use.
I think the best answer here is "get over it". I get it; because having the business logic inside the aggregate root is what the literature is telling us to do. But if you continue to refactor, identifying common patterns and separating concerns, you'll see that entities are really just plumbing around a reference to state and a functional core.
AggregateRoot {
final Reference<State> bookOfRecord;
final Model<State,Command> theModel;
onCommand(Command command) {
State currentState = bookOfRecord.getState()
State nextState = model.computeNextState(currentState, command)
bookOfRecord.replace(currentState, nextState)
}
}
All we've done here is taken the "construct the next state" logic, which we used to have scattered through out the AggregateRoot, and encapsulated it into a separate responsibility boundary. Here, its specific to the root itself, but an equivalent refactoring it so pass it as an argument.
AggregateRoot {
final Reference<State> bookOfRecord;
onCommand(Model<State,Command> theModel, Command command) {
State currentState = bookOfRecord.getState()
State nextState = model.computeNextState(currentState, command)
bookOfRecord.replace(currentState, nextState)
}
}
In other words, the model, teased out from the plumbing of tracking state, is a domain service. The domain logic within the domain service is just as much a part of the domain model as the domain logic within the aggregate -- the two implementations are dual to one another.
And there's no reason that a read model of your domain shouldn't have access to a domain service.
I don't like the idea of sharing domain knowlegde (code) between the write and the read models as you will have to continously keep them in sync and that'd really a chalenge even if you are the only developer in your company.
But the good knews is that you don't have to duplicate anything. If you designed your Aggregate to be pure, with no side effect as you should do (!), you can simply send it the command but without persisting the changes. If the command throws an exception then the command would not succeed, otherwise the command would succeed. In case of CQRS this is even better as you have a 3rd outcome: idempotent command detection in which case the command succeeds but it has no effect (no events are raised but no exception is thrown either) and the UI might find this interesting.
So, as an example you could have something like this:
DiscussionController
#Security(is_logged)
#Method('POST')
#Route('addPost')
addPostToDiscussionAction(request)
discussionService.postToDiscussion(
new PostToDiscussionCommand(request.discussionId, session.myUserId, request.bodyText)
)
#Method('GET')
#Route('showDiscussion/{discussionId}')
showDiscussionAction(request)
discussionWithAllThePosts = discussionFinder.findById(request.discussionId)
canAddPostToThisDiscussion = discussionService.canPostToDiscussion(request.discussionId, session.myUserId, "some sample body")
// render the discussion to the user, and use `canAddPostToThisDiscussion` to show/hide the form
// from which the user can send a request to `addPostToDiscussionAction`.
renderDiscussion(discussionWithAllThePosts, canAddPostToThisDiscussion)
DiscussionApplicationService
postToDiscussion(command)
discussion = discussionRepository.get(command.discussionId)
author = collaboratorService.authorFrom(discussion.Id, command.authorId)
post = discussion.createPost(postRepository.nextIdentity(), author, command.bodyText)
postRepository.add(post)
canPostToDiscussion(discussionId, authorId, bodyText)
discussion = discussionRepository.get(discussionId)
author = collaboratorService.authorFrom(discussion.Id, authorId)
try
{
post = discussion.createPost(postRepository.nextIdentity(), author, bodyText)
return true
}
catch (exception)
{
return false
}
You could even have a method named whyCantPostToDiscussion that would return the exception or the exception message and display it in the UI.
There is only one issue with the code: the call to postRepository.nextIdentity() because it would increase the next ID every time but you could replace it with something like postRepository.getBiggestIdentity() that should have no side effect.
I find it is rare that authorization is actually part of the domain. If it isn't, it makes sense to move that logic out into its own service which the UI and the domain can make use of.
I like to build up a set of rules using the specification pattern. I find it to be a fairly elegant way to build up the rules.
This also plays very well in a CQRS context as you can run each command through the 'rules engine' before they get issued to your AR's. If you push queries through a message routeing system you can do the same for queries. I've had a lot of success with this approach.
The response you are looking for is HATEOAS, look no further. You must implement your rest api as really restful (level 3) adhering to hypertext to model the state transitions and return links to the clients (being the UI one of those). These links represent the actions the user can execute in its context according to the model state. It´s simple. If you return a link from the server then you bind it to a button in the UI, if you don´t return the link because of business invariants then you do not show the button on the UI. There is a lot more of concepts behind it such as designing a good API supporting a well designed domain model behind but this is the general idea around it and fits exactly what you want.

Implementing thread-safe, parallel processing

I am trying to convert an existing process in a way that it supports multi-threading and concurrency to make the solution more robust and reliable.
Take the example of an emergency alert system. When a worker clocks-in, a new Recipient object is created with their information and added to the Recipients collection. Conversely, when they clock-out, the object is removed. And in the background, when an alert occurs, the alert engine will iterate through the same list of Recipients (foreach), calling SendAlert(...) on each object.
Here are some of my requirements:
Adding a recipient should not block if an alert is in progress.
Removing a recipient should not block if an alert is in progress.
Adding or removing a recipient should not affect the list of
recipients used by an in-progress alert.
I've been looking at the Task and Parallel classes as well as the BlockingCollection and ConcurrentQueue classes but am not clear what the best approach is.
Is it as simple as using a BlockingCollection? After reading a ton of documentation, I'm still not sure what happens if Add is called while I am enumerating the collection.
UPDATE
A collegue referred me to the following article which describes the ConcurrentBag class and how each operation behaves:
http://www.codethinked.com/net-40-and-system_collections_concurrent_concurrentbag
Based on the author's explanation, it appears that this collection will (almost) serve my purposes. I can do the following:
Create a new collection
var recipients = new ConcurrentBag();
When a worker clocks-in, create a new Recipient and add it to the collection:
recipients.Add(new Recipient());
When an alert occurs, the alert engine can iterate through the collection at that time because GetEnumerator uses a snapshot of the collection items.
foreach (var recipient in recipients)
recipient.SendAlert(...);
When a worker clocks-out, remove the recipient from the collection:
???
The ConcurrentBag does not provide a way to remove a specific item. None of the concurrent classes do as far as I can tell. Am I missing something? Aside from this, ConcurrentBag does everything I need.
ConcurrentBag<T> should definitely be the best performing class out of the bunch for you to use for such a case. Enumeration works exactly as your friend describes and so it should serve well for the scenario you have laid out. However, knowing you have to remove specific items from this set, the only type that's going to work for you is ConcurrentDictionary<K, V>. All the other types only offer a TryTake method which, in the case of ConcurrentBag<T>, is indeterminate or, in the case of ConcurrentQueue<T> or ConcurrentStack<T> ordered only.
For broadcasting you would just do:
ConcurrentDictionary<string, Recipient> myConcurrentDictionary = ...;
...
foreach(Recipient recipient in myConcurrentDictionary.Values)
{
...
}
The enumerator is once again a snapshot of the dictionary in that instant.
I came into work this morning to an e-mail from a friend that gives me the following two answers:
1 - With regards to how the collections in the Concurrent namespace work, most of them are designed to allow additions and subtractions from the collection without blocking and are thread-safe even when in the process of enumerating the collection items.
With a "regular" collection, getting an enumerator (via GetEnumerator) sets a "version" value that is changed by any operation that affects the collection items (such as Add, Remove or Clear). The IEnumerator implementation will compare the version set when it was created against the current version of the collection. If different, an exception is thrown and enumeration ceases.
The Concurrent collections are designed using segments that make it very easy to support multi-threading. But, in the case of enumerating, they actually create a snapshot copy of the collection at the time GetEnumerator is called and the enumerator works against this copy. That allows changes to be made to the collection without adverse affects on the enumerator. Of course this means that the enumeration will know nothing of these changes but it sounds like your use-case allows this.
2 - As far as the specific scenario you are describing, I don't believe that a Concurrent collection is needed. You can wrap a standard collection using a ReaderWriterLock and apply the same logic as the Concurrent collections when you need to enumerate.
Here's what I suggest:
public class RecipientCollection
{
private Collection<Recipient> _recipients = new Collection<Recipient>();
private ReaderWriterLock _lock = new ReaderWriterLock();
public void Add(Recipient r)
{
_lock.AcquireWriterLock(Timeout.Infinite);
try
{
_recipients.Add(r);
}
finally
{
_lock.ReleaseWriterLock();
}
}
public void Remove(Recipient r)
{
_lock.AcquireWriterLock(Timeout.Infinite);
try
{
_recipients.Remove(r);
}
finally
{
_lock.ReleaseWriterLock();
}
}
public IEnumerable<Recipient> ToEnumerable()
{
_lock.AcquireReaderLock(Timeout.Infinite);
try
{
var list = _recipients.ToArray();
return list;
}
finally
{
_lock.ReleaseReaderLock();
}
}
}
The ReaderWriterLock ensures that operations are only blocked if another operation that changes the collection's contents is in progress. As soon as that operation completes, the lock is released and the next operation can proceed.
Your alert engine would use the ToEnumerable() method to obtain a snapshot copy of the collection at that time and enumerate the copy.
Depending on how often an alert is sent and changes are made to the collection, this could be an issue but you might be able to still implement some type of version property that is changed when an item is added or removed and the alert engine can check this property to see if it needs to call ToEnumerable() again to get the latest version. Or encapsulate this by caching the array inside the RecipientCollection class and invalidating the cache when an item is added or removed.
HTH
There is much more to an implementation like this than just the parallel processing aspects, durability probably being paramount among them. Have you considered building this using an existing PubSub technology like say... Azure Topics or NServiceBus?
Your requirements strike me as an good fit for the way standard .NET events are triggered in C#. I don't know offhand if the VB syntax gets compiled to similar code or not. The standard pattern looks something like:
public event EventHandler Triggered;
protected void OnTriggered()
{
//capture the list so that you don't see changes while the
//event is being dispatched.
EventHandler h = Triggered;
if (h != null)
h(this, EventArgs.Empty);
}
Alternatively, you could use an immutable list class to store the recipients. Then when the alert is sent, it will first take the current list and use it as a "snapshot" that cannot be modified by adding and removing while you are sending the alert. For example:
class Alerter
{
private ImmutableList<Recipient> recipients;
public void Add(Recipient recipient)
{
recipients = recipients.Add(recipient);
}
public void Remove(Recipient recipient)
{
recipients = recipients.Remove(recipient);
}
public void SendAlert()
{
//make a local reference to the current list so
//you are not affected by any calls to Add/Remove
var current = recipients;
foreach (var r in current)
{
//send alert to r
}
}
}
You will have to find an implementation of an ImmutableList, but you should be able to find several without too much work. In the SendAlert method as I wrote it, I probably didn't need to make an explicit local to avoid problems as the foreach loop would have done that itself, but I think the copy makes the intention clearer.

C# lock keyword, I think I'm using this wrong

I recently had a problem with multiple form posting in an ASP.NET MVC application. The situation was basically, if someone intentionally hammered the submit button, they could force data to be posted multiple times despite validation logic (both server and client side) that was intended to prohibit this. This occurred because their posts would go through before the Transaction.Commit() method could run on the initial request (this is all done in nHibernate)
The MVC ActionMethod looked kind of like this..
public ActionResult Create(ViewModelObject model)
{
if(ModelState.IsValid)
{
// ...
var member = membershipRepository.GetMember(User.Identity.Name);
// do stuff with member
// update member
}
}
There were a lot of solutions proposed, but I found the C# lock statement, and gave it a try, so I altered my code to look like this...
public ActionResult Create(ViewModelObject model)
{
if(ModelState.IsValid)
{
// ...
var member = membershipRepository.GetMember(User.Identity.Name);
lock(member) {
// do stuff with member
// update member
}
}
}
It worked! None of my testers can reproduce the bug, anymore! We've been hammering away at it for over a day and no one can find any flaw. But I'm not all that experienced with this keyword. I looked it up again to get clarification...
The lock keyword marks a statement block as a critical section by obtaining the mutual-exclusion lock for a given object, executing a statement, and then releasing the lock
Okay, that makes sense. Here is my question.
This was too easy
This solution seemed simple, straightforward, clear, efficient, and clean. It was way too simple. I know better than to think something that complicated has that simple a solution. So I wanted to ask more experienced programmers ...
Is there something bad going on I should be aware of?
No it's not that easy. Locking only works if the same instance is used.
This will not work:
public IActionResult Submit(MyModel model)
{
lock (model)
{
//will not block since each post generates it's own instance
}
}
Your example could work. It all depends on if second-level caching is enabled in nhibernate (and thus returning the same user instance). Note that it will not prevent anything from being posted to the database, just that each post will be saved in sequence.
Update
Another solution would be to add return false; to the submit button when it's being pressed. it will prevent the button from submitting the form multiple times.
Here is a jquery script that will fix the problem for you (it will go through all submit buttons and make sure that they will only submit once)
$(document).ready(function(){
$(':submit').click(function() {
var $this = $(this);
if ($this.hasClass('clicked')) {
alert('You have already clicked on submit, please be patient..');
return false;
}
$this.addClass('clicked');
});
});
Add it do you layout or to a javascript file.
Update2
Note that the jquery code works in most cases, but remember that any user with a little bit of programming knowledge can use for instance HttpWebRequest to spam POSTs to your web server. It's not likely, but it could happen. The point I'm making is that you should not rely on client side code to handle problems since they can be circumvented.
Yeah, it's that easy, but - there may be a performance hit. Remember that a Monitor lock restricts that code to be run by only one thread at a time. There is a new thread for each HTTP Request, so that means only one of those requests at any given time can access that code. If it's a long running procedure, or a lot of people are trying to access that part of the site at the same time - you might start to sluggish responses.
It's that easy, but be careful what object you lock on. It should be the same one for all the threads - for example, it could be a static object.
lock is syntactic sugar for a Monitor, so there is quite a bit going on under the cover.
Also, you should keep an eye out for deadlocks - they can happen when you lock on two or more objects.

Resources