Azure Event Grid Blob Storage - prevent double Blob Created events? - azure-eventgrid

I noticed a bit strange behavior of Blob Storage events during customer testing on Friday, and now I'm wondering if there are known situations which cause double events (blob created) to fire.
So basically an external application writes blobs to a container. Most of the blobs fire just one blob created event as usual, but for some reason, *.zip blobs (and only those) cause two event to fire close to one another ( < 0,5 secs apart). The zip file size is usually about 200-250 kB.
Earlier I have seen similar issues, but in those cases the first event always comes with
"contentLength": 0
...which also makes them very easy to filter out.
But in this scenario I'm getting two distinct events both with the same (and actual) blob size.
This might naturally be caused by the sender application too, I'm trying hard to get the correct people online to be able to verify and fix or possibly reproduce or issue with my own test code.
But is there a more detailed specification available that describes how those events are fired from blob storage. I'm also curious if there are any ways to avoid creating the 0-length event that sometimes seems to occur.
EDIT: Well - got finally the confirmation that this was actually a bug in the sending system, so we have no problem in this case. But still, if someone knows what actually triggers those possible "zero content length" events, that would also help planning future solutions.

Related

Event sourcing - delete event related files

I'd like to store some of my data in relative big files (a few GBs per file). I'd like to use event sourcing and save events related to these files, e.g. FileCreated: title, description, timestamp, author, personal, encryptionkey, etc. After a while some of the files won't be needed any longer, and they take up a lot of space. So in order to free up space, I need to delete them. Doing so is problematic, because I will have the history in the event storage, but not the file in the filesystem. Is there any way to keep integrity and somehow delete both? Or is there a best practice for this problem?
Since I did not get an answer, I try to answer this myself.
It is possible to remove an event from the history, you need to create a new event storage and filter the events for the same aggregate id you want to get rid of. After you are done, you can switch to the new event storage and remove the old one. Probably you need to replay projections as well. So it is very similar to a whole migration, it takes a lot of time. In the current case it is not problem if I need to do this only once every year or so. Another problem with storing this data in the event storage that either I stream it from there or I need to duplicate it in order to serve it. The latter one is not always a good solution, because sometimes it takes too much time to copy and in order to save the data you need to stream it anyways, otherwise you will be out of memory very fast. So the event storage should support streaming attachments.
Another solution to keep the relative big data in the files and display something like 404 not found, or file was removed because this and that. I see this frequently. In this case it is ok to keep the event in the storage and for example you can add a ContentRemoved event, where you can select the cause. Another option to hide the removed file, so it won't be listed by the app, this is usual I guess too. This solution has drawbacks too. Migration is more complex with this approach, because you need to move both the event storage and the files. If you remove a file by accident, you cannot undo it later unless you have the file in the backup. This can be fixed by delaying the actual file removal with a few days, so you can undo it if you change your mind. Another option to make a trash and files will be deleted only by emptying the trash.
I think both solutions are worth to consider and probably it depends on the actual project which one is better suited.

Delays in updating content controls when more context.sync() are used on WORD for Mac

We update content control for every character typed in the task pane’s input field. So that user can see the live updates on the word document.
Recently we added functionality for locking content controls. And it happens as below:
User input (types a character) in a input field
We search a content control for that input field (involves context.sync)
Unlock the content control (involves context.sync)
Update value in content control (involves context.sync)
Lock back the content control (involves context.sync)
All this works nice in Word for windows without problems.
But is extremely (visibly) slow with Word for Mac (apple machines)
How should I overcome the delays happening on Mac?
As Juan mentioned in the comment, there are some important details that the team would need to investigate. Sample code would be good too.
That being said, just looking at what you describe, I think you can dramatically cut down on the context.sync() statements. Unlocking the content control, updating its value, and locking it should all be possible to do in one sync.
I have a bunch of details about minimizing sync-s in my book, "Building Office Add-ins using Office.js. Quoting one of the sections from it:
As an add-in author, your job is to minimize the number of context.sync()
calls. Each sync is an extra round-trip to the host application; and when
that application is Office Online, the cost of each of those round-trip adds up
quickly.
If you set out to write your add-in with this in principle in mind, you will
find that you need a surprisingly small number of sync calls. In fact, when
writing this chapter, I found that I really needed to rack my brain to come up with a scenario that did need more than two sync calls. The trick for
minimizing sync calls is to arrange the application logic in such a way that
you're initially scraping the document for whatever information you need
(and queuing it all up for loading), and then following up with a bunch
of operations that modify the document (based on the previously-loaded
data). You've seen several examples of this already: one in the intro chapter,
when describing why Office.js is async; and more recently in the "canonical
sample" section at the beginning of this chapter. For the latter, note that the
scenario itself was reasonably complex: reading document data, processing
it to determine which city has experienced the highest growth, and then
creating a formatted table and chart out of that data. However, given the
"time-travel" superpowers of proxy objects, you can still accomplish this task
as one group of read operations, followed by a group of write operations.
Still, there are some scenarios where multiple loads may be required. And in
fact, there may be legitimate scenarios where even doing an extra sync is the
right thing to do – if it saves on loading a bunch of unneeded data. You will
see an example of this later in the chapter.

How to avoid lambda trigger recursive call

I've written a lambda function that is triggered via an s3 bucket's putObject event. I am modifying the headers of an object post upload, downloading the object, and reuploading with appropriate headers. But because the function itself uses the putObject to reupload the object, the lambda triggers itself.
Three options:
Use a different API to upload your changes than the one that you have an event on. ie, if your lambda is triggered by PUT, then use a POST to modify the content afterwards (tough to do since POST isn't supported well by SDKs AFAIK, so this may not be an option).
Track usage and have a small guard at the beginning of your handler to short circuit if the only changes made to a file are ones you made. If you can't programmatically detect the headers you've set, you'll probably need a small dynamo table or similar for keeping track of which files you've already touched. This will let you abort immediately and only be charged the minimum 100ms fee.
Reorganize your project to have an 'ingest' bucket and an output bucket. Un-processed are put into the former, modified, and then placed into the latter. This has a number of advantages. The first is that you don't end up with the current situation, so that's a plus. The second is that you don't have whatever process consumes these modified files potentially pulling an unmodified version. The third is that you get better insight into the process - if something goes wrong, it's easy to see which batches of files have undergone which process.
Overall, I'd recommend option 3 for you, though I know that in my lazier moments I might try to opt for 1 or 2.
Either way, good luck.

RabbitMQ Message size limitiation?

I am trying to gauge the performance of RabbitMQ when my message size increases to a few MB. However, even when I sent a 32KB message, I get a Resource temporarily unavilable message from the Server. There's no error in the log files, there are no memory limit reaching errors... How do I go about debugging this issue?
If it's on any help, I'm running this on EC2 T1.micro instance.. So 592MB RAM.
According to the bug you linked, someone recently (looks like after you left the link to the bug) left a comment that they can reliably reproduce the bug when the message size is >=15821 bytes.
I would recommend that you see if that also holds true for you -- i.e. can you also reproduce at that threshold -- and then evaluate if under that amount -- thus avoiding the bug documented in the issue above -- is a sufficient size for your needs. If not, you may want to try pika (https://github.com/pika/pika) and see if that works better with larger messages (one of the other comments on that bug suggests that pika did work for them with larger message sizes).
Another option that may work, depending on your exact use case, would be to include in the rabbitmq message payload a key of sorts that points allows you to fetch the large blob of data from wherever it's stored (Postgres, MongoDB, etc.) when you consume the message, and therefore allow you to avoid the bug. Perhaps not ideal if you really want to encapsulate everything inside the payload, but may be a feasible workaround to the bug.
In terms of debugging, since it appears that this is a bug with rabbitpy itself, I think you would need to debug the actual rabbitpy library if you wanted to proceed on that front. Doable, but perhaps not feasible due to time, etc.

PowerBuilder 12.1 production performance issues causing asynchrony?

We have a legacy PowerBuilder 12.1 Classic application with an Oracle 11g back end, and are experiencing performance issues in production that we cannot reproduce in our test environments.
The window in question has shared grid/freeform DataWindows and buttons to open other response windows, which when closed cause the grid to re-retrieve.
The grid has a very expensive query behind it, several columns receive their values from function calls with some very intense SQL within, however it still runs within a couple seconds, even in production.
The only consistency in when the errors occur is that it seems to be more likely if they attempt to navigate to the other windows quickly. The buttons that open said windows are assuming that a certain instance variable is set with the appropriate value from the row in focus in the grid. However, in this scenario, the instance variable has not yet been set, even though it looks like the row focus change has occurred. This is causing null reference exceptions that shouldn't be possible.
The end users' network connectivity is often sluggish, and their hardware isn't any less capable than ours. I want to blame the network, but I attempted to reproduce this myself in development by intentionally slowing down the SQL so that I could attempt to click a button, however everything happened as I expected: clicking the button didn't happen until after retrieve and all the other events finished.
My gut tells me that for some reason things aren't running synchronously when they should, and the only factor I can imagine is the speed of the SQL, whether from the query being slow, or the network being slow, but when I tried reproducing that effect things still happened in the proper sequence. The only suspect code is that the datawindow ancestor posts a user event called ue_post_rfc from rowfocuschanged, and this event does a Yield(). ue_post_rfc is where code goes instead of rowfocuschanged.
Is there any way Yield() would cause these problems, without manifesting itself in test environments, even when SQL is artificially slowed?
While your message may not give enough information to give you a recipe to solve your problem, it does give me a hint towards a common point of hard-to-diagnose failures that I see often in PowerBuilder systems.
The sequence of development events goes something like this
Developer develops code where there is a dependence on one event firing before another event, often a dependence through instance or global variables
This event sequence has been something the developer has observed, but isn't documented as a guaranteed sequence (like the AcceptText() sequence or the Update() sequence are documented)
I find this a lot with posted events, and I'm not talking about event and post-event where post-event is posted from event, but more like between post-ItemChanged and post-GetFocus
Something changes the sequence of events, breaking the code. Things that I've seen change non-guaranteed sequences of events include:
PowerBuilder version change
Operating system change
Hardware change
The application running with other applications taxing the system resources
Whoever is now in charge of solving this, has no clue what is going on or how to deal with it, so they start peppering the code with Yield() statements (I've literally seen comments beside a Yield() that said "I don't know why this works, but it solves problem X")
Note that Yield() allows any and all events in the message queue to be processed, while this developer really wants only one particular event to get through
Also note that the commonly-seen-in-my-career DO ... LOOP UNTIL (NOT Yield()) could loop infinitely on a heavily loaded system
Something happens to change the event sequence again
Now when the Yield() occurs, there is a different sequence of messages in the queue to be processed, and not the message the developer had wanted to be processed
Things start failing again
My advice to get rid of this problem (if this is your problem) is to either:
Get rid of the cross-event dependence
Get rid of event sequence assumptions
Manage the event sequence yourself
Good luck,
Terry
P.S. Here's a couple of quotes from your question that make me think of Yield() (not that I don't love the opportunity to jump all over Yield() grin)
The only consistency in when the errors occur is that it seems to be
more likely if they attempt to navigate to the other windows quickly.
Seen this when the user tries to initiate (let's say for example) two actions very quickly. If the script from the first action contains a Yield(), the script from the second action will both start and finish before the first action finishes. This can be true of any combination of user actions (e.g. button clicks, menu clicks, tabs, window closings... you coded with the possibility that the window isn't there anymore after the Yield() was done, right? If not, join the 99% of those that code Yield(), don't, and live dangerously) and system events (e.g. GetFocus, Deactivate, Timer)
My gut tells me that for some reason things aren't running
synchronously when they should
You're right. PowerBuilder (unless you force it) runs synchronously. However, if one event is starting before another finishes (see above), then you're going to get behaviours that look like asynchronous behaviours.
There's nothing definitive in what you've said, but you did ask about Yield(). The really kicker to nail this down is if you could reproduce this with a PBDEBUG trace; you'd see which event(s) is(are) surprising you. However, the amount that PBDEBUG slows things down affects event sequences and queuing, which may or may not be helpful.

Resources