What happens when LINQ updates fail? - linq

Here's the scenario:
Process 1 (P1) - reads in various flat files one consequence of which is deleting or adding photos URLs to a database table to indicate that these need to be downloaded
Process 2 (P2) - Looks up the photo URLs that need to be downloaded, actually performs the download, then marks the record as downloaded
P1 and P2 sometimes run concurrently based on the amount of data to process and sometimes P1 removes records that P2 has already loaded and prepared to download.
This is not actually a problem - if the image is downloaded but no longer needed, the nature of these image URLs is such that it may end up being used at a later date anyway and space is no concern. So, the only trouble is that if P2 selects a group of records and then some of those records cease to exist during the LINQ update, an error is thrown by SubmitChanges() similar to:
"1 of 4 updates failed."
My question is: what happened when this update failed? As far as I can tell, there are 3 possibilities:
The entire transaction was rolled back
The transaction was not rolled back and all records that could be updated were
The transaction was not rolled back and the 1st record was updated but the 2nd failed so the rest of the updates weren't attempted.
The actual call was as follows - no ConflictMode set:
this.SomeDataContext.SubmitChanges();
How would this call be altered so that any updates that could be executed would and the others ignored? Does the following do the trick:
this.SomeDataContext.SubmitChanges(ConflictMode.ContinueOnConflict);
I don't see anything in the MSDN indicating the default ConflictMode of the parameter-less call:
http://msdn.microsoft.com/en-us/library/bb292162.aspx
..though there is an indication in the single parameter call indicating that the default is FailOnFirstConflict:
http://msdn.microsoft.com/en-us/library/system.data.linq.datacontext.submitchanges.aspx

With Linq-To-SQL, all changes that are committed when SubmitChanges() is called are saved or rolled back as a single transaction. So, if you have 2 inserts, 2 updates and 2 deletes pending when SubmitChanges() is called, they are all saved or rolled back as a single unit of work.
It is possible to make many SubmitChanges() calls and have all inserts/updates/deletes from all SubmitChanges() treated as a single unit of work by wrapping all the SubmitChanges() calls within a TransactionScope object.

I would have commented but dont have enough reputation.
I cam here looking for an answer to the default ConflictMode used. Later I found http://msdn.microsoft.com/en-us/library/bb345081(v=vs.90).aspx
Which seems to have it documented now.

Related

Flow Triggering Itself(Possibly), Each run hits past IDs that were edited

I am pretty new to power automate. I created a flow that triggers when an item is created or modified. It initializes some variables and then does some switch cases to assign values to each of them. The variables then go into an array and another variable is incremented to get the total of the array. I then have a conditional to assign a value to a column in the list. I tested the flow specifically going into the modern view of the list and clicking the save button. This worked a bunch of times and I sent it for user testing. One of the users edited multiple items by double clicking into the item which saves after each column change(which I assume triggers a run of the flow)
The flow seemingly works but seemed to get bogged down at a point based on run history. I let it sit overnight and then tested again and now it shows runs from multiple IDs at a time even though I only edited one specific one.
I had another developer take a look at my flow and he could not spot anything wrong with it and it never had a hard error in testing only warnings about conditionals causing a loop but all my conditionals rectify. Pictures included. I am just not sure of any caveats I might be missing.
I am currently letting the flow sit to see if it finishes getting caught up. I read about the concurrent run option as well as conditions on the trigger itself. I am curious as to why it seems to run on two records(or more) all at once without me or anyone editing each one.
You might be able to ignore the updates from the service account/account which is used in the connection of the actions by using the following trigger condition expression:
#not(equals(triggerOutputs()?['body/Editor/Claims'], 'i:0#.f|membership|johndoe#contoso.onmicrosoft.com'))

Parse Server - Saving objects with many fields - Schema Validation takes too long (enforceFieldExists)

We're using ParseServer to migrate a CloudCode based application to Heroku.
Using versions:
parse#1.8.5
parse-server#2.2.16
We noticed (its hard not to notice) that saving some is unreasonably slow. These objects are typically saved a few at a time - between 2 to 6 objects (using an Parse.Object.saveAll which fires a REST call to /1/batch
Saving each of these objects now takes anything between 4 to 12 seconds. Digging into Parse code, it was easy to see that schema validation is the cause.
SchemaController.validateObject() {
...
SchemaController.enforceFieldExists()
We are using triggers for simple validation, but as per the logic in RestWrite.js this causes schema validation to be executed twice - once before trigger and once after.
The problem lies in that our collection has about 40 fields. SchemaController.enforceFieldExists() loads the entire schema twice while attempting to validate each field. Moreover, it always attempts to write to the schema document (again, for each field), only to fail usually because all fields are already listed in the schema.
this means that we get an overhead of about 240 round trips to the database for each object, and we store up to 5 objects typically in each invocation. that adds up to over 1000 round trips to the database. so we easily go beyond the Heroku router timeout limit of 30 seconds.
My questions are:
Is there anything I can do to speed up this validation? (did not find documentation or settings for that)
Is there a fix for this redundant implementation planned or available anywhere?
Can I safely castrate enforceFieldExists() to do nothing without anything else breaking on me assuming we don't add fields often? What is this collection (_SCHEMA) used for other tan to draw the tables in Dashboard UI?
I'm currently thinking about patching this function to do nothing with an npm postinstall script. Does that sound like a good approach?
Appreciate any help on this,
Ron
This is being fixed with this pull request https://github.com/ParsePlatform/parse-server/pull/2286
and that line
https://github.com/ParsePlatform/parse-server/pull/2286/files#diff-7d0dd667d7bdafd6ebee06cf70139fa0R555
This will skip trying to write the schema is the current field is available.
This should be released soon

Posting an update request to ElasticSearch without waiting for completion

I have an ElasticSearch index that stores files, sometimes very large ones. Because the underlying Lucene engine is actually doing a complete replacement each time a document is updated, even if I am just modifying the value of one field, the entire document needs to be updated behind the scenes.
For large, multi-MB files this can take a fairly long time (several hundred ms). Since this is done as part of a web application this is not really acceptable. What I am doing right now is forking the process, so the update is called on a separate thread while the request finishes.
This works, but I'm not really happy with this as a long term solution, partially because it means that every time I create a new interface to the search engine I'll have to recode the forking logic. Also it means I basically can't know whether the request is successful or not, or if some kind of error occurred, without writing additional code to log successful or unsuccessful requests somewhere.
So I'm wondering if there is an unknown feature where you can post an UPDATE request to ElasticSearch, and have them return an acknowledgement without waiting for the update task to actually complete.
If you look at the documentation for Snapshot and Restore you'll see when you make a request you can add wait_for_completion=true in order to have the entire process run before receiving the result.
What I want is the reverse — the ability to add ?wait_for_completion=false to a POST request.

How to handle multiple asynch downloads

I recently moved my background synch downloads to a view controller and need some advice on how to best handle them asynch. I have written all the code to show a progressview as the download occurs but as you might have guessed it's not that simple. Here's how it works.
user sees a tableview with two entires one for each database. they can press a button to download the database and when the download starts that fires off the asynch URL connection,etc. This works to a certain extent however it's not that simple.
here's what i want it to do.
download the main update URL (works ok)
then download a secondary URL.
then apply the first URL content to the sqlite store (code written for that)
then apply the 2nd URL content to the sqlite store (code written for that)
(All the while showing progress to the user)
when the downloads were synch it was easy as i just waited for them to finish in order to fire the next activity off but when using the asynch method i'm struggling with how to get them to wait. Step 3 depends upon step 1 finishing and step 4 depends on step 2 finishing and overall success relies on all finishing. step 4 needs to wait for step 3 to finish otherwise the database locks will cause a clash.
the second complication is that if the user presses the second button while the first is downloading then steps 3, 4 will clash if they execute at the same time as the first row is accessing the database.
Has anyone done anything similar and if so what was the strategy you used to manage the flow of events.
Also i wanted to wrap this all up in a backgroundTask with ExpirationHandler so it would survive the user pressing the home button... but the delegate methods don't get called when i do that.
Ok Here is what i did to fix the problem.
Created an NSOperationQueue
Added the URL operations as NSURLInnvocationOperations
3.waited until the URL operations were complete (waituntilalloperationsarefinished).
Then set the max concurrent count to 1 which forced the subsequent database operations to execute in series one after the other and thus prevented SQLite from locking it's self out.

What's a good way to handle "async" commits?

I have a WCF service that uses ODP.NET to read data from an Oracle database. The service also writes to the database, but indirectly, as all updates and inserts are achieved through an older layer of business logic that I access via COM+, which I wrap in a TransactionScope. The older layer connects to Oracle via ODBC, not ODP.NET.
The problem I have is that because Oracle uses a two-phase-commit, and because the older business layer is using ODBC and not ODP.NET, the transaction sometimes returns on the TransactionScope.Commit() before the data is actually available for reads from the service layer.
I see a similar post about a Java user having trouble like this as well on Stack Overflow.
A representative from Oracle posted that there isn't much I can do about this problem:
This maybe due to the way OLETx
ITransaction::Commit() method behaves.
After phase 1 of the 2PC (i.e. the
prepare phase) if all is successful,
commit can return even if the resource
managers haven't actually committed.
After all the successful "prepare" is
a guarantee that the resource managers
cannot arbitrarily abort after this
point. Thus even though a resource
manager couldn't commit because it
didn't receive a "commit" notification
from the MSDTC (due to say a
communication failure), the
component's commit request returns
successfully. If you select rows from
the table(s) immediately you may
sometimes see the actual commit occur
in the database after you have already
executed your select. Your select will
not therefore see the new rows due to
consistent read semantics. There is
nothing we can do about this in Oracle
as the "commit success after
successful phase 1" optimization is
part of the MSDTC's implementation.
So, my question is this:
How should I go about dealing with the possible delay ("asyc" via the title) problem of figuring out when the second part of the 2PC actually occurs, so I can be sure that data I inserted (indirectly) is actually available to be selected after the Commit() call returns?
How do big systems deal with the fact that the data might not be ready for reading immediately?
I assume that the whole transaction has prepared and a commit outcome decided by the TransactionManager, therefore eventually (barring heuristic damage) the Resource Managers will receive their commit message and complete. However, there are no guarantees as to how long that might take - could be days, no timeouts apply, having voted "commit" in the Prepare the Resource Manager must wait to hear the collective outcome.
Under these conditions, the simplest approach is to take "an understood, we're thinking" approach. Your request has been understood, but you actually don't know the outcome, and that's what you tell the user. Yes, in all sane circumstances the request will complete, but under some conditions operators could actually choose to intervene in the transaction manually (and maybe cause heuristic damage in doing so.)
To go one step further, you could start a new transaction and perform some queries to see if the data is there. Now, if you are populating a result screen you will naturally be doing such as query. The question would be what to do if the expected results are not there. So again, tell the user "your recent request is being processed, hit refresh to see if it's complete". Or retry automatically (I don't much like auto retry - prefer to educate the user that it's effectively an asynch operation.)

Resources