Why is infinite loop protection being triggered on my CRM workflow? - dynamics-crm

Update
I should have added from the outset - this is in Microsoft Dynamics CRM 2011
I know CRM well, but I'm at a loss to explain behaviour on my current deployment.
Please read the outline of my scenario to help me understand which of my presumptions / understandings is wrong (and therefore what is causing this error). It's not consistent with my expectations.
Basic Scenario
Requirement demands that a web service is called every X minutes (it adds pending items to a database index)
I've opted to use a workflow / custom entity trigger model (i.e. I have a custom entity which has a CREATE plugin registered. The plugin executes my logic. An accompanying workflow is started when "completed" time + [timeout period] expires. On expiry, it creates a new trigger record and the workflow ends).
The plugin logic works just fine. The workflow concept works fine to a point, but after a period of time the workflow stalls with a failure:
This workflow job was canceled because the workflow that started it included an infinite loop. Correct the workflow logic and try again. For information about workflow logic, see Help.
So in a nutshell - standard infinite loop detection. I understand the concept and why it exists.
Specific deployment
Firstly, I think it's quite safe for us to ignore the content of the plugin code in this scenario. It works fine, it's atomic and hardly touches CRM (to be clear, it is a pre-event plugin which runs the remote web service, awaits a response and then sets the "completed on" date/time attribute on my Trigger record before passing the Target entity back into the pipeline) . So long as a Trigger record is created, this code runs and does what it should.
Having discounted the content of the plugin, there might be an issue that I don't appreciate in having the plugin registered on the pre-create step of the entity...
So that leaves the workflow itself. It's a simple one. It runs thusly:
On creation of a new Trigger entity...
it has a Timeout of Trigger.new_completedon + 15 minutes
on timeout, it creates a new Trigger record (with no "completed on" value - this is set by the plugin remember)
That's all - no explicit "end workflow" (though I've just added one now and will set it testing...)
With this set-up, I manually create a new Trigger record and the process spins nicely into action. Roll forwards 1h 58 mins (based on the last cycle I ran - remembering that my plugin code may take a minute to finish running), after 7 successful execution cycles (i.e. new workflow jobs being created and completed), the 8th one fails with the aforementioned error.
What I already know (correct me where I'm wrong)
Recursion depth, by default, is set to 8. If a workflow / plugin calls itself 8 times then an infinite loop is detected.
Recursion depth is reset every one hour (or 10 minutes - see "Warnings" in linked blog?)
Recursion depth settings can be set via PowerShell or SDK code using the Deployment Web Service in an on-premise deployment only (via the Set-CrmSetting Cmdlet)
What I don't want to hear (please)
"Change recursion depth settings"
I cannot change the Deployment recursion depth settings as this is not an option in an online scenario - ultimately I will be deploying to CRM Online too.
"Increase the timeout period on your workflow"
This is not an option either - the reindex needs to occur every 15 minutes, ideally sooner.
Update
#Boone suggested below that the recursion depth timeout is reset after 60 minutes of inactivity rather than every 60 minutes. Therein lies the first misunderstanding.
While discussing with #alex, I suggested that there may be some persistence of CorrelationId between creating an entity via the workflow and the workflow that ultimates gets spawned... Well there is. The CorrelationId is the same in both the plugin and the workflow and any records that spool from that thread. I am now looking at ways to decouple the CorrelationId (or perhaps the creation of records) from the entity and the workflow.

For the one hour "reset" to take place you have to have NO activity for an hour. It doesn't reset just 1 hour from the original. So since you have an activity every 15 minutes, it never has a chance to reset. I don't know that is said in stone anywhere... but from my experience.
In CRM 4 it was possible to create a CRM Service (Google creating a CRM service in the child pipeline) and reset the correlation ID (using CorrelationToken.NewToken()). I don't see anything so easy in the 2011 SDK. No idea if this trick worked in the online environment. Is 2011 online backwards compatible with CRM 4 plug-ins?
One thing you could try would be to use the IExecutionContext.CorrelationId to scavenge the asyncoperation (System Job) table. But according to the metadata, the attribute I think might be useful (CorrelationId, CorrelationUpdatedTime, Depth) are NOT valid for update. Maybe you could delete the rows? Even that may not help.

I doubt this can be solved like this.
I'd suggest a different approach: deploy a simple application alongside CRM and let it call the web service, which in turn can use the XRM endpoints in order to change the records.
UPDATE
Or, you can try something like this upon your crm service initialization in the plugin (dug it up from one of my plugins) leaving your workflow untouched:
CrmService service = new CrmService();
//initialize service here, then...
CorrelationToken newtoken = new CorrelationToken();
newtoken.CorrelationId = context.CorrelationId;
newtoken.CorrelationUpdatedTime = context.CorrelationUpdatedTime;
// WILD GUESS: Enforce unlimited depth ?
corToken.Depth = 0; // THIS WAS: context.Depth;
//updating correlation token
service.CorrelationTokenValue = corToken;
I admit I don't really remember much about this (code dates back to about 2 years ago), but it might help.

Related

Power Automate flow triggerede by "When a row is added, modified or deleted" not triggered

I'm new to Microsoft CRM and Power Automate. I'm trying to implement the Quote approval process.
As the first step, I tried to create a simple workflow to trigger when a Quote item is saved/modified using Dataverse 'When a row is added, modified or deleted'. So, I have studied many online tutorials and done workflow very simple to send an email when a Quote item is saved/modified.
Problem is, It's actually not triggered with any action on the Quote page. Already spend a couple of days to figure out the issue.
This is how I tested it.
Step 01 - Save and test manually test flow
Step 02 - insert a new item to Quote or modify the existing item.
But nothing happens.
Items you add to the Quote are in fact Quote Product records. This is a separate table named quotedetail.
Your Power Automate flow needs to target the Quote Products table to capture the intended changes.
The reason was Administrative mode.
Administrative mode disables all asynchronous processes. This includes server-side sync (email synchronization) and workflow processes including flow (Power Automate).
https://functionalthoughts.com/microsoft-dynamics-365-managing-admin-mode-from-power-platform/

Invoking 1 AWS Lambda with API Gateway sequentially

I know there's a question with the same title but my question is a little different: I got a Lambda API - saveInputAPI() to save the value into a specified field. Users can invoke this API with different parameter, for example:
saveInput({"adressType",1}); //adressType is a DB field.
or
saveInput({"name","test"}) //name is a DB field.
And of course, this hosts on AWS so I'm also using API Gateway as well. But the problem is sometimes, an error like this happened:
As you can see. API No. 19 was invoked first but ended up finishing later
(10:10:16:828) -> (10:10:18:060)
While API No.18 was invoked later but finished sooner...
(10:10:17:611) -> (10:10:17:861)
This leads to a lot of problems in my project. And sometimes, the delay between 2 API was up to 10 seconds. The front project acts independently so users don't know what happens behind. They think they have set addressType to 1 but in reality, the addressType is still 2. Since this project is large and I cannot change this kind of [using only 1 API to update DB value] design. Is there any way for me to fix this problem ?? Really appreciate any idea. Thanks
If updates to Database can't be skipped if last updated timestamp is more recent than the source event timestamp, we need to decouple Api Gateway and Lambda.
Api Gateway writes to SQS FIFO Queue.
Lambda to consume SQS and process the request.
This will ensure older event is processed first.
Amazon Lambda is asynchronous by design. That means that trying to make it synchronous and predictable is kind of waste.
If your concern is avoiding "old" data (in a sense of scheduling) overwrite "fresh" data, then you might consider timestamping each data and then applying constraints like "if you want to overwrite target data, then your source timestamp have to be in the future compared to timestamp of the targeted data"

Schedule simple GET batch for each second or even less than one second - Should opt for Spring Cloud Task, Spring Batch or springframework.scheduling

Context: in my country there will be a new way to Instantly Payment previewed for November. Basically, the Central Bank will provide two endpoints: (1) one POST endpoint which we post a single money transfer and (2) one GET endpoint where we get the result of a money transfer sent before and it can be completely out of order. It will answer back only on Money Transfer result and in its header will inform if there is another result we must GET. It never informs how many results are available. If there is a result it gives back on Get response and only inform if it is the last one or there is remaining ones for next GET.
Top limitation: from the moment final user clicks Transfer button in his/her mobile app until final result showing in his mobile screen if it was successful or failed is 10 seconds.
Strategy: I want a schedule which triggers each second or even less than a second a Get to Central Bank. The Scheduler will basically evoke a simple function which
Calls the Get endpoint
Pushes it to a Kafka or persist in database and
If in the answer headers it is informed more results are available, start same function again.
Issue: Since we are Spring users/followers, I though my decision was between Spring Batch versus org.springframework.scheduling.annotation.SchedulingConfigurer/TaskScheduler. I have used successfully Spring Batch for while but never for a so short period trigger (never used for 1 second period). I stumbled in discussion that drove me to think if in my case, a very simple task but with very short period, I should consider Spring Cloud Data Flow or Spring Cloud Task instead of Spring Batch.
According to this answer "... Spring Batch is ... designed for the building of complex compute problems ... You can orchestrate Spring Batch jobs with Spring Scheduler if you want". Based on that, it seems I shouldn't use Spring Batch because it isn't complex my case. The challenge design decision is more regard a short period trigger and triggering another batch from current batch instead of transformation, calculation or ETL process. Nevertheless, as far as I can see Spring Batch with its tasklet is well-designed for restarting, resuming and retrying and fits well a scenario which never finishes while org.springframework.scheduling seems to be only a way to trigger an event based on period configuration. Well, this is my filling based on personal uses and studies.
According to an answer to someone asking about orchestration for composed tasks this answer "... you can achieve your design goals using Spring Cloud Data Flow along with the Spring Cloud Task/Spring Batch...". In my case, I don't see composed tasks. In my case, the second trigger doesn't depend on result from previous one. It sounds more as "chained" tasks instead of "composed". I have never used Spring Cloud Data Flow but it seems a nice candidate for Manage/View/Console/Dashboards the triggered task. Nevertheless, I didn't find anywhere informing limitations or rule of thumbs for short periods triggers and "chained" triggers.
So my straight question is: what is the current recommend Spring members for a so short period trigger? Assuming Spring Cloud Data Flow is used for manager/dashboard what is the trigger member from Spring recommended in so short trigger scenarios? It seems Spring Cloud Task is designed for calling complex functions and Spring Batch seems to add too much than I need and org.springframework.scheduling.* missing integration with Spring Cloud Data Flow. As an analogy and not as comparison, in AWS, the documentation clear says "don't use CloudWatch for less than one minute. If you want less than one minute, start CloudWatch for each minute that start another scheduler/cron each second". There might be a well-know rule of thumb for a simple task that needs to be trigger each second or even less than one second and take advantage of Spring family approach/concerns/experience.
This may be stupid answer. Why do you need scheduler here?. Wouldn't a never ending job will achieve the goal here?
You start a job, it does a GET request, push the result to kafka,
If the GET response indicated, it had more results, it immediately does a GET again, push the result to kafka
If the GET response indicated, there are no more results, sleep for 1 second, do the GET request again.

Restful triggering of Camunda process definitions from nodejs

I’m a beginner at Camunda/BPMN and I want to use it to control what is going on in nodejs, mostly likely using a REST API, at least for now. (Unless folks have a better idea for how nodejs should talk to Camunda.) My goal is to deliver systems where non-programmers can update the business logic in very practical ways.
I'd like to trigger the start of perhaps more-than-one process by sending a REST message, say to reflect that "a new insurance policy has been sold" and that might trigger the instantiation of say 2 processes on Monday but perhaps on Tuesday we add a third and now the same REST API call should now trigger more activity on Wednesday. (I figure it is better for nodejs to know about events but not about the process definitions. After all, my goal is to use Camunda as a sort of business logic server for my application. The less the nodejs code needs to know, the better.)
Which REST API should I be using to express the message that, say "a new insurance policy has been sold"? When I look at:
https://docs.camunda.org/manual/develop/reference/rest/signal/post-signal/
I find it very confusing. What should "name" match in the biz process definitions? I assume I don't need an executionId? I assume I can leave out tenantId?
Would some string in the message match the ID of a start event in one or more process definitions (or what has to match what)?
When I look at a process, is there an easy way to tell what variables I need to supply to start that process running?
Should I perhaps avoid using this event-oriented style of kicking off processes and just use the POST /process-definition/key/{key}/start? It would seem to me to be better form to trigger activity with events or signals or something like that rather than to have my nodejs code know about the specific process definition by name.
Should I be using events or signals in this case?
I gather that the start event should not be a "None Start Event" but I'm not clear on what type of start event TO use if I want automatic triggering based on events or signals or something? Would a "Non-interrupting - Message Start Event" be the right sort? I'm finding this confusing.
Once I have triggered the process to start, what does nodejs need to send to step the process forward from one task in that instance to the next?
Thanks!
In order to instantiate a new workflow instance you have the following possibilities:
Start exactly one instance:
Start a workflow instance by its known "key": https://docs.camunda.org/manual/develop/reference/rest/process-definition/post-start-process-instance/
Start a workflow by a message start event: https://docs.camunda.org/manual/develop/reference/rest/message/post-message/. A message can only start one specific workflow instance, it is not allowed that this is not a unique relationship. The message start event is the one you have to use in your BPMN process model. See also https://docs.camunda.org/manual/develop/reference/bpmn20/events/message-events/. This might indeed be the better approach to make your client independent of the process definition key.
Start multiple instances:
- Start a workflow instance by a BPMN signal event: https://docs.camunda.org/manual/develop/reference/rest/signal/post-signal/. The signal name could start many instances as once.
The name of the message or name of signal would be configured in the BPMN model. Both could work for your use case.
Once a process instance is started it will move automatically execute the next steps.
Probably following this example (https://blog.bernd-ruecker.com/use-camunda-without-touching-java-and-get-an-easy-to-use-rest-based-orchestration-and-workflow-7bdf25ac198e) step by step can give you some better idea?

Immediately Display New Metrics

I am using graphite and coda hale metrics to try and track the number of times particular API's are called and also the top 10 callers. I have assigned a metric to each user who calls the API and use graphite to bring back the top 10.
The problem is, if it is a new user - ie a new metric, this will only be displayed in Graphite when the tool is refreshed - Has anyone come across a work around for this ? Is there some way Graphite can automatically detect new meters?
Just to be clear - I can see the top ten API callers for the last 30 minutes.........unless it is a brand new user that has never logged in before.
It seems that graphite-web uses an on disk index generated by a glorified find command. Another script is available so you can run it as cron to update the metric index file.
Whenever you update the index file, graphite-web process will detect it and reload it.
Since reloading the index might be heavy for large (1M) number of metrics, I would advise to modify the update script a bit to conditionnaly update the file (only if different for instance).
EDIT: after test, graphite does not seem to call the reloading code

Resources