I need to design a URL Callback Scheduler system for an application with potentially millions of jobs per days; the scheduler will need to do the following,
Provide an API for clients to register a URL Callback to be called at a specific date and time, the callback time is between 1 minute to 1 year; in other words, the client can register a callback to be fired in 1 minute in the future, or a year in future.
My questions are,
1- Is there a design pattern that I can utilize?
2- Are you aware of an open source application that does this?
I've been searching for days to get a clue on how to start but haven't found anything useful, you're help is greatly appreciated.
Related
Does someone know what "sf_max_daily_api_calls" parameter in Heroku mappings does? I do not want to assume it is a daily limit for write operations per object and I cannot find an explanation.
I tried to open a ticket with Heroku, but in their support ticket form "Which application?" drop-down is required, but none of the support categories have anything to choose there from, the only option is "Please choose..."
I tried to find any reference to this field and can't - I can only see it used in Heroku's Quick Start guide, but without an explanation. I have a very busy object I'm working on, read/write, and want to understand any limitations I need to account for.
Salesforce orgs have rolling 24h limit of max daily API calls. Generally the limit is very generous in test orgs (sandboxes), 5M calls because you can make stupid mistakes there. In productions it's lower. Bit counterintuitive but protects their resources, forces you to write optimised code/integrations...
You can see your limit in Setup -> Company information. There's a formula in documentation, roughly speaking you gain more of that limit with every user license you purchased (more for "real" internal users, less for community users), same as with data storage limits.
Also every API call is supposed to return current usage (in special tag for SOAP API, in a header in REST API) so I'm not sure why you'd have to hardcode anything...
If you write your operations right the limit can be very generous. No idea how that Heroku Connect works. Ideally you'd spot some "bulk api 2.0" in the documentation or try to find synchronous vs async in there.
Normal old school synchronous update via SOAP API lets you process 200 records at a time, wasting 1 API call. REST bulk API accepts csv/json/xml of up to 10K records and processes them asynchronously, you poll for "is it done yet" result... So starting job, uploading files, committing job and then only checking say once a minute can easily be 4 API calls and you can process milions of records before hitting the limit.
When all else fails, you exhausted your options, can't optimise it anymore, can't purchase more user licenses... I think they sell "packets" of more API calls limit, contact your account representative. But there are lots of things you can try before that, not the least of them being setting up a warning when you hit say 30% threshold.
Context: in my country there will be a new way to Instantly Payment previewed for November. Basically, the Central Bank will provide two endpoints: (1) one POST endpoint which we post a single money transfer and (2) one GET endpoint where we get the result of a money transfer sent before and it can be completely out of order. It will answer back only on Money Transfer result and in its header will inform if there is another result we must GET. It never informs how many results are available. If there is a result it gives back on Get response and only inform if it is the last one or there is remaining ones for next GET.
Top limitation: from the moment final user clicks Transfer button in his/her mobile app until final result showing in his mobile screen if it was successful or failed is 10 seconds.
Strategy: I want a schedule which triggers each second or even less than a second a Get to Central Bank. The Scheduler will basically evoke a simple function which
Calls the Get endpoint
Pushes it to a Kafka or persist in database and
If in the answer headers it is informed more results are available, start same function again.
Issue: Since we are Spring users/followers, I though my decision was between Spring Batch versus org.springframework.scheduling.annotation.SchedulingConfigurer/TaskScheduler. I have used successfully Spring Batch for while but never for a so short period trigger (never used for 1 second period). I stumbled in discussion that drove me to think if in my case, a very simple task but with very short period, I should consider Spring Cloud Data Flow or Spring Cloud Task instead of Spring Batch.
According to this answer "... Spring Batch is ... designed for the building of complex compute problems ... You can orchestrate Spring Batch jobs with Spring Scheduler if you want". Based on that, it seems I shouldn't use Spring Batch because it isn't complex my case. The challenge design decision is more regard a short period trigger and triggering another batch from current batch instead of transformation, calculation or ETL process. Nevertheless, as far as I can see Spring Batch with its tasklet is well-designed for restarting, resuming and retrying and fits well a scenario which never finishes while org.springframework.scheduling seems to be only a way to trigger an event based on period configuration. Well, this is my filling based on personal uses and studies.
According to an answer to someone asking about orchestration for composed tasks this answer "... you can achieve your design goals using Spring Cloud Data Flow along with the Spring Cloud Task/Spring Batch...". In my case, I don't see composed tasks. In my case, the second trigger doesn't depend on result from previous one. It sounds more as "chained" tasks instead of "composed". I have never used Spring Cloud Data Flow but it seems a nice candidate for Manage/View/Console/Dashboards the triggered task. Nevertheless, I didn't find anywhere informing limitations or rule of thumbs for short periods triggers and "chained" triggers.
So my straight question is: what is the current recommend Spring members for a so short period trigger? Assuming Spring Cloud Data Flow is used for manager/dashboard what is the trigger member from Spring recommended in so short trigger scenarios? It seems Spring Cloud Task is designed for calling complex functions and Spring Batch seems to add too much than I need and org.springframework.scheduling.* missing integration with Spring Cloud Data Flow. As an analogy and not as comparison, in AWS, the documentation clear says "don't use CloudWatch for less than one minute. If you want less than one minute, start CloudWatch for each minute that start another scheduler/cron each second". There might be a well-know rule of thumb for a simple task that needs to be trigger each second or even less than one second and take advantage of Spring family approach/concerns/experience.
This may be stupid answer. Why do you need scheduler here?. Wouldn't a never ending job will achieve the goal here?
You start a job, it does a GET request, push the result to kafka,
If the GET response indicated, it had more results, it immediately does a GET again, push the result to kafka
If the GET response indicated, there are no more results, sleep for 1 second, do the GET request again.
All,
I have some client code that needs to execute tasks that might be long running. A user will want to upload a video for processing which could take a long time and/or possibly fail. Another user might want to upload a small pic which could finish quickly or also fail. In all cases I need to be able to update the client code with some sort progress as the job moves along. Is there a Spring 4 solution for this kind of pattern. I have found many pub/sub solutions but they were all several years old. I am hoping this type of problem is now common enough to have a structured solution.
I'm developing a basic messaging system on the Parse.com at the moment and I have noticed in the Events Analytics screen I'm hitting 30,000+ requests per day. This is a shock considering I'm the only person using the system at the moment. Obviously with a few users I would blow my API request limit straight away.
I'm pretty experienced with Parse.com these days, so I'm lean with queries and I'm alert to not putting finds, saves, retrieves, etc in for loops. I also understand that saveAll() on an array of ParseObjects doesn't always limit the request count to 1 (depending on relationships inside that object).
So how does one track down where the excessive calls are coming from?
I see the above Analytics > Performance > Served Requests data, but how do I drill down to see if cloud code or iOS is the culprit?
Current solution is to effectively unit test each block of Parse code and look at the results in above screen.
For the benefit of others who may happen upon this thread with the same questions, I found some techniques to hunt down where excessive requests are coming from.
1) Parse's documentation on the API's themselves is really good, but there isn't a lot of information / guides for the admin interfaces. Under: Analytics -> Explorer -> Make a table there is a capability to download all the requests for a specific day (to import into a spreadsheet). The data isn't very detailed though and the dates are epoch timestamps, so hard to follow. At least you can see [Request Type, Class, Installation ID] e.g. ["find", "MyParseClass", "Cloud Code"].
2) My other technique was to add custom Analytic events to the code. So in Cloud Code for example, I added the following line to each beforeSave and afterSave event:
Parse.Analytics.track('MyClass_beforeSave', null);
3) Obviously, Parse logs these calls in the Logs window, but given you can only see the most recents transactions and can't clear them, I found it mostly unhelpful in tracking down the excessive calls.
Update
I should have added from the outset - this is in Microsoft Dynamics CRM 2011
I know CRM well, but I'm at a loss to explain behaviour on my current deployment.
Please read the outline of my scenario to help me understand which of my presumptions / understandings is wrong (and therefore what is causing this error). It's not consistent with my expectations.
Basic Scenario
Requirement demands that a web service is called every X minutes (it adds pending items to a database index)
I've opted to use a workflow / custom entity trigger model (i.e. I have a custom entity which has a CREATE plugin registered. The plugin executes my logic. An accompanying workflow is started when "completed" time + [timeout period] expires. On expiry, it creates a new trigger record and the workflow ends).
The plugin logic works just fine. The workflow concept works fine to a point, but after a period of time the workflow stalls with a failure:
This workflow job was canceled because the workflow that started it included an infinite loop. Correct the workflow logic and try again. For information about workflow logic, see Help.
So in a nutshell - standard infinite loop detection. I understand the concept and why it exists.
Specific deployment
Firstly, I think it's quite safe for us to ignore the content of the plugin code in this scenario. It works fine, it's atomic and hardly touches CRM (to be clear, it is a pre-event plugin which runs the remote web service, awaits a response and then sets the "completed on" date/time attribute on my Trigger record before passing the Target entity back into the pipeline) . So long as a Trigger record is created, this code runs and does what it should.
Having discounted the content of the plugin, there might be an issue that I don't appreciate in having the plugin registered on the pre-create step of the entity...
So that leaves the workflow itself. It's a simple one. It runs thusly:
On creation of a new Trigger entity...
it has a Timeout of Trigger.new_completedon + 15 minutes
on timeout, it creates a new Trigger record (with no "completed on" value - this is set by the plugin remember)
That's all - no explicit "end workflow" (though I've just added one now and will set it testing...)
With this set-up, I manually create a new Trigger record and the process spins nicely into action. Roll forwards 1h 58 mins (based on the last cycle I ran - remembering that my plugin code may take a minute to finish running), after 7 successful execution cycles (i.e. new workflow jobs being created and completed), the 8th one fails with the aforementioned error.
What I already know (correct me where I'm wrong)
Recursion depth, by default, is set to 8. If a workflow / plugin calls itself 8 times then an infinite loop is detected.
Recursion depth is reset every one hour (or 10 minutes - see "Warnings" in linked blog?)
Recursion depth settings can be set via PowerShell or SDK code using the Deployment Web Service in an on-premise deployment only (via the Set-CrmSetting Cmdlet)
What I don't want to hear (please)
"Change recursion depth settings"
I cannot change the Deployment recursion depth settings as this is not an option in an online scenario - ultimately I will be deploying to CRM Online too.
"Increase the timeout period on your workflow"
This is not an option either - the reindex needs to occur every 15 minutes, ideally sooner.
Update
#Boone suggested below that the recursion depth timeout is reset after 60 minutes of inactivity rather than every 60 minutes. Therein lies the first misunderstanding.
While discussing with #alex, I suggested that there may be some persistence of CorrelationId between creating an entity via the workflow and the workflow that ultimates gets spawned... Well there is. The CorrelationId is the same in both the plugin and the workflow and any records that spool from that thread. I am now looking at ways to decouple the CorrelationId (or perhaps the creation of records) from the entity and the workflow.
For the one hour "reset" to take place you have to have NO activity for an hour. It doesn't reset just 1 hour from the original. So since you have an activity every 15 minutes, it never has a chance to reset. I don't know that is said in stone anywhere... but from my experience.
In CRM 4 it was possible to create a CRM Service (Google creating a CRM service in the child pipeline) and reset the correlation ID (using CorrelationToken.NewToken()). I don't see anything so easy in the 2011 SDK. No idea if this trick worked in the online environment. Is 2011 online backwards compatible with CRM 4 plug-ins?
One thing you could try would be to use the IExecutionContext.CorrelationId to scavenge the asyncoperation (System Job) table. But according to the metadata, the attribute I think might be useful (CorrelationId, CorrelationUpdatedTime, Depth) are NOT valid for update. Maybe you could delete the rows? Even that may not help.
I doubt this can be solved like this.
I'd suggest a different approach: deploy a simple application alongside CRM and let it call the web service, which in turn can use the XRM endpoints in order to change the records.
UPDATE
Or, you can try something like this upon your crm service initialization in the plugin (dug it up from one of my plugins) leaving your workflow untouched:
CrmService service = new CrmService();
//initialize service here, then...
CorrelationToken newtoken = new CorrelationToken();
newtoken.CorrelationId = context.CorrelationId;
newtoken.CorrelationUpdatedTime = context.CorrelationUpdatedTime;
// WILD GUESS: Enforce unlimited depth ?
corToken.Depth = 0; // THIS WAS: context.Depth;
//updating correlation token
service.CorrelationTokenValue = corToken;
I admit I don't really remember much about this (code dates back to about 2 years ago), but it might help.