I got the exception time exceeded execution while calling the restlet api from postman or other external souce.
That means your suitescript is processing too much data and possibly hitting the governance limit
avoid unnecessary module operations as follows:
too many record.load
nested queries. this will kill your governance
processing >299 record operations in one restlet call. use a map reduce script instead
loading transaction records and not using 'mainline' filter accordingly. this can bring back 1000's of records, thus resulting in your error
More than likely you are not returning anything from your RESTlet function.
Is there a way to redefine the database "transactional" boundary on a spring batch job?
We have a simple payment processing job that reads x number of payment records, processes and marks the records in the database as processed. Currently, the writer does a REST API call (to the payment gateway), processes the API response and marks the records as processed. We're doing a chunk oriented approach so the updates aren't flushed to the database until the whole chunk has completed. Since, basically the whole read/write is within a transaction, we are starting to see excessive database locks and contentions. For example, if the API takes a long time to respond (say 30 seconds), the whole application starts to suffer.
We can obviously reduce the timeout for the API call to be a smaller value.. but that still doesn't solve the issue of the tables potentially getting locked for longer than desirable duration. Ideally, we want to keep the database transaction as short lived as possible. Our thought is that if the "meat" of what the job does can be done outside of the database transaction, we could get around this issue. So, if the API call happens outside of a database transaction.. we can afford it to take a few more seconds to accept the response and not cause/add to the long lock duration.
Is this the right approach? If not, what would be the recommended way to approach this "simple" job in spring-batch fashion? Are there other batch tools better suited for the task? (if spring-batch is not the right choice).
Open to providing more context if needed.
I don't have a precise answer to all your questions but I will try to give some guidelines.
Since, basically the whole read/write is within a transaction, we are starting to see excessive database locks and contentions. For example, if the API takes a long time to respond (say 30 seconds), the whole application starts to suffer.
Since its inception, the term batch processing or processing data in "batches" is based on the idea that a batch of records is treated as a unit: either all records are processed (whatever the term "process" means) or none of the records is processed. This "all or nothing" semantic is exactly what Spring Batch implements in its chunk-oriented processing model. Achieving such a (powerful) property comes with trade-offs. In your case, you need to make a trade-off between consistency and responsiveness.
We can obviously reduce the timeout for the API call to be a smaller value.. but that still doesn't solve the issue of the tables potentially getting locked for longer than desirable duration.
The chunk-size is the most impactful parameter on the transaction behaviour. What you can do is try to reduce the number of records to be processed within a single transaction and see the result. There is no best value, this is an empirical process. This will also depend on the responsiveness of the API you are calling during the processing of a chunk.
Our thought is that if the "meat" of what the job does can be done outside of the database transaction, we could get around this issue. So, if the API call happens outside of a database transaction.. we can afford it to take a few more seconds to accept the response and not cause/add to the long lock duration.
A common technique to avoid doing such updates on a live system is to offload the processing against another datastore and then replicate the updates in a single transaction. The idea is to mark records with a given batch id and copy those records to a different datastore (or even a temporary table within the same datastore) that the batch process can use without impacting the live datastore. Once the processing is done (which could be done in parallel to improve performance), records can be marked as processed in the live system within in a single transaction (this is usually very fast and could be based on the batch id to identify which records to update).
I'm trying to implement a batch query interface with GraphQL. I can get a request to work synchronously without issue, but I'm not sure how to approach making the result asynchronous. Basically, I want to be able to kick off the query and return a pointer of sorts to where the results will eventually be when the query is done. I'd like to do this because the queries can sometimes take quite a while.
In REST, this is trivial. You return a 202 and return a Location header pointing to where the client can go to fetch the result. GraphQL as a specification does not seem to have this notion; it appears to always want requests to be handled synchronously.
Is there any convention for doing things like this in GraphQL? I very much like the query specification but I'd prefer to not leave the client HTTP connection open for up to a few minutes while a large query is executed on the backend. If anything happens to kill that connection the entire query would need to be retried, even if the results themselves are durable.
What you're trying to do is not solved easily in a spec-compliant way. Apollo introduced the idea of a #defer directive that does pretty much what you're looking for but it's still an experimental feature. I believe Relay Modern is trying to do something similar.
The idea is effectively the same -- the client uses a directive to mark a field or fragment as deferrable. The server resolves the request but leaves the deferred field null. It then sends one or more patches to the client with the deferred data. The client is able to apply the initial request and the patches separately to its cache, triggering the appropriate UI changes each time as usual.
I was working on a similar issue recently. My use case was to submit a job to create a report and provide the result back to the user. Creating a report takes couple of minutes which makes it an asynchronous operation. I created a mutation which submitted the job to the backend processing system and returned a job ID. Then I periodically poll the jobs field using a query to find out about the state of the job and eventually the results. As the result is a file, I return a link to a different endpoint where it can be downloaded (similar approach Github uses).
Polling for actual results is working as expected but I guess this might be better solved by subscriptions.
I am using Elasticsearch bulk API to send a lot of documents to index and delete at once. If there is an error for one document, other documents will be indexed or deleted successfully. And this leads to wrong state of data in elasticstore because in my case documents are kind of related to each other. I mean if one document's field has some value then there are other documents which should also have same value for that field. I am not sure how I can handle such errors from Bulk requests. Is it possible to rollback a request in any way? I read similar questions but could not get solution on handling such cases. Or instead of rollback, is there any way to send data only if there is no error? or something like dry run of request possible?
I'm late to the question but will answer for whoever runs across a similar scenario in the future.
After executing the Elasticsearch (ES) bulk API aka BulkRequest, you get a BulkResponse in return which consists of one or more BulkItemResponse. BulkItemResponse has a method isFailed() which will tell you if that action failed or not. In your case, you can traverse all the items in the response if there are failures and handle failed responses as per your requirement.
The code will look something like this for Synchronous execution:
val bulkResponse: BulkResponse = restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);
.foreach(item => { // your logic to handle failures })
For Asynchronous execution, you can provide a listener which will be called after the execution is completed. You have to override onResponse() and onFailure() in this case. You can read more about it at https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-document-bulk.html
The solution shared above to use BulkResponse output is basically to handle next batch requests. What if I want to break the batch processing at the position where any request failed in the batch. We are sending bulk events which are related to each other. Example of my issue: Batch(E1- E10), if batch fails at E5. I don't want E6-E10 to process because they are related. I want immediate response in that case.
I'm getting the servingLimitExceeded error message for results within batch but not for an entire batch. For example, I may get 100 records responding with this error and then it starts returning more results. All within the a single batch.
If batches are handled internally by Google API, how can I adjust them to not hit the rate limit? I tried adding a 1-second delay between batches but that doesn't change this. I also set retries = 3 on the Ruby client, but I don't know if that means it retries a failed batch. I don't think it's retrying individual API calls within the batch, because the back-off should resolve this.
Do I have to record the failed results and create a new batch to recover those separately?
Incidentally, the documented quota limit errors are confusing. There are dailyLimitExceeded and rateLimitExceeded messages but this isn't returning one of those. The servingLimitExceeded description of "The overall rate limit specified for the API has already been reached" is not all that helpful but I'm assuming this is the rate limit that we hit.
Looking at the code, I see that the retries in the ruby google-api-client only apply to transmission and authorization (401) errors. A 403 (which is what rate limit returns) raises a ClientError which is not retried anyway.
So setting retries on the client object has no bearing on this.
Is there something I can do to address this in the batch?
We received word from Webmaster team that the API is limited to 20QPS and there is currently no way to go higher.
One suggested solution is to make smaller batch requests.
I am using mongodb to store user's events, there's a document for every user, containing an array of events. The system processes thousands of events a minute and inserts each one of them to mongo.
The problem is that I get poor performance for the update operation, using a profiler, I notice that the WriteResult.getError is the one that incur the performance impact.
That makes sense, the update is async, but if one wants to retrieve the operation result he needs to wait until the operation is completed.
My question, is there a way to keep the update async, but only get an exception if error occurs (99.999 of the times there is no error, so the system waits for nothing). I understand it means the exception will be raised somewhere further down the process flow, but I can live with that.
Any other suggestions?
The application is written in Java so we're using the Java driver, but I am not sure it's related.
have you done indexing on your records?
it may be a problem to your performance.
if not done before you should do Indexing on ur collection like
for more help visit http://www.mongodb.org/display/DOCS/Indexes