Webmaster API v3: getting servingLimitExceeded using batch requests

Webmaster API v3: getting servingLimitExceeded using batch requests - google-api

I'm getting the servingLimitExceeded error message for results within batch but not for an entire batch. For example, I may get 100 records responding with this error and then it starts returning more results. All within the a single batch.
If batches are handled internally by Google API, how can I adjust them to not hit the rate limit? I tried adding a 1-second delay between batches but that doesn't change this. I also set retries = 3 on the Ruby client, but I don't know if that means it retries a failed batch. I don't think it's retrying individual API calls within the batch, because the back-off should resolve this.
Do I have to record the failed results and create a new batch to recover those separately?
Incidentally, the documented quota limit errors are confusing. There are dailyLimitExceeded and rateLimitExceeded messages but this isn't returning one of those. The servingLimitExceeded description of "The overall rate limit specified for the API has already been reached" is not all that helpful but I'm assuming this is the rate limit that we hit.
Update
Looking at the code, I see that the retries in the ruby google-api-client only apply to transmission and authorization (401) errors. A 403 (which is what rate limit returns) raises a ClientError which is not retried anyway.
So setting retries on the client object has no bearing on this.
Is there something I can do to address this in the batch?

We received word from Webmaster team that the API is limited to 20QPS and there is currently no way to go higher.
One suggested solution is to make smaller batch requests.

Related

How to configure pull timeout on GCP PubSubTemplate?

From the documentation on GCP it says:
Otherwise, the system may wait (for a bounded amount of time) until at least one message is available, rather than returning no messages.
Is there any way of configuring this "(for a bounded amount of time)"? I am using spring-cloud-gcp.

As mentioned in the document returnImmediately is deprecated.
However you can set the timeout to any Asynchronous Pull by mentioning the timeout value in StreamingPullFuture#result. If you don't set timeout then the result() will block indefinitely, unless an exception is encountered first.
You can go through the Pub/Sub Receiving messages Quickstart to understand better. You can also check this documentation regarding the different types of pull subscriptions.

Kafka Producer is not retrying after Timeout

Intermittently(once or twice in a month) I am seeing the error
org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for cart-topic-0: 5109 ms has passed since batch creation plus linger time
in my logs due to which the corresponding message was not processed by Kafka Producer.
Though all the brokers are up and available I'm not sure why this error is being observed. Even the load is not much during this period.
I have set the retries property value to 10 in Producer configs but still, the message was not been retried. Is there anything else I need to add for the Kafka send method? I have gone through the similar issues raised, but there is no proper conclusion for this error.
Can someone please help on how to fix this.

From the KIP proposal which is now addressed
We propose adding a new timeout delivery.timeout.ms. The window of enforcement includes batching in the accumulator, retries, and the inflight segments of the batch. With this config, the user has a guaranteed upper bound on when a record will either get sent, fail or expire from the point when send returns. In other words we no longer overload request.timeout.ms to act as a weak proxy for accumulator timeout and instead introduce an explicit timeout that users can rely on without exposing any internals of the producer such as the accumulator.
So basically, post this now you can additionally be able to configure a delivery timeout and retries for every async send you execute.

I had an issue where retries were not being obeyed, but in my particular case it was because we were calling the get() method on send for synchronous behaviour. We hadn't realized it would impact retries.
In investigating the issue through various paths I came across the definition of the sorts of errors that are retrial
https://kafka.apache.org/11/javadoc/org/apache/kafka/common/errors/RetriableException.html
What had confused me is that timeout was listed as a retrial one.
I would normally have suggested you would want to look into if the delivery of your batches was taking too long and messages in your buffer were expiring due to increased volume, but you've mentioned that the volume isn't particularly high.
Did you determine if increasing the request.timeout.ms has an impact on the frequency of occurrence? It might be more of a treating the symptom step than the cause.

SparkPost API error: 'Too many requests'

When I run requests series like
https://api.sparkpost.com:443/api/v1/suppression-list/he**0#gmail.com
Sometimes, I get error:
name: 'SparkPostError',
errors: [ { message: 'Too many requests' } ],
statusCode: 429
Too many - this is how many?
How long the server keeps track of period? How long the server resets the counter? How to solve this problem?

https://developers.sparkpost.com/api/index.html#header-rate-limiting
After a quick look at the site...

The answer of support
As mentioned before we do limit requests on our endpoints to prevent abuse and while we can’t reveal the actual limits we do recommend that you wait several seconds before making consecutive requests. If you are making these requests via some type of automated process (script, code, etc.), we highly recommend that you add a wait for several seconds upon seeing this 429 error and reattempt after. Decreasing the frequency of your requests and waiting before reattempting is the only way around this error message.

https://developers.sparkpost.com/api/index.html#header-rate-limiting
Rate Limiting
Note: To prevent abuse, our servers enforce request rate limiting, which may trigger responses with HTTP status code 429.
SparkPost implements rate limiting on the following API endpoints:
/api/v1/message-events
/api/v1/metrics/*
The limits imposed here are dynamic but as a general rule, polling these endpoints more than once in 2 minutes may encounter rate limiting and a 429 status code.

Does the Google Analytics API throttle requests?

Does the Google Analytics API throttle requests?
We have a batch script that I have just moved from v2 to v3 of the API and the requests go through quite well for the first bit (50 queries or so) and then they start taking 4s or so each. Is this Google throttling us?

While Matthew is correct, I have another possibility for you. Google analytics API cashes your requests to some extent. Let me try and explain.
I have a customer / site that I request data from. While testing I noticed some strange things.
the first million rows results would come back with in an acceptable amount of time.
after a million rows things started to slow down we where seeing results returning in 5 times as much time instead of 5 minutes we where waiting 20 minutes or more for the results to return.
Example:
Request URL :
https://www.googleapis.com/analytics/v3/data/ga?ids=ga:34896748&dimensions=ga:date,ga:sourceMedium,ga:country,ga:networkDomain,ga:pagePath,ga:exitPagePath,ga:landingPagePath&metrics=ga:entrances,ga:pageviews,ga:exits,ga:bounces,ga:timeOnPage,ga:uniquePageviews&filters=ga:userType%3D%3DReturning+Visitor;ga:deviceCategory%3D%3Ddesktop&start-date=2014-05-12&end-date=2014-05-22&start-index=236001&max-results=2000&oauth_token={OauthToken}
Request Time (seconds:milliseconds): :0:484
Request URL :
https://www.googleapis.com/analytics/v3/data/ga?ids=ga:34896748&dimensions=ga:date,ga:sourceMedium,ga:country,ga:networkDomain,ga:pagePath,ga:exitPagePath,ga:landingPagePath&metrics=ga:entrances,ga:pageviews,ga:exits,ga:bounces,ga:timeOnPage,ga:uniquePageviews&filters=ga:userType%3D%3DReturning+Visitor;ga:deviceCategory%3D%3Ddesktop&start-date=2014-05-12&end-date=2014-05-22&start-index=238001&max-results=2000&oauth_token={OauthToken}
Request Time (seconds:milliseconds): :7:968
I did a lot of testing stopping and starting my application. I couldn't figure out why the data was so fast in the beginning then slow later.
Now I have some contacts on the Google Analytics Development team the guys in charge of the API. So I made a nice test app, logged some results showing my issue and sent it off to them. With the question Are you throttling me?
They where also perplexed, and told me there is no throttle on the API. There is a flood protection limit that Matthew speaks of. My Developer contact forwarded it to the guys in charge of the traffic.
Fast forward a few weeks. It seams that when we make a request for a bunch of data Google cashes the data for us. Its saved on the server incase we request it again. By restarting my application I was accessing the cashed data and it would return fast. When I let the application run longer I would suddenly reach non cashed data and it would take longer for them to return the request.
I asked how long is data cashed for, answer there was no set time. So I don't think you are being throttled. I think your initial speedy requests are cashed data and your slower requests are non cashed data.
Email back from google:
Hi Linda,
I talked to the engineers and they had a look. The response was
basically that they thinks it's because of caching. The response is
below. If you could do some additional queries to confirm the behavior
it might be helpful. However, what they need to determine is if it's
because you are querying and hitting cached results (because you've
already asked for that data). Anyway, take a look at the comments
below and let me know if you have additional questions or results that
you can share.
Summary from talking to engineer: "Items not already in our cache will
exhibit a slower retrieval processing time than items already present
in the cache. The first query loads the response into our cache and
typical query times without using the cache is about 7 seconds and
with using the cache is a few milliseconds. We can also confirm that
you are not hitting any rate limits on our end, as far as we can tell.
To confirm if this is indeed what's happening in your case, you might
want to rerun verified slow queries a second time to see if the next
query speeds up considerably (this could be what you're seeing when
you say you paste the request URL into a browser and results return
instantly)."
-- IMBA Google Analytics API Developer --

Google's Analytics API does have a rate limit per their docs: https://developers.google.com/analytics/devguides/reporting/core/v3/coreErrors
However they should not caused delayed requests, rather the request should be returned with a response of: 403 userRateLimitExceeded
Description of that error:
Indicates that the user rate limit has been exceeded. The maximum rate limit is 10 qps per IP address. The default value set in Google Developers Console is 1 qps per IP address. You can increase this limit in the Google Developers Console to a maximum of 10 qps.
Google's recommended course of action:
Retry using exponential back-off. You need to slow down the rate at which you are sending the requests.

errorCode=003001; statusCode=403; source=Throttling Policy

Lately, in our app using IPP data services, we have encountered these errors from time to time.
<RestResponse xmlns="http://www.intuit.com/sb/cdm/v2">
<Error RequestId="49f7926a9aa84cfc8289534801dee72d">
<RequestName>ErrorRequest</RequestName>
<ProcessedTime>2012-12-07T10:10:59+00:00</ProcessedTime>
<ErrorCode>3001</ErrorCode>
<ErrorDesc>message=This client has made too many consecutive requests over too short a period of time. Please wait a short amount of time before attempting to submit again; errorCode=003001; statusCode=403; source=Throttling Policy</ErrorDesc>
</Error>
</RestResponse>
Can't find any reference to a "Throttling Policy" or error code "3001" anywhere in the IPP documentation.
Any help in figuring out what the throttling limits are would be appreciated. Are they based around an IP, rate limitation, concurrency limitation, OAuth consumer, OAuth client, something else perhaps?
EDIT: Link to IDN forums about the same issue: https://idnforums.intuit.com/messageview.aspx?catid=69&threadid=18910.

Yes, there is a throttling process in place if > 500 requests per minute by a single user or against a single realm.
You had over 600 requests during the one minute period.
Looks like almost all (all except 32 requests) were individual customer queries… all different customer record ids. Is there a way you can make a single customer list query, filtered if necessary, to get a bunch of customer records in a single request and reduce the number of calls you are making.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Webmaster API v3: getting servingLimitExceeded using batch requests - google-api

We received word from Webmaster team that the API is limited to 20QPS and there is currently no way to go higher. One suggested solution is to make smaller batch requests.

Related

How to configure pull timeout on GCP PubSubTemplate?

Kafka Producer is not retrying after Timeout

SparkPost API error: 'Too many requests'

Does the Google Analytics API throttle requests?

errorCode=003001; statusCode=403; source=Throttling Policy

Categories

Resources