Does the Google Analytics API throttle requests? - google-api

Does the Google Analytics API throttle requests?
We have a batch script that I have just moved from v2 to v3 of the API and the requests go through quite well for the first bit (50 queries or so) and then they start taking 4s or so each. Is this Google throttling us?

While Matthew is correct, I have another possibility for you. Google analytics API cashes your requests to some extent. Let me try and explain.
I have a customer / site that I request data from. While testing I noticed some strange things.
the first million rows results would come back with in an acceptable amount of time.
after a million rows things started to slow down we where seeing results returning in 5 times as much time instead of 5 minutes we where waiting 20 minutes or more for the results to return.
Example:
Request URL :
https://www.googleapis.com/analytics/v3/data/ga?ids=ga:34896748&dimensions=ga:date,ga:sourceMedium,ga:country,ga:networkDomain,ga:pagePath,ga:exitPagePath,ga:landingPagePath&metrics=ga:entrances,ga:pageviews,ga:exits,ga:bounces,ga:timeOnPage,ga:uniquePageviews&filters=ga:userType%3D%3DReturning+Visitor;ga:deviceCategory%3D%3Ddesktop&start-date=2014-05-12&end-date=2014-05-22&start-index=236001&max-results=2000&oauth_token={OauthToken}
Request Time (seconds:milliseconds): :0:484
Request URL :
https://www.googleapis.com/analytics/v3/data/ga?ids=ga:34896748&dimensions=ga:date,ga:sourceMedium,ga:country,ga:networkDomain,ga:pagePath,ga:exitPagePath,ga:landingPagePath&metrics=ga:entrances,ga:pageviews,ga:exits,ga:bounces,ga:timeOnPage,ga:uniquePageviews&filters=ga:userType%3D%3DReturning+Visitor;ga:deviceCategory%3D%3Ddesktop&start-date=2014-05-12&end-date=2014-05-22&start-index=238001&max-results=2000&oauth_token={OauthToken}
Request Time (seconds:milliseconds): :7:968
I did a lot of testing stopping and starting my application. I couldn't figure out why the data was so fast in the beginning then slow later.
Now I have some contacts on the Google Analytics Development team the guys in charge of the API. So I made a nice test app, logged some results showing my issue and sent it off to them. With the question Are you throttling me?
They where also perplexed, and told me there is no throttle on the API. There is a flood protection limit that Matthew speaks of. My Developer contact forwarded it to the guys in charge of the traffic.
Fast forward a few weeks. It seams that when we make a request for a bunch of data Google cashes the data for us. Its saved on the server incase we request it again. By restarting my application I was accessing the cashed data and it would return fast. When I let the application run longer I would suddenly reach non cashed data and it would take longer for them to return the request.
I asked how long is data cashed for, answer there was no set time. So I don't think you are being throttled. I think your initial speedy requests are cashed data and your slower requests are non cashed data.
Email back from google:
Hi Linda,
I talked to the engineers and they had a look. The response was
basically that they thinks it's because of caching. The response is
below. If you could do some additional queries to confirm the behavior
it might be helpful. However, what they need to determine is if it's
because you are querying and hitting cached results (because you've
already asked for that data). Anyway, take a look at the comments
below and let me know if you have additional questions or results that
you can share.
Summary from talking to engineer: "Items not already in our cache will
exhibit a slower retrieval processing time than items already present
in the cache. The first query loads the response into our cache and
typical query times without using the cache is about 7 seconds and
with using the cache is a few milliseconds. We can also confirm that
you are not hitting any rate limits on our end, as far as we can tell.
To confirm if this is indeed what's happening in your case, you might
want to rerun verified slow queries a second time to see if the next
query speeds up considerably (this could be what you're seeing when
you say you paste the request URL into a browser and results return
instantly)."
-- IMBA Google Analytics API Developer --

Google's Analytics API does have a rate limit per their docs: https://developers.google.com/analytics/devguides/reporting/core/v3/coreErrors
However they should not caused delayed requests, rather the request should be returned with a response of: 403 userRateLimitExceeded
Description of that error:
Indicates that the user rate limit has been exceeded. The maximum rate limit is 10 qps per IP address. The default value set in Google Developers Console is 1 qps per IP address. You can increase this limit in the Google Developers Console to a maximum of 10 qps.
Google's recommended course of action:
Retry using exponential back-off. You need to slow down the rate at which you are sending the requests.

Related

investments/transactions/get endpoint - how long to return data?

I've been testing Plaid's investments transactions endpoint (investments/transactions/get) in development.
I'm encountering issues with highly variable delays for data to be returned (following the product initialization with Link). Plaid states that it takes 1–2 minutes to return investment transaction data, but I've found that in practice, it can be up to several hours before the data is returned.
Anyone else using this endpoint and getting data returned within 1–2 minutes, or is it generally a longer wait?
If it is a longer wait, do you simply wait for the DEFAULT_UPDATE webhook before you retrieve the data?
So far, my experience with their investments/transactions/get has been problematic (missing transactions, product doesn't work as described in their docs, limited sandbox dataset, etc.) so I'm very interested in hearing from anyone with more experience with this endpoint.
Do you find this endpoint generally reliable, and the data provided to be usable, or have you had issues? I've not seen any issues with investments/holdings/get, so I'm hoping that my problems are unusual, and I just need to push through it.
I'm testing in development with my own brokerage accounts, so I know what the underlying transactions are compared to what Plaid is returning to me. My calls are set up correctly, and I can't get a helpful answer from Plaid support.
I took at look at the support issue and it does appear like the problem you're hitting is related to a bug (or two different bugs, in this case).
However, for posterity/anyone else reading this question, I looked it up and the general answer to the question is that the endpoint in the general case is pretty fast -- P95 latency for calling /investments/transactions/get is currently about 1 second (initial calls on an Item will be higher latency as they have more data to fetch and because they are blocked on Plaid's extracting the data for the Item for the first time -- hence the 1-2 minute guidance in the docs).
In addition, Investments updates at some major brokerages are scheduled to happen only overnight after market close, so there might be a delay of 12+ hours between making a trade and seeing that trade be returned by the API.

What is "sf_max_daily_api_calls"?

Does someone know what "sf_max_daily_api_calls" parameter in Heroku mappings does? I do not want to assume it is a daily limit for write operations per object and I cannot find an explanation.
I tried to open a ticket with Heroku, but in their support ticket form "Which application?" drop-down is required, but none of the support categories have anything to choose there from, the only option is "Please choose..."
I tried to find any reference to this field and can't - I can only see it used in Heroku's Quick Start guide, but without an explanation. I have a very busy object I'm working on, read/write, and want to understand any limitations I need to account for.
Salesforce orgs have rolling 24h limit of max daily API calls. Generally the limit is very generous in test orgs (sandboxes), 5M calls because you can make stupid mistakes there. In productions it's lower. Bit counterintuitive but protects their resources, forces you to write optimised code/integrations...
You can see your limit in Setup -> Company information. There's a formula in documentation, roughly speaking you gain more of that limit with every user license you purchased (more for "real" internal users, less for community users), same as with data storage limits.
Also every API call is supposed to return current usage (in special tag for SOAP API, in a header in REST API) so I'm not sure why you'd have to hardcode anything...
If you write your operations right the limit can be very generous. No idea how that Heroku Connect works. Ideally you'd spot some "bulk api 2.0" in the documentation or try to find synchronous vs async in there.
Normal old school synchronous update via SOAP API lets you process 200 records at a time, wasting 1 API call. REST bulk API accepts csv/json/xml of up to 10K records and processes them asynchronously, you poll for "is it done yet" result... So starting job, uploading files, committing job and then only checking say once a minute can easily be 4 API calls and you can process milions of records before hitting the limit.
When all else fails, you exhausted your options, can't optimise it anymore, can't purchase more user licenses... I think they sell "packets" of more API calls limit, contact your account representative. But there are lots of things you can try before that, not the least of them being setting up a warning when you hit say 30% threshold.

JMeter and page views

I'm trying to use data from google analytics for an existing website to load test a new website. In our busiest month over an hour we had 8361 page requests. So should I get a list of all the urls for these page requests and feed these to jMeter, would that be a sensible approach? I'm hoping to compare the page response times against the existing website.
If you need to do this very quickly, say you have less than an hour for scripting, in that case you can do this way to compare that there are no major differences between 2 instances.
If you would like to go deeper:
8361 requests per hour == 2.3 requests per second so it doesn't make any sense to replicate this load pattern as I'm more than sure that your application will survive such an enormous load.
Performance testing is not only about hitting URLs from list and measuring response times, normally the main questions which need to be answered are:
how many concurrent users my application can support providing acceptable response times (at this point you may be also interested in requests/second)
what happens when the load exceeds the threshold, what types of errors start occurring and what is the impact.
does application recover when the load gets back to normal
what is the bottleneck (i.e. lack of RAM, slow DB queries, low network bandwidth on server/router, whatever)
So the options are in:
If you need "quick and dirty" solution you can use the list of URLs from Google Analytics with i.e. CSV Data Set Config or Access Log Sampler or parse your application logs to replay production traffic with JMeter
Better approach would be checking Google Analytics to identify which groups of users you have and their behavioral patterns, i.e. X % of not authenticated users are browsing the site, Y % of authenticated users are searching, Z % of users are doing checkout, etc. After it you need to properly simulate all these groups using separate JMeter Thread Groups and keep in mind cookies, headers, cache, think times, etc. Once you have this form of test gradually and proportionally increase the number of virtual users and monitor the correlation of increasing response time with the number of virtual users until you hit any form of bottleneck.
The "sensible approach" would be to know the profile, the pattern of your load.
For that, it's excellent you're already have these data.
Yes, you can feed it as is, but that would be the quick & dirty approach - while get the data analysed, patterns distilled out of it and applied to your test plan seems smarter.

How to debug performance issue/optimize your meteor app

I just deployed my Meteor app onto a production server on Digital Ocean.
I noticed that for about 7500 documents, it takes about 3-5 seconds to fully fetch the objects (selectively taking only 3 fields) and populate the autocomplete data. I believe it should rather be instantenous for such number of data, so I am curious how I can debug performance issues from here and optimize more. How should I go about debugging performance issues for a Meteor app? I tried seeing the network tab but nothing seems to take more than a second. I am not sure why it takes 3-5 seconds for the search bar with an autocomplete feature to get ready. After a close inspection, populating autocomplete fields is instantaneous, and the time until subscribe function's callback is called is about 3 to 5 seconds.
I've already looked into Kadira, but it reported that everything was complete within milliseconds, so I am confused.
possibly related: Meteor's subscription and sync are slow
After all, is 3-5 seconds for 7800 documents with 2 fields reasonable?
Let me tell you what's really happening here.
Kadira shows the time taken to fetch the data from the server and queue it to the network. So, 500 - 700 ms is reasonable for that.
So, this 3-5 ms latency is the network latency. That means the time taken to send data to the client via the network. It's quite okay for 7500+ documents even with three fields over DDP.
So, my suggestion is to do the search on the server and use something like Search Source for that.
With that, you'll get the only data required to the client. Which reduce the latency and saves the CPU of your app.

errorCode=003001; statusCode=403; source=Throttling Policy

Lately, in our app using IPP data services, we have encountered these errors from time to time.
<RestResponse xmlns="http://www.intuit.com/sb/cdm/v2">
<Error RequestId="49f7926a9aa84cfc8289534801dee72d">
<RequestName>ErrorRequest</RequestName>
<ProcessedTime>2012-12-07T10:10:59+00:00</ProcessedTime>
<ErrorCode>3001</ErrorCode>
<ErrorDesc>message=This client has made too many consecutive requests over too short a period of time. Please wait a short amount of time before attempting to submit again; errorCode=003001; statusCode=403; source=Throttling Policy</ErrorDesc>
</Error>
</RestResponse>
Can't find any reference to a "Throttling Policy" or error code "3001" anywhere in the IPP documentation.
Any help in figuring out what the throttling limits are would be appreciated. Are they based around an IP, rate limitation, concurrency limitation, OAuth consumer, OAuth client, something else perhaps?
EDIT: Link to IDN forums about the same issue: https://idnforums.intuit.com/messageview.aspx?catid=69&threadid=18910.
Yes, there is a throttling process in place if > 500 requests per minute by a single user or against a single realm.
You had over 600 requests during the one minute period.
Looks like almost all (all except 32 requests) were individual customer queries… all different customer record ids. Is there a way you can make a single customer list query, filtered if necessary, to get a bunch of customer records in a single request and reduce the number of calls you are making.

Resources