What is "sf_max_daily_api_calls"? - heroku

Does someone know what "sf_max_daily_api_calls" parameter in Heroku mappings does? I do not want to assume it is a daily limit for write operations per object and I cannot find an explanation.
I tried to open a ticket with Heroku, but in their support ticket form "Which application?" drop-down is required, but none of the support categories have anything to choose there from, the only option is "Please choose..."
I tried to find any reference to this field and can't - I can only see it used in Heroku's Quick Start guide, but without an explanation. I have a very busy object I'm working on, read/write, and want to understand any limitations I need to account for.

Salesforce orgs have rolling 24h limit of max daily API calls. Generally the limit is very generous in test orgs (sandboxes), 5M calls because you can make stupid mistakes there. In productions it's lower. Bit counterintuitive but protects their resources, forces you to write optimised code/integrations...
You can see your limit in Setup -> Company information. There's a formula in documentation, roughly speaking you gain more of that limit with every user license you purchased (more for "real" internal users, less for community users), same as with data storage limits.
Also every API call is supposed to return current usage (in special tag for SOAP API, in a header in REST API) so I'm not sure why you'd have to hardcode anything...
If you write your operations right the limit can be very generous. No idea how that Heroku Connect works. Ideally you'd spot some "bulk api 2.0" in the documentation or try to find synchronous vs async in there.
Normal old school synchronous update via SOAP API lets you process 200 records at a time, wasting 1 API call. REST bulk API accepts csv/json/xml of up to 10K records and processes them asynchronously, you poll for "is it done yet" result... So starting job, uploading files, committing job and then only checking say once a minute can easily be 4 API calls and you can process milions of records before hitting the limit.
When all else fails, you exhausted your options, can't optimise it anymore, can't purchase more user licenses... I think they sell "packets" of more API calls limit, contact your account representative. But there are lots of things you can try before that, not the least of them being setting up a warning when you hit say 30% threshold.

Related

investments/transactions/get endpoint - how long to return data?

I've been testing Plaid's investments transactions endpoint (investments/transactions/get) in development.
I'm encountering issues with highly variable delays for data to be returned (following the product initialization with Link). Plaid states that it takes 1–2 minutes to return investment transaction data, but I've found that in practice, it can be up to several hours before the data is returned.
Anyone else using this endpoint and getting data returned within 1–2 minutes, or is it generally a longer wait?
If it is a longer wait, do you simply wait for the DEFAULT_UPDATE webhook before you retrieve the data?
So far, my experience with their investments/transactions/get has been problematic (missing transactions, product doesn't work as described in their docs, limited sandbox dataset, etc.) so I'm very interested in hearing from anyone with more experience with this endpoint.
Do you find this endpoint generally reliable, and the data provided to be usable, or have you had issues? I've not seen any issues with investments/holdings/get, so I'm hoping that my problems are unusual, and I just need to push through it.
I'm testing in development with my own brokerage accounts, so I know what the underlying transactions are compared to what Plaid is returning to me. My calls are set up correctly, and I can't get a helpful answer from Plaid support.
I took at look at the support issue and it does appear like the problem you're hitting is related to a bug (or two different bugs, in this case).
However, for posterity/anyone else reading this question, I looked it up and the general answer to the question is that the endpoint in the general case is pretty fast -- P95 latency for calling /investments/transactions/get is currently about 1 second (initial calls on an Item will be higher latency as they have more data to fetch and because they are blocked on Plaid's extracting the data for the Item for the first time -- hence the 1-2 minute guidance in the docs).
In addition, Investments updates at some major brokerages are scheduled to happen only overnight after market close, so there might be a delay of 12+ hours between making a trade and seeing that trade be returned by the API.

Fetch third party data in a periodic interval

I've an application with 10M users. The application has access to the user's Google Health data. I want to periodically read/refresh users' data using Google APIs.
The challenge that I'm facing is the memory-intensive task. Since Google does not provide any callback for new data, I'll be doing background sync (every 30 mins). All users would be picked and added to a queue, which would then be picked sequentially (depending upon the number of worker nodes).
Now for 10M users being refreshed every 30 mins, I need a lot of worker nodes.
Each user request takes around 1 sec including network calls.
In 30 mins, I can process = 1800 users
To process 10M users, I need 10M/1800 nodes = 5.5K nodes
Quite expensive. Both monetary and operationally.
Then thought of using lambdas. However, lambda requires a NAT with an internet gateway to access the public internet. Relatively, it very cheap.
Want to understand if there's any other possible solution wrt the scale?
Without knowing more about your architecture and the google APIs it is difficult to make a recommendation.
Firstly I would see if google offer a bulk export functionality, then batch up the user requests. So instead of making 1 request per user you can make say 1 request for 100k users. This would reduce the overhead associated with connecting and processing/parsing of the message metadata.
Secondly i'd look to see if i could reduce the processing time, for example an interpreted language like python is in a lot of cases much slower than a compiled language like C# or GO. Or maybe a library or algorithm can be replaced with something more optimal.
Without more details of your specific setup its hard to offer more specific advice.

Parse.com how to investigate excessive amount of requests

I'm developing a basic messaging system on the Parse.com at the moment and I have noticed in the Events Analytics screen I'm hitting 30,000+ requests per day. This is a shock considering I'm the only person using the system at the moment. Obviously with a few users I would blow my API request limit straight away.
I'm pretty experienced with Parse.com these days, so I'm lean with queries and I'm alert to not putting finds, saves, retrieves, etc in for loops. I also understand that saveAll() on an array of ParseObjects doesn't always limit the request count to 1 (depending on relationships inside that object).
So how does one track down where the excessive calls are coming from?
I see the above Analytics > Performance > Served Requests data, but how do I drill down to see if cloud code or iOS is the culprit?
Current solution is to effectively unit test each block of Parse code and look at the results in above screen.
For the benefit of others who may happen upon this thread with the same questions, I found some techniques to hunt down where excessive requests are coming from.
1) Parse's documentation on the API's themselves is really good, but there isn't a lot of information / guides for the admin interfaces. Under: Analytics -> Explorer -> Make a table there is a capability to download all the requests for a specific day (to import into a spreadsheet). The data isn't very detailed though and the dates are epoch timestamps, so hard to follow. At least you can see [Request Type, Class, Installation ID] e.g. ["find", "MyParseClass", "Cloud Code"].
2) My other technique was to add custom Analytic events to the code. So in Cloud Code for example, I added the following line to each beforeSave and afterSave event:
Parse.Analytics.track('MyClass_beforeSave', null);
3) Obviously, Parse logs these calls in the Logs window, but given you can only see the most recents transactions and can't clear them, I found it mostly unhelpful in tracking down the excessive calls.

Does the Google Analytics API throttle requests?

Does the Google Analytics API throttle requests?
We have a batch script that I have just moved from v2 to v3 of the API and the requests go through quite well for the first bit (50 queries or so) and then they start taking 4s or so each. Is this Google throttling us?
While Matthew is correct, I have another possibility for you. Google analytics API cashes your requests to some extent. Let me try and explain.
I have a customer / site that I request data from. While testing I noticed some strange things.
the first million rows results would come back with in an acceptable amount of time.
after a million rows things started to slow down we where seeing results returning in 5 times as much time instead of 5 minutes we where waiting 20 minutes or more for the results to return.
Example:
Request URL :
https://www.googleapis.com/analytics/v3/data/ga?ids=ga:34896748&dimensions=ga:date,ga:sourceMedium,ga:country,ga:networkDomain,ga:pagePath,ga:exitPagePath,ga:landingPagePath&metrics=ga:entrances,ga:pageviews,ga:exits,ga:bounces,ga:timeOnPage,ga:uniquePageviews&filters=ga:userType%3D%3DReturning+Visitor;ga:deviceCategory%3D%3Ddesktop&start-date=2014-05-12&end-date=2014-05-22&start-index=236001&max-results=2000&oauth_token={OauthToken}
Request Time (seconds:milliseconds): :0:484
Request URL :
https://www.googleapis.com/analytics/v3/data/ga?ids=ga:34896748&dimensions=ga:date,ga:sourceMedium,ga:country,ga:networkDomain,ga:pagePath,ga:exitPagePath,ga:landingPagePath&metrics=ga:entrances,ga:pageviews,ga:exits,ga:bounces,ga:timeOnPage,ga:uniquePageviews&filters=ga:userType%3D%3DReturning+Visitor;ga:deviceCategory%3D%3Ddesktop&start-date=2014-05-12&end-date=2014-05-22&start-index=238001&max-results=2000&oauth_token={OauthToken}
Request Time (seconds:milliseconds): :7:968
I did a lot of testing stopping and starting my application. I couldn't figure out why the data was so fast in the beginning then slow later.
Now I have some contacts on the Google Analytics Development team the guys in charge of the API. So I made a nice test app, logged some results showing my issue and sent it off to them. With the question Are you throttling me?
They where also perplexed, and told me there is no throttle on the API. There is a flood protection limit that Matthew speaks of. My Developer contact forwarded it to the guys in charge of the traffic.
Fast forward a few weeks. It seams that when we make a request for a bunch of data Google cashes the data for us. Its saved on the server incase we request it again. By restarting my application I was accessing the cashed data and it would return fast. When I let the application run longer I would suddenly reach non cashed data and it would take longer for them to return the request.
I asked how long is data cashed for, answer there was no set time. So I don't think you are being throttled. I think your initial speedy requests are cashed data and your slower requests are non cashed data.
Email back from google:
Hi Linda,
I talked to the engineers and they had a look. The response was
basically that they thinks it's because of caching. The response is
below. If you could do some additional queries to confirm the behavior
it might be helpful. However, what they need to determine is if it's
because you are querying and hitting cached results (because you've
already asked for that data). Anyway, take a look at the comments
below and let me know if you have additional questions or results that
you can share.
Summary from talking to engineer: "Items not already in our cache will
exhibit a slower retrieval processing time than items already present
in the cache. The first query loads the response into our cache and
typical query times without using the cache is about 7 seconds and
with using the cache is a few milliseconds. We can also confirm that
you are not hitting any rate limits on our end, as far as we can tell.
To confirm if this is indeed what's happening in your case, you might
want to rerun verified slow queries a second time to see if the next
query speeds up considerably (this could be what you're seeing when
you say you paste the request URL into a browser and results return
instantly)."
-- IMBA Google Analytics API Developer --
Google's Analytics API does have a rate limit per their docs: https://developers.google.com/analytics/devguides/reporting/core/v3/coreErrors
However they should not caused delayed requests, rather the request should be returned with a response of: 403 userRateLimitExceeded
Description of that error:
Indicates that the user rate limit has been exceeded. The maximum rate limit is 10 qps per IP address. The default value set in Google Developers Console is 1 qps per IP address. You can increase this limit in the Google Developers Console to a maximum of 10 qps.
Google's recommended course of action:
Retry using exponential back-off. You need to slow down the rate at which you are sending the requests.

Pricing: Are push notifications really free?

According to the parse.com pricing page, push notifications are free up to 1 million unique recipients.
API calls are free up to 30 requests / second.
I want to make sure there is no catch here.
An example will clarify: I have 100K subscribed users. I will send weekly push notifications to them. In a month, that will be 4 push "blasts" with 100K recipients each. Is this covered by the free tier? Would this count as 4 API calls, 400K API calls, or some other amount?
100k users is 1/10 the advertised unique recipient limit, so that should be okay.
Remember that there's a 10sec timeout, too. So the only way to blast 100k pushes within the free-tier resource limits is to create a scheduled job that spends about 2 hours (that's a safe rate of 15 req/sec) doing pushes and writing state so you can pick up later where you left off.
Assuming there's no hidden gotcha (you'll probably need to discover those empirically), I think the only gotcha in plain sight is the fact that the free tier allows only one (1) scheduled job. Any other long-running processing -- and there are bound to be some on 100k users -- are going to have to share the job, making the what-should-this-single-job-work-on-now logic pretty complex.
You should take a look at the FAQ for Parse.com:
https://www.parse.com/plans/faq
What is considered an API request?
Anytime you make a network call to
Parse on behalf of your app using one of the Parse SDKs or REST API,
it counts as an API request. This does include things like queries,
saves, logins, amongst other kinds of requests. It also includes
requests to send push notifications, although this is seen as a single
request regardless of how many recipients are targeted. Serving Parse
files counts as an API request, including static assets served from
Parse Hosting. Analytics requests do have a special exemption. You can
send us your analytics events any time without being limited by your
app's request limit.

Resources