API times out due to limit and offset - ruby

I'm trying to pull items from one of our client's Podio instance. It has around 35k records, so I'm setting a limit and offset of 500
loop do
current_offset_value = offset.next
puts "LIMIT: #{LIMIT}, OFFSET: #{current_offset_value}"
Podio::Item.find_by_filter_id(app_id, view_id, limit: LIMIT, remember: true, offset: offset.next).all.each do |item|
yield item
end
end
However, the code just hangs after the first two calls and returns a timeout error
LIMIT: 500, OFFSET: 0
LIMIT: 500, OFFSET: 500
creates a CSV file from a table (FAILED - 1)
Failures:
SourceTableSync::LocalCsvDumper::CitrixPodio creates a CSV file from a table
Failure/Error:
Podio::Item.find_by_filter_id(app_id, view_id, limit: LIMIT, remember: true, offset: current_offset_value).all.each do |item|
yield item
end
Faraday::TimeoutError:
Net::ReadTimeout with #<TCPSocket:(closed)>

Looks like Podio API is timing out while getting 500 items, probably your items are big or has some relationship to other apps and it simply takes too much time to fetch it all.
I'd try some smaller number (e.g. 100 or 200) to see if that will work :)

Related

Why and how is the quota "critial read requests" exceeded when using batchCreateContacts

I'm programming a contacts export from our database to Google Contacts using the Google People API. I'm programming the requests over URL via Google Apps Script.
The code below - using https://people.googleapis.com/v1/people:batchCreateContacts - works for 13 to about 15 single requests, but then Google returns this error message:
Quota exceeded for quota metric 'Critical read requests (Contact and Profile Reads)' and limit 'Critical read requests (Contact and Profile Reads) per minute per user' of service 'people.googleapis.com' for consumer 'project_number:***'.
For speed I send the request with batches of 10 parallel requests.
I have the following two questions regarding this problem:
Why, for creating contacts, I would hit a quotum regarding read requests?
Given the picture link below, why would sending 2 batches of 10 simultaneous requests (more precise: 13 to 15 single requests) hit that quotum limit anyway?
quotum limit of 90 read requests per user per minute as displayed on console.cloud.google.com
Thank you for any clarification!
Further reading: https://developers.google.com/people/api/rest/v1/people/batchCreateContacts
let payloads = [];
let lengthPayloads;
let limitPayload = 200;
/*Break up contacts in payload limits*/
contacts.forEach(function (contact, index) /*contacts is an array of objects for the API*/
{
if(!(index%limitPayload))
{
lengthPayloads = payloads.push(
{
'readMask': "userDefined",
'sources': ["READ_SOURCE_TYPE_CONTACT"],
'contacts': []
}
);
}
payloads[lengthPayloads-1]['contacts'].push(contact);
}
);
Logger.log("which makes "+payloads.length+" payloads");
let parallelRequests = [];
let lengthParallelRequests;
let limitParallelRequest = 10;
/*Break up payloads in parallel request limits*/
payloads.forEach(function (payload, index)
{
if(!(index%limitParallelRequest))
lengthParallelRequests = parallelRequests.push([]);
parallelRequests[lengthParallelRequests-1].push(
{
'url': "https://people.googleapis.com/v1/people:batchCreateContacts",
'method': "post",
'contentType': "application/json",
'payload': JSON.stringify(payload),
'headers': { 'Authorization': "Bearer " + token }, /*token is a token of a single user*/
'muteHttpExceptions': true
}
);
}
);
Logger.log("which makes "+parallelRequests.length+" parallelrequests");
let responses;
parallelRequests.forEach(function (parallelRequest)
{
responses = UrlFetchApp.fetchAll(parallelRequest); /* error occurs here*/
responses = responses.map(function (response) { return JSON.parse(response.getContentText()); });
responses.forEach(function (response)
{
if(response.error)
{
Logger.log(JSON.stringify(response));
throw response;
}
else Logger.log("ok");
}
);
Output of logs:
which makes 22 payloads
which makes 3 parallelrequests
ok (15 times)
(the error message)
I had raised the same issue in Google's issue tracker.
Seems that the single BatchCreateContacts or BatchUpdateContacts call consumes six (6) "Critical Read Request" quota per request. Still did not get an answer why for creating/updating contacts, we are hitting the limit of critical read requests.
Quota exceeded for quota metric 'Critical read requests (Contact and Profile Reads)' and limit 'Critical read requests (Contact and Profile Reads) per minute per user' of service 'people.googleapis.com' for consumer 'project_number:***'.
There are two types of quotas: project based quotas and user based quotas. Project based quotas are limits placed upon your project itself. User based quotes are more like flood protection they limit the number of requests a single user can make over a period of time.
When you send a batch request with 10 requests in it it counts as ten requests not as a single batch request. If you are trying to run this parallel then you are defiantly going to be overflowing the request per minute per user quota.
Slow down this is not a race.
Why, for creating contacts, I would hit a quota regarding read requests?
I would chock it up to a bad error message.
Given the picture link below, why would sending 13 to 15 requests hit that quota limit anyway? ((there are 3 read requests before this code)) quota limit of 90 read requests per user per minute as displayed on console.cloud.google.com
Well you are sending 13 * 10 = 130 per minute that would exceed the request per minute. There is also no way of knowing how fast your system is running it could be going faster as it will depend upon what else the server is doing at the time it gets your requests what minute they are actually being recorded in.
My advice is to just respect the quota limits and not try to understand why there are to many variables on Googles servers to be able to tack down what exactly a minute is. You could send 100 requests in 10 seconds and then try to send another 100 in 55 seconds and you will get the error you could also get the error after 65 seconds depend upon when they hit the server and when the server finished processing your initial 100 requests.
Again slow down.

StackExchange.redis - Redis Timeout On Batch

I'm trying to Insert 1M keys of hashes to redis using batch insertion.
When I do that, several thousands of keys are not being inserted, and I got RedisTimeoutException.
Here is my code:
IDatabase db = RedisDB.Instance;
List<Task> tasks = new List<Task>();
var batch = db.CreateBatch();
foreach (var itemKVP in items)
{
HashEntry[] hash = RedisConverter.ToHashEntries(itemKVP.Value);
tasks.Add(batch.HashSetAsync(itemKVP.Key, hash));
}
batch.Execute();
Task.WaitAll(tasks.ToArray());
And then I get this exception:
RedisTimeoutException: Timeout awaiting response (outbound=463KiB, inbound=10KiB, 100219ms elapsed, timeout is 5000ms), command=HMSET, next: HMSET *****:*****:1390194, inst: 0, qu: 0, qs: 110, aw: True, rs: DequeueResult, ws: Writing, in: 0, in-pipe: 1045, out-pipe: 0, serverEndpoint: 10.7.3.36:6379, mgr: 9 of 10 available, clientName: DataCachingService:DEV25S, IOCP: (Busy=0,Free=1000,Min=8,Max=1000), WORKER: (Busy=4,Free=32763,Min=8,Max=32767), Local-CPU: 0%, v: 2.0.601.3402 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)
I read the article and I didn’t succeed to solve the problem.

how to do pagination and search based on status on rturk gem for mturk?

we are working with rturk gem to get hit from mturk in ruby
rturk gem : https://github.com/ryantate/rturk
I able to get hit from mturk using below code
hits = RTurk::Hit.all
puts "#{hits.size} hits. \n"
but it gives me only 100 hits how to find pagination object and total count of hits using rturk gem
how we can manage pagination with rturk ?
using the above code it gives all results but i want to search hits based on specific states.
How we can search for a hit by its state ?
Amazon Mechanical Turk has a max page size of 100 (details here), so that's why the rturk gem is faithfully returning 100 HITs.
The RTurk::SearchHITs operation does not seem to support searching by status. Instead, you can use the RTurk::GetReviewableHITs operation to do something like this:
get_reviewable_hits = RTurk::GetReviewableHITs.new status: 'Reviewable', page_size: 10, page_number: 2
=> #<RTurk::GetReviewableHITs:0x007faacb523668 #status="Reviewable", #page_number=2, #page_size=10>
request = get_reviewable_hits.request
request.total_num_results
=> 91
request.num_results
=> 10
hits = request.hit_ids.map { |id| RTurk::Hit.new(id) }
hits_with_details_and_assignments = request.hit_ids.map { |id| RTurk::Hit.find(id) }
Here we're:
filtering by status Reviewable
setting the page size to 10 to demonstrate pagination, as I only had 91 reviewable HITs handy
requesting page 2 of 9 for illustration
Hope this helps. Cheers.

Typeahead remote call is triggering even if there are datas in local/prefetch

The remote call is trigging even if there are values in prefetch/local data.
Sample Code:
var jsonObj = ["Toronto", "Montreal", "Calgary", "Ottawa", "Edmonton", "Peterborough"];
$('input.countries-cities').typeahead([
{
name: 'Canada',
local: jsonObj,
remote: {
url: 'http://localhost/typeahead/ajaxcall.php?q=QUERY',
cache: true
},
limit: 3,
minLength: 1,
header: '<h3>Canada</h3>'
}
]);
What i expect is trigger remote call only if there are no matches in local. But each time i type locations the remote call is getting triggered. Any help will be highly appreciated.
I know this question is a couple months old, but I ran into a similar issue and found this answer.
The problem is that your limit is set to 3 and your search is turning up less results than your limit, thus triggering the remote call. If you had set your limit to 1, you wouldn't get a remote call unless there were no results.
Not a great design IMO, since you probably still want to see 3 results if there are 3 results. And worse, say your local/prefetch results only return 1 result...if your remote returns the same result, it will be duplicated in your list. I haven't found a solution to that problem yet.
In bloodhound.js replace
matches.length < this.limit ? cacheHit = ...
by
matches.length < 1 ? cacheHit = ...

Ruby: begin, sleep, retry: where to put incrementer

I have a method 'rate_limited_follow' that takes my Twitter useraccount and follows all the users in an array 'users'. Twitter's got strict rate limits, so the method deals with that contingency by sleeping for 15 minutes and then retrying again. (I didn't write this method, rather got it from the Twitter ruby gem api). You'll notice that it checks to see if the number of attempts are less than the MAX_ATTEMPTS.
My users array has about 400 users that I'm trying to follow. It's adding 15 users at a time (when the rate limits seems to kick in), then sleeping for 15 minutes. Since I set the MAX_ATTEMPTS constant to 3 (just to test it), I expected it to stop trying once it had added 45 users (3 times 15) but it's gone past that, continuing to add 15 users around every fifteen minutes, so it seems as if num_attempts is somehow remaining below 3, even though it's gone through this cycle more than 3 times. Is there something I don't understand about the code? Once 'sleep' is finished and it hits 'retry', where does it start again? Is there some reason num_attempts isn't incrementing?
Calling the method in the loop
>> users.each do |i|
?> rate_limited_follow(myuseraccount, i)
>> end
Method definition with constant
MAX_ATTEMPTS = 3
def rate_limited_follow (account, user)
num_attempts = 0
begin
num_attempts += 1
account.twitter.follow(user)
rescue Twitter::Error::TooManyRequests => error
if num_attempts <= MAX_ATTEMPTS
sleep(15*60) # minutes * 60 seconds
retry
else
raise
end
end
end
Each call to rate_limited_follow resets your number of attempts - or, to rephrase, you are keeping track of attempts per user rather than attempts over your entire array of users.
Hoist num_attempt's initialization out of rate_limited_follow, so that it isn't being reset by each call, and you'll have the behavior that you're looking for.

Resources