Best practice to avoid Heroku H12 timeout error for Django rest api uses OpenAI Completions models - heroku

I use django, deployed on Heroku, to make requests to OpenAI completion models as the backend of my application. If the prompt of the request is large and takes more than 30 seconds, I get an H12 timeout error response.
What best practices for overcoming this problem can you advise?

Related

Heroku Webhook Fails Sometimes

I'm creating a chatbot with dialogflow and I have a webhook hosted on heroku (so that I can use python scripts). The webhook works fine most of the time. However when I haven't used it in a while it will always fail on the first use with a request timeout. Has anyone else come across this issue? Is there a way to wake up the webserver before running the script I have written?
Heroku's free dynos will sleep after 30 minutes of inactivity.
Preventing them from sleeping is easy. You need to use any of their paid plans.
See https://www.heroku.com/pricing
Once you use a Hobby dyno, your app will never sleep anymore and you shouldn't be getting request timeouts.
Alternatively, you can also benchmark what's taking a long time to boot your app. With a faster boot time, the first request would be slow but wouldn't get a timeout.
Heroku times out requests after 30 seconds.

Google API Key giving Query over limit error

We have an web application which was working fine till yesterday. But since yesterday afternoon , one of our projects in google api console , all the keys started giving OVER_QUERY_LIMIT error.
And we cross checked that the quotas for that project and api are still not full. Can anybody help me to understand what may have caused this.
And after a days use also the API keys are still giving the same error.
Just to give more information we are using Geocoding API and Distance Matrix API in our application.
If you exceed the usage limits you will get an OVER_QUERY_LIMIT status code as a response. This means that the web service will stop providing normal responses and switch to returning only status code OVER_QUERY_LIMIT until more usage is allowed again. This can happen:
Within a few seconds, if the error was received because your application sent too many requests per second.
Within the next 24 hours, if the error was received because your application sent too many requests per day. The daily quotas are reset at midnight, Pacific Time.
This screencast provides a step-by-step explanation of proper request throttling and error handling, which is applicable to all web services.
Upon receiving a response with status code OVER_QUERY_LIMIT, your application should determine which usage limit has been exceeded. This can be done by pausing for 2 seconds and resending the same request. If status code is still OVER_QUERY_LIMIT, your application is sending too many requests per day. Otherwise, your application is sending too many requests per second.
Note: It is also possible to get the OVER_QUERY_LIMIT error:
From the Google Maps Elevation API when more than 512 points per request are provided.
From the Google Maps Distance Matrix API when more than 625 elements per request are provided.
Applications should ensure these limits are not reached before sending requests.
Documentation usage limits

How to deal with excessive requests on heroku

I am experiencing a once per 60-90 minute spike in traffic that's causing my Heroku app to slow to a crawl for the duration of the spike - NewRelic is reporting response times of 20-50 seconds per request, with 99% of that down to the Heroku router queuing requests up. The request count goes from an average of around 50-100rpm up to 400-500rpm
Looking at the logs, it looks to me like a scraping bot or spider trying to access a lot of content pages on the site. However it's not all coming from a single IP.
What can I do about it?
My sysadmin / devops skills are pretty minimal.
Guy
Have your host based firewall throttle those requests. Depending on your setup, you can also add Nginx in to the mix, which can throttle requests too.

OpenSSL::SSL::SSLErrorWaitReadable read would block

Currently seeing this error from our background workers on Heroku when posting text/plain queries to the Intuit v3 API. https://quickbooks.api.intuit.com/v3/company/123456789/query/
Ruby 2.1.2 and Rails 4.0.10
OAuth 0.4.5 https://github.com/oauth-xx/oauth-ruby
There is an error_message "Intuit's API timed out. They may be over capacity. Please try again later". I'm not sure if this is just an intermittent error that we should ignore, or if it's an issue on the Intuit side or with Ruby/OpenSSL on Heroku?
This error isn't happening a lot, but we have one customer who reports being unable to sync for a week.
Heroku timeouts are 30s. Intuit's api may be taking longer possibly. I suggest you try from a local machine or try to test from AWS if you are familiar or another cloud provider. Another thing you could do is test running the api on the command line from heroku console.

HTTP streaming on Heroku (upload lots of data)

I have one app hosted on Heroku and this app saving lots of data information to database (it takes about 70 seconds).
Heroku after 30 seconds period of every request display the error page H12 about timeout, how could I display some info-message while the upload is in progress instead of displaying H12 error?
I have been looking for some example of this, but I wasn't much successful... I just found some notes, that I have to send every time (eg. 15 seconds) some control string from server, but I already didn't find some specific example how to do that...
Any advices how to do that?
Thank in advance.
It is a poor practice to have your users wait for 70 seconds for a request to complete on any platform. Heroku just enforces this best practice by implementing the 30-second timeout. So the real question is how to better architect the application.
Heroku has an article on implementing background workers which are designed to solve this very problem: https://devcenter.heroku.com/articles/queueing
The basic approach is to have the web request schedule a background job (using Delayed Job, Queue Classic, Resque etc...) and immediately respond to the user with some indicator of progress. Then a dyno running a background worker does the heavy lifting of saving the info to the db. When it's done it flips some flag in a db or other storage mechanism which notifies to the web client that the job is now complete.
Running a background worker does require another dyno. If you're looking to avoid that expense you can look into Girl Friday which many report having success with.
Hope that helps.

Resources