Trying to get Heroku to run some Puppeteer jobs. Locally, it works. It's slow but it works. Monitoring the memory in OS X Activity Monitor, it doesn't get above 50MB. But when I deploy this script to Heroku, I'm getting a Memory quota exceeded every time, and the memory footprint it much larger.
Looking at the logs, I'm getting the message:
Process running mem=561M(106.5%) .
Error R14 (Memory quota exceeded) .
Restarting .
State changed from up to starting
Either Activity Monitor is not reporting the memory correctly, or something is going wrong only when running the script on Heroku. I can't imagine why a page scrape of 25 pages would be 561M.
Also since the Puppeteer scripts must be contained in try/catch—the memory error is crashing the Dyno and restarting. By the time the Dyno restarts, the browser hangs up. So the restarting does little good. Is there a way to catch 'most' errors on Heroku but throw when there is a memory R14 error?
I had a similar issue. What I discovered is that if you are not closing the browser you will get immediately an R14 error. What I recommend:
Make sure you use a single browser instance and multiple contexts instead of multiple browsers.
Make sure you close the contexts after you call pdf
If you are processing large pages you need to scale your heroku instance, you don't have a choice. Unfortunately, you need to pay 50$ for 1GB of memory on heroku...
Some ugly code but it points the fact you context is closed after calling pdf function.
browser.createIncognitoBrowserContext().then((context)=>{
context.newPage().then((page)=>{
page.setContent(html).then(()=>{
page.pdf(options).then((pdf)=>{
let inputStream = bufferToStream(pdf);
let outputStream = fs.createWriteStream(path);
inputStream.pipe(outputStream).on("finish", () => {
context.close().then(()=>{
resolve();
}).catch(reject);
});
});
}).catch(reject)
}).catch(reject)
}).catch(reject);
Related
I'm new to heroku and wondering about their terminology.
I host a project that requires seeding to populate a database with tens of thousands of rows. To do this I employ a web dyno to extract information from APIs across the web.
As my dyno is running I get memory notifications saying that the dyno has exceeded memory requirements (specific heroku errors are R14 and R15).
I am not sure whether this merely means that my seeding process (web dyno) is running too fast and will be throttled, or whether my database itself is too large and must be reduced?
R14 and R15 errors are only thrown on their runtime dynos. For reference, Heroku Postgres databases do not run on dynos. If you're hitting R14/R15 errors it means that the seed data you're pulling down is likely exhausting your memory quota. You'll need to either decrease the size of the data or batch the data, write to Postgres and then clean up before proceeding.
I am using Windows Azure SDK 2.2 and have created an Azure cloud service that uses an in-role cache.
I have 2 instances of the service running under normal conditions.
When the services scales (up to 3 instances, or back down to 2 instances), I get lots of DataCacheExceptions. These are often accompanied by Azure db connection failures from the process going in inside the cache. (If I don't find the entry I want in the cache, I get it from the db and put it into the cache. All standard stuff.)
I have implemented retry processes on the cache gets and puts, and use the ReliableSqlConnection object with a retry process for db connection using the Transient Fault Handling application block.
The retry process uses a fixed interval retrying every second for 5 tries.
The failures are typically;
Microsoft.ApplicationServer.Caching.DataCacheException: ErrorCode:SubStatus:There is a temporary failure. Please retry later
Any idea why the scaling might cause these exceptions?
Should I try a less aggressive retry policy?
Any help appreciated.
I have also noticed that I am getting a high percentage (> 70%) cache miss rate and when the system is struggling, there is high cpu utilisation (> 80%).
Well, I haven't been able to find out any reason for the errors I am seeing, but I have 'fixed' the problem, sort of!
When looking at the last few days processing stats, it is clear the high cpu usage corresponds with the cloud service having 'problems'. I have changed the service to use two medium instances instead of two small instances.
This seems to have solved the problem, and the service has been running quite happily, low cpu usage, low memory usage, no exceptions.
So, whilst still not discovering what the source of the problems were, I seem to have overcome them by providing a bigger environment for the service to run in.
--Late news!!! I noticed this morning that from about 06:30, the cpu usage started to climb, along with the time taken for the service to process as it should. Errors started appearing and I had to restart the service at 10:30 to get things back to 'normal'. Also, when restarting the service, the OnRoleRun process threw loads of DataCacheExceptions before it started running again, 45 minutes later.
Now all seems well again, and I will monitor for the next hours/days...
There seems to be no explanation for this, remote desktop to the instances show no exceptions in the event log, other logging is not showing application problems, so I am still stumped.
My app runs on Heroku with unicorn and uses sucker_punch to send a small quantity of emails in the background without slowing the web UI. This has been working pretty well for a few weeks.
I changed the unicorn config to the Heroku recommended config. The recommended config
includes an option for the number of unicorn processes and I upped the number of processes from 2 to 3.
Apparently that was too much. The sucker_punch jobs stopped running. I have log messages that indicate when they are queued and I have messages that indicate when they start processing. The log shows them being queued but the processing never starts.
My theory is that I exceeded memory by going from 2 to 3 unicorns.
I did not find a message anywhere indicating a problem.
Q1: should I expect to find a failure messsage somewhere? Something like "attempting to start sucker_punch -- oops, not enough memory"?
Q2: Any suggestions on how I can be notified of a failure like this in the future.
Thanks.
If you are indeed exceeding dyno memory, you should find R14 or R15 errors in your logs. See https://devcenter.heroku.com/articles/error-codes#r14-memory-quota-exceeded
A more likely problem, though, given that you haven't found these errors, is that something within the perform method of your sucker punch worker is throwing an exception. I've found sucker punch tasks to be a pain to debug because it appears the lib swallows all exceptions silently. Try instantiating your task and calling perform on it from a rails console to make sure that it behaves as you expect.
For example, you should be able to do this without causing an exception:
task = YourTask.new
task.perform :something, 55
I have a web app on heroku which all the time is using around 300% of the allowed RAM (512 MB). I see my logs full of Error R14 (Memory quota exceeded) [an entry every second]. Although in bad condition, my app still works.
Apart from degraded performance, are there any other consequences also which I should be aware of ( like heroku be charging extra for anything related to this issue, scheduled jobs might fail etc) ?
To the best of my knowledge Heroku will not take action even if you continue to exceed the memory requirements. However, I don't think the availability of the full 1 GB of overage (out of the 1.5 GB that you are consuming) is guaranteed, or is guaranteed to be physical memory at all times. Also, if you are running close to 1.5 GB, then you risk going over the hard 1.5 GB limit at which point your dyno will be terminated.
I also get the following every time I run a specific task on my Heroku app and check heroku logs --tail:
Process running mem=626M(121.6%)
Error R14 (Memory quota exceeded)
My solution would be to check out Celery and Heroku's documentation on this.
Celery is an open source asynchronous task queue, or job queue, which makes it very easy to offload work out of the synchronous request lifecycle of a web app onto a pool of task workers to perform jobs asynchronously.
I'm running an eventmachine process on heroku, and it seems to be hitting their memory limit of 512MB after an hour or so. I start seeing messages like this:
Error R14 (Memory quota exceeded)
Process running mem=531M(103.8%)
I'm running a lot of events through the reactor, so I'm thinking maybe the reactor is getting backed up (I'm imagining it as a big queue)? But there could be some other reason, I'm still fairly new to eventmachine.
Are there any good ways to profile eventmachine and and get some stats on it. As a simple example, I was hoping to see how many events were scheduled in the queue to see if it was getting backed up and keeping those all in memory. But if anyone has other suggestions I'd really appreciate it.
Thanks!
I use eventmachine extensively and never ran into any memory leak inside the reactor so your bet is that the ruby code is but without knowing more about your application it is hard to give you a real answer.
The only queue I can think of right now is the thread pool, each time you use the defer method the block is either given to a free thread or queued waiting for a free thread, I suppose if all your threads are blocking waiting for something the queue could grow and use all the memory available.
The leak turned out to be in Mongoid's identity_map (nothing to do with EventMachine). Setting Mongoid.identity_map_enabled = false at the beginning of the event machine process resolved it.