Every ten seconds, I get errors like these in my OS X El Capitan error logs (with line breaks added here for clarity):
3/7/17 09:15:34.104 com.apple.xpc.launchd[1]:
(org.postfix.master[98071]) Service exited with abnormal code: 1
3/7/17 09:15:34.104 com.apple.xpc.launchd[1]:
(org.postfix.master) Service only ran for 1 seconds. Pushing respawn out by 9 seconds.
I don’t think I’m using postfix for anything, nor even that I want to. What’s the best way to handle this? I’m fine with disabling postfix altogether if there’s no reason to have it running, and also fine with having it running if it’s needed and/or can be made to not spam the console.
Related
I'm currently upgrading an application to use Octane. At a glance I thought this was working great until I used K6 to do some load testing.
It might be worth mentioning at this point that I'm using Docker.
When running a 60s test with 20 vus I can see all the requests hitting (via CLI logs) and then at about 99% the logs stop and then test finishes with request timeout errors (99% success).
If I try to visit the application locally the requests are in a constant pending state (never terminates).
If I enter the container and try to interact with the application it simply does nothing (like the request, constantly hanging). I cant even exit the container at this point. If I try to kill the container I can't, I have to restart Docker.
If I cut the vus down to 10, the test finishes successfully. If I run the test again at about half way through it again dies.
It's as if once the application hits x amount of requests, it dies.
Also note that I have run these tests many times before installing Octane and they ran fine.
Has anyone else had any experience with Octane who has experienced the above?
Regards
I've got a phantomjs app (http://css.benjaminbenben.com) running on heroku - it works well for some time but then I have to run heroku restart because it requests start timing out.
I'm looking for a stop-gap solution (I've gone from around 6 to 4500 daily visitors over the last week), and I was considering exiting the process after it had served a set number of requests to fire a restart.
Will this work? And would this be considered bad practice?
(in case you're interested, the app source is here - https://github.com/benfoxall/wtcss)
It'd work, as long as you don't crash within I think 10 minutes of the last crash. If it's too frequent the process will stay down.
It's not bad practice, but it's not great practice. You should figure out what is causing your server to hang, of course.
My app runs on Heroku with unicorn and uses sucker_punch to send a small quantity of emails in the background without slowing the web UI. This has been working pretty well for a few weeks.
I changed the unicorn config to the Heroku recommended config. The recommended config
includes an option for the number of unicorn processes and I upped the number of processes from 2 to 3.
Apparently that was too much. The sucker_punch jobs stopped running. I have log messages that indicate when they are queued and I have messages that indicate when they start processing. The log shows them being queued but the processing never starts.
My theory is that I exceeded memory by going from 2 to 3 unicorns.
I did not find a message anywhere indicating a problem.
Q1: should I expect to find a failure messsage somewhere? Something like "attempting to start sucker_punch -- oops, not enough memory"?
Q2: Any suggestions on how I can be notified of a failure like this in the future.
Thanks.
If you are indeed exceeding dyno memory, you should find R14 or R15 errors in your logs. See https://devcenter.heroku.com/articles/error-codes#r14-memory-quota-exceeded
A more likely problem, though, given that you haven't found these errors, is that something within the perform method of your sucker punch worker is throwing an exception. I've found sucker punch tasks to be a pain to debug because it appears the lib swallows all exceptions silently. Try instantiating your task and calling perform on it from a rails console to make sure that it behaves as you expect.
For example, you should be able to do this without causing an exception:
task = YourTask.new
task.perform :something, 55
a cron job that was successfully running for years suddenly started dying after about 80% completion. Not sure if it is because the collection with results was steadily growing and reached some critical size (does not seem to be all that big to me) or for any other reason. I am not sure how to debug this, I found the user at whom the job died and tried to run the job for this user, got CURSOR_NOTFOUND message after 2 hours. Yesterday it died after 3 hours of running for all users. I am still using old mongoid (2.0.0.beta) because of multiple dependences and lack of time to change it, but mongo is up to date (I know about the bug in versions before 1.1.2).
I found two similar questions but neither of them is applicable. In this case, they used Mopped which was not production ready. And here the problem was in pagination.
I am getting this error message
MONGODB cursor.refresh() for cursor xxxxxxxxx
rake aborted!
Query response returned CURSOR_NOT_FOUND. Either an invalid cursor was specified, or the cursor may have timed out on the server.
Any suggestions?
A "cursor not found" error from MongoDB is typically an indication that the cursor timed out (after 10 minutes of inactivity) but it could potentially indicate that the client code has become confused and is using a stale or closed cursor or has corrupted the cursor somehow. If the 3 hour runtime included a lot of busy time on the client in between calls to MongoDB, that might give the server time to timeout the cursor.
You can specify a no-timeout option on the cursor to see if it is a server timeout of your cursor that is causing your problem.
There is nothing writing to the Apache error log and I can not find any scheduled tasks that may be causing a problem. The restart occurs around the same time, 3 times over the past week at 12:06 am. Then also in the 3-4 am time frame.
I am running Apache version 2.2.9 on Windows 2003 server version.
The same behavior was happening prior to the past week, where there was an error being written to the Apache error log indicating that the MaxRequestsPerChild limit was being reached. I found this article,
http://httpd.apache.org/docs/2.2/platform/windows.html
suggesting setting MaxRequestsPerChild to 0, which I did and the error stopped reporting to the error log, but the behavior of restarting continued, although not as frequently.