There is nothing writing to the Apache error log and I can not find any scheduled tasks that may be causing a problem. The restart occurs around the same time, 3 times over the past week at 12:06 am. Then also in the 3-4 am time frame.
I am running Apache version 2.2.9 on Windows 2003 server version.
The same behavior was happening prior to the past week, where there was an error being written to the Apache error log indicating that the MaxRequestsPerChild limit was being reached. I found this article,
http://httpd.apache.org/docs/2.2/platform/windows.html
suggesting setting MaxRequestsPerChild to 0, which I did and the error stopped reporting to the error log, but the behavior of restarting continued, although not as frequently.
Related
My queue jobs all run fairly seamlessy in our production server, but about every 2 - 3 months I start getting a lot of timeout exceeded/too many attempts exceptions.
Our app is running with event sourcing and many events are queued so neededless to say we have a lot of jobs passing through the system (100 - 200k per day generally).
I have not found the root cause of the issues yet, but a simple re-deploy through Laravel Envoyer fixes the issue. This is most likely due to the cache:clear command being run.
Currently, the cache is handled by Redis and is on the same server as the app. I was considering moving the cache to its own server/instance but this still does not help me with the root cause.
Does anyone have any ideas what might be going on here and how I can diagnose/fix it? I am guessing the cache is just getting overloaded/running out of space/leaking etc. over time but not really sure where to go from here.
Check :
The version of your redis make an update of the predis package
The version of your Laravel
Your server
I hope I gave you some solutions
I have a cloudera 5.x cluster running in Azure. Everything was running fine, and then a few days ago I started getting "NODE_MANAGER_UNEXPECTED_EXITS" health notifications via email every hour.
This seems to happen on the 43 minute of every hour.
Most of the forms I've come across have suggested outOfMemory errors - though I'm not seeing any of these in the log files. For good measure I've tried upping the java head space memory allocation for NodeManager but this has not solved the problem.
I've stopped all jobs on the cluster - it is essentially sitting idle, but every hour I'm getting these alerts.
Example of the health alert that comes in the email:
NODE_MANAGER_UNEXPECTED_EXITS Role health test bad Critical The health test result for NODE_MANAGER_UNEXPECTED_EXITS has become bad: This role encountered 1 unexpected exit(s) in the previous 5 minute(s). Critical threshold: any.
Any help is greatly appreciated
I have Collabnet subversion edge installed on my windows server 2008 R2 standard (x64 bit)). I am using only Collabnet subversion with apache configured manually configured by me.
svn version is 1.8.13 and apache version is 2.4.12.
Authentication: using AD
CPU:4
RAM:16 GB
Problem statement: server is going down again and again because it is reaching CPU 100%. When i checked which process is causing this issue, i can see that it is httpd.exe consuming all cpu when i just kill it cpu will come down to zero.
So far i am not successful in identifying what is the exact root cause for this, however in the error log i found one line which says [mpm_winnt:error] [pid 3448:tid 3040] AH00326: Server ran out of threads to server requests. Consider raising the ThreadsPerChild setting. After going through apache documentation i came to know that we have a mpm (multi process module) module to handle the number of threads per child, so did the below change in my httpd.conf:
AcceptFilter http none
AcceptFilter https none
<IfModule mpm_winnt_module>
ThreadsPerChild 200
MaxConnectionPerChild 10000
</IfModule>
And also did one more change after going through some web links which says LDAP caching also cause CPU reaching 100% hence, i made caching zero using the below line
LDAPSharedCacheSize 0.
After the above two changes my server was running fine for one month.
Looks like it has a side effect. I got the complaint from my user that: every day first fetch to the repository is taking time. then i removed LDAPSharedCacheSize 0 from my httpd.conf But, vary next day CPU again reached 100%.
Can anybody help me if my configuration is wrong or i need to modify the configuration in my httpd.conf?
Every ten seconds, I get errors like these in my OS X El Capitan error logs (with line breaks added here for clarity):
3/7/17 09:15:34.104 com.apple.xpc.launchd[1]:
(org.postfix.master[98071]) Service exited with abnormal code: 1
3/7/17 09:15:34.104 com.apple.xpc.launchd[1]:
(org.postfix.master) Service only ran for 1 seconds. Pushing respawn out by 9 seconds.
I don’t think I’m using postfix for anything, nor even that I want to. What’s the best way to handle this? I’m fine with disabling postfix altogether if there’s no reason to have it running, and also fine with having it running if it’s needed and/or can be made to not spam the console.
a cron job that was successfully running for years suddenly started dying after about 80% completion. Not sure if it is because the collection with results was steadily growing and reached some critical size (does not seem to be all that big to me) or for any other reason. I am not sure how to debug this, I found the user at whom the job died and tried to run the job for this user, got CURSOR_NOTFOUND message after 2 hours. Yesterday it died after 3 hours of running for all users. I am still using old mongoid (2.0.0.beta) because of multiple dependences and lack of time to change it, but mongo is up to date (I know about the bug in versions before 1.1.2).
I found two similar questions but neither of them is applicable. In this case, they used Mopped which was not production ready. And here the problem was in pagination.
I am getting this error message
MONGODB cursor.refresh() for cursor xxxxxxxxx
rake aborted!
Query response returned CURSOR_NOT_FOUND. Either an invalid cursor was specified, or the cursor may have timed out on the server.
Any suggestions?
A "cursor not found" error from MongoDB is typically an indication that the cursor timed out (after 10 minutes of inactivity) but it could potentially indicate that the client code has become confused and is using a stale or closed cursor or has corrupted the cursor somehow. If the 3 hour runtime included a lot of busy time on the client in between calls to MongoDB, that might give the server time to timeout the cursor.
You can specify a no-timeout option on the cursor to see if it is a server timeout of your cursor that is causing your problem.