Trying to understand launchd daemon state

Trying to understand launchd daemon state - macos

I am trying to setup a launchd daemon for Zabbix agent on macOS 10.13 High Sierra.
First I install the Zabbix agent with:
brew install zabbix --without-server-proxy
Then I create a property list named com.zabbix.zabbix_agentd.plist with this content:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>AbandonProcessGroup</key>
<true/>
<key>GroupName</key>
<string>zabbix</string>
<key>KeepAlive</key>
<dict>
<key>SuccessfulExit</key>
<false/>
</dict>
<key>Label</key>
<string>com.zabbix.zabbix_agentd</string>
<key>ProgramArguments</key>
<array>
<string>/usr/local/sbin/zabbix_agentd</string>
<string>-c</string>
<string>/opt/zabbix/zabbix_agentd.conf</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>StandardErrorPath</key>
<string>/var/log/zabbix/zabbix_agentd.error.log</string>
<key>StandardOutPath</key>
<string>/var/log/zabbix/zabbix_agentd.stdout.log</string>
<key>UserName</key>
<string>zabbix</string>
</dict>
</plist>
I load it with:
sudo launchctl load ./com.zabbix.zabbix_agentd.plist
Now I can see that the daemon has done what I expected it to with:
ps ax | grep zabbix_agentd | grep -v grep
I see 6 zabbix processes. 1 collector, 3 listeners, 1 active check and the process that the launch daemon started:
8931 ?? S 0:00.01 /usr/local/sbin/zabbix_agentd -c /opt/zabbix_agentd.conf
But when I run this command:
launchctl print system/com.zabbix.zabbix_agentd | grep state
I get this output:
state = waiting
I expected to see state = running... Why does that command tell me that the daemon is waiting when it has 6 running processes?
Is this "works as designed" or did I do something wrong?

This is sort of "works as designed", but I'd really say it's a result of a philosophical conflict between zabbix and launchd about how daemons should work.
When you run zabbix_agentd, it "daemonizes" itself, meaning that it fires off the actual daemon process as a background subprocess, and then the parent process exits; from that point on, the daemon process (and any subprocesses it starts) run pretty much independently from whatever started them. This is pretty much the traditional way unix daemons operate.
launchd, on the other hand, is written to expect the daemons it manages to stay in the foreground and execute directly under it; this gives launchd much more ability to monitor and control its daemons than it would have if they distanced itself from launchd.
This is a common conflict between traditional unix daemons and launchd, and there are two ways to solve it: either get the daemon to run in the foreground (i.e. conform to the launchd way of doing things), or tell launchd not to worry that the daemon seems to have quit. zabbix_agentd doesn't seem to have anything like a --nodaemon option (according to these docs), so you have to adapt launchd (update: newer versions do, see below). The standard way of doing this, which is pretty much what you have in your .plist, is to add AbandonProcessGroup and KeepAlive keys to tell launchd not to panic when (as far as it can tell) the daemon exits. This works, but it means that launchd cannot tell what's actually going on with the daemon, leading to the weird-looking results you see.
UPDATE: I was looking at an old version of zabbix_agentd. Stefan spotted that a -f (or --foreground) option was added to zabbix_agentd in version 3.0. With this, I'd recommend adding --foreground to the ProgramArguments array, replacing the KeepAlive dictionary with a simple <true/> (this tells launchd to auto-restart the daemon if it exits for any reason), and removing <key>AbandonProcessGroup</key><true/> (this option controls whether launchd cleans up leftover subprocesses if the main daemon process exits/crashes). The result should look something like this:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>GroupName</key>
<string>zabbix</string>
<key>Label</key>
<string>com.zabbix.zabbix_agentd</string>
<key>ProgramArguments</key>
<array>
<string>/usr/local/sbin/zabbix_agentd</string>
<string>-c</string>
<string>/opt/zabbix/zabbix_agentd.conf</string>
<string>--foreground</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>StandardErrorPath</key>
<string>/var/log/zabbix/zabbix_agentd.error.log</string>
<key>StandardOutPath</key>
<string>/var/log/zabbix/zabbix_agentd.stdout.log</string>
<key>UserName</key>
<string>zabbix</string>
</dict>
</plist>

Related

Launch Agent Not Starting

I’m trying to make a launch agent that starts a Python script. It should run when there’s a network connection but it doesn’t do that. launchctl list says it’s loaded:
launchctl list | grep test.Flopsey.DiscordMusicBot
- 0 test.Flopsey.DiscordMusicBot
When I start it with launchctl start test.Flopsey.DiscordMusicBot it works fine. The .plist file (which is stored under ~/Library/LaunchAgents) looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>test.Flopsey.DiscordMusicBot</string>
<key>ProgramArguments</key>
<array>
<string>/Library/Frameworks/Python.framework/Versions/3.5/bin/python3.5</string>
<string>/path/to/MusicBot/run.py</string>
</array>
<key>EnvironmentVariables</key>
<dict>
<key>PATH</key>
<string>/bin:/usr/bin:/usr/local/bin</string>
</dict>
<key>StandardOutPath</key>
<string>/path/to/MusicBot/log.log</string>
<key>StandardErrorPath</key>
<string>/path/to/MusicBot/log.log</string>
<key>WorkingDirectory</key>
<string>/path/to/MusicBot</string>
<key>KeepAlive</key>
<dict>
<key>NetworkState</key>
<true/>
</dict>
</dict>
</plist>
I’m new to launchd and I followed this tutorial. I think the solution to my problem is very basic but I have no idea what it could be. I’ve also made another similar agent and it works fine (unlike the music bot it’s only one file).
Update
Thanks to #LCC’s comment I realised that using NetworkState doesn’t work anymore on OS X 10.10 and higher. Since the script exits when it can’t connect to the Internet I can just set KeepAlive to <true/> and set up a ThrottleInterval so launchd restarts the script after a cooldown if it couldn’t connect.

launchd does not 'keep alive' processes in El Capitan

I'm trying to make sure a process is always running, even after it quits, crashes or stops in anyway. It's a small binary that reads a serial line and writes to a database - nothing too complex. If it fails to read, it quits with exitcode 70 and it captures any SIGKILL or SIGTERM events and shuts down it's database connections gracefully before actually quitting.
However, the process does NOT launch at load (even though this flag is set), nor does it restart if it is killed. Here is the plist:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Disabled</key>
<false/>
<key>KeepAlive</key>
<true/>
<key>Label</key>
<string>blah.bloop.tag05</string>
<key>ProgramArguments</key>
<array>
<string>/Users/blah/Desktop/rfid</string>
<string>-f/dev/tty.usbserial-FT32X30YBXB</string>
<string>-n5</string>
<string>-ctcp://127.0.01</string>
<string>-v</string>
<string>-x100000</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>StandardErrorPath</key>
<string>/var/log/rfid.log</string>
<key>StandardOutPath</key>
<string>/var/log/rfid.log</string>
<key>WorkingDirectory</key>
<string>/Users/blah/Desktop</string>
</dict>
</plist>
This plist lives in ~/Library/LaunchAgents (and the user in question can stop and start this process easily enough).
Any thoughts at all? I know there are other processes that are being restarted but I can't for the life of me figure out this one. I thought permissions might be it but these all seem fine :/
I decided to try another plist to see if it was my daemon program at fault. Turns out, it's launchd:
So I decided to run a quick test, using the program tail. I wanted to see if it was my daemon process itself or something to do with launchd. It seems that launchd is the problem. Here is an alternative and simple plist:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>KeepAlive</key>
<true/>
<key>Label</key>
<string>test.test</string>
<key>ProgramArguments</key>
<array>
<string>/usr/bin/tail</string>
<string>-f</string>
<string>/var/log/system.log</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>StandardErrorPath</key>
<string>/tmp/test.test.err</string>
<key>StandardOutPath</key>
<string>/tmp/test.test.out</string>
</dict>
</plist>
I launch this process using the following command
launchctl load test.plist
launchctl start test.test
I then kill the process from another terminal by sending either SIGKILL or SIGTERM to the process with the kill command. Launchd fails to restart the process.
I suspect there must be something new in El-Capitan that I've missed?

Further experimentation has revealed an answer. It appears that only Daemons, such as 'Global Daemons' respect the restart and RunAtLaunch directives. I notice that there are a couple of plists in /Library/LaunchDaemons such as these for TeamViewer and other third party programs.
Interestingly, one can send SIGKILL to processes belonging to Apple such as ImageCaptureAgent and the KeepAlive directive is NOT respected. This is an Apple defined agent plist file that explicitly states a process should be restarted if it is killed, but it does not.
I wonder if Apple changed this functionality in the newer release?

OSX launchd plist for node forever process

I am trying to write a launchd.plist file for my node server. I am using forever to run my node server. I would like the server to start on boot. I would also like to wait for the mongodb launchd plist to run first.
I installed mongobb using homebrew and it came with a launchd.plist already. I have executed the following:
$ launchctl load ~/Library/LaunchAgents/homebrew.mxcl.mongodb.plist
plist for mongodb is:
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>homebrew.mxcl.mongodb</string>
<key>ProgramArguments</key>
<array>
<string>/usr/local/opt/mongodb/mongod</string>
<string>run</string>
<string>--config</string>
<string>/usr/local/etc/mongod.conf</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<false/>
<key>WorkingDirectory</key>
<string>/usr/local</string>
<key>StandardErrorPath</key>
<string>/usr/local/var/log/mongodb/output.log</string>
<key>StandardOutPath</key>
<string>/usr/local/var/log/mongodb/output.log</string>
<key>HardResourceLimits</key>
<dict>
<key>NumberOfFiles</key>
<integer>1024</integer>
</dict>
<key>SoftResourceLimits</key>
<dict>
<key>NumberOfFiles</key>
<integer>1024</integer>
</dict>
</dict>
</plist>
If I shutdown the computer and restart mongodb fires up as it should.
However my node server is not starting. Any ideas?
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>KeepAlive</key>
<dict>
<key>SuccessfulExit</key>
<false/>
</dict>
<key>Label</key>
<string>com.test.app</string>
<key>ProgramArguments</key>
<array>
<string>/usr/local/bin/forever</string>
<string>-a</string>
<string>-l</string>
<string>/var/log/app/app.log</string>
<string>-e</string>
<string>/var/log/app/app_error.log</string>
<string>/data/server/app.js</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>StartInterval</key>
<integer>3600</integer>
</dict>
</plist>
EDIT:
writing to log file and I see this:
env: node: No such file or directory
I think this means that the node binary cannot be found. I can echo $PATH and /usr/local/bin is in my path. I can start node from the terminal. Ideas?

Add environment Variables worked for me.
<key>EnvironmentVariables</key>
<dict>
<key>PATH</key>
<string>/usr/local/bin/:$PATH</string>
</dict>
You may also need to add WorkingDirectory to your node app.
<key>WorkingDirectory</key>
<string>path/to/your/node/app</string>

I had this problem too, but I solved it using an Automator app that runs at startup.
Open Automator and choose, New Application
Insert in your workflow "Run Shell Script"
Use this code in the shell script, changing the paths to your paths
export PATH=/usr/local/bin/:$PATH
cd /path/to/your/nodejs/app
forever start app.js
Go to System Preferences >> User & Groups and click on Login Items tab
Add your Automator app and be happy.
The important part of the solution is the first line of the script (adding your bin to the path). It would probably work to add a Startup Item pointed at a bash script too (and no Automator script), feel free to try!

Add node before forever:
<key>ProgramArguments</key>
<array>
<string>/usr/local/bin/node</string>
<string>/usr/local/bin/forever</string>
<string>/path/to/app.js</string>
</array>

This is not an answer of the original questions, but I was looking for a simple way to start a node server (and keep it running) after reboot. I found pm2 to be a much easier to set up than the solutions for forever above.
# install pm2
npm install pm2 -g
# start server
pm2 start app.js
# start pm2 after reboot (might need sudo)
$ pm2 startup
http://pm2.keymetrics.io/docs/usage/startup/

I'm not sure when node-launchd came out. However, it seems to be the more reliable solution.
The solution in which a workflow is created and added into the login item is also well appreciated. However, the problem is that if the app is served on a server and when the server is restarted, I wonder the app will be started before the user login to the system. Didn't try, though.

cannot get real hostname when launching shell script at boot with launchd

I'm trying to run a shell script at boot with launchd, via a plist file in /Library/LaunchDaemons (on 10.8.x, if that matters):
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>testD</string>
<key>ProgramArguments</key>
<array>
<string>/Users/lfriedman/cuda-stuff/sw/gpgpu/build/scripts/testing/testD.sh</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>UserName</key>
<string>lfriedman</string>
<key>AbandonProcessGroup</key>
<false/>
<key>StandardOutPath</key>
<string>/tmp/testD.out</string>
<key>StandardErrorPath</key>
<string>/tmp/testD.err</string>
</dict>
</plist>
Inside the shell script is a call to 'hostname -s'. If I run the script manually, everything works fine. If I run the script via cron, everything works fine. However, when it runs at boot via launchd the value returned from 'hostname -s' is always (erroneously) returned as 'localhost', rather than the actual hostname of the system. If I tweak the plist to run the script at a time other than at bootup, it also does the right thing and returns the actual short hostname of the system. This leads me to think that there's some sort of race condition going on where launchd is firing off its jobs before the network subsystem of the OS is fully running.
Is there some special way to ensure that the OS is "fully" booted before launchd runs a job? Or a way to force a delay inside the plist file before the program is invoked?

Unfortunately, launchd doesn't have a way of setting dependencies, so you'll need to have the delay in your script. In a shell script, an easy way to check for networking is:
#!/bin/bash
# Example Daemon Starter
. /etc/rc.common
CheckForNetwork
while [ "${NETWORKUP}" != "-YES-" ]
do
sleep 5
NETWORKUP=
CheckForNetwork
done
# Now do what you need to do.
You can see more info at the following link:
http://blog.slaunchaman.com/2010/07/01/how-to-run-a-launchdaemon-that-requires-networking/

Launchd plist service is not shutting down

I have application i want to run as LaunchDaemon on OSX. My plist looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>KeepAlive</key>
<false/>
<key>Label</key>
<string>my.service</string>
<key>ProgramArguments</key>
<array>
<string>/servertest/MyService</string>
<string>-jvm</string>
<string>server</string>
<string>-nodetach</string>
<string>-outfile</string>
<string>out.txt</string>
<string>-errfile</string>
<string>err.txt</string>
<string>-verbose</string>
<string>-debug</string>
<string>-home</string>
<string>/System/Library/Frameworks/JavaVM.framework/Home</string>
<string>-cp</string>
<string>./lib/hsqldb.jar:./lib/myservice-wrapper.jar:./lib/commons-daemon-1.0.8.jar</string>
<string>my.service.DaemonMac</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>StandardErrorPath</key>
<string>/servertest/stderr.log</string>
<key>StandardOutPath</key>
<string>/servertest/stdout.log</string>
<key>WorkingDirectory</key>
<string>/servertest</string>
</dict>
</plist>
Service starts on load or when i call 'launchctl start my.service'. But it's not going down when i use 'launchctl stop my.service' or even unload plist. It remains in memory and runs as if nothing happened. Have no idea what could be the reason. Please, help!

Does your application daemonize itself (i.e. drop into the background)? If so, it's essentially detaching itself from launchd and hence preventing launchd from managing it.
EDIT: if the application isn't daemonizing itself, the next thing to look at is to see what launchd thinks is going on with launchctl list my.service. This should dump a list of properties of the daemon, including its PID (if launchd thinks it's running) and LastExitStatus (whether it exited successfully last time it ran). Does the PID correspond to the actual running process? Does the result change when you try to stop the service (esp. does the PID change to 0)? Does a process listing (e.g. with ps -axj) show any child processes living on after the parent exits?
Oh, and just to make sure: when managing LaunchDaemons, you need to use sudo launchctl (without the sudo, launchctl will try to manage LaunchAgents within your user session).

The file that worked for me is below. I've learned that my deamon must not detach itself from controling process and it has to be "OnDemand" so i could stop it when i want. I've made some shaman dances while working on it and though it's not clear to me at which point it began to work. Now it does.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>KeepAlive</key>
<false/>
<key>Label</key>
<string>myserver</string>
<key>OnDemand</key>
<true/>
<key>ProgramArguments</key>
<array>
<string>/opt/MYServer/MYServer</string>
<string>-server</string>
<string>-outfile</string>
<string>/opt/MYServer/out.txt</string>
<string>-errfile</string>
<string>/opt/MYServer/err.txt</string>
<string>-verbose</string>
<string>-debug</string>
<string>-nodetach</string>
<string>-home</string>
<string>/System/Library/Frameworks/JavaVM.framework/Home</string>
<string>-cp</string>
<string>/opt/MYServer/lib/hsqldb.jar:/opt/MYServer/lib/my-wrapper.jar:/opt/MYServer/lib/commons-daemon-1.0.8.jar</string>
<string>my.service.DaemonMac</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>StandardErrorPath</key>
<string>/opt/MYServer/stderr.log</string>
<key>StandardOutPath</key>
<string>/opt/MYServer/stdout.log</string>
<key>WorkingDirectory</key>
<string>/opt/MYServer</string>
</dict>
</plist>

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Trying to understand launchd daemon state - macos

Related

Launch Agent Not Starting

launchd does not 'keep alive' processes in El Capitan

OSX launchd plist for node forever process

cannot get real hostname when launching shell script at boot with launchd

Launchd plist service is not shutting down

Categories

Resources