I have an embedded device which manages its various services using systemd. Our status reporting application is one of these services. It is always on and it automatically restarts on failure (crashes, exceptions, OOM conditions, whatever).
We report an event to our cloud services on device restart (technically application restart) but I'd like to distinguish first start (after reboot) from restart. Is there a mechanism built into systemd which can provide the service restart count, or do I need to roll my own method?
Do you have the journal ? If you do, then you can get the count like this:
journalctl -b -u myservicename.service |grep -c Started
The -b option limits logs to the current boot; -u limits to the service in argument.
Then you grep for the "Started" line, and tell grep to only give you the number of matches.
you can use following command:
systemctl show foo.service -p NRestarts
It will return a value if the service is in a restart loop, otherwise, will return nothing.
Related
I have a program which is run by systemd with a service file like this:
[Unit]
Description=...
[Service]
Type=notify
ExecStart=/usr/sbin/myprogram
WatchdogSec=1
KillMode=process
KillSignal=SIGTERM
Restart=always
It sends the respective signal to the watchdog regularly. From time to time, the program seems to hang and is terminated by the watchdog, then restarts. Before the watchdog terminates it, I'd like to capture some information from the program by executing a command or running some other script (e.g. run gdb -p <PID> --batch -ex 'thread apply all backtrace'). How would I do this?
Add a ExecStop= to your service.
[Service]
ExecStart=....
ExecStop=/path/to/SomeOtherProgram
....
According to systemd manual, if ExecStop option is available, it will run that first, then if the process under ExecStart is still available after this, it will run the KillMode.
ExecStop=
Commands to execute to stop the service started via
ExecStart=. This argument takes multiple command lines, following the
same scheme as described for ExecStart= above. Use of this setting is
optional. After the commands configured in this option are run, it is
implied that the service is stopped, and any processes remaining for
it are terminated according to the KillMode= setting (see
systemd.kill(5)). If this option is not specified, the process is
terminated by sending the signal specified in KillSignal= when service
stop is requested. Specifier and environment variable substitution is
supported (including $MAINPID, see above).
EDIT
As in the comment below, this solution may not work for Watchdog option in the service file.
I use GMediaRenderer to send audio via UPNP from a Raspberry Pi. Occasionally, for reasons unknown, I have to SSH into my Pi and send the command sudo service gmediarenderer restart to get it to work properly. I'd like to add a command to crontab or similar that periodically checks whether the service is running properly. I already have a crontab entry that checks whether the service is running, and starts if it isn't. The trouble I'm having is that sometimes, even though the service is running, it doesn't appear to be communicating with UPNP control points. Executing the restart command brings it back, so I assume it is simply the case that the service has crashed but not closed down.
Does anyone know how to programmatically check (preferably using a bash script) whether the GMediaRenderer service is up and running?
I have found a solution to this. The command gssdp-discover returns a list of active renderers. I setup a sudo crontab job to run a bash script every minute that checks whether or not a particular renderer is running, and to restart gmediarenderer if it isn't found.
The following command will list your active renderers:
gssdp-discover -i wlan0 --timeout=3
Change wlan0 above depending on your specific network connection. In my case, the renderer that I'm interested in is listed as urn:av-openhome-org:service:Info:1 (run the command with and without the renderer active, and look for the one that only appears when running). So, my bash script contains the following:
gssdp-discover -i wlan0 --timeout=3 --target=urn:av-openhome-org:service:Info:1 | grep available &> /dev/null
if [ $? == 0 ]; then
echo "OpenHome renderer is already running"
else
echo "restarting gmediarenderer"
/etc/init.d/gmediarenderer stop
/etc/init.d/gmediarenderer start
fi
Ive been trying to figure out when to set my script to initiate and what to use as After= parameter.
What i need is to initiate my service as late as possible... kind of like the last service in the stack. I def. need /home to be mounted. I cant rely on wpa_supplicant nor mdns since it is not given those have been configured on the device.
Ive also read systemd docs but could not figure out what service to set to After= option in service file.
After=ABC.service
means you service will launch after launching ABC.service, but it is not guarantied, so to ensure you service starts only after ABC.service use
Requires=ABC.service
OR
You can use below script to achieve this.
Create a file at any location of your device once home is getting mounted, and then launch your service
[Service]
Type=oneshot
ExecStart=/bin/ABC -c 'while [ ! -e /tmp/YOUR_FILE ]; do sleep 0.1 ; done'
ABC is your executable of service, it will wait till it doesn't get YOUR_FILE at /tmp/ location.
Hope this helps.
I'm trying to run kafka in supervision mode so that it can start automatically in case of a shutdown. But all the examples of running kafka use shell scripts and the supervisord is not able to note which PID to monitor. Can anyone suggesthow to accomplish auto restart of kafka?
If you are on a Unix or Linux machine, then this is when /etc/inittab comes in handy. Or you might want to use daemontools. I don't know about Windows though.
We are running Kafka under Supervisord (http://supervisord.org/), it works like a charm. Run command looks like this (as specified in supervisord.conf file:
command=/usr/local/bin/pidproxy /var/run/kafka.pid /usr/lib/kafka/bin/kafka-server.sh -f -p /var/run/kafka.pid
Flag -f tells Kafka to start in foreground. If flag -p is set, Kafka process PID is written into specified file.
The command pidproxy is a part of Supervisord distribution. Upon receiving KILL signal, it reads PID from specified file, and forwards the signal to the corresponding process.
I'm using a BeagleBone, and since it has no built in RTC and battery back up, it loses the date on every reboot. I can easily set the date with the command:
/usr/bin/ntpdate -b -s -u pool.ntp.org
But if the power goes out and back on for the house for example, then the time is lost. The solution that comes with the latest beaglebone Angstrom linux distribution is to put a crontab line in that updates the time every half hour. But I would prefer to just run the command once on powerup.
I tried putting this command listed above in crontab with the #reboot line, but that I believe ran before network was configured, or something else failed since it didn't get me the right time when I pulled the power for 5 minutes and put it back to the beaglebone.
Is there some way to use ifconfig or something like that to run a script from init.d only after network is available?
opkg install ntp-systemd
systemctl enable ntpdate.service
systemctl enable ntpd.service
Edit /etc/ntp.conf and comment the following lines (no fallback on an hardware clock that doesn't exist and because the ntpdate service use the "ntpd -q" command)
#server 127.127.1.0
#fudge 127.127.1.0 stratum 14
Two services are installed:
/lib/systemd/system/ntpd.service:
[Unit]
Description=Network Time Service
After=network.target
[Service]
Type=forking
PIDFile=/run/ntpd.pid
ExecStart=/usr/bin/ntpd -p /run/ntpd.pid
/lib/systemd/system/ntpdate.service:
[Unit]
Description=Network Time Service (one-shot ntpdate mode)
Before=ntpd.service
[Service]
Type=oneshot
ExecStart=/usr/bin/ntpd -q -g -x
RemainAfterExit=yes
ntpd is started after the network is up (After=network.target) so the date should be continuously synchronized. BUT has explained in the ntpd man page:
Most operating systems and hardware of today incorporate a
time-of-year (TOY) chip to maintain the time during periods when the
power is off. When the machine is booted, the chip is used to
initialize the operating system time. After the machine has
synchronized to a NTP server, the operating system corrects the chip
from time to time. In case there is no TOY chip or for some reason its
time is more than 1000s from the server time, ntpd assumes something
must be terribly wrong and the only reliable action is for the
operator to intervene and set the clock by hand. This causes ntpd to
exit with a panic message to the system log. The -g option overrides
this check and the clock will be set to the server time regardless of
the chip time. However, and to protect against broken hardware, such
as when the CMOS battery fails or the clock counter becomes defective,
once the clock has been set, an error greater than 1000s will cause
ntpd to exit anyway.
So we need to set the date before starting ntpd and this is done by the ntpdate service by executing "ntpd -q -g -x" before starting ntpd.service.
From ntpd man page:
-q Exit the ntpd just after the first time the clock is set. This behavior mimics that of the ntpdate program, which is to be retired.
The -g and -x options can be used with this option. Note: The kernel
time discipline is disabled with this option.
Another service installed on the Beaglebone interact with the date/time
timestamp.service
[Unit]
Description=Timestamping service
ConditionPathExists=/etc/timestamp
After=remount-rootfs.service
[Service]
RemainAfterExit=yes
ExecStart=/usr/bin/load-timestamp.sh
ExecStop=/usr/bin/load-timestamp.sh --save
This service store the current timestamp in /etc/timestamp when it's stopped and set the date from that timestamp when it's started. So if ntpd isn't installed, the date set manually and the beaglebone rebooted, the date is only behind by the boot duration.
Do you have the /etc/network/if-post-up.d/ directory on your target system? If so, scripts in that directory should be run when the network comes up. If not, are you using DHCP? Your DHCP client may support running scripts.