What is the best way to run ntpdate at reboot, only after network is ready - embedded-linux

I'm using a BeagleBone, and since it has no built in RTC and battery back up, it loses the date on every reboot. I can easily set the date with the command:
/usr/bin/ntpdate -b -s -u pool.ntp.org
But if the power goes out and back on for the house for example, then the time is lost. The solution that comes with the latest beaglebone Angstrom linux distribution is to put a crontab line in that updates the time every half hour. But I would prefer to just run the command once on powerup.
I tried putting this command listed above in crontab with the #reboot line, but that I believe ran before network was configured, or something else failed since it didn't get me the right time when I pulled the power for 5 minutes and put it back to the beaglebone.
Is there some way to use ifconfig or something like that to run a script from init.d only after network is available?

opkg install ntp-systemd
systemctl enable ntpdate.service
systemctl enable ntpd.service
Edit /etc/ntp.conf and comment the following lines (no fallback on an hardware clock that doesn't exist and because the ntpdate service use the "ntpd -q" command)
#server 127.127.1.0
#fudge 127.127.1.0 stratum 14
Two services are installed:
/lib/systemd/system/ntpd.service:
[Unit]
Description=Network Time Service
After=network.target
[Service]
Type=forking
PIDFile=/run/ntpd.pid
ExecStart=/usr/bin/ntpd -p /run/ntpd.pid
/lib/systemd/system/ntpdate.service:
[Unit]
Description=Network Time Service (one-shot ntpdate mode)
Before=ntpd.service
[Service]
Type=oneshot
ExecStart=/usr/bin/ntpd -q -g -x
RemainAfterExit=yes
ntpd is started after the network is up (After=network.target) so the date should be continuously synchronized. BUT has explained in the ntpd man page:
Most operating systems and hardware of today incorporate a
time-of-year (TOY) chip to maintain the time during periods when the
power is off. When the machine is booted, the chip is used to
initialize the operating system time. After the machine has
synchronized to a NTP server, the operating system corrects the chip
from time to time. In case there is no TOY chip or for some reason its
time is more than 1000s from the server time, ntpd assumes something
must be terribly wrong and the only reliable action is for the
operator to intervene and set the clock by hand. This causes ntpd to
exit with a panic message to the system log. The -g option overrides
this check and the clock will be set to the server time regardless of
the chip time. However, and to protect against broken hardware, such
as when the CMOS battery fails or the clock counter becomes defective,
once the clock has been set, an error greater than 1000s will cause
ntpd to exit anyway.
So we need to set the date before starting ntpd and this is done by the ntpdate service by executing "ntpd -q -g -x" before starting ntpd.service.
From ntpd man page:
-q Exit the ntpd just after the first time the clock is set. This behavior mimics that of the ntpdate program, which is to be retired.
The -g and -x options can be used with this option. Note: The kernel
time discipline is disabled with this option.
Another service installed on the Beaglebone interact with the date/time
timestamp.service
[Unit]
Description=Timestamping service
ConditionPathExists=/etc/timestamp
After=remount-rootfs.service
[Service]
RemainAfterExit=yes
ExecStart=/usr/bin/load-timestamp.sh
ExecStop=/usr/bin/load-timestamp.sh --save
This service store the current timestamp in /etc/timestamp when it's stopped and set the date from that timestamp when it's started. So if ntpd isn't installed, the date set manually and the beaglebone rebooted, the date is only behind by the boot duration.

Do you have the /etc/network/if-post-up.d/ directory on your target system? If so, scripts in that directory should be run when the network comes up. If not, are you using DHCP? Your DHCP client may support running scripts.

Related

How to run a specific program before systemd's watchdog stops a service

I have a program which is run by systemd with a service file like this:
[Unit]
Description=...
[Service]
Type=notify
ExecStart=/usr/sbin/myprogram
WatchdogSec=1
KillMode=process
KillSignal=SIGTERM
Restart=always
It sends the respective signal to the watchdog regularly. From time to time, the program seems to hang and is terminated by the watchdog, then restarts. Before the watchdog terminates it, I'd like to capture some information from the program by executing a command or running some other script (e.g. run gdb -p <PID> --batch -ex 'thread apply all backtrace'). How would I do this?
Add a ExecStop= to your service.
[Service]
ExecStart=....
ExecStop=/path/to/SomeOtherProgram
....
According to systemd manual, if ExecStop option is available, it will run that first, then if the process under ExecStart is still available after this, it will run the KillMode.
ExecStop=
Commands to execute to stop the service started via
ExecStart=. This argument takes multiple command lines, following the
same scheme as described for ExecStart= above. Use of this setting is
optional. After the commands configured in this option are run, it is
implied that the service is stopped, and any processes remaining for
it are terminated according to the KillMode= setting (see
systemd.kill(5)). If this option is not specified, the process is
terminated by sending the signal specified in KillSignal= when service
stop is requested. Specifier and environment variable substitution is
supported (including $MAINPID, see above).
EDIT
As in the comment below, this solution may not work for Watchdog option in the service file.

ntpd -qg: Use with timeout

working on Pi3
Situation: only one server in /etc/ntp.conf is given and this given address is invalid (no NTP-Server running on that address).
Problem: running ntpd -qg does never end, since there is no timeout like in ntpdate -t 60.
Question: Can one specify a timeout for ntpd? If not, how can you assure the process ends after time x?
For now on startup the pi executes a bash-script that tries to get actual time from given NTP-Server in /etc/ntp.conf and then hangs in the process since there is no NTP-Server available on that address. So the process is running from start and i can't call another ntpd until the initial ntpd-process is killed.
Any work around?
PS: I would like not to use ntpdate since it is tagged as a retiring package
EDIT:
The RPi3 is located in an isolated network. Online NTP-servers are no option in my case.
There is a timeout command usually shipped with coreutils that allows you to set timeout on any command (even if it does not support it on its own). E.g.
timeout 60 ntpd -qg
To run run ntpd -qg and have it time out after 60s. If the command finished, you should get its return value, if the timeout intervened, you get 124.

How to provide a restart count to systemd service

I have an embedded device which manages its various services using systemd. Our status reporting application is one of these services. It is always on and it automatically restarts on failure (crashes, exceptions, OOM conditions, whatever).
We report an event to our cloud services on device restart (technically application restart) but I'd like to distinguish first start (after reboot) from restart. Is there a mechanism built into systemd which can provide the service restart count, or do I need to roll my own method?
Do you have the journal ? If you do, then you can get the count like this:
journalctl -b -u myservicename.service |grep -c Started
The -b option limits logs to the current boot; -u limits to the service in argument.
Then you grep for the "Started" line, and tell grep to only give you the number of matches.
you can use following command:
systemctl show foo.service -p NRestarts
It will return a value if the service is in a restart loop, otherwise, will return nothing.

Elasticsearch Docker stop seems to ignore SIGKILL

I'm trying to use Elasticsearch in Docker for local dev. While I can find containers that work, when docker stop is sent, the containers hang for the default 10s, then docker forcibly kills the container. My assumption here is that ES is either not on PID 1 or other services prevent it from shutting down immediately.
I'm curious if anyone can expand on this, or explain why this is happening more accurately. I'm running numerous tests and 10s+ to shutdown is just annoying when other containers shutdown after 1-2s.
If you don't want to wait the 10 seconds, you can run a docker kill instead of a docker stop. You can also adjust the timeout on docker stop with the -t option, e.g. docker stop -t 2 $container_id to only wait 2 seconds instead of the default 10.
As for why it's ignoring the sigkill, that may depend on what image you are running (there's more than one for elasticsearch). However, if pid 1 is a shell like /bin/sh or /bin/bash, it will not pass signals through. If pid 1 is the elasticsearch process, it may ignore the signal, or 10 seconds may not be long enough for it to fully cleanup and shutdown.

start-stop-daemon weird behaviour

I'm creating a pallet crate for elasticsearch. I was stuck on the service not starting however after looking at the logs it seems that it's not really anything to do with pallet. I am using the elasticsearch apt package for 1.0 which includes an init script. If I run sudo service elasticsearch start then ES starts with no problems. If pallet does this for me then it records standard out as having started it successfully
start elasticsearch
* Starting Elasticsearch Server
...done.
However it is not started.
sudo service elasticsearch status
* elasticsearch is not running
I messed around with the init script and I found if I added sleep 1 after starting the daemon then it works correctly with pallet.
start-stop-daemon --start -b --user "$ES_USER" -c "$ES_USER" --pidfile "$PID_FILE" --exec $DAEMON -- $DAEMON_OPTS
#this sleep will allow it to work
#sleep 1
log_end_msg $?
I don't understand what is going on?
I've seen issues like this before, too. It generally comes down to expecting something to have finished before the script finishes, which may not always happen with services since they fork off background tasks that may still get killed when the ssh connection is terminated.
For these kinds of things you should use Pallet's built in code for running things under supervision. This also has the advantage of making it very easy to switch from plain init.d to runit or daemontools later, which is especially useful for Elasticsearch because it's a JVM process and nearly any JVM will eventually crash if you let it run long enough.

Resources