Optimizing a systemd service taking too long - systemd

I have this udhcpc service in my system:
[Unit]
Description=uDHCP Client Service
After=network.target
Conflicts=systemd-resolved.service
[Service]
Type=forking
ExecStart=/sbin/udhcpc -p /var/run/udhcpc.brg0.pid -i brg0 -R -b
ExecStop=/bin/sh -c 'test -f /var/run/udhcpc.brg0.pid && kill $(cat /var/run/udhcpc.brg0.pid)'
[Install]
WantedBy=multi-user.target
It's been working well, except systemd-analyze is showing that it is adding about 7 seconds to the boot time:
7.388s udhcpc.service
4.946s dev-mmcblk1p2.device
1.303s uim-sysfs.service
959ms dev-mmcblk1p4.device
752ms dev-mmcblk1p3.device
739ms dev-mmcblk1p1.device
718ms systemd-hwdb-update.service
.
.
And here is the output of systemd-analyze critical-chain:
multi-user.target #15.164s
[[0;1;31mudhcpc.service #7.773s +7.388s[[0m
network.target #7.551s
[[0;1;31msystemd-networkd.service #6.724s +668ms[[0m
[[0;1;31msystemd-udevd.service #1.854s +87ms[[0m
[[0;1;31msystemd-tmpfiles-setup-dev.service #1.662s +70ms[[0m
[[0;1;31msystemd-sysusers.service #1.353s +229ms[[0m
[[0;1;31msystemd-remount-fs.service #1.044s +238ms[[0m
systemd-journald.socket #911ms
-.slice #281ms
I suppose the right way to fix this is to avoid using udhcpc and stick to the mechanism built into systemd, but unfortunately that's not my call. I'd like to at least optimize the boot time though. What are some things I can do?

The "problem" is systemd-networkd, which stops the boot until the network is configured -- and this is what you want to replace, not systemd-resolved.
Network autoconfiguration cannot be made any faster, because, when properly implemented, DHCP needs to check that the address isn't already in use, which involves sending a bunch of ARP packets and waiting for the timeout.
Since you insert your service between "network is configured" and "multi-user boot is complete" targets, you introduce a dependency where there was none before.
Network configuration is normally asynchronous, because any service that fails when the network is unconfigured at start would also fail when the network goes down later.

Related

Starting an opensplice publisher via systemd does not publish data

I have an opensplice publisher on Ubuntu 20.04 that is started via systemd.
If the publisher starts via systemd then the data is not pubished, but also no errors are reported or present in the opensplice log files.
The publisher works if I run it from a command line or if I stop and restart the service.
The QoS are the same for the publisher and subscriber.
The publisher and subscriber applications are running on different machines.
There are no other participants on the network. All the machines are rebooted and the order of reboot does not change the observed behaviour.
The systemd service is:
[Unit]
Description=Publisher Process
Documentation=
After=network.target
StartLimitIntervalSec=0
[Service]
Type=simple
WorkingDirectory=/opt/publisher/bin
ExecStart=/opt/publisher/bin/publisher.sh
Restart=always
RestartSec=2
[Install]
WantedBy=multi-user.target
The publisher.sh is:
#!/bin/bash
cd /opt/publisher/bin
source bashrc_local
# We just keep running the application (in case of a crash)
while true; do
./publisher
sleep 15
done
I have a work around that feels a little bit naff.
#!/bin/bash
cd /opt/publisher/bin
source bashrc_local
timeout 30 ./remote_processor
killall remote_processor
# We just keep running the application (in case of a crash)
while true; do
./publisher
sleep 15
done
Any ideas on how I can remove my work around?
Edit 16 Sept 22
The issue appears to be systemd start order and dependencies as I have run into the same issue with a program publishing data via UDP which is not using DDS.
Changing the dependencies so the services are started just before the user login does not help.
check your environment variables as systemd will not run with the same environment as your bash console
in particular have you set the OSPL_URI variable to point at the config?
if using the commercial version, OSPL_HOME and ADLINK_LICENSE will also need to be set
Does the PATH variable include your OSPL shared libraries?
These are all setup by running the $OSPL_HOME\release.com script in your bash session
I tend to manually add the required ones to the service file
e.g.
Environment=OSPL_URI=file:///opt/ospl.xml

How to properly override generated systemd unit file to start after a ZFS mount has mounted

I'm using Ubuntu 18.04.4 LTS which uses systemd, but the squid package packaged with this version of Ubuntu is configured to start via init.d. It starts and runs via systemctl start squid.service if I start it manually after the system has booted.
However, I'm using a ZFS mount point ("/media") to store the cache data, and during the boot process squid is starting before this mount point is active. Consequently I'm getting the error "Failed to verify one of the swap directories". Full output of systemctl status squid is here
I'd like to tell systemd to wait until after media.mount has completed in the most minimally invasive way possible (e.g. without modifying the /etc/init.d/squid file that is maintained by the package). To that end I created the /etc/systemd/system/squid.service.d/override.conf file like so:
% cat /etc/systemd/system/squid.service.d/override.conf
[Unit]
Wants=network.target network-online.target nss-lookup.target media.mount
After=network.target network-online.target nss-lookup.target media.mount
[Install]
WantedBy=multi-user.target
But squid is still starting too early.
Is what I want to do possible? Or do I have to bite the bullet and define a native /etc/systemd/system/squid.service file and remove the /etc/init.d/squid init script?

Starting a systemd service with privileges

I would like systemd to manage the tup monitor, so I wrote a service unit:
[Unit]
Description=Monitor source files for changes
[Service]
Type=simple
ExecStart=/usr/bin/tup monitor -f
ExecStop=/usr/bin/tup stop
WorkingDirectory=/some/dir
StandardOutput=journal+console
StandardError=journal+console
[Install]
WantedBy=multi-user.target
Whereas starting the monitor manually works, trying to do it with systemd fails with the following error:
setpgid: Operation not permitted
tup error: Unable to set process group for tup's subprocesses.
The error originates in this file and seems to be due to systemd not giving the process the capabilities it needs, so I went further:
[Unit]
Description=Monitor source files for changes
[Service]
Type=simple
ExecStart=/usr/bin/tup monitor -f
ExecStop=/usr/bin/tup stop
CapabilityBoundingSet=CAP_SETUID CAP_SETGID
AmbientCapabilities=CAP_SETUID CAP_SETGID
WorkingDirectory=/some/dir
StandardOutput=journal+console
StandardError=journal+console
[Install]
WantedBy=multi-user.target
Still, it doesn't work.
How can I tell systemd to keep the permissions tup requires?
systemd does not restrict any permissions on services by default, so I don’t think that’s the problem. The setpgid(2) manpage mentions three possible conditions for EPERM, though:
An attempt was made to move a process into a process group in a different session, or to change the process group ID of one of the children of the calling process and the child was in a different session, or to change the process group ID of a session leader (setpgid(), setpgrp()).
I think the third one is the most likely cause of the error: tup probably expects to be run from a shell, in which case the shell would be the session leader and tup tries to establish its own process group, but systemd places each service in its own session as part of setting up a clean execution environment for the service (see also daemon(7)), so that step fails.
Unless you can somehow disable that step in tup (it doesn’t need to fork, either, or do a number of other things which systemd makes unnecessary), I think the only workaround would be an ugly hack:
ExecStart=/bin/sh -c '/usr/bin/tup monitor -f'
With this, the session leader will be the shell, not tup, so the setpgid should work again.
Try this
[Unit]
Description=Monitor source files for changes
After=network.target
[Service]
Type=simple
ExecStart=/usr/bin/tup monitor -f
ExecStop=/usr/bin/tup stop
CapabilityBoundingSet=CAP_SETUID CAP_SETGID
AmbientCapabilities=CAP_SETUID CAP_SETGID
WorkingDirectory=/some/dir
StandardOutput=journal+console
StandardError=journal+console
[Install]
WantedBy=multi-user.target
I've added After=network.target, this will make sure TUP service starts after the network is fully active and avoid the service failing due to network issues

Running a binary as systemd service does not work

I have a go project and when I compile it I get your typical binary. If I call the binary directly from the command line with the adequate inputs, it runs perfectly well. I want to make this program part of the systemd ecosystem and the following is my my gobinary.service
[Unit]
Description=Run Go Service
[Service]
WorkingDirectory=/path/to/directory/
ExecStart=/path/to/directory/binary --config config/service.conf
Restart=always
[Install]
WantedBy=multi-user.target
I can start and stop the service using the usual sudo service gobinary start without error messages but the problem is that the program is actually not running because it does not respond to the inputs, namely it has a rabbitmq input queue, despite me sending content to the queue it doesn't react. If however I run the program using the binary directly $ ./binary --config/service.conf everything runs smoothly. How do I troubleshoot this?

Trigger event on AWS EC2 instance stop/terminate

Is there some way to trigger an event (e.g. running a script to push some logs to S3) when an EC2 instance is stopped/terminated?
I have looked into triggering the script using a service in /usr/lib/systemd/system but I haven't had any luck with that yet. I have heard that networking capabilities on the instance can be shutdown before a service is triggered and if true, that could be why the script is not executing correctly.
So the answer is not really AWS specific, but it is working for me now (tested on EC2 instance stopping and terminating).
I've created a system.d service file:
/usr/lib/systemd/system/my_shutdown.service
[Unit]
Description=my_shutdown Service
Before=shutdown.target reboot.target halt.target
Requires=network-online.target network.target
[Service]
KillMode=none
ExecStart=/bin/true
ExecStop=/path/to/my_script.sh
RemainAfterExit=yes
Type=oneshot
[Install]
WantedBy=multi-user.target
Added this service to multi-user.target:
systemctl enable my_shutdown.service
Alternatively you can manually create the symlink:
ln -s /usr/lib/systemd/system/my_shutdown.service /etc/systemd/system/multi-user.target.wants/my_shutdown.service
Started the service and tested by stopping/terminating the instance.
systemctl start my_shutdown.service
My understanding:
Description: a description of our service.
Before: we want our service to stop before these targets are started.
Requires: our service requires that network capabilities are available. These targets must not be stopped before our service starts/stops.
KillMode: none; do not kill our process.
ExecStart: /bin/true; a command that does nothing but returns a success. Run when are service is started.
ExecStop: the script to run. Run when are service is being stopped.
RemainAfterExit: consider our service active even when all its processes exited.
Type: oneshot; it is expected that the process has to exit before systemd starts follow-up units.
WantedBy: the target we want to add our service to.
References:
https://www.freedesktop.org/software/systemd/man/systemd.service.html
https://www.freedesktop.org/software/systemd/man/systemd.kill.html#
https://www.freedesktop.org/software/systemd/man/systemd.special.html
https://www.freedesktop.org/software/systemd/man/systemd.target.html
You can trigger events, such as pushing logs to S3 on specific events, with CloudWatch... Learn more here: https://aws.amazon.com/cloudwatch/

Resources