QNX (Neutrino 6.5.0) uses an open source implementation of ksh as its shell. A lot of the provided scripts, including the system startup scripts, use constructs such as
if ! test /dev/slog -ef /dev/slog; then
# do something
fi
to check whether a resource manager exists or not in the filesystem. I've searched and could only find very dray explanations that -ef checks to see whether the two parameters are in fact the same file. Since the filename specified is the same it seems to just reduce to checking that the file exists.
I have checked the behaviour of test -a and test -e (both seem to check for file existance of any type of file according to the various docs I've read) and they seem to also work.
Is there any difference in the checks performed between -ef and -a/-e? Is using -ef some kind of attempt to protect against a race condition in the existence of the file?
Reviewing the strace on Ubuntu Linux's copy of ksh reveals no substantial differences. One call to stat vs two.
$ strace test /tmp/tmp.geLaoPkXXC -ef /tmp/tmp.geLaoPkXXC
showed this:
mmap(NULL, 7220736, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f11dc80b000
close(3) = 0
stat("/tmp/tmp.geLaoPkXXC", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
stat("/tmp/tmp.geLaoPkXXC", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
close(1) = 0
close(2) = 0
...whereas
$ strace test -a /tmp/tmp.geLaoPkXXC
showed this:
fstat(3, {st_mode=S_IFREG|0644, st_size=7220736, ...}) = 0
mmap(NULL, 7220736, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f6b49e2b000
close(3) = 0
stat("/tmp/tmp.geLaoPkXXC", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
close(1) = 0
close(2) = 0
One stat vs two.
$ ksh --version
version sh (AT&T Research) 93u 2011-02-08
We don't know how the code use the stat exactly without code, we need to find the difference via the code.
/* code for -ef */
return (stat (argv[op - 1], &stat_buf) == 0
&& stat (argv[op + 1], &stat_spare) == 0
&& stat_buf.st_dev == stat_spare.st_dev
&& stat_buf.st_ino == stat_spare.st_ino);
/* code for -e/-a */
case 'a': /* file exists in the file system? */
case 'e':
return stat (argv[pos - 1], &stat_buf) == 0;
So, if the names are the same and two stat() with the same name will return the same value, then,
test -a/-e file is the same as test file -ef file. We know the first condition is true, and we know the second condition is also true from the comments from #tinman
Related
Question
How can I instruct my the bash script to not attempt to re-connect if to my rsync daemon if the process lock.file already exists? (as to prevent the bash script from attempting to infinitely create new connections after the first connection has already been made)?
This is an example of one of my rsync-daemon wrapper scripts:
#!/bin/sh
#
#
while [ 1 ]
do
cputool --load-limit 7.5 -- nice -n -15 rsync -avxP --no-i-r --rsync-path="rsync" --log-file=/var/log/rsync-home.log --exclude 'snap' --exclude 'lost+found' --exclude=".*" --exclude=".*/" 127.0.0.1::home /media/username/external/home-files-only && sync && echo 3 > /proc/sys/vm/drop_caches
if [ "$?" = "0" ] ; then
echo "rsync completed normally"
exit
else
echo "Rsync failure. Backing off and retrying..."
sleep 10
fi
done
#end of shell script
This is my /etc/rsyncd.conf:
[home]
path = /home/username
list = yes
use chroot = false
strict modes = false
uid = root
gid = root
read only = yes
# Data source information
max connections = 1
lock file = /var/run/rsyncd-home.lock
[prod-bkup]
path = /media/username/external/Server-Backups/Prod/today
list = yes
use chroot = false
strict modes = false
uid = root
gid = root
# Don't allow to modify the source files
read only = yes
max connections = 1
lock file = /var/run/rsyncd-prod-bkup.lock
[test-bkup]
path = /media/username/external/Server-Backups/Test/today
list = yes
use chroot = false
strict modes = false
uid = root
gid = root
# Don't allow to modify the source files
read only = yes
max connections = 1
lock file = /var/run/rsyncd-test-bkup.lock
[VminRoot2]
path = /root/VDI-Files
list = yes
use chroot = false
strict modes = false
uid = root
gid = root
# Don't allow to modify the source files
read only = yes
max connections = 1
lock file = /var/run/rsyncd-VminRoot2.lock
Thanks to #james-brown I now have multiple ways to ensure my script runs once.. correctly...
Solution 1 (quick & dirty):
flock -n <lock file> <script>
Or in my case, using this command to execute my cron job:
flock -n /var/run/rsyncd-home.lock /path/to/my_script.sh
caveat - this leaves your script vulnerable to stale lock files that may prevent execution on the next time interval.
Solution 2:
So, I used a bullet-proof method (so I think... I invite folks to correct my understanding, if need be)...
First, I did apt install procmail, then removed/hashed out the below two lines in my /etc/rsyncd.conf and ran systemctl restart rsync:
#max connections = 1
#lock file = /var/run/rsyncd-home.lock
From there I edited /usr/local/bin/backupscript.sh as follows:
#!/bin/bash
#
LOCK=/var/run/rsyncd-home.lock
remove_lock()
{
rm -f "$LOCK"
}
another_instance()
{
echo "There is another instance running, exiting"
exit 1
}
lockfile -r 0 -l 3600 "$LOCK" || another_instance
trap remove_lock EXIT
#new using rsyncd & perpetual restart
while [ 1 ]
do
cputool --load-limit 7.5 -- nice -n -15 rsync -avxP --no-i-r --rsync-path="rsync" --log-file=/var/log/rsync-home.log --exclude 'snap' --exclude="Variety Images" --exclude="Downloads/WebDev/Vmin-Vbox" --exclude 'Downloads/WebDev/Win10-Vbox' --exclude="Videos/other" --exclude 'lost+found' --exclude=".*" --exclude=".*/" 127.0.0.1::home /media/username/external/home-files-only && sync && echo 3 > /proc/sys/vm/drop_caches
if [ "$?" = "0" ] ; then
echo "rsync completed normally"
exit
else
echo "Rsync failure. Backing off and retrying..."
sleep 10
fi
done
#end of shell script
PRESTO:
The script will only connect to rsync daemon once, it will re-connect on dropped connections thanks to the while loop, and there is no danger of stale lock files interrupting my backup process at future intervals... (i.e. problem solved).
Very useful reference:
https://www.baeldung.com/linux/bash-ensure-instance-running
I have a script which reads commands from a named pipe:
#! /usr/bin/env bash
host_pipe="host-pipe"
#pipe for executing commands
[ -p "$host_pipe" ] || mkfifo -m 0600 "$host_pipe" || exit 1
chmod o+w "$host_pipe"
set -o pipefail
while :; do
if read -r cmd <$host_pipe; then
if [ "$cmd" ]; then
printf 'Running: %s \n' "$cmd"
fi
fi
done
I run it and test with command:
bash -c "echo 'abcdef' > host-pipe"
bash -c "echo 'abcdef' > host-pipe"
bash -c "echo 'abcdef' > host-pipe"
bash -c "echo 'abcdef' > host-pipe"
And get the strange output:
Running: abcdf
Running: abcdef
Running: abcde
Running: abcdf
Running: ace
Somehow the script can't read all the string it get from the pipe? How to read it?
You must have more than one reader of the named pipe host-pipe running for this to happen.
Check to see if you have a second instance of the script running in the background or possibly in another terminal.
Explanation
You will find that bash will issue reads from the pipe 1 byte at a time. If you are on Linux, you can strace your script. Here is an excerpt:
open("host-pipe", O_RDONLY|O_LARGEFILE) = 3
fcntl64(0, F_GETFD) = 0
fcntl64(0, F_DUPFD, 10) = 10
fcntl64(0, F_GETFD) = 0
fcntl64(10, F_SETFD, FD_CLOEXEC) = 0
dup2(3, 0) = 0
close(3) = 0
ioctl(0, TCGETS, 0xbf99bfec) = -1 ENOTTY (Inappropriate ioctl for device)
_llseek(0, 0, 0xbf99c068, SEEK_CUR) = -1 ESPIPE (Illegal seek)
read(0, "a", 1) = 1
read(0, "b", 1) = 1
read(0, "c", 1) = 1
read(0, "d", 1) = 1
read(0, "e", 1) = 1
read(0, "f", 1) = 1
read(0, "\n", 1) = 1
dup2(10, 0) = 0
fcntl64(10, F_GETFD) = 0x1 (flags FD_CLOEXEC)
close(10) = 0
Once you have more than one process with this consumption pattern, any single process will see lost characters.
I found a strange and totally unexpected behaviour while working with redirection in bash and even If I manage to work around it, I'd like to know why it happens.
If I run this command:{ echo wtf > /dev/stdout ; } >> wtf.txt N times, I expect to see the filled with N "wtf" lines. What I found in the file is a single line.
I think that since the first command is opening /dev/stdout in truncate mode, then the mode is inherited by the second file descriptor (wtf.txt), which is then completely erased, but I'd like to know if some of you may explain it better and if this is the correct behaviour or a bug.
Just to be clear, the command I used was a different one, but with the echo example is simpler to understand it. The original command was a command who need an output file as argument and since I want the output on stdout I passed /dev/stdout as argument. The same behaviour may be verified with the command openssl rand -hex 4 -out /dev/stdout >> wtf.txt.
Finally, the solution I managed to fix the problem delegating the append operation to tee in the following way: { echo wtf > /dev/stdout } | tee -a wtf.txt > /dev/null
You can check what happens using strace:
strace -o wtf-trace.txt -ff bash -c '{ (echo wtf) > /dev/stdout; } >> wtf.txt'
This will generate two files like wtf-trace.txt.12889 and wtf-trace.txt.12890 in my case. What happens is, process 1 >> wtf.txt:
open("wtf.txt", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3
dup2(3, 1) = 1
close(3) = 0
clone(child_stack=0, .................) = 12890
wait4(-1, [{WIFEXITED(s) .............) = 12890
exit_group(0) = ?
The first process opens or creates "wtf.txt" for appending and get FD 3. After that it duplicates FD 1 with FD 3 and closes FD 3. At this point it forks (clone), waits for it to exit and exits itself.
The second process { echo wtf > /dev/stdout } inherits the file by FD 1 (stdout) and it does:
open("/dev/stdout", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
dup2(3, 1) = 1
close(3) = 0
fstat(1, {st_mode=S_IFREG|0664, st_size=0, ...}) = 0
write(1, "wtf\n", 4) = 4
exit_group(0) = ?
As you can see it opens /dev/stdout (note O_TRUNC) and gets FD 3, dup2 to get FD 3 to FD 1, closes FD 3, checks FD 1 and gets a file with size of 0 st_size=0, writes to it and exits.
If you do | cat >> then the second process gets it FD 1 connected to a pipe, which is not seek-able or truncate-able...
NB: I show only the relevant lines of the files strace generated.
I have a cronjob that runs a ruby scrip that collects data from a bash utility (ipmitool). Sometimes this utility hangs, causing the whole script to hang, causing the cron jobs to stack up...
the line of code that does this is:
'macaddress' => `timeout 5 ipmitool lan print | grep 'MAC Address'`.split(':',2)[1].strip
in the cron job this still causes the script to hang but when I manually test the following in a ruby script & run form terminal:
ans = `timeout 1 sleep 20 | grep 'hello'`
the shell command terminates properly
how can I prevent the cron script from hanging?
edit: Here's strace of the hanging (hang is at select) :
open("/root/.freeipmi/sdr-cache/sdr-cache-xxxx.localhost", O_RDONLY) = 4
mmap(NULL, 2917, PROT_READ, MAP_PRIVATE, 4, 0) = 0x7f0ea2dfd000
ioctl(3, IPMICTL_SEND_COMMAND, 0x7fff74802020) = 0
select(4, [3], NULL, NULL, {60, 0}
My question relates to an answer posted by jbarlow to the following question:
redirect COPY of stdout to log file from within bash script itself
I used the suggested script as listed below. I have to use this because I don't have access to full bash (as jbarlow points out) because I'm using a buildroot version of busybox.
#!/bin/sh
if [ "$SELF_LOGGING" != "1" ]
then
PIPE=tmp.fifo
mkfifo $PIPE
# Keep PID of this process
SELF_LOGGING=1 sh $0 $* >$PIPE &
PID=$!
tee logfile <$PIPE &
# Safe to rm pipe because the child processes have references to it
rm $PIPE
wait $PID
# Return same error code as original process
exit $?
fi
The issue I'm finding is that it appears that something can freeze up from this script. For example, an strace of a frozen script using the above code looks like:
Process 29750 attached - interrupt to quit
open("/tmp/tmp.fifo", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
write(2, "/usr/bin/runStuff", 24) = 24
write(2, ": ", 2) = 2
write(2, "line ", 5) = 5
write(2, "45", 2) = 2
write(2, ": ", 2) = 2
write(2, "can't open ", 11) = 11
write(2, "/tmp/tmp.fifo", 21) = 21
write(2, ": ", 2) = 2
write(2, "no such file", 12) = 12
write(2, "\n", 1) = 1
stat64("/sbin/tee", 0xbff7c20c) = -1 ENOENT (No such file or directory)
stat64("/usr/sbin/tee", 0xbff7c20c) = -1 ENOENT (No such file or directory)
stat64("/bin/tee", 0xbff7c20c) = -1 ENOENT (No such file or directory)
stat64("/usr/bin/tee", {st_mode=S_IFREG|0755, st_size=18956, ...}) = 0
_exit(1) = ?
Process 29750 detached
What it looks like (to me, with limited knowledge in this) is that tee is ending and the parent script doesn't die. Is that correct? If so, shouldn't the lack of a readable file cause the script to end? tee is backgrounded, so obviously that has no control over the parent.
As background, there's another process that repeatedly calls this if it dies. So it's possible that using the same file is causing a lockup situation. Or maybe the rm is happening before the fifo is created?
I've considered using 'read' with a timeout, but there can be situations where nothing is logged for hours at a time.
Can the script be modified so that this doesn't lock up and the script will die if/when one of the ends of the fifo dies?