The flock utility man page gives the following usage example:
(
flock -s 200
# ... commands executed under lock ...
) 200>/var/lock/mylockfile
Assuming 200 is the filehandle of the lockfile, is there a possibility that during some run it fails, because the same filehandle is already in use by other process? If so, are there any tricks to make sure locking with flock works reliably?
It doesn't matter in the slightest whether another process is also using file descriptor 200. Think about it; every process on the system is entitled to have file descriptors 0, 1, 2 pointing to somewhere, and they do not all point to the same place. All that matters is that your processes won't get upset about file descriptor 200 being used, and very few processes will notice, much less care.
Given that, there aren't any tricks needed - you simply have to ensure that all the processes that need to use the lock file actually do use it.
Related
I currently have the following script below that runs every 1 second on a cron job
#!/bin/bash
LOCK_FILE=/src/lock-file.lock
exec 99>"$LOCK_FILE"
flock -xn 99 -c "node /src/cron.js" || exit 1
exec 99<&- #close
The code is based off some tutorial sites I found below.
https://dev.to/rrampage/ensuring-that-a-shell-script-runs-exactly-once-3d3f
https://www.baeldung.com/linux/file-locking
Currently after about a few days the script gets stuck on the lock. I can type in lslocks and I see the following below proving it's stuck on the lock, even if I stop the cron. This is a problem because the script basically stops running every 1 second.
COMMAND PID TYPE SIZE MODE M START END PATH
(undefined) -1 OFDLCK READ 0 0 0
flock 261 FLOCK WRITE 0 0 0 /root/99
As you can see above the file is still locked and it never seems to unlock for whatever reason. I read that I can use -u option to unlock, but I also read this is not good to do as well and that closing the file is preferred with the exec 99<&- option. The nodejs script should be impossible to have any infinite loops the way it's written.
For that matter the 'file descriptors' are very confusing for me to understand as well. What exactly does 99>"$LOCK_FILE" really do to the file /src/lockfile.lock? I can't seem to find any explanations online and just hear you are supposed to do it. I'm aware 99 is supposed to be some file descriptor that doesn't interfere with more common descriptors like 0 and 1, but this is still very confusing what it all means which might be where my problem is.
What is really weird is it creates a file in my directory called 99, I thought the file descriptor was supposed to be just an integer like an open port on a socket not an actual file that gets created? I'm not sure if this is part of the problem either.
My question is, does anyone see anything wrong with the code I have below that could cause a 'forever' lock situation?
This question already has an answer here:
How does this canonical flock example work?
(1 answer)
Closed 8 years ago.
I've found boilerplate flock(1) code which looks promising. Now I want to understand the components before blindly using it.
Seems like these functions are using the third form of flock
flock [-sxun] [-w timeout] fd
The third form is convenient inside shell scripts, and is usually used
the following manner:
(
flock -s 200
# ... commands executed under lock ...
) 200>/var/lock/mylockfile
The piece I'm lost on (from the sample wrapper functions) is this notation
eval "exec $LOCKFD>\"$LOCKFILE\""
or in shorthand from the flock manpage
200>/var/lock/mylockfile
What does that accomplish?
I notice subsequent commands to flock passed a value other than the one in the initial redirect cause flock to complain
flock: 50: Bad file descriptor
It seems like flock is using the file descriptors as a map to know which file to operate on. In order for that to work though, those descriptors would have to still be around and associated with the file, right?
After the redirect is finished, and the lock file is created, isn't the file closed, and file descriptors associated with the open file vaporized? I thought file descriptors were only associated with open files.
What's going on here?
200>/var/lock/mylockfile
This creates a file /var/lock/mylockfile which can be written to via file descriptor 200 inside the sub-shell. The number 200 is an arbitrary one. Picking a high number reduces the chance of any of the commands inside the sub-shell "noticing" the extra file descriptor.
(Typically, file descriptors 0, 1, and 2 are used by stdin, stdout, and stderr, respectively. This number could have been as low as 3.)
flock -s 200
Then flock is used to lock the file via the previously created file descriptor. It needs write access to the file, which the > in 200> provided. Note that this happens after the redirection above.
I ask because I recently made a change to a KornShell (ksh) script that was executing. A short while after I saved my changes, the executing process failed. Judging from the error message, it looked as though the running process had seen some -- but not all -- of my changes. This strongly suggests that when a shell script is invoked, the entire script is not read into memory.
If this conclusion is correct, it suggests that one should avoid making changes to scripts that are running.
$ uname -a
SunOS blahblah 5.9 Generic_122300-61 sun4u sparc SUNW,Sun-Fire-15000
No. Shell scripts are read either line-by-line, or command-by-command followed by ;s, with the exception of blocks such as if ... fi blocks which are interpreted as a chunk:
A shell script is a text file containing shell commands. When such a
file is used as the first non-option argument when invoking Bash, and
neither the -c nor -s option is supplied (see Invoking Bash), Bash
reads and executes commands from the file, then exits. This mode of
operation creates a non-interactive shell.
You can demonstrate that the shell waits for the fi of an if block to execute commands by typing them manually on the command line.
http://www.gnu.org/software/bash/manual/bashref.html#Executing-Commands
http://www.gnu.org/software/bash/manual/bashref.html#Shell-Scripts
It's funny that most OS'es I know, do NOT read the entire content of any script in memory, and run it from disk. Doing otherwise would allow making changes to the script, while running. I don't understand why that is done, given the fact :
scripts are usually very small (and don't take many memory anyway)
at some point, and shown in this thread, people would start making changes to a script that is already running anyway
But, acknowledging this, here's something to think about: If you decided that a script is not running OK (because you are writing/changing/debugging), do you care on the rest of the running of that script ? you can go ahead making the changes, save them, and ignore all output and actions, done by the current run.
But .. Sometimes, and that depends on the script in question, a subsequent run of the same script (modified or not), can become a problem since the current/previous run is doing an abnormal run. It would typically skip some stuff, or sudenly jump to parts in the script, it shouldn't. And THAT may be a problem. It may leave "things" in a bad state; particularly if file manipulation/creation is involved.
So, as a general rule : even if the OS supports the feature or not, it's best to let the current run finish, and THEN save the updated script. You can change it already, but don't save it.
It's not like in the old days of DOS, where you actually have only one screen in front of you (one DOS screen), so you can't say you need to wait on run completion, before you can open a file again.
No they are not and there are many good reasons for that.
One of the things you should keep in mind is that a shell is not an interpreter even if there are some similarities. Shells are designed to work with a stream of commands. Either from the TTY ,a PIPE, FIFO or even a socket.
The shell reads from its resource line by line until a EOF is returned by the kernel.
The most shells have no extra support for interpreting files. they work with a file as they would work with a terminal.
In fact this is considered to be a nice feature because you can do interesting stuff like this How do Linux binary installers (.bin, .sh) work?
You can use a binary file and prepend shell scripts. You can't do this with an interpreter. because it parses the whole file or at least it would try it and fail. A shell would just interpret it line by line and doesnt care about the garbage at the end of the file. You just have to make sure the execution of the script gets terminated before it reaches the binary part.
I am a Bash beginner but I am trying to learn this tool to have a job in computers one of these days.
I am trying to teach myself about file descriptors now. Let me share some of my experiments:
#!/bin/bash
# Some dummy multi-line content
read -d '' colours <<- 'EOF'
red
green
blue
EOF
# File descriptor 3 produces colours
exec 3< <(echo "$colours")
# File descriptor 4 filters colours
exec 4> >(grep --color=never green)
# File descriptor 5 is an unlimited supply of violet
exec 5< <(yes violet)
echo Reading colours from file descriptor 3...
cat <&3
echo ... done.
echo Reading colours from file descriptor 3 again...
cat <&3
echo ... done.
echo Filtering colours through file descriptor 4...
echo "$colours" >&4
echo ... done. # Race condition?
echo Dipping into some violet...
head <&5
echo ... done.
echo Dipping into some more violet...
head <&5
echo ... done.
Some questions spring to mind as I see the output coming from the above:
fd3 seems to get "depleted" after "consumption", is it also automatically closed after first use?
how is fd3 different from a named pipe? (something I have looked at already)
when exactly does the command yes start executing? upon fd declaration? later?
does yes stop (CTRL-Z or other) and restart when more violet is needed?
how can I get the PID of yes?
can I get a list of "active" fds?
very interesting race condition on filtering through fd4, can it be avoided?
will yes only stop when I exec 5>&-?
does it matter whether I close with >&- or <&-?
I'll stop here, for now.
Thanks!
PS: partial (numbered) answers are fine.. I'll put together the different bits and pieces myself.. (although a comprehensive answer from a single person would be impressive!)
fd3 seems to get "depleted" after "consumption", is it also automatically closed after first use?
No, it is not closed. This is due to the way exec works. In the mode in which you have used exec (without arguments), its function is to arrange the shell's own file descriptors as requested by the I/O redirections specified to itself, and then leave them that way until the script terminated or they are changed again later.
Later, cat receives a copy of this file descriptor 3 on its standard input (file descriptor 0). cat's standard input is implicitly closed when cat exits (or perhaps, though unlikely, cat closes it before it exists, but that doesn't matter). The original copy of this file, which is the shell's file descriptor 3, remains. Although the actual file has reached EOF and nothing further will be read from it.
how is fd3 different from a named pipe? (something I have looked at already)
The shell's <(some command) syntax (which is not standard bourne shell syntax and I believe is only available in zsh and bash, by the way) might actually be implemented using named pipes. It probably isn't under Linux because there's a better way (using /dev/fd), but it probably is on other operating systems.
So in that sense, this syntax may or may not be a helper for setting up named pipes.
when exactly does the command yes start executing? upon fd declaration? later?
As soon as the <(yes violet) construct is evaluated (which happens when the exec 5< <(yes violet) is evaluated).
does yes stop (CTRL-Z or other) and restart when more violet is needed?
No, it does not stop. However, it will block soon enough when it starts producing more output than anything reading the other end of the pipe is consuming. In other words, the pipe buffer will become full.
how can I get the PID of yes?
Good question! $! appears to contain it immediately after yes is executed. However there seems to be an intermediate subshell and you actually get the pid of that subshell. Try <(exec yes violet) to avoid the intermediate process.
can I get a list of "active" fds?
Not from the shell. But if you're using an operating system like Linux that has /proc, you can just consult /proc/self/fd.
very interesting race condition on filtering through fd4, can it be avoided?
To avoid it, you presumably want to wait for the grep process to complete before proceeding through the script. If you obtain the process ID of that process (as above), I think you should be able to wait for it.
will yes only stop when I exec 5>&-?
Yes. What will happen then is that yes will continue to try to produce output forever but when the other end of the file descriptor is closed it will either get a write error (EPIPE), or a signal (SIGPIPE) which is fatal by default.
does it matter whether I close with >&- or <&-?
No. Both syntaxes are available for consistency's sake.
What is the best way of running process in background and receiving its output only when needed?
Intended usage: make prompt-outputting script with heavy initialization be initialized once per session and not on each prompt run. Note: two-way communication is needed: shell needs to tell when new prompt is needed, what is the last command status.
Known solutions:
some explicitly created files on filesystem (FIFO files, UNIX sockets): it would be better to avoid this as this means I need to choose file name, be sure it is garbage-collected on exit and add something to clean no longer used files in case of a crash.
zsh/zpty module: it is a bit like overkill for this job and does not work in bash.
coprocesses: does not work in bash and AFAIK only one coprocess per session is allowed.
Bash supports coprocesses sinces 4.0, but multiple coprocesses is still experimental.
I would have gone with some explicitly created files, naming them ~/.myThing-$HOSTNAME/fifo if they're per user and host. You can use flock to relatively easily determine if the command is still running and optionally start it:
(
flock -n 123 || exit 1
rm/mkfifo ..
exec yourServer < .. > ..
) 123> ~/".myThing-$HOSTNAME/lockfile"
If the command or server dies, the lock is automatically released and you only have a few zero length files lying around. The next time the server starts, it deletes and sets them up again.
Querying the server would be similar, but exiting if the lock is not in use (and optionally using a wait lock to avoid contention).