What does 200>"$somefile" accomplish? [duplicate] - bash

This question already has an answer here:
How does this canonical flock example work?
(1 answer)
Closed 8 years ago.
I've found boilerplate flock(1) code which looks promising. Now I want to understand the components before blindly using it.
Seems like these functions are using the third form of flock
flock [-sxun] [-w timeout] fd
The third form is convenient inside shell scripts, and is usually used
the following manner:
(
flock -s 200
# ... commands executed under lock ...
) 200>/var/lock/mylockfile
The piece I'm lost on (from the sample wrapper functions) is this notation
eval "exec $LOCKFD>\"$LOCKFILE\""
or in shorthand from the flock manpage
200>/var/lock/mylockfile
What does that accomplish?
I notice subsequent commands to flock passed a value other than the one in the initial redirect cause flock to complain
flock: 50: Bad file descriptor
It seems like flock is using the file descriptors as a map to know which file to operate on. In order for that to work though, those descriptors would have to still be around and associated with the file, right?
After the redirect is finished, and the lock file is created, isn't the file closed, and file descriptors associated with the open file vaporized? I thought file descriptors were only associated with open files.
What's going on here?

200>/var/lock/mylockfile
This creates a file /var/lock/mylockfile which can be written to via file descriptor 200 inside the sub-shell. The number 200 is an arbitrary one. Picking a high number reduces the chance of any of the commands inside the sub-shell "noticing" the extra file descriptor.
(Typically, file descriptors 0, 1, and 2 are used by stdin, stdout, and stderr, respectively. This number could have been as low as 3.)
flock -s 200
Then flock is used to lock the file via the previously created file descriptor. It needs write access to the file, which the > in 200> provided. Note that this happens after the redirection above.

Related

How to temporarily override a named file descriptor in bash?

I have a quite unusual problem in one of my bash scripts. I want to do something (in fact I want to create/remve LVs, this is MWE) like this:
#! /bin/bash
# ...
exec {flock}>/tmp/lock
# Do something with fd ${flock} e.g.
flock -n ${flock} || exit 1
# ...
lvs ${flock}>&-
# ...
The problem is the ${flock}>&-. Why do I want this? The LVM tools complain with a warning about any opened file descriptors except for stdin, stdout and stderr. So when I drop this small redirecting part, the script works but writes out a warning message.
Thus I wanted to redirect the fd $flock only for the LVM command to be closed. I do not want to close the file but only redirect for this single command invocation.
In my case $flock is set to 10 (first free fd greater or equal to 10, see man bash). However I do not get the corresponding fd remapped as sketched above. Instead the 10 is considered a parameter of the (lvs) command and the stdout should be redirected. Of course this is not what I intend.
If I hardcode 10>&- this works but is very bad style. For now I switched to completely hardcode the fd in the whole file. Nevertheless I would like to know how it would be done correctly.
Don't use the dollar
Your example code has:
lvs ${flock}>&-
The correct syntax is:
lvs {flock}>&-
From the redirection section of the Bash manual:
Each redirection that may be preceded by a file descriptor number may instead be preceded by a word of the form {varname}. In this case, for each redirection operator except >&- and <&-, the shell will allocate a file descriptor greater than 10 and assign it to {varname}. If >&- or <&- is preceded by {varname}, the value of varname defines the file descriptor to close. If {varname} is supplied, the redirection persists beyond the scope of the command, allowing the shell programmer to manage the file descriptor himself.
(emphasis mine)

Why use "<&3", "3&-" and "3</file" in a bash while loop? And what does it actually do?

I'm going through a bash script, trying to figure out how it works and potentially patching it. The script in question is this cryptroot script from debian responsible for decrypting block devices at boot. Not being completely at home in bash definitely makes it a challenge.
I found this piece of code and I'm not sure what it does.
if [ -r /conf/conf.d/cryptroot ]; then
while read mapping <&3; do
setup_mapping "$mapping" 3<&-
done 3< /conf/conf.d/cryptroot
fi
My guess is that it reads each line in /conf/conf.d/cryptroot and passes it to setup_mapping. But I don't quite understand how, what the significance of <&3, 3&- and 3</conf/conf.d/cryptroot is, and what they do?
When I read lines from a file I usually do something like this:
while read LINE
do COMMAND
done < FILE
Where the output of FILE is directed to read in the while loop and doing COMMAND until the last line.
I also know a little bit about redirection, as in, I sometimes use it to redirect STDOUT and STDERR to stuff like /dev/null for example. But I'm not sure what the redirection to 3 means.
After reading a little bit more about I/O redirection I've something close to an answer, according to tldp.org.
The file descriptors for stdin, stdout, and stderr are 0, 1, and 2,
respectively. For opening additional files, there remain descriptors 3
to 9.
So the 3 is "just" a reference to an open file or:
...simply a number that the operating system assigns to an open file
to keep track of it. Consider it a simplified type of file pointer.
So from what I understand:
The 3< /conf/conf.d/cryptroot opens /conf/conf.d/cryptroot for reading and assigns it to file descriptor 3.
The read mapping <&3 seems to be reading the first line from file descriptor 3, which points to the open file /conf/conf.d/cryptroot.
The setup_mapping "$mapping" 3<&- seems to be closing file descriptor 3, does this mean that it is opened again for every turn in the loop and pointing to the next line?
If the above is correct my question is why do this rather than the "normal" way? e.g.
while read mapping; do
setup_mapping "$mapping"
done < /conf/conf.d/cryptroot
What advantage (if any) does the first version provide?
A common problem with
while read LINE
do COMMAND
done < FILE
is that people forget the COMMAND is also reading from FILE, and potentially consuming data intended to be read by read in the control of the while loop. To avoid this, a common idiom is to instead read from a different file descriptor. That is accomplished with <&3. But doing that opens file descriptor 3 for COMMAND. That may not be an issue, but it's reasonable to explicitly close it with 3<&- . In short, the construction you're seeing is just a way to avoid having setup_mapping inadvertently read data intended for read.

Execution order of subshells?

While trying to solve other problems, I have come across the following bash script in Alex B's answer in this question:
#!/bin/bash
(
# Wait for lock on /var/lock/.myscript.exclusivelock (fd 200) for 10 seconds
flock -x -w 10 200 || exit 1
# Do stuff
) 200>/var/lock/.myscript.exclusivelock
I have problems understanding that script. According to flock's manual, the file descriptor (the 200) in flock -x -w 10 200 must relate to an open file.
Where is that descriptor / file opened? If it is the 200>/var/lock/.myscript.exclusivelock which opens the descriptor, that would mean that this part is executed before the subshell, which is the opposite of what I have thought when I initially have looked at this script.
This leads me to my question: What is the execution order of subshells in bash, in relation to the main script (i.e. the script opening the subshells) as well as in relation to other subshells which the same main script might spawn?
From reading other articles and the bash manual, I believe I have only learned that subshells are executed "concurrently", but I didn't see any statement explaining if there are execptions from this (one obvious exception would be when the main script would need the output of a subshell, like echo foo $(cat bar)).
200>, the redirection operator, opens the file using descriptor 200. It is indeed processed before the subshell. That file descriptor is then inherited by the subshell.
There is nothing inherently concurrent about subshells. You may be thinking of pipelines, like a | b | c, where a, b, and c are all commands that run concurrently. The fact that each is run in a subshell (usually a subprocess proper, if they are external commands, but even shell built-ins execute in a subshell) is an implementation detail of the pipeline.
To elaborate,
First, the shell parses this command. It identifies the complex command (...) with an output redirection.
It opens /var/lock/.myscript.exclusivelock in write mode on file descriptor 200.
It executes the subshell, which inherits all open file descriptors, including 200.
In the subshell, it executes flock, which inherits all open file descriptors from its parent, the subshell. It does its thing on file descriptor 200, as requested by its argument.
Once the subshell exits, any file opened by one of its redirection operators is closed by the shell.

Lock a file in bash using flock and lockfile

i spent the better part of the day looking for a solution to this problem and i think i am nearing the brink ... What i need to do in bash is: write 1 script that will periodicly read your inputs and write them into a file and second script that will periodicly print out the complete file BUT only when something new gets written in, meaning it will never write 2 same outputs 1 after another. 2 scripts need to comunicate by the means of a lock, meaning script 1 will lock a file so that script 2 cant print anything out of it, then script 1 will write something new into that file and unlock it ( and then script 2 can print updated file ).
The only hints we got was the usage of flock and lockfile - didnt get any hints on how to use them, exept that problem MUST be solved by flock or lockfile.
edit: When i said i was looking for a solution i ment i tried every single combination of flock with those flags and i just couldnt get it to work.
I will write pseudo code of what i want to do. A thing to note here is that this pseudocode is basicly the same as it is done in C .. its so simple, i dont know why everything has to be so complicated in bash.
script 1:
place a lock on file text.txt ( no one else can read it or write to it)
read input
place that input into file ( not deleting previous text )
remove lock on file text.txt
repeat
script 2:
print out complete text.txt ( but only if it is not locked, if it is locked obviously you cant)
repeat
And since script 2 is repeating all the time, it should print the complete text.txt ONLY when something new was writen to it.
I have about 100 other commands like flock that i have to learn in a very short time and i spent 1 day only for 1 of those commands. It would be kind of you to at least give me a hint. As for man page ...
I tried to do something like flock -x text.txt -c read > text.txt, tried every other combination also, but nothing works. It takes only 1 command, wont accept arguments. I dont even know why there is an option for command. I just want it to place a lock on file, write into it and then unlock it. In c it only takes flock("text.txt", ..).
Let's look at what this does:
flock -x text.txt -c read > text.txt
First, it opens test.txt for write (and truncates all contents) -- before doing anything else, including calling flock!
Second, it tells flock to get an exclusive lock on the file and run the command read.
However, read is a shell builtin, not an external command -- so it can't be called by a non-shell process at all, mooting any effect that it might otherwise have had.
Now, let's try using flock the way the man page suggests using it:
{
flock -x 3 # grab a lock on file descriptor #3
printf "Input to add to file: " # Prompt user
read -r new_input # Read input from user
printf '%s\n' "$new_input" >&3 # Write new content to the FD
} 3>>text.txt # do all this with FD 3 open to text.txt
...and, on the read end:
{
flock -s 3 # wait for a read lock
cat <&3 # read contents of the file from FD 3
} 3<text.txt # all of this with text.txt open to FD 3
You'll notice some differences from what you were trying before:
The file descriptor used to grab the lock is in append mode (when writing to the end), or in read mode (when reading), so you aren't overwriting the file before you even grab the lock.
We're running the read command (which, again, is a shell builtin, and so can only be run directly by the shell) by the shell directly, rather than telling the flock command to invoke it via the execve syscall (which is, again, impossible).

Bash: file descriptors

I am a Bash beginner but I am trying to learn this tool to have a job in computers one of these days.
I am trying to teach myself about file descriptors now. Let me share some of my experiments:
#!/bin/bash
# Some dummy multi-line content
read -d '' colours <<- 'EOF'
red
green
blue
EOF
# File descriptor 3 produces colours
exec 3< <(echo "$colours")
# File descriptor 4 filters colours
exec 4> >(grep --color=never green)
# File descriptor 5 is an unlimited supply of violet
exec 5< <(yes violet)
echo Reading colours from file descriptor 3...
cat <&3
echo ... done.
echo Reading colours from file descriptor 3 again...
cat <&3
echo ... done.
echo Filtering colours through file descriptor 4...
echo "$colours" >&4
echo ... done. # Race condition?
echo Dipping into some violet...
head <&5
echo ... done.
echo Dipping into some more violet...
head <&5
echo ... done.
Some questions spring to mind as I see the output coming from the above:
fd3 seems to get "depleted" after "consumption", is it also automatically closed after first use?
how is fd3 different from a named pipe? (something I have looked at already)
when exactly does the command yes start executing? upon fd declaration? later?
does yes stop (CTRL-Z or other) and restart when more violet is needed?
how can I get the PID of yes?
can I get a list of "active" fds?
very interesting race condition on filtering through fd4, can it be avoided?
will yes only stop when I exec 5>&-?
does it matter whether I close with >&- or <&-?
I'll stop here, for now.
Thanks!
PS: partial (numbered) answers are fine.. I'll put together the different bits and pieces myself.. (although a comprehensive answer from a single person would be impressive!)
fd3 seems to get "depleted" after "consumption", is it also automatically closed after first use?
No, it is not closed. This is due to the way exec works. In the mode in which you have used exec (without arguments), its function is to arrange the shell's own file descriptors as requested by the I/O redirections specified to itself, and then leave them that way until the script terminated or they are changed again later.
Later, cat receives a copy of this file descriptor 3 on its standard input (file descriptor 0). cat's standard input is implicitly closed when cat exits (or perhaps, though unlikely, cat closes it before it exists, but that doesn't matter). The original copy of this file, which is the shell's file descriptor 3, remains. Although the actual file has reached EOF and nothing further will be read from it.
how is fd3 different from a named pipe? (something I have looked at already)
The shell's <(some command) syntax (which is not standard bourne shell syntax and I believe is only available in zsh and bash, by the way) might actually be implemented using named pipes. It probably isn't under Linux because there's a better way (using /dev/fd), but it probably is on other operating systems.
So in that sense, this syntax may or may not be a helper for setting up named pipes.
when exactly does the command yes start executing? upon fd declaration? later?
As soon as the <(yes violet) construct is evaluated (which happens when the exec 5< <(yes violet) is evaluated).
does yes stop (CTRL-Z or other) and restart when more violet is needed?
No, it does not stop. However, it will block soon enough when it starts producing more output than anything reading the other end of the pipe is consuming. In other words, the pipe buffer will become full.
how can I get the PID of yes?
Good question! $! appears to contain it immediately after yes is executed. However there seems to be an intermediate subshell and you actually get the pid of that subshell. Try <(exec yes violet) to avoid the intermediate process.
can I get a list of "active" fds?
Not from the shell. But if you're using an operating system like Linux that has /proc, you can just consult /proc/self/fd.
very interesting race condition on filtering through fd4, can it be avoided?
To avoid it, you presumably want to wait for the grep process to complete before proceeding through the script. If you obtain the process ID of that process (as above), I think you should be able to wait for it.
will yes only stop when I exec 5>&-?
Yes. What will happen then is that yes will continue to try to produce output forever but when the other end of the file descriptor is closed it will either get a write error (EPIPE), or a signal (SIGPIPE) which is fatal by default.
does it matter whether I close with >&- or <&-?
No. Both syntaxes are available for consistency's sake.

Resources