Create file, but fail if it exists, with bash [duplicate] - bash

In system call open(), if I open with O_CREAT | O_EXCL, the system call ensures that the file will only be created if it does not exist. The atomicity is guaranteed by the system call. Is there a similar way to create a file in an atomic fashion from a bash script?
UPDATE:
I found two different atomic ways
Use set -o noclobber. Then you can use > operator atomically.
Just use mkdir. Mkdir is atomic

A 100% pure bash solution:
set -o noclobber
{ > file ; } &> /dev/null
This command creates a file named file if there's no existent file named file. If there's a file named file, then do nothing (but return a non-zero return code).
Pros of > over the touch command:
Doesn't update timestamp if file already existed
100% bash builtin
Return code as expected: fail if file already existed or if file couldn't be created; success if file didn't exist and was created.
Cons:
need to set the noclobber option (but it's okay in a script, if you're careful with redirections, or unset it afterwards).
I guess this solution is really the bash counterpart of the open system call with O_CREAT | O_EXCL.

Here's a bash function using the mv -n trick:
function mkatomic() {
f="$(mktemp)"
mv -n "$f" "$1"
if [ -e "$f" ]; then
rm "$f"
echo "ERROR: file exists:" "$1" >&2
return 1
fi
}
Examples:
$ mkatomic foo
$ wc -c foo
0 foo
$ mkatomic foo
ERROR: file exists: foo

You could create it under a randomly-generated name, then rename (mv -n random desired) it into place with the desired name. The rename will fail if the file already exists.
Like this:
#!/bin/bash
touch randomFileName
mv -n randomFileName lockFile
if [ -e randomFileName ] ; then
echo "Failed to acquired lock"
else
echo "Acquired lock"
fi

Just to be clear, ensuring the file will only be created if it doesn't exist is not the same thing as atomicity. The operation is atomic if and only if, when two or more separate threads attempt to do the same thing at the same time, exactly one will succeed and all others will fail.
The best way I know of to create a file atomically in a shell script follows this pattern (and it's not perfect):
create a file that has an extremely high chance of not existing (using a decent random number selection or something in the file name), and place some unique content in it (something that no other thread would have - again, a random number or something)
verify that the file exists and contains the contents you expect it to
create a hard link from that file to the desired file
verify that the desired file contains the expected contents
In particular, touch is not atomic, since it will create the file if it's not there, or simply update the timestamp. You might be able to play games with different timestamps, but reading and parsing a timestamp to see if you "won" the race is harder than the above. mkdir can be atomic, but you would have to check the return code, because otherwise, you can only tell that "yes, the directory was created, but I don't know which thread won". If you're on a file system that doesn't support hard links, you might have to settle for a less ideal solution.

Another way to do this is to use umask to try to create the file and open it for writing, without creating it with write permissions, like this:
LOCK_FILE=only_one_at_a_time_please
UMASK=$(umask)
umask 777
echo "$$" > "$LOCK_FILE"
umask "$UMASK"
trap "rm '$LOCK_FILE'" EXIT
If the file is missing, the script will succeed at creating and opening it for writing, despite the file being created without writing permissions. If it already exists, the script won't be able to open the file for writing. It would be possible to use exec to open the file and keep the file descriptor around.
rm requires you to have write permissions to the directory itself, without regards to file permissions.

touch is the command you are looking for. It updates timestamps of the provided file if the file exists or creates it if it doesn't.

Related

run a program avoiding overwriting output

I have 1000 inputs for a program which I have no control on the output
I can run the program over each file like below. So this program goes take the input file which is like input1, input2 and input3, then run my program and save several outputs there but each time overwrite the outputs to the previous
for i in {1..3}; do
myprogram input"$i"
done
I thought I generate 3 folders and put the input files there then I run the program so maybe the program write the output there, but still not successful.
for i in {1..3}; do
myprogram "$i"/input"$i"
done
Basically I want to exe the program that save the output in each file and then go to another folder .
Is there anyway to cope with this?
Thanks
If it is overwriting the input file as indicated in your comment, you can do save the original input file by copying and renaming/moving then calling the program. Then if you really want them in a subdirectory, make a directory, and move the input and/or output file(s).
for i in {1..3}
do
cp infile$i outfile$i
./myprogram outfile$i
mkdir programRun-$i
mv infile$i outfile$i programRun-$i
done
If it is leaving the input file alone, and just outputs to a consistent file name, then something like
for i in {1..3}
do
./myprogram infile$i
mkdir programRun-$i
mv outfile programRun-$i/outfile-$i
done
Note that in either case, I'd consider using a different variable than $i to identify which run of the program - perhaps a time/date in YYYMMDDHHMMSS form, or just a unix timestamp. Just for organization purposes, and that way all output files from a given run are together... but whatever fits your needs.
If the myprogram is always creating the same file names then you could move them off before executing the next loop iteration. In this example if the output is files called out*.txt .
for i in {1..3}; do ./myprogram input"$i"; mkdir output"$i"; mv out*.txt output"$i"/; done
If the file names created differ you could create new directories and cd into those prior to executing the application.
for i in {1..3}; do mkdir output"$i"; cd output"$i"; ../myprogram ../input"$i"; cd ..; done

mktemp with extension without specifying file path

Prefacing this with that I found identical questions but none of them have answers that are working for me.
I need to make a temporary .json file (it needs to be json because I'll be working with jq later in the script).
I thought based on the answers to this question that it would be the following, but they're creating files named .json and XXXXXXXX.json respectively.
STACKS=$(mktemp .json)
STACKS=$(mktemp XXXXXXXX.json)
This will need to run on both mac OS, and a linux box. I can't specify a path for the file because it will be run both locally and by Jenkins, which have an unidentical file structure. What's the proper syntax?
if you are using openBSD mktemp you can
STACKS="$(mktemp XXXXXX).json"
and then write a trap so the tmps are removed when script finishes:
function cleanup {
if [ -f "$STACKS" ] && [[ "$STACKS" =~ ".json"$ ]]; then
rm -f "$STACKS"
fi
}
trap cleanup EXIT
so when script finishes (no matter how) it will try to remove $STACKS if it is a file and if it ends with .json (for extra safety).

bash is zipping entire home

I am trying to back up a all world* folders from /home/mc/server/ and drop the zipped in /home/mc/backup/
#!/bin/bash
moment=$(date +"%Y%m%d%H%M")
backup="/home/mc/backup/map$moment.zip"
map="/home/mc/server/world*"
zipping="zip -r -9 $backup $map"
eval $zipping
The zipped file is created in backup folder as expected, but when I unzipped it contants the entire /home dir. I am running this bash in two ways:
Manually
Using user's crontab
Finally, If I put an echo of echo $zipping this prints correctly the command that I need to trigger. What am I missing? Thank you in advance.
There's no reason to use eval here (and no, justifying it on DRY grounds if you want to both log a command line and subsequently execute it does not count as a good reason IMO.)
Define a function and call it with the appropriate arguments:
#!/bin/bash
moment=$(date +"%Y%m%d%H%M")
zipping () {
output=$1
shift
zip -r -9 "$output" "$#"
}
zipping "/home/mc/backup/map$moment.zip" /home/mc/server/world*
(I'll admit, I don't know what is causing the behavior you report, but it would be better to confirm it is not somehow specific to the use of eval before trying to diagnose it further.)

Locking Files in Bash

I have a Problem to find a good concept on locking files in bash,
Basically I want to achieve the following:
Lock File
Read in the data in the file (multiple times)
Do stuff with the data.
Write new stuff to the file (not necessarily to the end)
Unlock that file
Doing this with flock seems not possible to me, because the file descriptor will just move once to the end of the file.
Also creating a Tempfile fails, because I might overwrite already read lines which is also not possible.
Edit:
Also note that other scripts I do not control might try to write to that file.
So my question is how can I create a lock in step 1 so it will span over steps 2,3,4 till I unlock it again in step 5?
You can do this with the flock utility. You just need to get flock to use a separate read-only file descriptor, i.e. open the file twice. E.g. to sort a file using a intermediate temporary file:
(
flock -x -w 10 100 || exit 1
tmp=$(mktemp)
sort <"$file" >"$tmp"
cat "$tmp" > "$file"
rm -f "$tmp"
) 100<"$file"
flock will issue the flock() system call for your file and block if it is already locked. If the timeout is exceeded then the script will just abort with an error code.

Bash script — determine if file modified?

I have a Bash script that repeatedly copies files every 5 seconds. But this is a touch overkill as usually there is no change.
I know about the Linux command watch but as this script will be used on OS X computers (which don’t have watch, and I don’t want to make everyone install macports) I need to be able to check if a file is modified or not with straight Bash code.
Should I be checking the file modified time? How can I do that?
Edit: I was hoping to expand my script to do more than just copy the file, if it detected a change. So is there a pure-bash way to do this?
I tend to agree with the rsync answer if you have big trees of files
to manage, but you can use the -u (--update) flag to cp to copy the
file(s) over only if the source is newer than the destination.
cp -u
Edit
Since you've updated the question to indicate that you'd like to take
some additional actions, you'll want to use the -nt check
in the [ (test) builtin command:
#!/bin/bash
if [ $1 -nt $2 ]; then
echo "File 1 is newer than file 2"
else
echo "File 1 is older than file 2"
fi
From the man page:
file1 -nt file2
True if file1 is newer (according to modification date) than
file2, or if file1 exists and file2 does not.
Hope that helps.
OS X has the stat command. Something like this should give you the modification time of a file:
stat -f '%m' filename
The GNU equivalent would be:
stat --printf '%Y\n' filename
You might find it more reliable to detect changes in the file content by comparing the file size (if the sizes differ, the content does) and the hash of the contents. It probably doesn't matter much which hash you use for this purpose: SHA1 or even MD5 is probably adequate, and you might find that the cksum command is sufficient.
File modification times can change without changing the content (think touch file); file modification times can not change even when the content does (doing this is harder, but you could use touch -r ref-file file to set the modification times of file to match ref-file after editing the file).
No. You should be using rsync or one of its frontends to copy the files, since it will detect if the files are different and only copy them if they are.

Resources