Create a backup of a file in bash - bash

I want to write into a file in a bash script but I want to make sure that the file is backed up if it exists and I also want to avoid overwriting any existing backups.
So basically I have $FILE, if this exists, I want to move $FILE to $FILE.bak if it does not already exist, otherwise to $FILE.bak2, $FILE.bak3, etc.
Is there a shell command for this?

Using a function to find the next available name:
#!/usr/bin/env bash
function nextsuffix {
local name="$1.bak"
if [ -e "$name" ]; then
printf "%s" "$name"
else
local -i num=2
while [ -e "$name$num" ]; do
num+=1
done
printf "%s%d" "$name" "$num"
fi
}
mv "$1" "$(nextsuffix "$1")"
If foo.bak already exists, it just loops until a given foo.bakN filename doesn't exist, incrementing N each time.

You can just output to a file with a date.
FILE=~/test
echo "123" >> $FILE.$(date +'%Y%d%m')
If you want the numbers logrotate seems to be most ideal.

cp "$FILE" "$FILE.bak$(( $(grep -Eo '[[:digit:]]+' <(sort -n <(for fil in $FILE.bak*;do echo $fil;done) | tail -1 )) + 1 ))"
Breaking the commands down
sort -n <(for fil in $FILE.bak*;do echo $fil;done) | tail -1
List the last file in the directory which is sorted in numeric form
grep -Eo '[[:digit:]]+' <(sort -n <(for fil in $FILE.bak*;do echo $fil;done) | tail -1 ))
Strip out everything but the digits
(( $(grep -Eo '[[:digit:]]+' <(sort -n <(for fil in $FILE.bak*;do echo $fil;done) | tail -1 )) + 1 ))
Add one to the result

For posterity, my function with changes inspired by #Shawn's answer
backup() {
local file new n=0
local fmt='%s.%(%Y%m%d)T_%02d'
for file; do
while :; do
printf -v new "$fmt" "$file" -1 $((++n))
[[ -e $new ]] || break
done
command cp -vp "$file" "$new"
done
}
I like to cp not mv.

Related

Delete empty files - Improve performance of logic

I am i need to find & remove empty files. The definition of empty files in my use case is a file which has zero lines.
I did try testing the file to see if it's empty However, this behaves strangely as in even though the file is empty it doesn't detect it so.
Hence, the best thing I could write up is the below script which i way too slow given it has to test several hundred thousand files
#!/bin/bash
LOOKUP_DIR="/path/to/source/directory"
cd ${LOOKUP_DIR} || { echo "cd failed"; exit 0; }
for fname in $(realpath */*)
do
if [[ $(wc -l "${fname}" | awk '{print $1}') -eq 0 ]]
then
echo "${fname}" is empty
rm -f "${fname}"
fi
done
Is there a better way to do what I'm after or alternatively, can the above logic be re-written in a way that brings better performance please?
Your script is slow beacuse wc reads every file to the end, which is not needed for your purpose. This might be what you're looking for:
#!/bin/bash
lookup_dir='/path/to/source/directory'
cd "$lookup_dir" || exit
for file in *; do
if [[ -f "$file" && -r "$file" && ! -L "$file" ]]; then
read < "$file" || echo rm -f -- "$file"
fi
done
Drop the echo after making sure it works as intended.
Another version, calling the rm only once, could be:
#!/bin/bash
lookup_dir='/path/to/source/directory'
cd "$lookup_dir" || exit
for file in *; do
if [[ -f "$file" && -r "$file" && ! -L "$file" ]]; then
read < "$file" || files_to_be_deleted+=("$file")
fi
done
rm -f -- "${files_to_be_deleted[#]}"
Explanation:
The core logic is in the line
read < "$file" || rm -f -- "$file"
The read < "$file" command attempts to read a line from the $file. If it succeeds, that is, a line is read, then the rm command on the right-hand side of the || won't be executed (that's how the || works). If it fails then the rm command will be executed. In any case, at most one line will be read. This has great advantage over the wc command because wc would read the whole file.
if ! read < "$file"; then rm -f -- "$file"; fi
could be used instead. The two lines are equivalent.
To check a "$fname" is a file and is empty or not, use [ -s "$fname" ]:
#!/usr/bin/env sh
LOOKUP_DIR="/path/to/source/directory"
for fname in "$LOOKUP_DIR"*/*; do
if ! [ -s "$fname" ]; then
echo "${fname}" is empty
# remove echo when output is what you want
echo rm -f "${fname}"
fi
done
See: help test:
File operators:
...
-s FILE True if file exists and is not empty.
Yet another method
wc -l ~/tmp/* 2>/dev/null | awk '$1 == 0 {print $2}' | xargs echo rm
This will break if any of your files have whitespace in the name.
To work around that, with awk still
wc -l ~/tmp/* 2>/dev/null \
| awk 'sub(/^[[:blank:]]+0[[:blank:]]+/, "")' \
| xargs echo rm
This works because the sub function returns the number of substitutions made, which can be treated as a boolean zero/not-zero condition.
Remove the echo to actually delete the files.

Shell: Add string to the end of each line, which match the pattern. Filenames are given in another file

I'm still new to the shell and need some help.
I have a file stapel_old.
Also I have in the same directory files like english_old_sync, math_old_sync and vocabulary_old_sync.
The content of stapel_old is:
english
math
vocabulary
The content of e.g. english is:
basic_grammar.md
spelling.md
orthography.md
I want to manipulate all files which are given in stapel_old like in this example:
take the first line of stapel_old 'english', (after that math, and so on)
convert in this case english to english_old_sync, (or after that what is given in second line, e.g. math to math_old_sync)
search in english_old_sync line by line for the pattern '.md'
And append to each line after .md :::#a1
The result should be e.g. of english_old_sync:
basic_grammar.md:::#a1
spelling.md:::#a1
orthography.md:::#a1
of math_old_sync:
geometry.md:::#a1
fractions.md:::#a1
and so on. stapel_old should stay unchanged.
How can I realize that?
I tried with sed -n, while loop (while read -r line), and I'm feeling it's somehow the right way - but I still get errors and not the expected result after 4 hours inspecting and reading.
Thank you!
EDIT
Here is the working code (The files are stored in folder 'olddata'):
clear
echo -e "$(tput setaf 1)$(tput setab 7)Learning directories:$(tput sgr 0)\n"
# put here directories which should not become flashcards, command: | grep -v 'name_of_directory_which_not_to_learn1' | grep -v 'directory2'
ls ../ | grep -v 00_gliederungsverweise | grep -v 0_weiter | grep -v bibliothek | grep -v notizen | grep -v Obsidian | grep -v z_nicht_uni | tee olddata/stapel_old
# count folders
echo -ne "\nHow much different folders: " && wc -l olddata/stapel_old | cut -d' ' -f1 | tee -a olddata/stapel_old
echo -e "Are this learning directories correct? [j ODER y]--> yes; [Other]-->no\n"
read lernvz_korrekt
if [ "$lernvz_korrekt" = j ] || [ "$lernvz_korrekt" = y ];
then
read -n 1 -s -r -p "Learning directories correct. Press any key to continue..."
else
read -n 1 -s -r -p "Learning directories not correct, please change in line 4. Press any key to continue..."
exit
fi
echo -e "\n_____________________________\n$(tput setaf 6)$(tput setab 5)Found cards:$(tput sgr 0)$(tput setaf 6)\n"
#GET && WRITE FOLDER NAMES into olddata/stapel_old
anzahl_zeilen=$(cat olddata/stapel_old |& tail -1)
#GET NAMES of .md files of every stapel and write All to 'stapelname'_old_sync
i=0
name="var_$i"
for (( num=1; num <= $anzahl_zeilen; num++ ))
do
i="$((i + 1))"
name="var_$i"
name=$(cat olddata/stapel_old | sed -n "$num"p)
find ../$name/ -name '*.md' | grep -v trash | grep -v Obsidian | rev | cut -d'/' -f1 | rev | tee olddata/$name"_old_sync"
done
(tput sgr 0)
I tried to add:
input="olddata/stapel_old"
while IFS= read -r line
do
sed -n "$line"p olddata/stapel_old
done < "$input"
The code to change only the english_old_sync is:
lines=$(wc -l olddata/english_old_sync | cut -d' ' -f1)
for ((num=1; num <= $lines; num++))
do
content=$(sed -n "$num"p olddata/english_old_sync)
sed -i "s/"$content"/""$content":::#a1/g"" olddata/english_old_sync
done
So now, this need to be a inner for-loop, of a outer for-loop which holds the variable for english, right?
stapel_old should stay unchanged.
You could try a while + read loop and embed sed inside the loop.
#!/usr/bin/env bash
while IFS= read -r files; do
echo cp -v "$files" "${files}_old_sync" &&
echo sed '/^.*\.md$/s/$/:::#a1/' "${files}_old_sync"
done < olddata/staple_old
convert in this case english to english_old_sync, (or after that what is given in second line, e.g. math to math_old_sync)
cp copies the file with a new name, if the goal is renaming the original file name from the content of the file staple_old then change cp to mv
The -n and -i flag from sed was ommited , include it, if needed.
The script also assumes that there are no empty/blank lines in the content of staple_old file. If in case there are/is add an addition test after the line where the do is.
[[ -n $files ]] || continue
It also assumes that the content of staple_old are existing files. Just in case add an additional test.
[[ -e $files ]] || { printf >&2 '%s no such file or directory.\n' "$files"; continue; }
Or an if statement.
if [[ ! -e $files ]]; then
printf >&2 '%s no such file or directory\n' "$files"
continue
fi
See also help test
See also help continue
Combining them all together should be something like:
#!/usr/bin/env bash
while IFS= read -r files; do
[[ -n $files ]] || continue
[[ -e $files ]] || {
printf >&2 '%s no such file or directory.\n' "$files"
continue
}
echo cp -v "$files" "${files}_old_sync" &&
echo sed '/^.*\.md$/s/$/:::#a1/' "${files}_old_sync"
done < olddata/staple_old
Remove the echo's If you're satisfied with the output so the script could copy/rename and edit the files.

Add character to file name if duplicate when moving with bash

I currently use a bash script and PDFgrep to rename files to a certain structure. However, in order to stop overriding if the new file has a duplicate name, I want to add a number at the end of the name. Keep in mind that there may be 3 or 4 duplicate names. What's the best way to do this?
#!/bin/bash
if [ $# -ne 1 ]; then
echo Usage: Renamer file
exit 1
fi
f="$1"
id1=$(pdfgrep -m 1 -i "MR# : " "$f" | grep -oE "[M][0-9][0-9]+") || continue
id2=$(pdfgrep -m 1 -i "Visit#" "$f" | grep -oE "[V][0-9][0-9]+") || continue
{ read today; read dob; read dop; } < <(pdfgrep -i " " "$f" | grep -oE "[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]")
dobsi=$(echo $dob | sed -e 's/\//-/g')
dopsi=$(echo $dop | sed -e 's/\//-/g')
mv -- "$f" "${id1}_${id2}_$(printf "$dobsi")_$(printf "$dopsi")_1.pdf"
Use a loop that checks if the destination filename exists, and increments a counter if it does. Replace the mv line with this:
prefix="${id1}_{id2}_${dob}_${dop}"
counter=0
while true
do
if [ "$counter" -ne 0 ]
then target="${prefix}_${counter}.pdf"
else target="${prefix}.pdf"
fi
if [ ! -e "$target" ]
then
mv -- "$f" "$target"
break
fi
((counter++))
done
Note that this suffers from a TOCTTOU problem, if the duplicate file is created between the ! -f "$target" test and the mv. I thought it would be possible to replace the existence check with using mv -n; but while this won't overwrite the file, it still treats the mv as successful, so you can't test the result to see if you need to increment the counter.

Shell script: check if any files in one directory are newer than any files in another directory

I want to run a command in a shell script if files in one directory have changed more recently than files in another directory.
I would like something like this
if [ dir1/* <have been modified more recently than> dir2/* ]; then
echo 'We need to do some stuff!'
fi
As described in BashFAQ #3, broken down here into reusable functions:
newestFile() {
local latest file
for file; do
[[ $file && $file -nt $latest ]] || latest=$file
done
}
directoryHasNewerFilesThan() {
[[ "$(newestFile "$1"/*)" -nt "$(newestFile "$2" "$2"/*)" ]]
}
if directoryHasNewerFilesThan dir1 dir2; then
echo "We need to do something!"
else
echo "All is well"
fi
If you want to count the directories themselves as files, you can do that too; just replace "$(newestFile "$1"/*)" with "$(newestFile "$1" "$1"/*)", and likewise for the call to newestFile for $2.
Using /bin/ls
#!/usr/bin/ksh
dir1=$1
dir2=$2
#get modified time of directories
integer dir1latest=$(ls -ltd --time-style=+"%s" ${dir1} | head -n 2 | tail -n 1 | awk '{print $6}')
integer dir2latest=$(ls -ltd --time-style=+"%s" ${dir2} | head -n 2 | tail -n 1 | awk '{print $6}')
#get modified time of the latest file in the directories
integer dir1latestfile=$(ls -lt --time-style=+"%s" ${dir1} | head -n 2 | tail -n 1 | awk '{print $6}')
integer dir2latestfile=$(ls -lt --time-style=+"%s" ${dir2} | head -n 2 | tail -n 1 | awk '{print $6}')
#sort the times numerically and get the highest time
val=$(/bin/echo -e "${dir1latest}\n${dir2latest}\n${dir1latestfile}\n${dir2latestfile}" | sort -n | tail -n 1)
#check to which file the highest time belongs to
case $val in
#(${dir1latest}|${dir1latestfile})) echo $dir1 is latest ;;
#(${dir2latest}|${dir2latestfile})) echo $dir2 is latest ;;
esac
It's simple, get times stamps of both the folders in machine format(epoch time) then do simple comparison. that's all

Bash - sometimes creates only empty output

I am trying to create a bash dictionary script that accepts first argument and creates file named after that, then script accepts next arguments (which are files inside same folder) and outputs their content into file (first argument). It also sorts, deletes symbols etc., but main problem is, that sometimes ouptut file is empty (I am passing one non empty file and one non existing file), after deleting and running script few more times it is sometimes empty sometimes not.
#!/bin/bash
numberoffileargs=$(( $# - 1 ))
exitstat=0
counterexit=0
acceptingstdin=0;
> "$1";
#check if we have given input files given
if [ "$#" -gt 1 ]; then
#for cycle going through input files
for i in "${#:2}"
do
#check whether input file is readable
if [ -r "${i}" ]; then
cat "${i}" >> "$1"
#else redirect to standard output
else
exitstat=2
counterexit=$((counterexit + 1))
echo "file does not exist" 1>&2
fi
done
else
echo "stdin code to be done"
acceptingstdin=1
#stdin input to output file
#stdin=$(cat)
fi
#one word for each line, alphabetical sort, alphabet only, remove duplicates
#all lowercase
#sort -u >> "$1"
if [ "$counterexit" -eq "$numberoffileargs" ] && [ "$acceptingstdin" -eq 0 ]; then
exitstat=3
fi
cat "$1" | sed -r 's/[^a-zA-Z\-]+/ /g' | tr A-Z a-z | tr ' ' '\n' | sort -u | sed '/^$/d' > "$1"
echo "$numberoffileargs"
echo "$counterexit"
echo "$exitstat"
exit $exitstat
Here is your script with some syntax improvement. Your trouble came from the fact that the dictionary was both on input and output on your pipeline; I added a temp file to fix it.
#!/bin/bash
(($# >= 1)) || { echo "Usage: $0 dictionary file ..." >&2 ; exit 1;}
dict="$1"
shift
echo "Creating $dict ..."
>| "$dict" || { echo "Failed." >&2 ; exit 1;}
numberoffileargs=$#
exitstat=0
counterexit=0
acceptingstdin=0
if (($# > 0)); then
for i ; do
#check whether input file is readable
if [ -r "${i}" ]; then
cat "${i}" >> "$dict"
else
exitstat=2
let counterexit++
echo "file does not exist" >&2
fi
done
else
echo "stdin code to be done"
acceptingstdin=1
fi
if ((counterexit == numberoffileargs && acceptingstdin == 0)); then
exitstat=3
fi
sed -r 's/[^a-zA-Z\-]+/ /g' < "$dict" | tr '[:upper:]' '[:lower:]' | tr ' ' '\n' |
sort -u | sed '/^$/d' >| tmp$$
mv -f tmp$$ "$dict"
echo "$numberoffileargs"
echo "$counterexit"
echo "$exitstat"
exit $exitstat
The pipeline might be improved.

Resources