Protect against accidental deletion - bash

Today I first saw the potential of a partial accidental deletion of a colleague's home directory (2 hours lost in a critical phase of a project).
I was enough worried about it to start thinking of the problem ad a possible solution.
In his case a file named '~' somehow went into a test folder, which he after deleted with rm -rf... when rm arrived to the file bash expanded it to his home folder (he managed to CTRL-C almost in time).
A similar problem could happen if one have a file named '*'.
My first thought was to prevent creation of files with "dangerous names", but that would still not solve the problem as mv or other corner case situations could lead to the risky situation as well.
Second thought was creating a listener (don't know if this is even possible) or an alias of rm that checks what files it processes and if it finds a dangerous one skips sending a message.
Something similar to this:
take all non-parameter arguments (so to get the files one wants to delete)
cycle on these items
check if current item is equal to a dangerous item (say for example '~' or '*'), don't know if this works, at this point is the item already expanded or not?
if so echo a message, don't do anything on the file
proceed with iteration
Third thought: has anyone already done or dealed with this? :]

There's actually pretty good justification for having critical files in your home directory checked into source control. As well as protecting against the situation you've just encountered it's nice being able to version control .bashrc, etc.

Since the shell probably expands the parameter, you can't really catch 'dangerous' names like that.
You could alias 'rm -rf' to 'rm -rfi' (interactive), but that can be pretty tedious if you actually mean 'rm -rf *'.
You could alias 'rm' to 'mv $# $HOME/.thrash', and have a separate command to empty the thrash, but that might cause problems if you really mean to remove the files because of disk quotas or similar.
Or, you could just keep proper backups or use a file system that allows "undeletion".

Accidents do happen. You only can reduce the impact of them.
Both version control (regular checkins) and backups are of vital importance here.
If I can't checkin (because it does not work yet), I backup to an USB stick.
And if the deadline aproaches, the backup frequency increases because Murphy strikes at the most inapropriate moment.

One thing I do is always have a file called "-i" in my $HOME.
My other tip is to always use "./*" or find instead of plain "*".

The version control suggestion gets an upvote from me. I'd recommend that for everything, not just source.
Another thought is a shared drive on a server that's backed up and archived.
A third idea is buying everyone an individual external hard drive that lets them back up their local drive. This is a good thing to do because there are two kinds of hard drives: those that have failed and those that will in the future.

You could also create an alias from rm that runs through a simple script that escapes all characters, effectively stopping you from using wildcards. Then create another alias that runs through real rm without escaping. You would only use the second if you are really sure. Bu then again, that's kinda the point of rm -rf.
Another option I personally like is create an alias that redirects through a script and then passes everything on to rm. If the script finds any dangerous characters, it prompts you Y/N if you want to continue, N cancelling the operation, Y continuing on as normal.

One company where I worked we had a cron job which ran every half an hour which copied all the source code files from everyone's home directory to backup directory structure elsewhere on the system just using find.
This wouldn't prevent actual deletion but it did minimise the work lost on a number of occasions.

This is pretty odd behaviour really - why is bash expanding twice?
Once * has expanded to
old~
this~
~
then no further substitution should happen!
I bravely tested this on my mac, and it just deleted ~, and not my home directory.
Is it possible your colleague somehow wrote code that expanded it twice?
e.g.
ls | xargs | rm -rf

You may disable file name generation (globbing):
set -f
Escaping special chars in file paths could be done with Bash builtins:
filepath='/abc*?~def'
filepath="$(printf "%q" "${filepath}")"
filepath="${filepath//\~/\\~}"
printf "%s\n" "${filepath}"

I use this in my ~/.basrc
alias rm="rm -i"
rm prompts before deleting anything, and the alias can be circumvented either with the -f flag, or by escabing, e.g.
\rm file
Degrades the problem yes; solves it no.

Related

Protecting scripts from errant clobbering

I spent some time building this handy bash script that accepts input via stdin. I got the idea from the top answer to this question: Pipe input into a script
However, I did something really dumb. I typed the following into the terminal:
echo '{"test": 1}' > ./myscript.sh
I meant to pipe it | to my script instead of redirecting > the output of echo.
Up until this point in my life, I never accidentally clobbered any file in this manner. I'm honestly surprised that it took me until today to make this mistake. :D
At any rate, now I've made myself paranoid that I'll do this again. Aside from marking the script as read-only or making backup copies of it, is there anything else I can do to protect myself? Is it a bad practice in the first place to write a script that accepts input from stdin?
Yes, there is one thing you can do -- check your scripts into a source-code-control repository (git, svn, etc).
bash scripts are code, and any non-trivial code you write should be checked in to source-code-control (and changes committed regularly) so that when something like this happens, you can just restore the most-recently-committed version of the file and continue onwards.
This is a very open-ended question, but I usually put scripts in a global bin folder (~/.bin or so). This lets me invoke them as myscript rather than path/to/myscript.sh, so if I accidentally used > instead of |, it'd just create a file by that name in the current directory - which is virtually never ~/.bin.

mktemp vs. umask 066 and touch?

My bash shell requires a temp file. Suppose filename conflict is not an issue, can I say mktemp is not as good as manually touch a temp file after umask 066?
My assumption is:
mktemp is a system function, compared to manually touch a file, it still takes a little bit more resource.
I've read something about ln -s etc/passwd attack, but it looks like a story decades ago when passwords were not shadowed.
Please correct me if my understanding is wrong.
Those two commands are not destined to do the same thing. mktemp creates a file in a flexible way, and has features to make sure it uses a unique name. touch will modify the timestamp of a file (or create it if it does not exist), but you supply the name.
If you want to create an empty file for which you already have a name, then use touch ; if you are going to write to that file right after, you do not need to create it first, just redirect to it.
But if you really need to make a temporary file and ensure you will not overwrite any other file, touch does nothing for you. It is "lighter", maybe, but useless in this case, and you need mktemp.
The mktemp command was written by Todd C. Miller of OpenBSD to prevent common vulnerabilities in shell scripts. In his own words:
Mktemp is a simple utility designed to make temporary file handling in
shells scripts be safe and simple. Traditionally, people writing
shell scripts have used constructs like:
TFILE=/tmp/foop.$$
which are trivial to attack. If such a script is run as root it may
be possible for an attacker on the local host to gain access to the
root login, corrupt or unlink system files, or do a variety of other
nasty things.
The basic problem is that most shells have no equivalent to open(2)'s
O_EXCL flag. While it is possible to avoid this using temporary
directories, I consider the use of mktemp(1) to be superior both in
terms of simplicity and robustness.
Shadow passwords do not help here. If the script is run a root and it writes to a temporary file in an insecure way, then an attacker could possibly exploit race conditions to modify /etc/password or /etc/shadow or both!

How to increment directory number in bash?

I have a watchdog implemented in bash that is restarting a service on certain conditions and it does move the old logs to an old directory.
The problem is that I do want to move the logs to old_1, old_2, ... if previous one exists.
How can I implement this in bash?
You can search for the first non-existing log like this:
#!/bin/bash
num=1
while [[ -f log_$num ]] ; do
let num++
done
echo Fresh new: log_$num
That is a pain to write, handle missing folders (which will break choroba solution for instace). This is why most systems requiring logs are just suffixing their names with dates, I encourage you to do the same, its easier to handle and also easier to retrieve a log afterward.

How to get the full pathname of the current shell script?

Is there a less brute-force way to do this?
#!/bin/ksh
THIS_SCRIPT=$(/usr/bin/readlink -f $(echo $0 | /bin/sed "s,^[^/],$PWD/&,"))
echo $THIS_SCRIPT
I'm stuck using ksh but would prefer a solution that works in bash too (which I think this does).
Entry #28 in the bash FAQ:
How do I determine the location of my script? I want to read some config files from the same place.
There are two prime reasons why this issue comes up: either you want to externalize data or configuration of your script and need a way to find these external resources, or your script is intended to act upon a bundle of some sort (eg. a build script), and needs to find the resources to act upon.
It is important to realize that in the general case, this problem has no solution. Any approach you might have heard of, and any approach that will be detailed below, has flaws and will only work in specific cases. First and foremost, try to avoid the problem entirely by not depending on the location of your script!
...
Using BASH_SOURCE
The BASH_SOURCE internal bash variable is actually an array of pathnames. If you expand it as a simple string, e.g. "$BASH_SOURCE", you'll get the first element, which is the pathname of the currently executing function or script.
I've always done:
SCRIPT_PATH=$(cd `dirname ${0}`; pwd)
I've never used readlink before: is it Gnu only? (i.e. will it work on HP-UX, AIX, and Solaris out of the box? dirname and pwd will....)
(edited to add `` which I forgot in original post. d'oh!)
(edit 2 to put on two lines which I've apparently always done when I look at previous scripts I'd written, but hadn't remembered properly. First call gets path, second call eliminates relative path)
(edit 3 fixed typo that prevented single line answer from working, back to single line!)
Why didn't I think to try this before I asked the question?
THIS_SCRIPT=$(/usr/bin/readlink -nf "$0")
Works great.
In macOS I use (edit: only works if you run the script from where the script actually is!)
my_script=$(pwd)/$(basename $0)

For loop in Bash Script

lets say we were to use standard bash terminology to write a for loop which loops srm to securely erase a item on your drive.
Now lets say we set it to iterate 10 times, after it is done the first iteration, How can it still work on the file? The file should no longer exist, so how can It erase it? Not a question specific to srm, anything can be ran. Even something like mv, even when the file is no longer availible.
It'll run through the loop 10 times, but except on the first iteration, the command you're executing will fail (and return -1). The command will also write out any error messages it normally writes out (to stdout, stderr or a file).
#!/bin/bash
for i in {1..5}
do
rm something
done
Now, assuming there's a file called something, you get:
rm: something: No such file or directory
rm: something: No such file or directory
rm: something: No such file or directory
rm: something: No such file or directory
Note that this happens 4 times, not 5, since the first time, rm ran successfully.
You can't. Once srm has returned, the file is gone.
Rather then writing a loop, you will want to adjust the arguments to srm to overwrite the data more times before returning.
According to the Wikipedia writeup on srm, the default mode is 35 pass Gutmann. Is that really not sufficient?
srm does the looping for you, and then deletes the drive, there is no need or ability to do what you want from bash. You would have to write something in C/C++ that talked directly to the filesystem using some OS specific API.
Overkill. Just use shred --remove <file>; it's a dozen times easier.
If you're trying to wipe your whole drive, it's been pretty systematically proven that nothing gets you more bang for the buck than dd and writing your drive with zeroes.

Resources