Advice on using crontab with bash - bash

I want to make a cron job that checks if a folder exists, and it if does to delete all the contents of that folder. For example, I know that the following will delete the contents of my folder in using cron:
0 * * * * cd home/docs/reports/;rm -r *
However, I realized that if the folder is removed (or the wrong file path is given) instead of the contents of that folder being deleted, cd fails and all files are deleted on my operating system. To prevent this from happening (again) I want to check for the existence of the folder first, and then to delete the contents. I want to do something like the following, but I'm not sure how to use a bash script with cron.
if [ -d "home/docs/reports/" ]; then
cd home/docs/reports/;rm -r *
fi
I'm new to bash and cron (in case it is not obvious).

I think cron uses /bin/sh to execute commands. sh is typically a subset of bash, and you're not doing anything bash-specific.
Execute the rm command only if the cd command succeeds:
0 * * * * cd home/docs/reports/ && rm -r *
NOTE Please wait a few minutes while I test this. If this note is gone, I've tried it and it works.
Yes, it works. (Note that testing whether the directory exists is less reliable; it's possible that the directory exists but you can't cd into it, or it might cease to exist between the test and the cd command.)
But actually you don't need to use a compound command like that:
0 * * * * rm -r home/docs/reports/*
Still the && trick, and the corresponding || operator to execute a second command only if the first one fails, can be very useful for more complicated operations.
(Did you mean /home/docs rather than home/docs? The latter will be interpreted relative to your home directory.)
Though this worked ok when I tried it, use it at your own risk. Any time you combine rm -r with wildcards, there's a risk. If possible, test in a directory you're sure you don't care about. And you might consider using rm -rf if you want to be as sure as possible that everything is deleted. Finally, keep in mind that the * wildcard doesn't match files or directories whose names start with ..
#include <stddisclaimer.h>
EDIT :
The comments have given me a better understanding of what you're trying to do. These are files that users are going to download shortly after they're created (right?), so you don't want to delete anything less than, say, 5 minutes old.
Assuming you have GNU findutils, you can do something like this:
0 * * * * find /home/docs/reports/* -cmin +5 -delete 2>/dev/null
Using the -delete option to find means you're deleting files and/or directories one at a time, not deleting entire subtrees; the main difference is that an old directory with a new file in it will not be deleted. Applying -delete to a non-empty directory will fail with an error message.
Read the GNU find documentation (info find) for more information on the -cmin and -delete options. Note that -ctime operates on the time of the last status change of the file, not its creation time (Unix doesn't record file creation times). For your situation, it's likely to be the same.
(If you omit the /* on the path, it will delete the reports directory itself.)

Wrap your entire command (including the if logic) into a script.sh.
Specify #!/bin/bash at the top of your script.
Make it executable:
chmod +x script.sh
Then specify the full path of the script in your cron job.

Easiest thing by far is to do SHELL=/bin/bash at the top of your crontab. Works for me.

Related

How to automatically "chmod" a file in PyCharm/IntelliJ with a particular extension?

Background
I'm maintaining a repo in which I create many small bash files containing tiny code snippets.
To create these files, I use the Right Click -> New -> File interaction in PyCharm/IntelliJ quite often:
These files are automatically created with the rw-rw-r-- (664) permissions.
So every time, I have to start a terminal, and execute chmod 700 somefile.sh to make them executable.
The Question
It would be wonderful to automatically set the permissions of files with a certain extension in IntelliJ/PyCharm.
I don't want all my files to be executable, just the files with a .sh extension.
Is there a way I configure this?
A non-elegant solution and potentially problematic would be to have a bash script in the background checking contents of a specific directory for specific filenames and applying chmod 700 on those.
I strongly suggest a special directory - to minimize the risk of this script affecting other/unexpected files.
You'd need to set it to sleep for a few seconds so that it doesn't take up too much of your CPU time.
It could look something like this:
background_chmod.sh:
#!/bin/bash
FILESPEC="*.sh"
DIRPATH=/some/special/path/
while [ 0 ] :
do
find $DIRPATH -name $FILESPEC -type f -exec chmod 700 {} \;
sleep 10; # tweak this to your requirements
done
You could then put it in your .bashrc file (if you are using bash) to execute when you log on.
Additional line in your .bashrc would be:
/path_to_script/background_chmod.sh &
Rememeber to:
chmod +x /path_to_script/background_chmod.sh
IMPORTANT: I have no means of testing the above. Please test this on some test files before implementing! Use at your own risk!

monitor for file then copy to another directory

I'm relatively new to unix scripting, so apologies for the newbie question.
I need to create a script which will permanently run in the background, and monitor for a file to arrive in an FTP landing directory, then copy it to a different directory, and lastly remove the file from the original directory.
The script is running on a Ubuntu box.
The landing directory is /home/vpntest
The file needs to be copied as /etc/ppp/chap-secrets
So far, I've got this
#/home/vpntest/FTP_file_copy.sh
if [ -f vvpn_azure.txt ]; then
cp vvpn_azure.txt /etc/ppp/chap-secrets
rm vvpn_azure.txt
fi
I can run this as root, and it works, but only as a one off (I need it to run permanently in the background, and trigger each time a new file is received in the landing zone.)
If I don't run as root, I get issues with permissions (even if I run it from within the directory /home/vpntest.)
Any help would be much appreciated.
Updated: crontab correction and extra info
One way to have a check and move process in background with root permissions, is the "polling" approach done via root user's crontab, with your script.
Steps:
Revise your /home/vpntest/FTP_file_copy.sh:
#!/bin/bash
new_file=/home/vpntest/vvpn_azure.txt
if [ -f "$new_file" ]; then
mv $new_file /etc/ppp/chap-secrets
fi
Log out. Log in as root user.
Add a cron task to run the script:
crontab -e
If this is a new machine, and your first time running crontab, you may get a prompt first to choose an editor for crontab, just choose and continue into the editor.
The format is m h dom mon dow command, so if checking every 5 minutes is sufficiently frequent, do:
*/5 * * * * /home/vpntest/FTP_file_copy.sh
Save and close to apply.
It will now automatically run the script every 5 minutes in the background, helping you to move the file if found.
Explanation
Root user, because you mentioned it only worked for you as root.
So we set this in the root user's crontab to have sufficient permissions.
man 5 crontab informs us:
Steps are also permitted after an asterisk, so if you want to say
'every two hours', just use '*/2'.
Thus we write */5 in the first column, which is the minutes column,
to set for "every 5 minutes".
FTP_file_copy.sh:
uses absolute paths, can run from anywhere
re-arranged so one variable new_file can be re-used
good practice to enclose any values being checked within your [ ] test
uses mv to write over the destination while removing itself from the source directory

Copy file without file extension to new folder and rename it

I just during the weekend decided to try out zsh and have a bit of fun with it. Unfortunately I'm an incredible newbie to shell scripting in general.
I have this folder with a file, which filename is a hash (4667e85581f80b6936f8811f0a7493c70eae4ee7) without a file-extension.
What I would like to do is copy this file to another folder and rename it to "screensaver.png".
I've tried with the following code:
#!/usr/bin/zsh
KUVVA_CACHE="$HOME/Library/Containers/com.kuvva.Kuvva-Wallpapers/Data/Library/Application Support/Kuvva"
DEST_FOLDER="/Library/Desktop Pictures/Kuvva/$USERNAME/screensaver.png"
for wallpaper in ${KUVVA_CACHE}; do
cp -f ${wallpaper} ${DEST_FOLDER}
done
This returns the following error:
cp: /Users/Morten/Library/Containers/com.kuvva.Kuvva-Wallpapers/Data/Library/Application Support/Kuvva is a directory (not copied).
And when I try to echo the $wallpaper variable instead of doing "cp" then it just echo's the folder path.
The name of the file changes every 6 hour, which is why I'm doing the for-loop. So I never know what the name of the file will be, but I know that there's always only ONE file in the folder.
Any ideas how I can manage to do this? :)
Thanks a lot!
Morten
It should work with regular filename expansion (globbing).
KUVVA_CACHE="$HOME/Library/Containers/com.kuvva.Kuvva-Wallpapers/Data/Library/Application Support/Kuvva/"
And then copy
cp -f ${KUVVA_CACHE}/* ${DEST_FOLDER}
You can add the script to your crontab so it will be run at a certain interval. Edit it using 'crontab -e' and add
30 */3 * * * /location/of/your/script
This will run it every third hour. First digit is minutes. Star indicates any. Exit the editor by pressing the escape-key, then shift+: and type wq and press enter. These vi-commands.
Don't forget to 'chmod 0755 file-name' the script so it becomes executable.
Here is the script.
#!/bin/zsh
KUVVA_CACHE="$HOME/Library/Containers/com.kuvva.Kuvva-Wallpapers/Data/Library/Application Support/Kuvva"
DEST_FOLDER="/Library/Desktop Pictures/Kuvva/$USERNAME/screensaver.png"
cp "${KUVVA_CACHE}/"* "${DEST_FOLDER}"

Is it safe to alias cp -R to cp?

When I copy something, I always forget the -R, then I have go all the way back to add it right after cp.
I want to add this to bash config files.
alias cp="cp -R"
I have not seen anything bad happen. Is it safe to do this?
The only thing I can think of that would cause unexpected behavior with the -R flag is that it doesn't work with wildcards.
What I mean is... for example you want to copy all mp3 Files in a directory and every subdirectory with: cp -R /path/*.mp3. Although -R is given it will not copy mp3s in the subdirectories of path - if there are any.
I wouldn't use aliases for changing the behaviour of normal commands. When you are in a different shell / another computer, the alias will be missing. When a friend wants to help you, he will not know what you did.
Once I had an alias rm="rm -i" and I performed rm *, while I just had changed shell with a su.
And sometimes you want to use cp without the -R option, will you remember to use /bin/cp in these cases (copy all files in the current dir to another location, but do not cp the subdirs)?

How can I find a directory-diff of millions of files to script maintenance?

I have been working on how to verify that millions of files that were on file system A have infact been moved to file system B. While working on a system migration, it became evident that all the files needed to be audited to prove that the files have been moved. The files were initially moved via rsync, which does provide logs, although not in a format that is helpful for doing an audit. So, I wrote this script to index all the files on System A:
#!/bin/bash
# Get directories and file list to be used to verify proper file moves have worked successfully.
LOGDATE=`/usr/bin/date +%Y-%m-%d`
FILE_LIST_OUT=/mounts/A_files_$LOGDATE.txt
MOUNT_POINTS="/mounts/AA mounts/AB"
touch $FILE_LIST_OUT
echo TYPE,USER,GROUP,BYTES,OCTAL,OCTETS,FILE_NAME > $FILE_LIST_OUT
for directory in $MOUNT_POINTS; do
# format: type,user,group,bytes,octal,octets,file_name
gfind $directory -mount -printf "%y","%u","%g","%s","%m","%p\n" >> $FILE_LIST_OUT
done
The file indexing works fine and takes about two hours to index ~30 million files.
On side B is where we run into issues. I have written a very simple shell script that reads the index file, tests to see if the file is there, and then counts up how many files are there, but it's running out of memory while looping through the 30 million lines on indexed file names. Effectively doing this little bit of code below through a while loop, and counters to increment for files found and not found.
if [ -f "$TYPE" "$FILENAME" ] ; then
print file found
++
else
file not found
++
fi
My questions are:
Can a shell script do this type of reporting from such a large list. A 64 bit unix system ran out of memory while trying to execute this script. I have already considered breaking up the input script into smaller chunks to make it faster. Currently it can
If as shell script is inappropriate, what would you suggest?
You just used rsync, use it again...
--ignore-existing
This tells rsync to skip updating files that already exist on the destination (this does not ignore existing directories, or nothing would get done). See also --existing.
This option is a transfer rule, not an exclude, so it doesn’t affect the data that goes into the file-lists, and thus it doesn’t affect deletions. It just limits the files that the receiver requests to be transferred.
This option can be useful for those doing backups using the --link-dest option when they need to continue a backup run that got interrupted. Since a --link-dest run is copied into a new directory hierarchy (when it is used properly), using --ignore existing will ensure that the already-handled files don’t get tweaked (which avoids a change in permissions on the hard-linked files). This does mean that this option is only looking at the existing files in the destination hierarchy itself.
That will actually fix any problems (at least in the same sense that any diff-list on file-exist tests could fix problem. Using --ignore-existing means rsync only does the file-exist tests (so it'll construct the diff list as you request and use it internally). If you just want information on the differences, check --dry-run, and --itemize-changes.
Lets say you have two directories, foo and bar. Let's say bar has three files, 1,2, and 3. Let's say that bar, has a directory quz, which has a file 1. The directory foo is empty:
Now, here is the result,
$ rsync -ri --dry-run --ignore-existing ./bar/ ./foo/
>f+++++++++ 1
>f+++++++++ 2
>f+++++++++ 3
cd+++++++++ quz/
>f+++++++++ quz/1
Note, you're not interested in the cd+++++++++ -- that's just showing you that rsync issued a chdir. Now, let's add a file in foo called 1, and let's use grep to remove the chdir(s),
$ rsync -ri --dry-run --ignore-existing ./bar/ ./foo/ | grep -v '^cd'
>f+++++++++ 2
>f+++++++++ 3
>f+++++++++ quz/1
f is for file. The +++++++++ means the file doesn't exist in the DEST dir.
Here is the bonus, remove --dry-run, and, it'll go ahead and make the changes for you.
Have you considered a solution such as kdiff3, which will diff directories of files ?
Note the feature for version 0.9.84
Directory-Comparison: Option "Full Analysis" allows to show the number
of solved vs. unsolved conflicts or deltas vs. whitespace-changes in
the directory tree.
There is absolutely no problem reading a 30 million line file in a shell script. The reason why your process failed was most likely that you tried to read the file entirely into memory, e.g. by doing something wrong like for i in $(cat file).
The correct way of reading a file is:
while IFS= read -r line
do
echo "Something with $line"
done < someFile
A shell script is inappropriate, yes. You should be using a diff tool:
diff -rNq /original /new
If you're not particular about the solution being a script, you could also look into meld, which would let you diff directory trees quite easily and you can also set ignore patterns if you have any.

Resources