I have the following bash script, that I launch using the terminal.
dataset_dir='/home/super/datasets/Carpets_identification/data'
dest_dir='/home/super/datasets/Carpets_identification/augmented-data'
# if dest_dir does not exist -> create it
if [ ! -d ${dest_dir} ]; then
mkdir ${dest_dir}
fi
# for all folder of the dataset
for folder in ${dataset_dir}/*; do
curr_folder="${folder##*/}"
echo "Processing $curr_folder category"
# get all files
for item in ${folder}/*; do
# if the class dir in dest_dir does not exist -> create it
if [ ! -d ${dest_dir}/${curr_folder} ]; then
mkdir ${dest_dir}/${curr_folder}
fi
# for each file
if [ -f ${item} ]; then
# echo ${item}
filename=$(basename "$item")
extension="${filename##*.}"
filename=`readlink -e ${item}`
# get a certain number of patches
for i in {1..100}
do
python cropper.py ${filename} ${i} ${dest_dir}
done
fi
done
done
Given that it needs at least an hour to process all the files.
What happens if I change the '100' with '1000' in the last for loop and launch another instance of the same script?
Will the first process count to 1000 or will continue to count to 100?
I think the file will be readonly when a bash process executes it. But you can force the change. The already running process will count to its original value, 100.
You have to take care about the results. You are writing in the same output directory and have to expect side effects.
"When you make changes to your script, you make the changes on the disk(hard disk- the permanent storage); when you execute the script, the script is loaded to your memory(RAM).
(see https://askubuntu.com/questions/484111/can-i-modify-a-bash-script-sh-file-while-it-is-running )
BUT "You'll notice that the file is being read in at 8KB increments, so Bash and other shells will likely not load a file in its entirety, rather they read them in in blocks."
(see https://unix.stackexchange.com/questions/121013/how-does-linux-deal-with-shell-scripts )
So, in your case, all your script is loaded in the RAM memory by the script interpretor, and then executed. Meaning that if you change the value, then execute it again, the first instance will still have the "old" value.
Related
I want to run a Bash script every minute (through a CRON entry) to launch a series of Python scripts in a granular (time-wise) fashion.
So far, this is the script I've made:
# set the current date
DATE=`date +%Y-%m-%d`
# set the current system time (HH:MM)
SYSTIME=`date +%H-%M`
# parse all .py script files in the 'daily' folder
for f in daily/*.py; do
if [[ -f $f ]]; then
# set the script name
SCRIPT=$(basename $f)
# get the script time
SCRTIME=`echo $SCRIPT | cut -c1-5`
# execute the script only if its intended execution time and the system time match
if [[ $SCRTIME == $SYSTIME ]]; then
# ensure the directory exists
install -v -m755 -d done/$DATE/failed
# execute the script
python3 evaluator.py $f > done/$DATE/daily-$SCRIPT.log &
# evaluate the result
if [ $? -eq 0 ]; then
# move the script to the 'done' folder
cp $f done/$DATE/daily-$SCRIPT
else
# log the failure
mv done/$DATE/daily-$SCRIPT.log done/$DATE/failed/
# move the script to the 'retry' folder for retrial
cp $f retry/daily-$SCRIPT
fi
fi
fi
done
Let's say we have the following files in a folder called daily/ (for daily execution):
daily/08-00-script-1.py
daily/08-00-script-2.py
daily/08-05-script-3.py
daily/09-20-script-4.py
The idea for granular execution is that a CRON task runs this script every minute. I fetch the system time and extract the execution time for each script and when the time matches between the system and the script file, it gets handed over to Python. So far, so good.
I know this script is not right in the sense that it gets a subshell for each script that's going to be executed but the following code is wrong as I know Bash automatically returns 0 on subshell invocation (if I did read correctly while searching on Google) and what I need is for each subshell to execute the code below so, if it fails, it gets sent to another folder (retry/) which is controlled by another Bash script running checks every 30 minutes for retrial (it's the same script as this one minus the checking part).
So, basically, I need to run this:
# evaluate the result
if [ $? -eq 0 ]; then
# move the script to the 'done' folder
cp $f done/$DATE/daily-$SCRIPT
else
# log the failure
mv done/$DATE/daily-$SCRIPT.log done/$DATE/failed/
# move the script to the 'retry' folder for retrial
cp $f retry/daily-$SCRIPT
fi
For every subshell-ed execution. How can I do this the right way?
Bash may return 0 for every sub-shell invocation, but if you wait for the result, then you will get the result (and I see no ampersand). If the python3 commands is relaying the exit code of the script, then your code will work. If your code does not catch an error, then it is the fault of python3 and you need to create error communication. Redirecting the output of stderr might be helpful, but first verify that your code does not work.
I am trying to write a shell script , which will write the output of another script in a file and it will keep writing to that upto a certain point and then it will overwrite the file so that file size will remain within a well bounded range.
while true
do
./runscript.sh > test.txt
sleep 1
done
I have tried to use infinite loop and sleep so that it will keep overwrite that file.
But, it shows a different behaviour. Till the point command is running , the filesize keeps on increasing. But, when i stop the command, the file size get reduce.
How can i keep overwriting the same file and maintain the file size along with it.
use truncate -s <size> <file> to shrink the file when its size is out of your boundary
I will do with below script
#!/bin/sh
Logfile=test.txt
minimumsize=100000 # define the size you want
actualsize=$(wc -c <"$Logfile")
if [[ $actualsize -ge $minimumsize ]]; then
rm -rf "$Logfile"
sh ./runscript.sh >> test.txt
else
#current_date_time="`date +%Y%m%d%H%M%S`"; #add this to runscript.sh to track when it was written
#echo "********Added at :$current_date_time ********" #add this to runscript.sh to track when it was written
sh ./runscript.sh >> test.txt
fi
I can try with the option for generating the new file once the old one
is full. … How can make the
script to generate the new file and write to it.
The following script, let's call it chop.sh, does that; you use it by feeding the output to it, specifying the desired file size and name as arguments, e. g. ./runscript.sh|chop.sh 999999 test.txt.
File=${2?usage: $0 Size File}
Size=$1
while
set -- `ls -l "$File" 2>/dev/null` # 5th column is file size
[ "$5" -lt "$Size" ] || mv "$File" "$File"-old
read -r && echo "$REPLY" >>"$File"
do :
done
The old (full) file would then be named test.txt-old.
I'm creating a custom script to run backups using Clonezilla. The script is working as expected (so far) but I'm having a little trouble with the variable assignment.
The variable assignment line (maxBackups=4) creates a file called '4' in the current directory and the if test doesn't work properly.
(the way I understand it, the link to the parent directory counts as a directory with this directory counting method.)
What am I doing wrong? I know it is something simple...
Thanks
#!/bin/bash
# Automated usb backup script for Clonezilla
# by DRC
# Begin script
# Store date and time for use in folder name
# to a variable called folderName
folderName=$(date +"%Y-%m-%d--%H-%M")
# mount second partition of USB media for storing
# saved image
mount /dev/sdb2 /home/partimag/
# Determine if there are more than 3 directories
# and terminate the script if there are
cd /home/partimag
maxBackups=4
numberOfBackups=$(find -maxdepth 1 -type d | wc -l)
if [ $numberOfBackups > $maxBackups ]; then
echo "The maximum number of backups has been reached."
echo "Plese burn the backups to DVD and remove them"
echo "from the USB key."
echo "Press ENTER to continue and select 'Poweroff'"
echo "from the next menu."
# Wait for the user to press the enter key
read
# If there are three or less backups, a new backup will be made
else
/usr/sbin/ocs-sr -q2 -c -j2 -a -z0 -i 2000 -sc -p true savedisk $folderName sda
fi
maxBackups=4 is not creating a file named 4 in your directory. What is creating that file is the if [ $numberOfBackups > $maxBackups ] bit. > is redirection to an output file, since [ is a command, not a keyword. You could try one of these instead:
if [ $numberOfBackups -gt $maxBackups ] # -gt is test's version of greater than
if [[ $numberOfBackups > $maxBackups ]] # double brackets are keywords, not commands
if (( numberOfBackups > maxBackups )) # arithmetic context doesn't even require $
I know you can create a log of the output by typing in script nameOfLog.txt and exit in terminal before and after running the script, but I want to write it in the actual script so it creates a log automatically. There is a problem I'm having with the exec >>log_file 2>&1 line:
The code redirects the output to a log file and a user can no longer interact with it. How can I create a log where it just basically copies what is in the output?
And, is it possible to have it also automatically record the process of files that were copied? For example, if a file at /home/user/Deskop/file.sh was copied to /home/bckup, is it possible to have that printed in the log too or will I have to write that manually?
Is it also possible to record the amount of time it took to run the whole process and count the number of files and directories that were processed or am I going to have to write that manually too?
My future self appreciates all the help!
Here is my whole code:
#!/bin/bash
collect()
{
find "$directory" -name "*.sh" -print0 | xargs -0 cp -t ~/bckup #xargs handles files names with spaces. Also gives error of "cp: will not overwrite just-created" even if file didn't exist previously
}
echo "Starting log"
exec >>log_file 2>&1
timelimit=10
echo "Please enter the directory that you would like to collect.
If no input in 10 secs, default of /home will be selected"
read -t $timelimit directory
if [ ! -z "$directory" ] #if directory doesn't have a length of 0
then
echo -e "\nYou want to copy $directory." #-e is so the \n will work and it won't show up as part of the string
else
directory=/home/
echo "Time's up. Backup will be in $directory"
fi
if [ ! -d ~/bckup ]
then
echo "Directory does not exist, creating now"
mkdir ~/bckup
fi
collect
echo "Finished collecting"
exit 0
To answer the "how to just copy the output" question: use a program called tee and then a bit of exec magic explained here:
redirect COPY of stdout to log file from within bash script itself
Regarding the analytics (time needed, files accessed, etc) -- this is a bit harder. Some programs that can help you are time(1):
time - run programs and summarize system resource usage
and strace(1):
strace - trace system calls and signals
Check the man pages for more info. If you have control over the script it will be probably easier to do the logging yourself instead of parsing strace output.
Could someone help me understand the following piece of code which is deciding on the start and end dates to pick data out of a db.
# Get the current time as the stop time.
#
stoptime=`date +"%Y-%m-%d %H:00"`
if test $? -ne 0
then
echo "Failed to get the date"
rm -f $1/.optpamo.pid
exit 4
fi
#
# Read the lasttime file to get the start time
#
if test -f $1/optlasttime
then
starttime=`cat $1/optlasttime`
# if the length of the chain is zero
# (lasttime is empty) It is updated properly
# (and I wait for the following hour)
if test -z "$starttime"
then
echo "Empty file lasttime"
echo $stoptime > $1/optlasttime
rm -f $1/.optpamo.pid
exit 5
fi
else
# If lasttime does not exist I create, it with the present date
# and I wait for the following hour
echo "File lasttime does not exist"
echo $stoptime > $1/optlasttime
rm -f $1/.optpamo.pid
exit 6
fi
Thanks
The script checks to see if there's a non-empty file named optlasttime in the directory specified as an argument ($1). If so, the script exits successfully (status 0). If the file doesn't exist or is empty, the current hour formatted as 2010-01-07 14:00 is written to the file, another file named .optpamo.pid is deleted from the argument directory and the script exits unsuccessfully (status 5 or 6).
This script is obviously a utility being called by some outer process, to which you need to refer for full understanding.
1.) Sets stop time to current time
2.) Checks if file $1/optlasttime exists (where $1 is passed into the script)
a.) if $1/optlasttime exists it checks the contents of the file (which it is assumed that if it does have contents it is a timestamp)
b.) if $1/optlasttime does not exist it populates the $1/optlasttime file with the stoptime.
I copied and pasted a small snippet of this into a file I called test.ksh
stoptime=`date +"%Y-%m-%d %H:00"`
if test $? -ne 0
then
echo "Failed to get the date"
rm -f $1/.optpamo.pid
exit 4
fi
Then I ran it at the commandline, like so:
zhasper#berens:~$ ksh -x ./temp.ksh
+ date '+%Y-%m-%d %H:00'
+ stoptime='2010-01-08 18:00'
+ test 0 -ne 0
The -x flag to ksh makes it print out each commandline, in full, as it executes. Comparing what you see here with the snippet of shell script above should tell you something about how ksh is interpreting the file.
If you run this over the whole file you should get a good feel for what it's doing.
To learn more, you can read man ksh, or search for ksh scripting tutorial online.
Together, these three things should help you learn a lot more than us simply telling you what the script does.