Update var with value from while loop - bash

I have the following script and I want to overwrite the progress var with the value from the loop but I'm struggling to get this working. I've read many threads but all of them seem to suggest something like writing to output to a file and then read from that again but I don't want to do that.
#!/bin/bash
function Upload {
find . -type f -name "*.test" -printf "%f\n"| sort -k1 | while read fname; do
progressBar=$(echo "scale=2 ; $progress + $percentage" | bc)
echo "bar: $progressBar"
progress=$progressBar
done
}
progress=0
totalFiles=$(find . -name "*.test" | wc -l)
totalCalc=$(($totalFiles + 1))
percentage=$(echo "scale=2 ; 100 / $totalCalc" | bc)
Upload
echo $progress
How can I get the var outside the loop/subshell and overwrite the main var?

As you already correctly point out in your question, you are running the loop body in a child process, so even if you would export the variable, a change can't be seen in the parent process.
Writing to a file is and retrieve it in your "main" program is easy if you do it properly:
In your main program (before calling your function), set up the file like this:
export progressBarFile=/tmp/progressBar.$$
This ensures that several processes of your script use their own file for it.
In your loop, do a
echo $progressBar >$progressBarFile
Then, after your function, fetch the value using
progressBar=$(<$progressBarFile)
In the end, you can erase this file like this:
rm $progressBarFile
If you are really paranoid of having a temporary file left over, you could use trap to catch a premature abort of your script, and erase the file inside the trap function.
Another possibility (without using a file) would be to use an array:
files=( $(find . -type f -name "*.test" -printf "%f\n"| sort -k1) )
for fname in "${files[#]}"
do
...
done
In this case, your loop body is not a child process.

Related

Reading filenames from a structured file to a bash script

I have a file with a structured list of filenames (file1.sh, file2.sh, ...) and would like to read loop the file names inside a bash script.
cat /home/flora/logs/9681-T13:17:07.091363777.org
%rec: dynamic
Ptrn: Gnu
File: /home/flora/comint.rc
+ /home/flora/engine.rc
+ /home/flora/playa.rc
+ /home/flora/edva.rc
+ /home/flora/dyna.rc
+ /home/flora/lin.rc
Have started with
while read -r fl; do
echo "$fl" | grep -oE '[/].+'
done < "$logfl"
But I want to be more specific by matching the File: , then continue reading the rest using + as a continuation character.
bash doesn't have impose a limit on variables (other than memory). That said, I would start by processing the list of lines one by one:
#!/bin/bash
while read _ f
do
process "$f"
done
where process is whatever function you need to implement.
If you want a variables use an array like this:
#!/bin/bash
while read _ f
do
files+=("$f")
done
In either case pass the input file to script with:
your_script < /home/flora/logs/27043-T13:09:44.893003954.log

Want to add the headers in text file in shell script

I want to add the header at the start of the file for that I use the following code but it will not add the header can please help where i am doing wrong.
start_MERGE_JobRec()
{
FindBatchNumber
export TEMP_SP_FORMAT="Temp_${file_indicator}_SP_[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_INSTANCE[0-9][0-9].txt"
export TEMP_SS_FORMAT="Temp_${file_indicator}_SS_[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_INSTANCE[0-9][0-9].txt"
export TEMP_SG_FORMAT="Temp_${file_indicator}_SG_[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_INSTANCE[0-9][0-9].txt"
export TEMP_GS_FORMAT="Temp_${file_indicator}_GS_[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_INSTANCE[0-9][0-9].txt"
export SP_OUTPUT_FILE="RTBCON_${file_indicator}_SP_${ONLINE_DATE}${TIME}_${BATCH_NUMBER}.txt"
export SS_OUTPUT_FILE="RTBCON_${file_indicator}_SS_${ONLINE_DATE}${TIME}_${BATCH_NUMBER}.txt"
export SG_OUTPUT_FILE="RTBCON_${file_indicator}_SG_${ONLINE_DATE}${TIME}_${BATCH_NUMBER}.txt"
export GS_OUTPUT_FILE="RTBCON_${file_indicator}_GS_${ONLINE_DATE}${TIME}_${BATCH_NUMBER}.txt"
#---------------------------------------------------
# Add header at the start for each file
#---------------------------------------------------
awk '{print "recordType|lifetimeId|MSISDN|status|effectiveDate|expiryDate|oldMSISDN|accountType|billingAccountNumber|usageTypeBen|IMEI|IMSI|cycleCode|cycleMonth|firstBillExperience|recordStatus|failureReason"$0}' >> $SP_OUTPUT_FILE
find . -maxdepth 1 -type f -name "${TEMP_SP_FORMAT}" -exec cat {} + >> $SP_OUTPUT_FILE
find . -maxdepth 1 -type f -name "${TEMP_SS_FORMAT}" -exec cat {} + >> $SS_OUTPUT_FILE
find . -maxdepth 1 -type f -name "${TEMP_SG_FORMAT}" -exec cat {} + >> $SG_OUTPUT_FILE
find . -maxdepth 1 -type f -name "${TEMP_GS_FORMAT}" -exec cat {} + >> $GS_OUTPUT_FILE
}
I use awk to add the header but it's not working.
Awk requires an input file before it will print anything.
A common way to force Awk to print something even when there is no input is to put the print statement in a BEGIN block;
awk 'BEGIN { print "something" }' /dev/null
but if you want to prepend a header to all the output files, I don't see why you are using Awk here at all, let alone printing the header in front of every output line. Are you looking for this, instead?
echo 'recordType|lifetimeId|MSISDN|status|effectiveDate|expiryDate|oldMSISDN|accountType|billingAccountNumber|usageTypeBen|IMEI|IMSI|cycleCode|cycleMonth|firstBillExperience|recordStatus|failureReason' |
tee "$SS_OUTPUT_FILE" "$SG_OUTPUT_FILE" "$GS_OUTPUT_FILE" >"$SP_OUTPUT_FILE"
Notice also how we generally always quote shell variables unless we specifically want the shell to perform word splitting and wildcard expansion on their values, and avoid upper case for private variables.
There also does not seem to be any reason to export your variables -- neither Awk nor find pays any attention to them, and there are no other processes here. The purpose of export is to make a variable visible to the environment of subprocesses. You might want to declare them as local, though.
Perhaps break out a second function to avoid all this code repetition, anyway?
merge_individual_job() {
echo 'recordType|lifetimeId|MSISDN|status|effectiveDate|expiryDate|oldMSISDN|accountType|billingAccountNumber|usageTypeBen|IMEI|IMSI|cycleCode|cycleMonth|firstBillExperience|recordStatus|failureReason'
find . -maxdepth 1 -type f -name "$1" -exec cat {} +
}
start_MERGE_JobRec()
{
FindBatchNumber
local id
for id in SP SS SG GS; do
merge_individual_job \
"Temp_${file_indicator}_${id}_[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_INSTANCE[0-9][0-9].txt" \
>"RTBCON_${file_indicator}_${id}_${ONLINE_DATE}${TIME}_${BATCH_NUMBER}.txt"
done
}
If FindBatchNumber sets the variable file_indicator, a more idiomatic and less error-prone approach is to have it just echo it, and have the caller assign it:
file_indicator=$(FindBatchNumber)

Integrate bash function return values

I have a function:
something() {
if [ something ]; then
echo "Something.";
return 0;
else
echo "Not something.";
return 1;
fi
}
I call it in a loop, it actually validates some files and counts how many files were valid:
find . -type l | while read line ; do something "$line"; done
I need to count how many files were invalid, therefore how many times the function has returned 0. I was thinking about this:
INVALID=0;
find . -type l | while read line ; do INVALID=$(($INVALID + something "$line")); done
Needless to say, bash doesn't buy it. Note a few things:
The info within something must be print in stdout.
The info print does not always correlate with file validity in my code. It's just info for the user.
The return value isn't directly available for arithmetic like that. You can either call the function then access $?, or branch based on the result of the function, like so:
INVALID=0
while IFS= read -r line; do
something "$line" || ((++INVALID))
done < <(find . -type l)
Also note that you can't change variables inside a pipeline. Pipelines run in subshells and have their own copies of variables. You'll need to restructure the loop to run without a pipeline to have the changes to $INVALID stick: change find | loop to loop < <(find).
It's also a good idea to use read -r to disable backslash escapes and clear out $IFS to handle lines with leading whitespace better.

Why is this bash for loop slow?

I am trying to this code:
for f in jobs/UPDTEST/apples* ; do
nf=`echo $f | sed s:jobs\/::g`
echo $nf | tr '_' ' '
done > jobs
There are 750 apples* type text files. But as I am only messing with the file name - I would have thought it should be quick - but take about 5 mins.
Is there an alternative way to do this?
You can use parameter expansions like ${parameter/pattern/string} to get rid of the calls to sed and tr. In your case it could look like:
for f in jobs/UPDTEST/apples*; do
f=${f//jobs\//}
echo ${f//_/ }
done > jobs
First, cd jobs would remove the need for the sed
Second, you don't need tr to substitute characters in the value of a bash variable.
Third, with find you don't need a loop at all.
f=$(cd jobs; find UPDTEST -name 'apples*' -depth 1)
echo "${f//_/ }" > jobs.log
By the way, you can't have a jobs directory and a jobs file in the same directory.

How can I copy all my disorganized files into a single directory? (on linux)

I have thousands of mp3s inside a complex folder structure which resides within a single folder. I would like to move all the mp3s into a single directory with no subfolders. I can think of a variety of ways of doing this using the find command but one problem will be duplicate file names. I don't want to replace files since I often have multiple versions of a same song. Auto-rename would be best. I don't really care how the files are renamed.
Does anyone know a simple and safe way of doing this?
You could change a a/b/c.mp3 path into a - b - c.mp3 after copying. Here's a solution in Bash:
find srcdir -name '*.mp3' -printf '%P\n' |
while read i; do
j="${i//\// - }"
cp -v "srcdir/$i" "dstdir/$j"
done
And in a shell without ${//} substitution:
find srcdir -name '*.mp3' -printf '%P\n' |
sed -e 'p;s:/: - :g' |
while read i; do
read j
cp -v "srcdir/$i" "dstdir/$j"
done
For a different scheme, GNU's cp and mv can make numbered backups instead of overwriting -- see -b/--backup[=CONTROL] in the man pages.
find srcdir -name '*.mp3' -exec cp -v --backup=numbered {} dstdir/ \;
bash like pseudocode:
for i in `find . -name "*.mp3"`; do
NEW_NAME = `basename $i`
X=0
while ! -f move_to_dir/$NEW_NAME
NEW_NAME = $NEW_NAME + incr $X
mv $i $NEW_NAME
done
#!/bin/bash
NEW_DIR=/tmp/new/
IFS="
"; for a in `find . -type f `
do
echo "$a"
new_name="`basename $a`"
while test -e "$NEW_DIR/$new_name"
do
new_name="${new_name}_"
done
cp "$a" "$NEW_DIR/$new_name"
done
I'd tend to do this in a simple script rather than try to fit in in a single command line.
For instance, in python, it would be relatively trivial to do a walk() through the directory, copying each mp3 file found to a different directory with an automatically incremented number.
If you want to get fancier, you could have a dictionary of existing file names, and simply append a number to the duplicates. (the index of the dictionary being the file name, and the value being the number of files found so far, which would become your suffix)
find /path/to/mp3s -name *.mp3 -exec mv \{\} /path/to/target/dir \;
At the risk of many downvotes, a perl script could be written in short time to accomplish this.
Pseudocode:
while (-e filename)
change filename to filename . "1";
In python: to actually move the file, change debug=False
import os, re
from_dir="/from/dir"
to_dir = "/target/dir"
re_ext = "\.mp3"
debug = True
w = os.walk(from_dir)
n = w.next()
while n:
d, arg, names = n
names = filter(lambda fn: re.match(".*(%s)$"%re_ext, fn, re.I) , names)
n = w.next()
for fn in names:
from_fn = os.path.join(d,fn)
target_fn = os.path.join(to_dir, fn)
file_exists = os.path.exists(target_fn)
if not debug:
if not file_exists:
os.rename(from_fn, target_fn)
else:
print "DO NOT MOVE - FILE EXISTS ", from_fn
else:
print "MOVE ", from_fn, " TO " , target_fn
Since you don't care how the duplicate files are named, utilize the 'backup' option on move:
find /path/to/mp3s -name *.mp3 -exec mv --backup=numbered {} /path/to/target/dir \;
Will get you:
song.mp3
song.mp3.~1~
song.mp3.~2~

Resources