How can I get my bash script to work? - bash

My bash script doesn't work the way I want it to:
#!/bin/bash
total="0"
count="0"
#FILE="$1" This is the easier way
for FILE in $*
do
# Start processing all processable files
while read line
do
if [[ "$line" =~ ^Total ]];
then
tmp=$(echo $line | cut -d':' -f2)
count=$(expr $count + 1)
total=$(expr $total + $tmp)
fi
done < $FILE
done
echo "The Total Is: $total"
echo "$FILE"
Is there another way to modify this script so that it reads arguments into $1 instead of $FILE? I've tried using a while loop:
while [ $1 != "" ]
do ....
done
Also when I implement that the code repeats itself. Is there a way to fix that as well?
Another problem that I'm having is that when I have multiple files hi*.txt it gives me duplicates. Why? I have files like hi1.txt hi1.txt~ but the tilde file is of 0 bytes, so my script shouldn't be finding anything.
What i have is fine, but could be improved. I appreciate your awk suggestions but its currently beyond my level as a unix programmer.
Strager: The files that my text editor generates automatically contain nothing..it is of 0 bytes..But yeah i went ahead and deleted them just to be sure. But no my script is in fact reading everything twice. I suppose its looping again when it really shouldnt. I've tried to silence that action with the exit commands..But wasnt successful.
while [ "$1" != "" ]; do
# Code here
# Next argument
shift
done
This code is pretty sweet, but I'm specifying all the possible commands at one time. Example: hi[145].txt
If supplied would read all three files at once.
Suppose the user enters hi*.txt;
I then get all my hi files read twice and then added again.
How can I code it so that it reads my files (just once) upon specification of hi*.txt?
I really think that this is because of not having $1.

It looks like you are trying to add up the totals from the lines labelled 'Total:' in the files provided. It is always a good idea to state what you're trying to do - as well as how you're trying to do it (see How to Ask Questions the Smart Way).
If so, then you're doing in about as complicated a way as I can see. What was wrong with:
grep '^Total:' "$#" |
cut -d: -f2 |
awk '{sum += $1}
END { print sum }'
This doesn't print out "The total is" etc; and it is not clear why you echo $FILE at the end of your version.
You can use Perl or any other suitable program in place of awk; you could do the whole job in Perl or Python - indeed, the cut work could be done by awk:
grep "^Total:" "$#" |
awk -F: '{sum += $2}
END { print sum }'
Taken still further, the whole job could be done by awk:
awk -F: '$1 ~ /^Total/ { sum += $2 }
END { print sum }' "$#"
The code in Perl wouldn't be much harder and the result might be quicker:
perl -na -F: -e '$sum += $F[1] if m/^Total:/; END { print $sum; }' "$#"
When iterating over the file name arguments provided in a shell script, you should use '"$#"' in place of '$*' as the latter notation does not preserve spaces in file names.
Your comment about '$1' is confusing to me. You could be asking to read from the file whose name is in $1 on each iteration; that is done using:
while [ $# -gt 0 ]
do
...process $1...
shift
done
HTH!

If you define a function, it'll receive the argument as $1. Why is $1 more valuable to you than $FILE, though?
#!/bin/sh
process() {
echo "doing something with $1"
}
for i in "$#" # Note use of "$#" to not break on filenames with whitespace
do
process "$i"
done

while [ "$1" != "" ]; do
# Code here
# Next argument
shift
done
On your problem with tilde files ... those are temporary files created by your text editor. Delete them if you don't want them to be matched by your glob expression (wildcard). Otherwise, filter them in your script (not recommended).

Related

use bash to update a files date/time stamp

I would like to use this code snippet to update a files date and time stamp using there file name:
Example file names:
2009.07.04-03.42.01.mov
2019.06.08-01.12.08.mov
I get the following error "The action “Run Shell Script” encountered an error: “touch: out of range or illegal time specification: [[CC]YY]MMDDhhmm[.SS]”
How would I modify this code snippet?
for if in "$#"
do
date_Time=$(echo "$if" | awk '{ print substr( $0, 1, length($0)-7 ) }' | sed 's/\.//g' | sed 's/-//')
touch -t "$date_Time" "$if"
done
UPDATE (01/05/2022):......
I would also like the code to work for the following filename formats...
And file names with no time info (time would default to 12pm):
2009.07.04.mov
2019.06.08.mov
And file names with description info:
2009.07.04-file-description.mov
2019.06.08-video-file info.mp4
2019.06.08-video-old-codec.avi
The error message suggests that you passed in file names which do not match your examples. Perhaps modify your code to display an error message if it is called with no files at all, and remove the path if it is passed files with directory names.
As an aside, if is a keyword, so you probably don't want to use it as a variable name, even though it is possible.
#!/bin/sh
if [ $# == 0 ]; then
echo "Syntax: $0 files ..." >&2
exit 1
fi
for f in "$#"
do
date_Time=$(echo "$f" | awk '{ sub(/.*\//, ""); gsub(/[^0-9]+/, ""); print substr( $0, 1, length($0)-7 ) }')
touch -t "$date_Time" "$if"
done
Notice also how I factored out the sed scripts; Awk can do everything sed can do, so I included the final transformation in the main script. (As an aside, sed 's/[-.]//g' would do both in one go; or you could do sed -e 's/\.//' -e 's/-//' with a single sed invocation.)
If you use Bash, you could simplify this further:
#!/bin/bash
if [ $# == 0 ]; then
echo "Syntax: $0 files ..." >&2
exit 1
fi
for f in "$#"
do
base=${f##*/}
dt=${base//[!0-9]/}
dt=${dt:0:12}
case $dt in
[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9])
touch -t "$dt" "$f";;
*) echo "$0: $f did not seem to contain a valid date ($dt)" >&2;;
esac
done
Notice also how the code now warns if it cannot extract exactly 14 digits from the file name. The parameter expansions are somewhat clumsy but a lot more efficient than calling Awk on each file name separately (and the Awk code wasn't particularly elegant or robust either anyway).
Quick and small bash function
setDateFileFromName() {
local _file _dtime
for _file ;do
_dtime="${_file%.*.mov}"
_dtime="${_dtime##*/}"
touch -t ${_dtime//[!0-9]/} "$_file"
done
}
Then
setDateFileFromName /path/to store dir/????.??.??-??.??.??.mov
Remark. This work with filenames formated as your sample. Any change in filename format will break this!

Reformatting a csv file, script is confused by ' %." '

I'm using bash on cygwin.
I have to take a .csv file that is a subset of a much larger set of settings and shuffle the new csv settings (same keys, different values) into the 1000-plus-line original, making a new .json file.
I have put together a script to automate this. The first step in the process is to "clean up" the csv file by extracting lines that start with "mme " and "sms ". Everything else is to pass through cleanly to the "clean" .csv file.
This routine is as follows:
# clean up the settings, throwing out mme and sms entries
cat extract.csv | while read -r LINE; do
if [[ $LINE == "mme "* ]]
then
printf "$LINE\n" >> mme_settings.csv
elif [[ $LINE == "sms "* ]]
then
printf "$LINE\n" >> sms_settings.csv
else
printf "$LINE\n" >> extract_clean.csv
fi
done
My problem is that this thing stubs its toe on the following string at the end of one entry: 100%." When it's done with the line, it simply elides the %." and the new-line marker following it, and smears the two lines together:
... 100next.entry.keyname...
I would love to reach in and simply manually delimit the % sign, but it's not a realistic option for my use case. Clearly I'm missing something. My suspicion is that I am in some wise abusing cat or read in the first line.
If there is some place I should have looked to find the answer before bugging you all, by all means point me in that direction and I'll sod off.
Syntax for printf is :
printf format [argument]...
In [ printf ] format string, anything followed by % is a format specifier as described in the link above. What you would like to do is :
while read -r line; do # Replaced LINE with line, full uppercase variable are reserved for the syste,
if [[ "$line" = "mme "* ]] # Here* would glob for anything that comes next
then
printf "%s\n" $line >> mme_settings.csv
elif [[ "$line" = "sms "* ]]
then
printf "%s\n" $line >> sms_settings.csv
else
printf "%s\n" $line >> extract_clean.csv
fi
done<extract.csv # Avoided the useless use of cat
As pointed out, your problem is expanding a parameter containing a formatting instruction in the formatting argument of printf, which can be solved by using echo instead or moving the parameter to be expanded out of the formatting string, as demonstrated in other answers.
I recommend not looping over your whole file with Bash in the first place, as it's notoriously slow; you're extracting lines starting with certain patterns, which is a job at which grep excels:
grep '^mme ' extract.csv > mme_settings.csv
grep '^sms ' extract.csv > sms_settings.csv
grep -v '^mme \|^sms ' extract.csv > extract_clean.csv
The third command uses the -v option (extract lines that don't match) and alternation to exclude lines both starting with mme and sms.

How do you name output files using an increment after a bash file loop?

I'm trying to treat a bunch of files (five) with an awk command and name the output files using an incrementation.
The input files have complicated names. I know how to reuse the files' basenames to rename the outputs but I want to simplify the file names.
this is my code:
for f in *.txt; do
for i in {1..5}; do
echo processing file $f
awk '
{ if ($1=="something" && ($5=="60" || $5=="61"|| $5=="62"|| $5=="63"|| $5=="64"|| $5=="65"|| $5=="66"|| $5=="67"|| $5=="68"|| $5=="69"|| $5=="70"))
print }' $b.txt>"file_treated"$i.txt
echo processing file $f over
done
done
I understand that the error is in the second line because what I wrote runs the second loop for each value of the first one. I want each value of the first loop to correspond to one value of the second one.
Hope this was clear enough
How about:
i=0
for f in *.txt; do
let i++;
awk '$1=="something" && ($5 >= 60 && $5 <=70)' "$f" > file_treated"${i}".txt
done
I simplified your awk command and straightened out your various quoting issues. I also removed the $b.txt since you were simply recreating $f. I left the echo $b etc in case you actually wanted that but it could just as easily be replaced with echo "$f".
Use a counter:
i=1
for f in *.txt
do
echo "$f is number $((i++))"
done

bash: how does float arithmetic work?

I'm gonna tear my hair out: I have this script:
#!/bin/bash
if [[ $# -eq 2 ]]
then
total=0
IFS=' '
while read one two; do
total=$((total+two))
done < $2
echo "Total: $total"
fi
Its supposed to add up my gas receipts I have saved in a file in this format:
3/9/13 21.76
output:
./getgas: line 9: 21.76: syntax error: invalid arithmetic operator (error token is ".76")
I read online that its possible to do float math in bash, and I found an an example script that works and it has:
function float_eval()
{
local stat=0
local result=0.0
if [[ $# -gt 0 ]]; then
result=$(echo "scale=$float_scale; $*" | bc -q 2>/dev/null)
stat=$?
if [[ $stat -eq 0 && -z "$result" ]]; then stat=1; fi
fi
echo $result
return $stat
}
which looks awesome, and runs no problem
WTF is going on here. I can easily do this is C but this crap is making me mad
EDIT: I don't anything about awk. It looks promising but I don't even know how to run those one-liners you guys posted
awk '{ sum += $2 } END { printf("Total: %.2f\n", sum); }' $2
Add up column 2 (that's the $2 in the awk script) of the file named by shell script argument $2 (rife with opportunities for confusion) and print the result at the end.
I don't [know] anything about awk. It looks promising but I don't even know how to run those one-liners you guys posted.
In the context of your script:
#!/bin/bash
if [[ $# -eq 2 ]]
then
awk '{ sum += $2 } END { printf("Total: %.2f\n", sum); }' $2
else
echo "Usage: $0 arg1 receipts-file" >&2; exit 1
fi
Or just write it on the command line, substituting the receipts file name for the $2 after the awk command. Or leave that blank and redirect from the file. Or type the dates and values in. Or, …
Your script demands two arguments, but doesn't use the first one, which is a bit puzzling.
As noted in the comments, you could simplify that to:
#!/bin/bash
exec awk '{ sum += $2 } END { printf("Total: %.2f\n", sum) }' "$#"
Or even use the shebang to full power:
#!/usr/bin/awk -f
{ sum += $2 }
END { printf("Total: %.2f\n", sum) }
The kernel will execute awk for you, and that's the awk script written out as a two line program. Of course, if awk is in /bin/awk, then you have to fix the shebang line; the shell looks in many places for awk and will probably find it. So there are advantages to sticking with a shell script. Both these revisions simply sum what's on standard input if there are no files specified, or what is in all the files specified if there is one or more files specified on the command line.
In bash you can only operate on integers. The example script you posted uses bc which is an arbitrary-precision calculation, included with most UNIX-like OS-es. So the script prepares an expression and pipes it to bc (the initial scale=... expression configures the number of significant digits bc should display.
A simplified example would be:
echo -e 'scale=2\n1.234+5.67\nquit' | bc
You could also use awk:
awk 'BEGIN{print 1.234+5.67}'

Parsing a config file in bash

Here's my config file (dansguardian-config):
banned-phrase duck
banned-site allaboutbirds.org
I want to write a bash script that will read this config file and create some other files for me. Here's what I have so far, it's mostly pseudo-code:
while read line
do
# if line starts with "banned-phrase"
# add rest of line to file bannedphraselist
# fi
# if line starts with "banned-site"
# add rest of line to file bannedsitelist
# fi
done < dansguardian-config
I'm not sure if I need to use grep, sed, awk, or what.
Hope that makes sense. I just really hate DansGuardian lists.
With awk:
$ cat config
banned-phrase duck frog bird
banned-phrase horse
banned-site allaboutbirds.org duckduckgoose.net
banned-site froggingbirds.gov
$ awk '$1=="banned-phrase"{for(i=2;i<=NF;i++)print $i >"bannedphraselist"}
$1=="banned-site"{for(i=2;i<=NF;i++)print $i >"bannedsitelist"}' config
$ cat bannedphraselist
duck
frog
bird
horse
$ cat bannedsitelist
allaboutbirds.org
duckduckgoose.net
froggingbirds.gov
Explanation:
In awk by default each line is separated into fields by whitespace and each field is handled by $i where i is the ith field i.e. the first field on each line is $1, the second field on each line is $2 upto $NF where NF is the variable that contains the number of fields on the given line.
So the script is simple:
Check the first field against our required strings $1=="banned-phrase"
If the first field matched then loop over all the other fields for(i=2;i<=NF;i++) and print each field print $i and redirect the output to the file >"bannedphraselist".
You could do
sed -n 's/^banned-phrase *//p' dansguardian-config > bannedphraselist
sed -n 's/^banned-site *//p' dansguardian-config > bannedsitelist
Although that means reading the file twice. I doubt that the possible performance loss matters though.
You can read multiple variables at once; by default they're split on whitespace.
while read command target; do
case "$command" in
banned-phrase) echo "$target" >>bannedphraselist;;
banned-site) echo "$target" >>bannedsitelist;;
"") ;; # blank line
*) echo >&2 "$0: unrecognized config directive '$command'";;
esac
done < dansguardian-config
Just as an example. A smarter implementation would read the list files first, make sure things weren't already banned, etc.
What is the problem with all the solutions which uses echo text >> file? It can be checked with strace that in every such step the file is opened, then positioned to the end, then text is written and file is closed. So if there is 1000 times echo text >> file then there will be 1000 open, lseek, write, close. The number of open, lseek and close can be reduced a lot on the following way:
while read key val; do
case $key in
banned-phrase) echo $val>&2;;
banned-site) echo $val;;
esac
done >bannedsitelist 2>bannedphraselist <dansguardian-config
The stdout and stderr is redirected to files and kept open while the loop is alive. So the files are opened once and closed once. No need of lseek. Also the file caching is used more in this way as the unnecessary calls to close will not flush the buffers each time.
while read name value
do
if [ $name = banned-phrase ]
then
echo $value >> bannedphraselist
elif [ $name = banned-site ]
then
echo $value >> bannedsitelist
fi
done < dansguardian-config
Better to use awk:
awk '$1 ~ /^banned-phrase/{print $2 >> "bannedphraselist"}
$1 ~ /^banned-site/{print $2 >> "bannedsitelist"}' dansguardian-config

Resources