overwrite a file then append - shell

I have a loop in my script that will append a list of email address's to a file "$CRN". If this script is executed again, it will append to this old list. I want it to overwrite with the new list rather then appending to the old list. I can submit my whole script if needed. I know I could test if "$CRN" exists then remove file, but I'm interested in some other suggestions? Thanks.
for arg in "$#"; do
if ls /students | grep -q "$arg"; then
echo "${arg}#mail.ccsf.edu">>$CRN
((students++))
elif ls /users | grep -q "$arg$"; then
echo "${arg}#ccsf.edu">>$CRN
((faculty++))
fi

Better do this :
CRN="/path/to/file"
:> "$CRN"
for arg; do
if printf '%s\n' /students/* | grep -q "$arg"; then
echo "${arg}#mail.ccsf.edu" >> "$CRN"
((students++))
elif printf '%s\n'/users/* | grep -q "${arg}$"; then
echo "${arg}#ccsf.edu" >> "$CRN"
((faculty++))
fi
done
don't parse ls output ! use bash glob instead. ls is a tool for interactively looking at file information. Its output is formatted for humans and will cause bugs in scripts. Use globs or find instead. Understand why: http://mywiki.wooledge.org/ParsingLs
"Double quote" every expansion, and anything that could contain a special character, eg. "$var", "$#", "${array[#]}", "$(command)". See http://mywiki.wooledge.org/Quotes http://mywiki.wooledge.org/Arguments and http://wiki.bash-hackers.org/syntax/words
take care to false positives like arg=foo and glob : foobar, that will match. You need grep -qw then if you want word boundaries. UP2U

Related

Some tips to improve a bash script for count fastq files

Hi guys I got this bash one line that i wish to make a script
for i in 'ls *.fastq.gz'; do echo $(zcat ${i} | wc -l)/4|bc; done
I would like to make it as a script to read from a data dir and print out the result with the name of the file.
I tried to put the dir in front of the 'data/*.fastq.gz' but got am error No such dir exist...
I would like some like this:
name1.fastq.gz 1898516
name2.fastq.gz 2467421
namen.fastq.gz 1234532
I am not experienced in bash.
Could you guys give a help?
Thanks
Take the dir as an argument, but default to the current dir if it's not set.
dir="${1-.}"
Then put it in the glob: "$dir"/*.fastq.gz
As well:
Quote variables and command expansions.
Don't parse ls.
Don't trust echo with arbitrary data (filenames). Use printf instead.
Use an end-of-options flag -- when giving filenames to commands.
I prefer to not have any inline command expansions, but that's just personal preference
Putting it together:
#!/bin/bash
dir="${1-.}"
for file in "$dir"/*.fastq.gz; do
printf '%s ' "$file"
lines="$(zcat -- "$file" | wc -l)"
bc <<< "$lines/4" # Using a here-string (Bash feature)
done
There is no need to escape to bc for integer math (divide by 4), or to use 'ls' to enumerate the files. The original version will do with minor changes:
#!/bin/bash
dir="${1-.}"
for i in "$dir"/*.fastq.gz; do
lines=$(zcat "${i}" | wc -l)
printf '%s %d\n' "$i" "$((lines/4))"
done

shell script grep to grep a string

The output is blank fr the below script. What is it missing? I am trying to grep a string
#!/bin/ksh
file=$abc_def_APP_13.4.5.2
if grep -q abc_def_APP $file; then
echo "File Found"
else
echo "File not Found"
fi
In bash, use the <<< redirection from a string (a 'Here string'):
if grep -q abc_def_APP <<< $file
In other shells, you may need to use:
if echo $file | grep -q abc_def_APP
I put my then on the next line; if you want your then on the same line, then add ; then after what I wrote.
Note that this assignment:
file=$abc_def_APP_13.4.5.2
is pretty odd; it takes the value of an environment variable ${abc_def_APP_13} and adds .4.5.2 to the end (it must be an env var since we can see the start of the script). You probably intended to write:
file=abc_def_APP_13.4.5.2
In general, you should enclose references to variables holding file names in double quotes to avoid problems with spaces etc in the file names. It is not critical here, but good practices are good practices:
if grep -q abc_def_APP <<< "$file"
if echo "$file" | grep -q abc_def_APP
Yuck! Use the shell's string matching
if [[ "$file" == *abc_def_APP* ]]; then ...

How to grep a string in until loop in bash?

I work on a script compressing files. I want to do an 'until loop' til' the content of variable matches the pattern. The script is using zenity. This is the major part:
part="0"
pattern="^([0-9]{1}[0-9]*([km])$"
until `grep -E "$pattern" "$part"` ; do
part=$(zenity --entry \
--title="Zip the file" \
--text "Choose the size of divided parts:
(0 = no division, *m = *mb, *k = *kb)" \
--entry-text "0");
if grep -E "$pattern" "$part" ; then
zenity --warning --text "Wrong text entry, try again." --no-cancel;
fi
done
I want it to accept string containing digits ended with 'k' or 'm' (but not both of them) and don't accept string started with '0'.
Is the pattern ok?
$ grep -w '^[1-9][0-9]*[km]$' <<< 45k
45k
$ grep -w '^[1-9][0-9]*[km]$' <<< 001023m
$ grep -w '^[1-9][0-9]*[km]$' <<< 1023m
1023m
Don't forget the <<< in your expression, you're not grep'ing a file, but a string. To be more POSIX-compliant, you can also use:
echo 1023m | grep -w '^[1-9][0-9]*[km]$'
But it is kinda ugly.
Edit:
Longer example:
initmessage="Choose the size of divided parts:\n(0 = no division, *m = *mb, *k = *kb)"
errmessage="Wrong input. Please re-read carefully the following:\n\n$initmessage"
message="$initmessage"
while true ; do
part=$(zenity --entry \
--title="Zip the file" \
--text "$message")
if grep -qw '^[1-9][0-9]*[km]$' <<< "$part" ; then
zenity --info --text 'Thank you !'
break
else
message="$errmessage"
fi
done
Also, this is not directly related to the question, but you may want to have a look at Yad, which does basically the same things Zenity does, but has more options. I used it a lot when I had to write Bash scripts, and found it much more useful than Zenity.
You don't want the back-quotes in the until line. You might write:
until grep -E "$pattern" "$part"
do
...body of loop...
done
Or you might add arguments to grep to suppress the output (or send the output to /dev/null). As written, the script tries to execute the output of the grep command and use the success/failure status of that (not the grep per se) as an indication of whether to continue the loop or not.
Additionally, your pattern needs some work. It is:
pattern="^([0-9]{1}[0-9]*([km])$"
There is an unmatched open parenthesis in there. It also looks to me as though it is trying to allow a leading zero. You probably want:
pattern='^[1-9][0-9]*[km]$'
Single quotes are generally safer than double quotes for things like regular expressions.
I just want to check if my variable called part is well-formed after writing it in Zenity entry dialog. I just realised that grep needs a file, but my part is a variable initialised in this script. How to get along now?
In bash, you can use the <<< operator to redirect from a string:
until grep -E "$pattern" <<< "$part"
In most other shells, you'd write:
until echo "$part" | grep -E "$pattern"
This also works in bash, of course.

Check execute command after cheking file type

I am working on a bash script which execute a command depending on the file type. I want to use the the "file" option and not the file extension to determine the type, but I am bloody new to this scripting stuff, so if someone can help me I would be very thankful! - Thanks!
Here the script I want to include the function:
#!/bin/bash
export PrintQueue="/root/xxx";
IFS=$'\n'
for PrintFile in $(/bin/ls -1 ${PrintQueue}) do
lpr -r ${PrintQueue}/${PrintFile};
done
The point is, all files which are PDFs should be printed with the lpr command, all others with ooffice -p
You are going through a lot of extra work. Here's the idiomatic code, I'll let the man page provide the explanation of the pieces:
#!/bin/sh
for path in /root/xxx/* ; do
case `file --brief $path` in
PDF*) cmd="lpr -r" ;;
*) cmd="ooffice -p" ;;
esac
eval $cmd \"$path\"
done
Some notable points:
using sh instead of bash increases portability and narrows the choices of how to do things
don't use ls when a glob pattern will do the same job with less hassle
the case statement has surprising power
First, two general shell programming issues:
Do not parse the output of ls. It's unreliable and completely useless. Use wildcards, they're easy and robust.
Always put double quotes around variable substitutions, e.g. "$PrintQueue/$PrintFile", not $PrintQueue/$PrintFile. If you leave the double quotes out, the shell performs wildcard expansion and word splitting on the value of the variable. Unless you know that's what you want, use double quotes. The same goes for command substitutions $(command).
Historically, implementations of file have had different output formats, intended for humans rather than parsing. Most modern implementations have an option to output a MIME type, which is easily parseable.
#!/bin/bash
print_queue="/root/xxx"
for file_to_print in "$print_queue"/*; do
case "$(file -i "$file_to_print")" in
application/pdf\;*|application/postscript\;*)
lpr -r "$file_to_print";;
application/vnd.oasis.opendocument.*)
ooffice -p "$file_to_print" &&
rm "$file_to_print";;
# and so on
*) echo 1>&2 "Warning: $file_to_print has an unrecognized format and was not printed";;
esac
done
#!/bin/bash
PRINTQ="/root/docs"
OLDIFS=$IFS
IFS=$(echo -en "\n\b")
for file in $(ls -1 $PRINTQ)
do
type=$(file --brief $file | awk '{print $1}')
if [ $type == "PDF" ]
then
echo "[*] printing $file with LPR"
lpr "$file"
else
echo "[*] printing $file with OPEN-OFFICE"
ooffice -p "$file"
fi
done
IFS=$OLDIFS

How can I use bash to parse out only a section of a variable with different delimiters?

I have a loop in a bash file to show me all of the files in a directory, each as its own variable. I need to take that variable (filename) and parse out only a section of it.
Example:
92378478234978ehbWHATIWANT#98712398712398723
Now, assuming "ehb" and the pound symbol never change, how can I just capture WHATIWANT into its own variable?
So far I have:
#!/bin/bash
for FILENAME in `dir -d *` ; do
done
You can use sed to edit out the parts you don't want.
want=$(echo "$FILENAME" | sed -e 's/.*ehb\(.*\)#.*/\1/')
Or you can use Bash's parameter expansion to strip out the tail and head.
want=${FILENAME%#*}; want=${want#*ehb}
One possibility:
for i in '92378478234978ehbWHATIWANT#98712398712398723' ; do
j=$(echo $i | sed -e 's/^.*ehb//' -e 's/#.*$//')
echo $j
done
produces:
WHATIWANT
using only the bash shell, no need external tools
$ string=92378478234978ehbWHATIWANT#98712398712398723
$ echo ${string#*ehb}
WHATIWANT#98712398712398723
$ string=${string#*ehb}
$ echo ${string%#*}
WHATIWANT

Resources