How to retrieve output and number of results from find? - shell

I want to make shell script that find some file(s). It should input result in one variable and number of occurrences in another. Now, I made script like this:
...
PATH=`find -name $FILE`
NUM=`find -name $FILE | wc -l`
...
Flaw is that I am using find command twice for a same search. So I wonder if I could Use one command and populate two variables?

You can use you PATH variable in the second assignment:
~$ PATH=$(find .)
~$ NUM=$(find .|wc -l)
~$ echo $NUM
32
~$ NUM=$(echo "$PATH"|wc -l)
~$ echo $NUM
32
Note that PATH is a variable name that should not be used as it is an internal variable. And that the $(...) form has superseded backticks for command substitution.

Related

Some tips to improve a bash script for count fastq files

Hi guys I got this bash one line that i wish to make a script
for i in 'ls *.fastq.gz'; do echo $(zcat ${i} | wc -l)/4|bc; done
I would like to make it as a script to read from a data dir and print out the result with the name of the file.
I tried to put the dir in front of the 'data/*.fastq.gz' but got am error No such dir exist...
I would like some like this:
name1.fastq.gz 1898516
name2.fastq.gz 2467421
namen.fastq.gz 1234532
I am not experienced in bash.
Could you guys give a help?
Thanks
Take the dir as an argument, but default to the current dir if it's not set.
dir="${1-.}"
Then put it in the glob: "$dir"/*.fastq.gz
As well:
Quote variables and command expansions.
Don't parse ls.
Don't trust echo with arbitrary data (filenames). Use printf instead.
Use an end-of-options flag -- when giving filenames to commands.
I prefer to not have any inline command expansions, but that's just personal preference
Putting it together:
#!/bin/bash
dir="${1-.}"
for file in "$dir"/*.fastq.gz; do
printf '%s ' "$file"
lines="$(zcat -- "$file" | wc -l)"
bc <<< "$lines/4" # Using a here-string (Bash feature)
done
There is no need to escape to bc for integer math (divide by 4), or to use 'ls' to enumerate the files. The original version will do with minor changes:
#!/bin/bash
dir="${1-.}"
for i in "$dir"/*.fastq.gz; do
lines=$(zcat "${i}" | wc -l)
printf '%s %d\n' "$i" "$((lines/4))"
done

How to recall a string in shell script

I made a script like this:
#! /usr/bin/bash
a=`ls ../wrfprd/wrfout_d0${i}* | cut -c22-25`
b=`ls ../wrfprd/wrfout_d0${i}* | cut -c27-28`
c=`ls ../wrfprd/wrfout_d0${i}* | cut -c30-31`
d=`ls ../wrfprd/wrfout_d0${i}* | cut -c33-34`
f=$a$b$c$d
echo $f
sed "s/.* startdate=.*/export startdate=${f}/g" ./post_process > post_process2
echo command works and gives 2008042118 that is what I want but in file post_process2 is like this export startdate= and can not recall variable f. I want to produce a line like export startdate=2008042118
First -- don't use ls here -- it's both expensive in terms of performance (compared to globbing, which is performed internal to the shell without starting any external programs), and doesn't guarantee useful output for the full range of possible filenames, making its use in this context inherently bug-prone. A better way to retrieve pieces from a filename, assuming a ksh-derived shell such as bash or zsh, would look like this:
#!/bin/bash
# this is an array, but we're only going to use the first element
file=( "../wrfprd/wrfout_d0${i}"* )
[[ -e $file ]] || { echo "No file found" >&2; exit 1; }
f=${file:22:4}${file:27:2}${file:30:2}${file:33:2}
Second, don't use sed to modify code -- doing so requires that your runtime user have permission to modify its own code, and moreover invites injection vulnerabilities. Just write your content out to a data file:
printf '%s\n' "$f" >startdate.txt
...and, in your second script, to read in the value from that file:
# if the shebang is #!/bin/bash
startdate=$(<startdate.txt)
# if the shebang is #!/bin/sh
startdate=$(cat startdate.txt)

variable as shell command

I am writing shell script that works with files. I need to find files and print them with some inportant informations for me. Thats no problem... But then I wanted to add some "features" and make it to work with arguments as well. One of the feature is ignoring some files that match patterm (like *.c - to ignore all c file). So I set variable and added string into it.
#!/bin/sh
command="grep -Ev \"$2\"" # in 2nd argument is pattern, that will be ignored
echo "find $PWD -type f | $command | wc -l" # printing command
file_num=$(find $path -type f | $command | wc -l) # saving number of files
echo "Number of files: $file_num"
But, command somehow ignor my variable and count all files. But when I put the same command into bash or shell, I get different number (the correct one) of files. I though, it could be just beacouse of bash, but on other machine, where is ksh, same problem and changing #!/bin/sh to #!/bin/bash did not help too.
The command line including the arguments is processed by the shell before it is executed. So, when you run script the command will be grep -Ev "c"and when you run single command grep -Ev "c" shell will interpreter this command as grep -Ev c.
You can use this command to check it: echo grep -Ev "c".
So, just remove quotes in $command and everything will be ok )
You need only to modify command value :
command="grep -Ev "$1

overwrite a file then append

I have a loop in my script that will append a list of email address's to a file "$CRN". If this script is executed again, it will append to this old list. I want it to overwrite with the new list rather then appending to the old list. I can submit my whole script if needed. I know I could test if "$CRN" exists then remove file, but I'm interested in some other suggestions? Thanks.
for arg in "$#"; do
if ls /students | grep -q "$arg"; then
echo "${arg}#mail.ccsf.edu">>$CRN
((students++))
elif ls /users | grep -q "$arg$"; then
echo "${arg}#ccsf.edu">>$CRN
((faculty++))
fi
Better do this :
CRN="/path/to/file"
:> "$CRN"
for arg; do
if printf '%s\n' /students/* | grep -q "$arg"; then
echo "${arg}#mail.ccsf.edu" >> "$CRN"
((students++))
elif printf '%s\n'/users/* | grep -q "${arg}$"; then
echo "${arg}#ccsf.edu" >> "$CRN"
((faculty++))
fi
done
don't parse ls output ! use bash glob instead. ls is a tool for interactively looking at file information. Its output is formatted for humans and will cause bugs in scripts. Use globs or find instead. Understand why: http://mywiki.wooledge.org/ParsingLs
"Double quote" every expansion, and anything that could contain a special character, eg. "$var", "$#", "${array[#]}", "$(command)". See http://mywiki.wooledge.org/Quotes http://mywiki.wooledge.org/Arguments and http://wiki.bash-hackers.org/syntax/words
take care to false positives like arg=foo and glob : foobar, that will match. You need grep -qw then if you want word boundaries. UP2U

Counting file lines in shell and in a script gives different results

For a bunch of files in a directory I want to get the number of lines for each one, store it
in a variable and do additional stuff. Via shell I can do it without problems if I do
read NLINES <<< $( cat file | wc -l )
but if I do it in a script
#!/bin/bash
for i in `ls *.dat `
do
read NLINES <<< $( cat $i | wc -l )
done
I get
Syntax error: redirection unexpected
Why the difference? How could I fix it?
I bet your default shell isn't bash but something else. Leave the #!/bin/bash and replace it with #!/bin/sh, to let your script use the default shell.
I made this error the other way, when I tried to use some debian scripts on Ubuntu, where #!/bin/sh behaved differently from my assumed #!/bin/bash.

Resources