How to run grep with while loop in shell script? - shell

I'm trying to make shell script that counts 9 letters words consisted of A, G, C, T in B.txt.
First,
9bp_cases.txt contains
AAAAAAAAA
AAAAAAAAG
AAAAAAAAC
AAAAAAAAT
AAAAAAAGA
AAAAAAAGG
AAAAAAAGC
...
//#!/bin/bash
file=/Dataset/4.synTF/2.Sequence/9bp_cases.txt
# CSV 파일이 존재하지 않으면 종료
if [ ! -f "$file" ]; then
echo "CSV 파일이 존재하지 않습니다: $file" >&2
exit 1
fi
cat 9bp_cases.txt | while read line;
do
echo $line
grep -i -o $line B.txt | wc -w
done
The result is like this:
//AAAAAAAAA
0
AAAAAAAAG
0
AAAAAAAAC
0
AAAAAAAAT
0
AAAAAAAGA
0
None of the words is counted correctly.
However, when I run the simple code, it returns result well.
grep -i -o AAAAAAAA B.txt | wc -w
33410
I guess $line after grep is not recognized by the command grep.
Can you please help me?
Thank you.

Related

Need to remove the extra empty lines from the output of shell script

i'm trying to write a code which will print all files taking more than min_size (lets say 10G) in a directory. the problem is output off the below code is all files irrespective of the min_size. i will be getting other details like mtime , owner as well later in the code but this part itself doesnt work fine, whats wrong here ?
#!/bin/sh
if (( $# <3 )); then
echo "$0 dirname min_size count"
exit 1
else
dirname="$1";
min_size="$2";
count="$3";
#shift 3
fi
tmpfile=$(mktemp /lawdump/pulkit/files.XXXXXX)
exec 3> "$tmpfile"
find "${dirname}" -type f -print0 2>&1 | grep -v "Permission denied" | xargs -0 -I {} echo "{}" > "$tmpfile"
for i in `cat tmpfile`
do
x="`du -ah $i | awk '{print $1}' | grep G | sort -nr -k 1`"
size=$(echo $x | sed 's/[A-Za-z]*//g')
if [ size > $min_size ];then
echo $size
fi
done
Note : i know this can be done through find or du but i need to write a shell script to have an email sent out regularly with all the details.

How to check that a file has more than 1 line in a BASH conditional?

I need to check if a file has more than 1 line. I tried this:
if [ `wc -l file.txt` -ge "2" ]
then
echo "This has more than 1 line."
fi
if [ `wc -l file.txt` >= 2 ]
then
echo "This has more than 1 line."
fi
These just report errors. How can I check if a file has more than 1 line in a BASH conditional?
The command:
wc -l file.txt
will generate output like:
42 file.txt
with wc helpfully telling you the file name as well. It does this in case you're checking out a lot of files at once and want individual as well as total stats:
pax> wc -l *.txt
973 list_of_people_i_must_kill_if_i_find_out_i_have_cancer.txt
2 major_acheivements_of_my_life.txt
975 total
You can stop wc from doing this by providing its data on standard input, so it doesn't know the file name:
if [[ $(wc -l <file.txt) -ge 2 ]]
The following transcript shows this in action:
pax> wc -l qq.c
26 qq.c
pax> wc -l <qq.c
26
As an aside, you'll notice I've also switched to using [[ ]] and $().
I prefer the former because it has less issues due to backward compatibility (mostly to do with with string splitting) and the latter because it's far easier to nest executables.
A pure bash (≥4) possibility using mapfile:
#!/bin/bash
mapfile -n 2 < file.txt
if ((${#MAPFILE[#]}>1)); then
echo "This file has more than 1 line."
fi
The mapfile builtin stores what it reads from stdin in an array (MAPFILE by default), one line per field. Using -n 2 makes it read at most two lines (for efficiency). After that, you only need to check whether the array MAPFILE has more that one field. This method is very efficient.
As a byproduct, the first line of the file is stored in ${MAPFILE[0]}, in case you need it. You'll find out that the trailing newline character is not trimmed. If you need to remove the trailing newline character, use the -t option:
mapfile -t -n 2 < file.txt
if [ `wc -l file.txt | awk '{print $1}'` -ge "2" ]
...
You should always check what each subcommand returns. Command wc -l file.txt returns output in the following format:
12 file.txt
You need first column - you can extract it with awk or cut or any other utility of your choice.
How about:
if read -r && read -r
then
echo "This has more than 1 line."
fi < file.txt
The -r flag is needed to ensure line continuation characters don't fold two lines into one, which would cause the following file to report one line only:
This is a file with _two_ lines, \
but will be seen as one.
change
if [ `wc -l file.txt` -ge "2" ]
to
if [ `cat file.tex | wc -l` -ge "2" ]
If you're dealing with large files, this awk command is much faster than using wc:
awk 'BEGIN{x=0}{if(NR>1){x=1;exit}}END{if(x>0){print FILENAME,"has more than one line"}else{print FILENAME,"has one or less lines"}}' file.txt

ksh: shell script to search for a string in all files present in a directory at a regular interval

I have a directory (output) in unix (SUN). There are two types of files created with timestamp prefix to the file name. These file are created on a regular interval of 10 minutes.
e. g:
1. 20140129_170343_fail.csv (some lines are there)
2. 20140129_170343_success.csv (some lines are there)
Now I have to search for a particular string in all the files present in the output directory and if the string is found in fail and success files, I have to count the number of lines present in those files and save the output to the cnt_succ and cnt_fail variables. If the string is not found I will search again in the same directory after a sleep timer of 20 seconds.
here is my code
#!/usr/bin/ksh
for i in 1 2
do
grep -l 0140127_123933_part_hg_log_status.csv /osp/local/var/log/tool2/final_logs/* >log_t.txt; ### log_t.txt will contain all the matching file list
while read line ### reading the log_t.txt
do
echo "$line has following count"
CNT=`wc -l $line|tr -s " "|cut -d" " -f2`
CNT=`expr $CNT - 1`
echo $CNT
done <log_t.txt
if [ $CNT > 0 ]
then
exit
fi
echo "waiitng"
sleep 20
done
The problem I'm facing is, I'm not able to get the _success and _fail in file in line and and check their count
I'm not sure about ksh, but while ... do; ... done is notorious for running off with whatever variables you're using in bash. ksh might be similar.
If I've understand your question right, SunOS has grep, uniq and sort AFAIK, so a possible alternative might be...
First of all:
$ cat fail.txt
W34523TERG
ADFLKJ
W34523TERG
WER
ASDTQ34T
DBVSER6
W34523TERG
ASDTQ34T
DBVSER6
$ cat success.txt
abcde
defgh
234523452
vxczvzxc
jkl
vxczvzxc
asdf
234523452
vxczvzxc
dlkjhgl
jkl
wer
234523452
vxczvzxc
And now:
egrep "W34523TERG|ASDTQ34T" fail.txt | sort | uniq -c
2 ASDTQ34T
3 W34523TERG
egrep "234523452|vxczvzxc|jkl" success.txt | sort | uniq -c
3 234523452
2 jkl
4 vxczvzxc
Depending on the input data, you may want to see what options sort has on your system. Examining uniq's options may prove useful too (it can do more than just count duplicates).
Think you want something like this (will work in both bash and ksh)
#!/bin/ksh
while read -r file; do
lines=$(wc -l < "$file")
((sum+=$lines))
done < <(grep -Rl --include="[1|2]*_fail.csv" "somestring")
echo "$sum"
Note this will match files starting with 1 or 2 and ending in _fail.csv, not exactly clear if that's what you want or not.
e.g. Let's say I have two files, one starting with 1 (containing 4 lines) and one starting with 2 (containing 3 lines), both ending in `_fail.csv somewhere under my current working directory
> abovescript
7
Important to understand grep options here
-R, --dereference-recursive
Read all files under each directory, recursively. Follow all
symbolic links, unlike -r.
and
-l, --files-with-matches
Suppress normal output; instead print the name of each input
file from which output would normally have been printed. The
scanning will stop on the first match. (-l is specified by
POSIX.)
Finaly I'm able to find the solution. Here is the complete code:
#!/usr/bin/ksh
file_name="0140127_123933.csv"
for i in 1 2
do
grep -l $file_name /osp/local/var/log/tool2/final_logs/* >log_t.txt;
while read line
do
if [ $(echo "$line" |awk '/success/') ] ## will check the success file
then
CNT_SUCC=`wc -l $line|tr -s " "|cut -d" " -f2`
CNT_SUCC=`expr $CNT_SUCC - 1`
fi
if [ $(echo "$line" |awk '/fail/') ] ## will check the fail file
then
CNT_FAIL=`wc -l $line|tr -s " "|cut -d" " -f2`
CNT_FAIL=`expr $CNT_FAIL - 1`
fi
done <log_t.txt
if [ $CNT_SUCC > 0 ] && [ $CNT_FAIL > 0 ]
then
echo " Fail count = $CNT_FAIL"
echo " Success count = $CNT_SUCC"
exit
fi
echo "waitng for next search..."
sleep 10
done
Thanks everyone for your help.
I don't think I'm getting it right, but You can't diffrinciate the files?
maybe try:
#...
CNT=`expr $CNT - 1`
if [ $(echo $line | grep -o "fail") ]
then
#do something with fail count
else
#do something with success count
fi

Bash - How to count C source file functions calls

I want to find for each function defined in a C source file how many times it's called and on which line.
Should I search for patterns which look like function definitions in C and then count how many times that function name occurs. If so, how can I do it? regular expressions?
Any help will be highly appreciated!
#!/bin/bash
if [ -r $1 ]; then
#??????
else
echo The file \"$1\" does NOT exist
fi
The final result is: (please report any bugs)
10 if [ -r $1 ]; then
11 functs=`grep -n -e "\(void\|double\|char\|int\) \w*(.*)" $1 | sed 's/^.*\(void\|double\|int\) \(\w*\)(.*$/\2/g'`
12 for f in $functs;do
13 echo -n $f\(\) is called:
14 grep -n $f $1 > temp.txt
15 echo -n `grep -c -v -e "\(void\|double\|int\) $f(.*)" -e"//" temp.txt`
16 echo " times"
17 echo -n on lines:
18 echo -n `grep -v -e "\(void\|double\|int\) $f(.*)" -e"//" temp.txt | sed -n 's/^\([0-9]*\)[:].*/\1/p'`
19 echo
20 echo
21 done
22 else
23 echo The file \"$1\" does not exist
24 fi
This might sort of work. The first bit finds function definitions like
<datatype> <name>(<stuff>)
and pulls out the <name>. Then grep for that string. There are loads of situations where this won't work, but it might be a good place to start if you're trying to make a simple shell script that works on some programs.
functions=`grep -e "\(void\|double\|int\) \w*(.*)$" -f input.c | sed 's/^.*\(void\|double\|int\) \(\w*\)(.*$/\2/g'`
for func in $functions
do
echo "Counting references for $func:"
grep "$func" -f input.c | wc -l
done
You can try with this regex
(^|[^\w\d])?(functionName(\s)*\()
for example to search all printf occurrences
(^|[^\w\d])?(printf(\s)*\()
to use this expression with grep you have to use the option -E, like this
grep -E "(^|[^\w\d])?(printf(\s)*\()" the_file.txt
Final note, what miss with this solution is to skip the occurrences in comment bloks.

Concatenate strings in bash

I have in a bash script:
for i in `seq 1 10`
do
read AA BB CC <<< $(cat file1 | grep DATA)
echo ${i}
echo ${CC}
SORT=${CC}${i}
echo ${SORT}
done
so "i" is a integer, and CC is a string like "TODAY"
I would like to get then in SORT, "TODAY1", etc
But I get "1ODAY", "2ODAY" and so
Where is the error?
Thanks
You should try
SORT="${CC}${i}"
Make sure your file does not contain "\r" that would end just in the end of $CC.
This could well explain why you get "1ODAY".
Try including
|tr '\r' ''
after the cat command
try
for i in {1..10}
do
while read -r line
do
case "$line" in
*DATA* )
set -- $line
CC=$3
SORT=${CC}${i}
echo ${SORT}
esac
done <"file1"
done
Otherwise, show an example of file1 and your desired output
ghostdog is right: with the -r option, read avoids succumbing to potential horrors, like CRLFs. Using arrays makes the -r option more pleasant:
for i in `seq 1 10`
do
read -ra line <<< $(cat file1 | grep DATA)
CC="${line[3]}"
echo ${i}
echo ${CC}
SORT=${CC}${i}
echo ${SORT}
done

Resources