How do I add multiple strings to search the same file for:
Currently:
#!/bin/bash
for log in filename.log.201[45]-*-*.gz; do
printf '%s:' "$log"
zcat "$log" | grep -wc 'dollar for dollars'
done
Desired result:
#!/bin/bash
for log in filename.log.201[45]-*-*.gz; do
printf '%s:' "$log"
echo "count for dollar for dollars"
zcat "$log" | grep -wc 'dollar for dollars'
echo "count for pollar for pollars"
zcat "$log" | grep -wc 'pollar for pollars'
done
You can use nested loop for this one.
for count in 'dollar for dollars' 'pollar for pollars'; do
for log in filename.log.201[45]-*-*.gz; do
printf '%s:' "$log"
echo -n $count :
zcat "$log" | grep -wc "$count"
done
done
You probably would be better off using an actual programming language, like awk.
If you want a count of the total number of occurrences of each pattern (which might be more than the number of lines in which each pattern appears, in the case that a pattern appears more than once in a line) you could use the -o option to grep to output the actual matches, and then construct the final report using sort | uniq -c, which counts the number of occurrences of each unique line in an output stream. That lets you supply multiple patterns to a single grep command, using the -e option:
for log in filename.log.201[45]-*-*.gz; do
zcat "$log" |
grep -e "pattern 1" -e "pattern 2" -ow |
sort | uniq -c |
xargs -d\\n printf "${log//%/%%}:%s\n"
done
Related
I'm new to bash so I'm finding trouble doing something very basic.
Through playing with various scripts I found out that the following script prints the lines that contain the "word"
for file in*; do
cat $file | grep "word"
done
doing the following:
for file in*; do
cat $file | grep "word" | wc -l
done
had a result of printing in every iteration how many times did the "word" appeared on file.
How can I implement a counter for all those appearances and in the
end just echo the counter?
I used a counter that way but it appeared 0.
let x+=cat $filename | grep "word"
You can pipe the entire loop to wc -l.
for file in *; do
cat $file | grep "word"
done | wc -l
This is a useless use of cat. How about:
for file in *; do
grep "word" "$file"
done | wc -l
Actually, the entire loop is unnecessary if you pass all the file names to grep at once.
grep "word" * | wc -l
Note that if word shows up more than once on the same line these solutions will only count the entire line once. If you want to count same-line occurrences separately you can use -o to print each match on a separate line:
grep -o "word" * | wc -l
The oneliner in John's answer is the way to go. Just to satisfy your curiosity:
sum=0
for f in *; do
x="$(grep 'word' "$f" | wc -l)"
echo "x: $x"
(( sum += x ))
done
echo "sum: $sum"
If the line containing the grep and wc does not yield a number you are SOL. That is why you should stick to the other solution or do a pure bash implementation with things like read, 'case and *word*)' or if [[ "$line" =~ "$re_containing_word" ]]; then ...
wc -l file.txt
outputs number of lines and file name.
I need just the number itself (not the file name).
I can do this
wc -l file.txt | awk '{print $1}'
But maybe there is a better way?
Try this way:
wc -l < file.txt
cat file.txt | wc -l
According to the man page (for the BSD version, I don't have a GNU version to check):
If no files are specified, the standard input is used and no file
name is
displayed. The prompt will accept input until receiving EOF, or [^D] in
most environments.
To do this without the leading space, why not:
wc -l < file.txt | bc
Comparison of Techniques
I had a similar issue attempting to get a character count without the leading whitespace provided by wc, which led me to this page. After trying out the answers here, the following are the results from my personal testing on Mac (BSD Bash). Again, this is for character count; for line count you'd do wc -l. echo -n omits the trailing line break.
FOO="bar"
echo -n "$FOO" | wc -c # " 3" (x)
echo -n "$FOO" | wc -c | bc # "3" (√)
echo -n "$FOO" | wc -c | tr -d ' ' # "3" (√)
echo -n "$FOO" | wc -c | awk '{print $1}' # "3" (√)
echo -n "$FOO" | wc -c | cut -d ' ' -f1 # "" for -f < 8 (x)
echo -n "$FOO" | wc -c | cut -d ' ' -f8 # "3" (√)
echo -n "$FOO" | wc -c | perl -pe 's/^\s+//' # "3" (√)
echo -n "$FOO" | wc -c | grep -ch '^' # "1" (x)
echo $( printf '%s' "$FOO" | wc -c ) # "3" (√)
I wouldn't rely on the cut -f* method in general since it requires that you know the exact number of leading spaces that any given output may have. And the grep one works for counting lines, but not characters.
bc is the most concise, and awk and perl seem a bit overkill, but they should all be relatively fast and portable enough.
Also note that some of these can be adapted to trim surrounding whitespace from general strings, as well (along with echo `echo $FOO`, another neat trick).
How about
wc -l file.txt | cut -d' ' -f1
i.e. pipe the output of wc into cut (where delimiters are spaces and pick just the first field)
How about
grep -ch "^" file.txt
Obviously, there are a lot of solutions to this.
Here is another one though:
wc -l somefile | tr -d "[:alpha:][:blank:][:punct:]"
This only outputs the number of lines, but the trailing newline character (\n) is present, if you don't want that either, replace [:blank:] with [:space:].
Another way to strip the leading zeros without invoking an external command is to use Arithmetic expansion $((exp))
echo $(($(wc -l < file.txt)))
Best way would be first of all find all files in directory then use AWK NR (Number of Records Variable)
below is the command :
find <directory path> -type f | awk 'END{print NR}'
example : - find /tmp/ -type f | awk 'END{print NR}'
This works for me using the normal wc -l and sed to strip any char what is not a number.
wc -l big_file.log | sed -E "s/([a-z\-\_\.]|[[:space:]]*)//g"
# 9249133
So I have an expression that I want to extract some lines from a text and count them. I can grep them as follows:
$ cat medsCounts_totals.csv | grep -E 'NumMeds": 0' | wc -l
Which is fine. Now I want to loop over with the string ...
$ for i in {0..10}; do expr="NumMeds\": $i"; echo $expr; done
However, when I try to use $expr
for i in {0..10}; do expr="NumMeds:\" $i"; cat medsCounts_totals.csv | grep -E "$expr" | wc -l ; done
I get nothing. How do I solve this problem in an elegant manner?
there is a typo in
for i in {0..10}; do expr="NumMeds:\" $i"; cat medsCounts_totals.csv | grep -E "$expr" | wc -l ; done
it should be
"NumMeds\": $i"
I'm trying to get the count of a matching pattern from a variable to check the count of it, but it's only returning 1 as the results, here is what I'm trying to do:
x="HELLO|THIS|IS|TEST"
echo $x | grep -c "|"
Expected result: 3
Actual Result: 1
Do you know why is returning 1 instead of 3?
Thanks.
grep -c counts lines not matches within a line.
You can use awk to get a count:
x="HELLO|THIS|IS|TEST"
echo "$x" | awk -F '|' '{print NF-1}'
3
Alternatively you can use tr and wc:
echo "$x" | tr -dc '|' | wc -c
3
$ echo "$x" | grep -o '|' | grep -c .
3
grep -c does not count the number of matches. It counts the number of lines that match. By using grep -o, we put the matches on separate lines.
This approach works just as well with multiple lines:
$ cat file
hello|this|is
a|test
$ grep -o '|' file | grep -c .
3
The grep manual says:
grep, egrep, fgrep - print lines matching a pattern
and for the -c flag:
instead print a count of matching lines for each input file
and there is just one line that match
You don't need grep for this.
pipe_only=${x//[^|]} # remove everything except | from the value of x
echo "${#pipe_only}" # output the length of pipe_only
Try this :
$ x="HELLO|THIS|IS|TEST"; echo -n "$x" | sed 's/[^|]//g' | wc -c
3
With only one pipe with perl:
echo "$x" |
perl -lne 'print scalar(() = /\|/g)'
wc -l file.txt
outputs number of lines and file name.
I need just the number itself (not the file name).
I can do this
wc -l file.txt | awk '{print $1}'
But maybe there is a better way?
Try this way:
wc -l < file.txt
cat file.txt | wc -l
According to the man page (for the BSD version, I don't have a GNU version to check):
If no files are specified, the standard input is used and no file
name is
displayed. The prompt will accept input until receiving EOF, or [^D] in
most environments.
To do this without the leading space, why not:
wc -l < file.txt | bc
Comparison of Techniques
I had a similar issue attempting to get a character count without the leading whitespace provided by wc, which led me to this page. After trying out the answers here, the following are the results from my personal testing on Mac (BSD Bash). Again, this is for character count; for line count you'd do wc -l. echo -n omits the trailing line break.
FOO="bar"
echo -n "$FOO" | wc -c # " 3" (x)
echo -n "$FOO" | wc -c | bc # "3" (√)
echo -n "$FOO" | wc -c | tr -d ' ' # "3" (√)
echo -n "$FOO" | wc -c | awk '{print $1}' # "3" (√)
echo -n "$FOO" | wc -c | cut -d ' ' -f1 # "" for -f < 8 (x)
echo -n "$FOO" | wc -c | cut -d ' ' -f8 # "3" (√)
echo -n "$FOO" | wc -c | perl -pe 's/^\s+//' # "3" (√)
echo -n "$FOO" | wc -c | grep -ch '^' # "1" (x)
echo $( printf '%s' "$FOO" | wc -c ) # "3" (√)
I wouldn't rely on the cut -f* method in general since it requires that you know the exact number of leading spaces that any given output may have. And the grep one works for counting lines, but not characters.
bc is the most concise, and awk and perl seem a bit overkill, but they should all be relatively fast and portable enough.
Also note that some of these can be adapted to trim surrounding whitespace from general strings, as well (along with echo `echo $FOO`, another neat trick).
How about
wc -l file.txt | cut -d' ' -f1
i.e. pipe the output of wc into cut (where delimiters are spaces and pick just the first field)
How about
grep -ch "^" file.txt
Obviously, there are a lot of solutions to this.
Here is another one though:
wc -l somefile | tr -d "[:alpha:][:blank:][:punct:]"
This only outputs the number of lines, but the trailing newline character (\n) is present, if you don't want that either, replace [:blank:] with [:space:].
Another way to strip the leading zeros without invoking an external command is to use Arithmetic expansion $((exp))
echo $(($(wc -l < file.txt)))
Best way would be first of all find all files in directory then use AWK NR (Number of Records Variable)
below is the command :
find <directory path> -type f | awk 'END{print NR}'
example : - find /tmp/ -type f | awk 'END{print NR}'
This works for me using the normal wc -l and sed to strip any char what is not a number.
wc -l big_file.log | sed -E "s/([a-z\-\_\.]|[[:space:]]*)//g"
# 9249133