count all the lines in all folders in bash [duplicate] - bash

wc -l file.txt
outputs number of lines and file name.
I need just the number itself (not the file name).
I can do this
wc -l file.txt | awk '{print $1}'
But maybe there is a better way?

Try this way:
wc -l < file.txt

cat file.txt | wc -l
According to the man page (for the BSD version, I don't have a GNU version to check):
If no files are specified, the standard input is used and no file
name is
displayed. The prompt will accept input until receiving EOF, or [^D] in
most environments.

To do this without the leading space, why not:
wc -l < file.txt | bc

Comparison of Techniques
I had a similar issue attempting to get a character count without the leading whitespace provided by wc, which led me to this page. After trying out the answers here, the following are the results from my personal testing on Mac (BSD Bash). Again, this is for character count; for line count you'd do wc -l. echo -n omits the trailing line break.
FOO="bar"
echo -n "$FOO" | wc -c # " 3" (x)
echo -n "$FOO" | wc -c | bc # "3" (√)
echo -n "$FOO" | wc -c | tr -d ' ' # "3" (√)
echo -n "$FOO" | wc -c | awk '{print $1}' # "3" (√)
echo -n "$FOO" | wc -c | cut -d ' ' -f1 # "" for -f < 8 (x)
echo -n "$FOO" | wc -c | cut -d ' ' -f8 # "3" (√)
echo -n "$FOO" | wc -c | perl -pe 's/^\s+//' # "3" (√)
echo -n "$FOO" | wc -c | grep -ch '^' # "1" (x)
echo $( printf '%s' "$FOO" | wc -c ) # "3" (√)
I wouldn't rely on the cut -f* method in general since it requires that you know the exact number of leading spaces that any given output may have. And the grep one works for counting lines, but not characters.
bc is the most concise, and awk and perl seem a bit overkill, but they should all be relatively fast and portable enough.
Also note that some of these can be adapted to trim surrounding whitespace from general strings, as well (along with echo `echo $FOO`, another neat trick).

How about
wc -l file.txt | cut -d' ' -f1
i.e. pipe the output of wc into cut (where delimiters are spaces and pick just the first field)

How about
grep -ch "^" file.txt

Obviously, there are a lot of solutions to this.
Here is another one though:
wc -l somefile | tr -d "[:alpha:][:blank:][:punct:]"
This only outputs the number of lines, but the trailing newline character (\n) is present, if you don't want that either, replace [:blank:] with [:space:].

Another way to strip the leading zeros without invoking an external command is to use Arithmetic expansion $((exp))
echo $(($(wc -l < file.txt)))

Best way would be first of all find all files in directory then use AWK NR (Number of Records Variable)
below is the command :
find <directory path> -type f | awk 'END{print NR}'
example : - find /tmp/ -type f | awk 'END{print NR}'

This works for me using the normal wc -l and sed to strip any char what is not a number.
wc -l big_file.log | sed -E "s/([a-z\-\_\.]|[[:space:]]*)//g"
# 9249133

Related

Can any one help me to understand this?

I'm new to shell scripting and I Can't understand those lines:
wc -l $x|sed 's/\s\+/|/g'
rc=`echo "$BTEQ_OUT"|grep "RC (return code)"| sed 's/ //g' | cut -d '=' -f2|tr -d "\r\n "`;
When you see a long pipeline, one useful technique for understanding it is to execute it piece by piece:
first, what's in $x?
echo $x
is that the name of a file?
ls -l $x
what does wc do?
wc -l $x
ok, what does the sed part do? (note, \s requires GNU sed)
wc -l $x | sed 's/\s\+/|/g'
Similarly:
echo "$BTEQ_OUT"
echo "$BTEQ_OUT"|grep "RC (return code)"
echo "$BTEQ_OUT"|grep "RC (return code)"| sed 's/ //g'
echo "$BTEQ_OUT"|grep "RC (return code)"| sed 's/ //g' | cut -d '=' -f2
echo "$BTEQ_OUT"|grep "RC (return code)"| sed 's/ //g' | cut -d '=' -f2|tr -d "\r\n ";
wc -l $x|sed 's/\s\+/|/g'
wc is a tools used for counting, with the -l flag, this will count the lines in a file or a string.
$x is the variable holding probably a file name to be passed into wc
| called 'pipe' passes the output of the command before as the input into the command after
sed is another scripting tool used to edit text in files.
's/\s\+/|/g' is regex which globally (g) substitutes any number of white space chars with pipe symbols '|'
This program does the following
Count how many lines are in $x and whatever you output replace empty characters with pipe symbols.
The fact that they expect multiple outputs from wc -l hints that $x might store more than one file ...
I'd suggest looking into what some of the other commands are and what they do, and how they interact. List below
echo
tr
cut
pipe

Count of matching word, pattern or value from unix korn shell scripting is returning just 1 as count

I'm trying to get the count of a matching pattern from a variable to check the count of it, but it's only returning 1 as the results, here is what I'm trying to do:
x="HELLO|THIS|IS|TEST"
echo $x | grep -c "|"
Expected result: 3
Actual Result: 1
Do you know why is returning 1 instead of 3?
Thanks.
grep -c counts lines not matches within a line.
You can use awk to get a count:
x="HELLO|THIS|IS|TEST"
echo "$x" | awk -F '|' '{print NF-1}'
3
Alternatively you can use tr and wc:
echo "$x" | tr -dc '|' | wc -c
3
$ echo "$x" | grep -o '|' | grep -c .
3
grep -c does not count the number of matches. It counts the number of lines that match. By using grep -o, we put the matches on separate lines.
This approach works just as well with multiple lines:
$ cat file
hello|this|is
a|test
$ grep -o '|' file | grep -c .
3
The grep manual says:
grep, egrep, fgrep - print lines matching a pattern
and for the -c flag:
instead print a count of matching lines for each input file
and there is just one line that match
You don't need grep for this.
pipe_only=${x//[^|]} # remove everything except | from the value of x
echo "${#pipe_only}" # output the length of pipe_only
Try this :
$ x="HELLO|THIS|IS|TEST"; echo -n "$x" | sed 's/[^|]//g' | wc -c
3
With only one pipe with perl:
echo "$x" |
perl -lne 'print scalar(() = /\|/g)'

Remove all chars that are not a digit from a string

I'm trying to make a small function that removes all the chars that are not digits.
123a45a ---> will become ---> 12345
I've came up with :
temp=$word | grep -o [[:digit:]]
echo $temp
But instead of 12345 I get 1 2 3 4 5. How to I get rid of the spaces?
Pure bash:
word=123a45a
number=${word//[^0-9]}
Here's a pure bash solution
var='123a45a'
echo ${var//[^0-9]/}
12345
is this what you are looking for?
kent$ echo "123a45a"|sed 's/[^0-9]//g'
12345
grep & tr
echo "123a45a"|grep -o '[0-9]'|tr -d '\n'
12345
I would recommend using sed or perl instead:
temp="$(sed -e 's/[^0-9]//g' <<< "$word")"
temp="$(perl -pe 's/\D//g' <<< "$word")"
Edited to add: If you really need to use grep, then this is the only way I can think of:
temp="$( grep -o '[0-9]' <<< "$word" \
| while IFS= read -r ; do echo -n "$REPLY" ; done
)"
. . . but there's probably a better way. (It uses grep -o, like your solution, then runs over the lines that it outputs and re-outputs them without line-breaks.)
Edited again to add: Now that you've mentioned that you use can use tr instead, this is much easier:
temp="$(tr -cd 0-9 <<< "$word")"
What about using sed?
$ echo "123a45a" | sed -r 's/[^0-9]//g'
12345
As I read you are just allowed to use grep and tr, this can make the trick:
$ echo "123a45a" | grep -o [[:digit:]] | tr -d '\n'
12345
In your case,
temp=$(echo $word | grep -o [[:digit:]] | tr -d '\n')
tr will also work:
echo "123a45a" | tr -cd '[:digit:]'
# output: 12345
Grep returns the result on different lines:
$ echo -e "$temp"
1
2
3
4
5
So you cannot remove those spaces during the filtering, but you can afterwards, since $temp can transform itself like this:
temp=`echo $temp | tr -d ' '`
$ echo "$temp"
12345

How to get "wc -l" to print just the number of lines without file name?

wc -l file.txt
outputs number of lines and file name.
I need just the number itself (not the file name).
I can do this
wc -l file.txt | awk '{print $1}'
But maybe there is a better way?
Try this way:
wc -l < file.txt
cat file.txt | wc -l
According to the man page (for the BSD version, I don't have a GNU version to check):
If no files are specified, the standard input is used and no file
name is
displayed. The prompt will accept input until receiving EOF, or [^D] in
most environments.
To do this without the leading space, why not:
wc -l < file.txt | bc
Comparison of Techniques
I had a similar issue attempting to get a character count without the leading whitespace provided by wc, which led me to this page. After trying out the answers here, the following are the results from my personal testing on Mac (BSD Bash). Again, this is for character count; for line count you'd do wc -l. echo -n omits the trailing line break.
FOO="bar"
echo -n "$FOO" | wc -c # " 3" (x)
echo -n "$FOO" | wc -c | bc # "3" (√)
echo -n "$FOO" | wc -c | tr -d ' ' # "3" (√)
echo -n "$FOO" | wc -c | awk '{print $1}' # "3" (√)
echo -n "$FOO" | wc -c | cut -d ' ' -f1 # "" for -f < 8 (x)
echo -n "$FOO" | wc -c | cut -d ' ' -f8 # "3" (√)
echo -n "$FOO" | wc -c | perl -pe 's/^\s+//' # "3" (√)
echo -n "$FOO" | wc -c | grep -ch '^' # "1" (x)
echo $( printf '%s' "$FOO" | wc -c ) # "3" (√)
I wouldn't rely on the cut -f* method in general since it requires that you know the exact number of leading spaces that any given output may have. And the grep one works for counting lines, but not characters.
bc is the most concise, and awk and perl seem a bit overkill, but they should all be relatively fast and portable enough.
Also note that some of these can be adapted to trim surrounding whitespace from general strings, as well (along with echo `echo $FOO`, another neat trick).
How about
wc -l file.txt | cut -d' ' -f1
i.e. pipe the output of wc into cut (where delimiters are spaces and pick just the first field)
How about
grep -ch "^" file.txt
Obviously, there are a lot of solutions to this.
Here is another one though:
wc -l somefile | tr -d "[:alpha:][:blank:][:punct:]"
This only outputs the number of lines, but the trailing newline character (\n) is present, if you don't want that either, replace [:blank:] with [:space:].
Another way to strip the leading zeros without invoking an external command is to use Arithmetic expansion $((exp))
echo $(($(wc -l < file.txt)))
Best way would be first of all find all files in directory then use AWK NR (Number of Records Variable)
below is the command :
find <directory path> -type f | awk 'END{print NR}'
example : - find /tmp/ -type f | awk 'END{print NR}'
This works for me using the normal wc -l and sed to strip any char what is not a number.
wc -l big_file.log | sed -E "s/([a-z\-\_\.]|[[:space:]]*)//g"
# 9249133

Get just the integer from wc in bash

Is there a way to get the integer that wc returns in bash?
Basically I want to write the line numbers and word counts to the screen after the file name.
output: filename linecount wordcount
Here is what I have so far:
files=\`ls`
for f in $files;
do
if [ ! -d $f ] #only print out information about files !directories
then
# some way of getting the wc integers into shell variables and then printing them
echo "$f $lines $words"
fi
done
Most simple answer ever:
wc < filename
Just:
wc -l < file_name
will do the job. But this output includes prefixed whitespace as wc right-aligns the number.
You can use the cut command to get just the first word of wc's output (which is the line or word count):
lines=`wc -l $f | cut -f1 -d' '`
words=`wc -w $f | cut -f1 -d' '`
wc $file | awk {'print "$4" "$2" "$1"'}
Adjust as necessary for your layout.
It's also nicer to use positive logic ("is a file") over negative ("not a directory")
[ -f $file ] && wc $file | awk {'print "$4" "$2" "$1"'}
Sometimes wc outputs in different formats in different platforms. For example:
In OS X:
$ echo aa | wc -l
1
In Centos:
$ echo aa | wc -l
1
So using only cut may not retrieve the number. Instead try tr to delete space characters:
$ echo aa | wc -l | tr -d ' '
The accepted/popular answers do not work on OSX.
Any of the following should be portable on bsd and linux.
wc -l < "$f" | tr -d ' '
OR
wc -l "$f" | tr -s ' ' | cut -d ' ' -f 2
OR
wc -l "$f" | awk '{print $1}'
If you redirect the filename into wc it omits the filename on output.
Bash:
read lines words characters <<< $(wc < filename)
or
read lines words characters <<EOF
$(wc < filename)
EOF
Instead of using for to iterate over the output of ls, do this:
for f in *
which will work if there are filenames that include spaces.
If you can't use globbing, you should pipe into a while read loop:
find ... | while read -r f
or use process substitution
while read -r f
do
something
done < <(find ...)
If the file is small you can afford calling wc twice, and use something like the following, which avoids piping into an extra process:
lines=$((`wc -l "$f"`))
words=$((`wc -w "$f"`))
The $((...)) is the Arithmetic Expansion of bash. It removes any whitespace from the output of wc in this case.
This solution makes more sense if you need either the linecount or the wordcount.
How about with sed?
wc -l /path/to/file.ext | sed 's/ *\([0-9]* \).*/\1/'
typeset -i a=$(wc -l fileName.dat | xargs echo | cut -d' ' -f1)
Try this for numeric result:
nlines=$( wc -l < $myfile )
Something like this may help:
#!/bin/bash
printf '%-10s %-10s %-10s\n' 'File' 'Lines' 'Words'
for fname in file_name_pattern*; {
[[ -d $fname ]] && continue
lines=0
words=()
while read -r line; do
((lines++))
words+=($line)
done < "$fname"
printf '%-10s %-10s %-10s\n' "$fname" "$lines" "${#words[#]}"
}
To (1) run wc once, and (2) not assign any superfluous variables, use
read lines words <<< $(wc < $f | awk '{ print $1, $2 }')
Full code:
for f in *
do
if [ ! -d $f ]
then
read lines words <<< $(wc < $f | awk '{ print $1, $2 }')
echo "$f $lines $words"
fi
done
Example output:
$ find . -maxdepth 1 -type f -exec wc {} \; # without formatting
1 2 27 ./CNAME
21 169 1065 ./LICENSE
33 130 961 ./README.md
86 215 2997 ./404.html
71 168 2579 ./index.html
21 21 478 ./sitemap.xml
$ # the above code
404.html 86 215
CNAME 1 2
index.html 71 168
LICENSE 21 169
README.md 33 130
sitemap.xml 21 21
Solutions proposed in the answered question doesn't work for Darwin kernels.
Please, consider following solutions that work for all UNIX systems:
print exactly the number of lines of a file:
wc -l < file.txt | xargs
print exactly the number of characters of a file:
wc -m < file.txt | xargs
print exactly the number of bytes of a file:
wc -c < file.txt | xargs
print exactly the number of words of a file:
wc -w < file.txt | xargs
There is a great solution with examples on stackoverflow here
I will copy the simplest solution here:
FOO="bar"
echo -n "$FOO" | wc -l | bc # "3"
Maybe these pages should be merged?
Try this:
wc `ls` | awk '{ LINE += $1; WC += $2 } END { print "lines: " LINE " words: " WC }'
It creates a line count, and word count (LINE and WC), and increase them with the values extracted from wc (using $1 for the first column's value and $2 for the second) and finally prints the results.
"Basically I want to write the line numbers and word counts to the screen after the file name."
answer=(`wc $f`)
echo -e"${answer[3]}
lines: ${answer[0]}
words: ${answer[1]}
bytes: ${answer[2]}"
Outputs :
myfile.txt
lines: 10
words: 20
bytes: 120
files=`ls`
echo "$files" | wc -l | perl -pe "s#^\s+##"
You have to use input redirection for wc:
number_of_lines=$(wc -l <myfile.txt)
respectively in your context
echo "$f $(wc -l <"$f") $(wc -w <"$f")"

Resources