Find TXT files and show Total Count of records of each file and Size of each file

Find TXT files and show Total Count of records of each file and Size of each file - shell

I need to find row Count and size of each TXT files.
It needs to search all the directories and just show result as :
FileName|Cnt|Size
ABC.TXT|230|23MB
Here is some code:
v_DIR=$1
echo "the directory to cd is "$1
x=`ls -l $0 | awk '{print $9 "|" $5}'`
y=`awk 'END {print NR}' $0`
echo $x '|' $y

Try something like
find -type f -name '*.txt' -exec bash -c 'lines=$(wc -l "$0" | cut -d " " -f1); size=$(du -h "$0" | cut -f1); echo "$0|$lines|$size"' {} \;

Related

Perform a CAT in FOR and SSH

I do not have much experience in shell script, therefore I need your help. I have the following query, I need to make a CAT to the files that I list, but I have not managed to know where to place the command. Thank you:
read date
echo -e "RECORDINGS"
for e in $Rec
do
sshpass -p password ssh user#server find $e "-type f -mtime -10 -exec ls -gGh --full-time {} \;" | cut -d ' ' -f 4,7 | grep $date | awk -F " " '{print $2}'
done

Ignoring that much of the code here is outright dangerous --
find_on_server() {
local e_q
printf -v e_q '%q ' "$1"
sshpass -p password ssh user#server "bash -s $e_q" <<'EOF'
e=$1
find "$e" -type f -mtime -10 -exec ls -gGh --full-time {} \;
EOF
# ^^^ the above line MUST NOT BE INDENTED
}
find_on_server "$e" | cut -d ' ' -f 4,7 | grep $date | awk -F " " '{print $2}'

Bash Script not working when trying to find large ASCII files

So what I'm trying to do is find large ASCII files and then print out the name of the file and then how many lines, but when I start my script it doesn't find anything.
find / -type f -size +2000c -exec file {} \; 2>/dev/null | awk -F':' '/: ASCII text/ {print $1}' | while read FILENAME; do LINES="$(wc -l)"; if [ $LINES > 10000 ]; then echo $FILENAME && echo $LINES; fi; done

what you got wrong?
if [ $LINES > 10000 ] here > goes for a string comparison. To use a numberic comparion -gt must be used as
if [ $LINES -gt 10000 ]

Please try this:
find / -type f -size +2000c -print0 | xargs.exe -0 grep -Z -L -e '[^[:print:]]' 2>/dev/null | xargs -0 awk 'ENDFILE { if (FNR > 10000) { print FILENAME " " FNR } }'
The idea is to filter out binary files with grep and feed awk with the list of filtered files to finally filter out files with line count less or equal to 10000.
btw, it handles files with white space in names gracefully.

To get \n instead of n in echo -e command in shell script

I am trying to get the output for the echo -e command as shown below
Command used:
echo -e "cd \${2}\nfilesModifiedBetweenDates=\$(find . -type f -exec ls -l --time-style=full-iso {} \; | awk '{print \$6,\$NF}' | awk '{gsub(/-/,\"\",\$1);print}' | awk '\$1>= '$fromDate' && \$1<= '$toDate' {print \$2}' | tr \""\n"\" \""\;"\")\nIFS="\;" read -ra fileModifiedArray <<< "\$filesModifiedBetweenDates"\nfor fileModified in \${fileModifiedArray[#]}\ndo\n egrep -w "\$1" "\$fileModified" \ndone"
cd ${2}
Expected output:
cd ${2}
filesModifiedBetweenDates=$(find . -type f -exec ls -l --time-style=full-iso {} \; | awk '{print $6,$NF}' | awk '{gsub(/-/,"",$1);print}' | awk '$1>= '20140806' && $1<= '20140915' {print $2}' | tr "\n" ";")
IFS=; read -ra fileModifiedArray <<< $filesModifiedBetweenDates
for fileModified in ${fileModifiedArray[#]}
do
egrep -w $1 $fileModified
done
Original Ouput:
cd ${2}
filesModifiedBetweenDates=$(find . -type f -exec ls -l --time-style=full-iso {} \; | awk '{print $6,$NF}' | awk '{gsub(/-/,"",$1);print}' | awk '$1>= '20140806' && $1<= '20140915' {print $2}' | tr "n" ";")
IFS=; read -ra fileModifiedArray <<< $filesModifiedBetweenDates
for fileModified in ${fileModifiedArray[#]}
do
egrep -w $1 $fileModified
done
How can i handle "\" in this ?

For long blocks of text, it's much simpler to use a quoted here document than trying to embedded a multi-line string into a single argument to echo or printf.
cat <<"EOF"
cd ${2}
filesModifiedBetweenDates=$(find . -type f -exec ls -l --time-style=full-iso {} \; | awk '{print $6,$NF}' | awk '{gsub(/-/,"",$1);print}' | awk '$1>= '20140806' && $1<= '20140915' {print $2}' | tr "\n" ";")
IFS=; read -ra fileModifiedArray <<< $filesModifiedBetweenDates
for fileModified in ${fileModifiedArray[#]}
do
egrep -w $1 $fileModified
done
EOF

You'd better use printf to have a better control:
$ printf "tr %s %s\n" '"\n"' '";"'
tr "\n" ";"
As you see, we indicate the parameters within double quotes: printf "text %s %s" and then we define what content should be stored in this parameters.
In case you really have to use echo, then escape the \:
$ echo -e 'tr "\\n" ";"'
tr "\n" ";"
Interesting read: Why is printf better than echo?

shell script sum in for loop not working

size=`ls -l /var/temp.* | awk '{ print $5}'`
fin_size=0
for row in ${size} ;
do
fin_size=`echo $(( $row + $fin_size )) | bc`;
done
echo $fin_size
is not working !! echo $fin_size is throwing some garbage minus value.
where I'm mistaking?
(my bash is old and I suppose to work in this only Linux kernel: 2.6.39)

Don't parse ls.
Why not use du as shown below?
du -cb /var/temp.* | tail -1

Because it cannot be stressed enough: Why you shouldn't parse the output of ls(1)
Use e.g. du as suggested by dogbane, or find:
$ find /var -maxdepth 1 -type f -name "temp.*" -printf "%s\n" | awk '{total+=$1}END{print total}'
or stat:
$ stat -c%s /var/temp.* | awk '{total+=$1}END{print total}'
or globbing and stat (unnecessary, slow):
total=0
for file in /var/temp.*; do
[ -f "${file}" ] || continue
size="$(stat -c%s "${file}")"
((total+=size))
done
echo "${total}"

Below should be enough:
ls -l /var/temp.* | awk '{a+=$5}END{print a}'
No need for you to run the for loop.This means:
size=ls -l /var/temp.* | awk '{ print $5}'`
fin_size=0
for row in ${size} ;
do
fin_size=`echo $(( $row + $fin_size )) | bc`;
done
echo $fin_size
The whole above thing can be replaced with :
fin_size=`ls -l /var/temp.* | awk '{a+=$5}END{printf("%10d",a);}'`
echo $fin_size

extract information regarding : size && time && row_count in one line shell script

Hey every one! I am pretty new for shell script and I am stuck
I need to extract information regarding: file_name && size && time && row_count and I want it do in one command line. I tried like this :
ls -l * && wc -l file.txt && du -ks file.txt | cut -f1| awk '{print $5" " $6 " " $7 " "$8 " " $9 " "$1 " "$2}'
but is not working properly
I also tried do in loop but i dont know how extract from there
for file in `ls -ltr /export/home/oracle/dbascripts/scripts`
do
[[ -f $file ]] && echo $file | awk '{print $3}'
done
Then I want to redirect to file like this >> for sql loader purpose.
Thanks in advance!

This could be a start if you have GNU find and GNU coreutils (most Linux distribution will do):
for i in /my/path/*; do
find "$i" ! -type d -printf '%p %TY-%Tm-%Td %TH:%TM:%TS %s '
wc -l <"$i"
done
/my/path/* should be modified to reflect the files you want to probe.
Also keep in mind that this one-liner has a few major issues if any directories are specified. This should be safer in that regard:
for i in *; do
if [[ -d "$i" ]]; then
continue
fi
find "$i" -printf '%p %TY-%Tm-%Td %TH:%TM:%TS %s '
wc -l <"$i"
done
You will want to see the manual page for GNU find to understand this better.
EDIT:
There is at least other faster way, using join and bash process substitution, but it's a bit ugly and somewhat harder to make safe and work the kinks out of.

ExtractInformation()
{
timesep="-"
sep="|"
dot=":"
sec="00"
lcount=`wc -l < $fname`
modf_time=`ls -l $fname`
f_size=`echo $modf_time | awk '{print $5}'`
time_month=`echo $modf_time | awk '{print $6}'`
time_day=`echo $modf_time | awk '{print $7}'`
time_hrmin=`echo $modf_time | awk '{print $8}'`
time_hr=`echo $time_hrmin | cut -d ':' -f1`
time_min=`echo $time_hrmin | cut -d ':' -f2`
time_year=`date '+%Y'`
time_param="DD-MON-YYYY HH24:MI:SS"
time_date=$time_day$timesep$time_month$timesep$time_year" "$time_hrmin$dot$sec
result=$fname$sep$time_date$sep$f_size$sep$lcount$sep$time_param
sqlresult=`echo $result | awk '{FS = "|" ;q=sprintf("%c", 39); print "INSERT INTO SIP_ICMS_FILE_T(f_name, f_date_time,f_size,f_row_count) VALUES (" q $1 q ", TO_DATE("q $2 q,q $5 q "),"$3","$4");";}'`
echo $sqlresult>>data.sql
echo "Reading data....."
}
UploadData()
{
#ss=`sqlplus -s a/a#adb #data.sql
#set serveroutput on
#set feedback off
#set echo off`
echo "loading with sql Loader....."
}
f_data=data.sql
[[ -f $f_data ]] && rm data.sql
for fname in * ;
do
if [[ -f $fname ]] then
ExtractInformation
fi
UploadData
#Zipdata
done

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Find TXT files and show Total Count of records of each file and Size of each file - shell

I need to find row Count and size of each TXT files. It needs to search all the directories and just show result as : FileName|Cnt|Size ABC.TXT|230|23MB Here is some code: v_DIR=$1 echo "the directory to cd is "$1 x=`ls -l $0 | awk '{print $9 "|" $5}'` y=`awk 'END {print NR}' $0` echo $x '|' $y

Try something like find -type f -name '*.txt' -exec bash -c 'lines=$(wc -l "$0" | cut -d " " -f1); size=$(du -h "$0" | cut -f1); echo "$0|$lines|$size"' {} \;

Related

Perform a CAT in FOR and SSH

Bash Script not working when trying to find large ASCII files

To get \n instead of n in echo -e command in shell script

shell script sum in for loop not working

extract information regarding : size && time && row_count in one line shell script

Categories

Resources