extract information regarding : size && time && row_count in one line shell script

extract information regarding : size && time && row_count in one line shell script - bash

Hey every one! I am pretty new for shell script and I am stuck
I need to extract information regarding: file_name && size && time && row_count and I want it do in one command line. I tried like this :
ls -l * && wc -l file.txt && du -ks file.txt | cut -f1| awk '{print $5" " $6 " " $7 " "$8 " " $9 " "$1 " "$2}'
but is not working properly
I also tried do in loop but i dont know how extract from there
for file in `ls -ltr /export/home/oracle/dbascripts/scripts`
do
[[ -f $file ]] && echo $file | awk '{print $3}'
done
Then I want to redirect to file like this >> for sql loader purpose.
Thanks in advance!

This could be a start if you have GNU find and GNU coreutils (most Linux distribution will do):
for i in /my/path/*; do
find "$i" ! -type d -printf '%p %TY-%Tm-%Td %TH:%TM:%TS %s '
wc -l <"$i"
done
/my/path/* should be modified to reflect the files you want to probe.
Also keep in mind that this one-liner has a few major issues if any directories are specified. This should be safer in that regard:
for i in *; do
if [[ -d "$i" ]]; then
continue
fi
find "$i" -printf '%p %TY-%Tm-%Td %TH:%TM:%TS %s '
wc -l <"$i"
done
You will want to see the manual page for GNU find to understand this better.
EDIT:
There is at least other faster way, using join and bash process substitution, but it's a bit ugly and somewhat harder to make safe and work the kinks out of.

ExtractInformation()
{
timesep="-"
sep="|"
dot=":"
sec="00"
lcount=`wc -l < $fname`
modf_time=`ls -l $fname`
f_size=`echo $modf_time | awk '{print $5}'`
time_month=`echo $modf_time | awk '{print $6}'`
time_day=`echo $modf_time | awk '{print $7}'`
time_hrmin=`echo $modf_time | awk '{print $8}'`
time_hr=`echo $time_hrmin | cut -d ':' -f1`
time_min=`echo $time_hrmin | cut -d ':' -f2`
time_year=`date '+%Y'`
time_param="DD-MON-YYYY HH24:MI:SS"
time_date=$time_day$timesep$time_month$timesep$time_year" "$time_hrmin$dot$sec
result=$fname$sep$time_date$sep$f_size$sep$lcount$sep$time_param
sqlresult=`echo $result | awk '{FS = "|" ;q=sprintf("%c", 39); print "INSERT INTO SIP_ICMS_FILE_T(f_name, f_date_time,f_size,f_row_count) VALUES (" q $1 q ", TO_DATE("q $2 q,q $5 q "),"$3","$4");";}'`
echo $sqlresult>>data.sql
echo "Reading data....."
}
UploadData()
{
#ss=`sqlplus -s a/a#adb #data.sql
#set serveroutput on
#set feedback off
#set echo off`
echo "loading with sql Loader....."
}
f_data=data.sql
[[ -f $f_data ]] && rm data.sql
for fname in * ;
do
if [[ -f $fname ]] then
ExtractInformation
fi
UploadData
#Zipdata
done

Related

bulk write in Unix using shell script

Is there any way to write bulk data in a file in shell script instead of writing line by line code in file?
In below script, I want to write difference between arrival time and generation time of files in test.csv file.
########################################################
echo "Starting the Execution for Time difference\n";
############################################################
# Functions used across the script
datediff() {
Unixtime=`echo $1 $2 $3 $4`
Filetime=`echo $5 $6 $7 $8`
echo $Unixtime;
echo $Filetime;
d1=`date -d "$Unixtime" +%s`
d2=`date -d "$Filetime" +%s`
echo $d1;
echo $d2;
TIME_DIFF=`expr $d1 - $d2`
TIME_DIFF=`expr $TIME_DIFF / 60`
echo $TIME_DIFF;
echo "$Unixtime,$Filetime,$TIME_DIFF,$9" >> ../test.csv
}
rm -f ../test.csv;
for i in `ls -1 | grep -v 'DelayCheck.s*'`
do
DayMonth=`ls -lrt $i | awk '{print $7" "$6" "}'`
Year=`ls --full-time $i | awk '{print $6}' | cut -c1-4`
HourMin=`ls -lrt $i | awk '{print " "$8}'`
timeA=`echo $DayMonth $Year $HourMin`
FileYearMonDay=`ls -ltr $i | awk '{print $9}' | awk -F'--' '{print $3}' | cut
-c2-9`
timeB1=`date -d $FileYearMonDay +'%d %b %Y'`
timeB2=`echo $i | awk -F'--' '{print substr($3,10,13)}' | sed -e
's/../:&/2g'`
timeB=`echo $timeB1 $timeB2`
echo "Time A is $timeA";
echo " Time b is $timeB";
datediff $timeA $timeB $i
done
echo $?;
script is working fine, but the problem is there is over 100k files. So script performance is bad.
I had tried to search is there any way to write bulk data in a file but I didn't find any solution.

How to pass a variable string to a file txt at the biginig of test?

I have a problem
I Have a program general like this gene.sh
that for all file (es file: geneX.csv) make a directory with the name of gene (example: Genex/geneX.csv) next this program compile an other program inside gene.sh but this progrm need a varieble and I dont know how do it.
this is the program gene.sh
#!/bin/bash
# Create a dictory for each file *.xls and *.csv
for fname in *.xlsx *csv
do
dname=${fname%.*}
[[ -d $dname ]] || mkdir "$dname"
mv "$fname" "$dname"
done
# For each gene go inside the directory and compile the programs getChromosomicPositions.sh to have the positions, and getHapolotipeStings.sh to have the variants
for geni in */; do
cd $geni
z=$(tail -n 1 *.csv | tr ';' "\n" | wc -l)
cd ..
cp getChromosomicPositions.sh $geni --->
cp getHaplotypeStrings.sh $geni
cd $geni
export z
./getChromosomicPositions.sh *.csv
export z
./getHaplotypeStrings.sh *.csv
cd ..
done
This is the program getChromosomichPositions.sh:
rm chrPosRs.txt
grep '^Haplotype\ ID' $1 | cut -d ";" -f 4-61 | tr ";" "\n" | awk '{print "select chrom,chromStart,chromEnd,name from snp147 where name=\""$1"\";"}' > listOfQuery.txt
while read l; do
echo $l > query.txt
mysql -h genome-mysql.cse.ucsc.edu -u genome -A -D hg38 --skip-column-names < query.txt > queryResult.txt
if [[ "$(cat queryResult.txt)" == "" ]];
then
cat query.txt |
while read line; do
echo $line | awk '$6 ~/rs/ {print $6}' > temp.txt;
if [[ "$(cat temp.txt)" != "" ]];
then cat temp.txt | awk -F'name="' '{print $2}' | sed -e 's/";//g' > temp.txt;
./getHGSVposHG19.sh temp.txt ---> Hear the problem--->
else
echo $line | awk '{num=sub(/.*:g\./,"");num+=sub(/\".*/,"");if(num==2){print};num=""}' > temp2.txt
fi
done
cat query.txt >> varianti.txt
echo "Missing Data" >> chrPosRs.txt
else
cat queryResult.txt >> chrPosRs.txt
fi
done < listOfQuery.txt
rm query*
hear the problem:
I need to enter in the file temp.txt and put automatically at the beginning of the file the variable $geni of the program gene.sh
How can I do that?

Why not pass "$geni" as say the first argument when invoking your script, and treating the rest of the arguments as your expected .csv files.
./getChromosomicPositions.sh "$geni" *.csv
Alternatively, you can set it as environment variable for the script, so that it can be used there (or just export it).
geni="$geni" ./getChromosomicPositions.sh *.csv
In any case, once you have it available in the second script, you can do
if passed as the first argument:
echo "${1}:$(cat temp.txt | awk -F'name="' '{print $2}' | sed -e 's/";//g')
or if passed as environment variable:
echo "${geni}:$(cat temp.txt | awk -F'name="' '{print $2}' | sed -e 's/";//g')

Find TXT files and show Total Count of records of each file and Size of each file

I need to find row Count and size of each TXT files.
It needs to search all the directories and just show result as :
FileName|Cnt|Size
ABC.TXT|230|23MB
Here is some code:
v_DIR=$1
echo "the directory to cd is "$1
x=`ls -l $0 | awk '{print $9 "|" $5}'`
y=`awk 'END {print NR}' $0`
echo $x '|' $y

Try something like
find -type f -name '*.txt' -exec bash -c 'lines=$(wc -l "$0" | cut -d " " -f1); size=$(du -h "$0" | cut -f1); echo "$0|$lines|$size"' {} \;

shell script sum in for loop not working

size=`ls -l /var/temp.* | awk '{ print $5}'`
fin_size=0
for row in ${size} ;
do
fin_size=`echo $(( $row + $fin_size )) | bc`;
done
echo $fin_size
is not working !! echo $fin_size is throwing some garbage minus value.
where I'm mistaking?
(my bash is old and I suppose to work in this only Linux kernel: 2.6.39)

Don't parse ls.
Why not use du as shown below?
du -cb /var/temp.* | tail -1

Because it cannot be stressed enough: Why you shouldn't parse the output of ls(1)
Use e.g. du as suggested by dogbane, or find:
$ find /var -maxdepth 1 -type f -name "temp.*" -printf "%s\n" | awk '{total+=$1}END{print total}'
or stat:
$ stat -c%s /var/temp.* | awk '{total+=$1}END{print total}'
or globbing and stat (unnecessary, slow):
total=0
for file in /var/temp.*; do
[ -f "${file}" ] || continue
size="$(stat -c%s "${file}")"
((total+=size))
done
echo "${total}"

Below should be enough:
ls -l /var/temp.* | awk '{a+=$5}END{print a}'
No need for you to run the for loop.This means:
size=ls -l /var/temp.* | awk '{ print $5}'`
fin_size=0
for row in ${size} ;
do
fin_size=`echo $(( $row + $fin_size )) | bc`;
done
echo $fin_size
The whole above thing can be replaced with :
fin_size=`ls -l /var/temp.* | awk '{a+=$5}END{printf("%10d",a);}'`
echo $fin_size

How to quote file name using awk?

I want output 'filename1','filename2' ,'filename3' ....
I m using awk ..but no idea how to print last quoate after filename.
It printing me ,'filename ===>I need ,'filename'
ls -ltr | grep -v ^d | sed '1d'| awk '{print "," sprintf("%c", 39) $9}'
Thanks in advance!

You can use the find command as:
find . -maxdepth 1 -type f -printf "'%f'," | sed s/,$//

if you have Ruby(1.9+)
ruby -e 'puts Dir["*"].select{|x|test(?f,x)}.join("\47,\47")'
else
find . -maxdepth 1 -type f -printf '%f\n' | sed -e ':a N' -e "s#\n#','#" -e 'b a'

Use the printf function http://www.gnu.org/manual/gawk/html_node/Basic-Printf.html

Pure bash (probably posix sh, too):
comma=
for file in * ; do
if [ ! -d "$file" ] ; then
if [ ! -z $comma ] ; then
printf ","
fi
comma=1
printf "'%s'" "$file"
fi
done
Files with ' in the name are not accounted for, but nobody else has been doing that either. Presuming that escaping with \ is correct you could do.
comma=
for file in * ; do
if [ ! -d "$file" ] ; then
if [ ! -z $comma ] ; then
printf ","
fi
comma=1
printf "'%s'" "${file//\'/\'}"
fi
done
But some CSV systems would require you to follow write '' instead, which would be
printf "'%s'" "${file//\'/''}"

Let's pretend that you're processing some other data besides the output of ls.
$ printf "hello\ngoodbye\no'malley\n" | awk '{gsub("\047","\047\\\047\047",$1);printf "%s\047%s\047",comma,$1; comma=","}END{printf "\n"}'
'hello','goodbye','o'\''malley'

This variant works fine but I think there should be more elegant way to do it.
ls -1 $1 | cut -d'.' -f1 | awk '{printf "," sprintf("%c", 39) $1 sprintf("%c", 39) "\n" }'| sed '1 s/,*//'

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

extract information regarding : size && time && row_count in one line shell script - bash

Related

bulk write in Unix using shell script

How to pass a variable string to a file txt at the biginig of test?

Find TXT files and show Total Count of records of each file and Size of each file

shell script sum in for loop not working

How to quote file name using awk?

Categories

Resources