bulk write in Unix using shell script - shell

Is there any way to write bulk data in a file in shell script instead of writing line by line code in file?
In below script, I want to write difference between arrival time and generation time of files in test.csv file.
########################################################
echo "Starting the Execution for Time difference\n";
############################################################
# Functions used across the script
datediff() {
Unixtime=`echo $1 $2 $3 $4`
Filetime=`echo $5 $6 $7 $8`
echo $Unixtime;
echo $Filetime;
d1=`date -d "$Unixtime" +%s`
d2=`date -d "$Filetime" +%s`
echo $d1;
echo $d2;
TIME_DIFF=`expr $d1 - $d2`
TIME_DIFF=`expr $TIME_DIFF / 60`
echo $TIME_DIFF;
echo "$Unixtime,$Filetime,$TIME_DIFF,$9" >> ../test.csv
}
rm -f ../test.csv;
for i in `ls -1 | grep -v 'DelayCheck.s*'`
do
DayMonth=`ls -lrt $i | awk '{print $7" "$6" "}'`
Year=`ls --full-time $i | awk '{print $6}' | cut -c1-4`
HourMin=`ls -lrt $i | awk '{print " "$8}'`
timeA=`echo $DayMonth $Year $HourMin`
FileYearMonDay=`ls -ltr $i | awk '{print $9}' | awk -F'--' '{print $3}' | cut
-c2-9`
timeB1=`date -d $FileYearMonDay +'%d %b %Y'`
timeB2=`echo $i | awk -F'--' '{print substr($3,10,13)}' | sed -e
's/../:&/2g'`
timeB=`echo $timeB1 $timeB2`
echo "Time A is $timeA";
echo " Time b is $timeB";
datediff $timeA $timeB $i
done
echo $?;
script is working fine, but the problem is there is over 100k files. So script performance is bad.
I had tried to search is there any way to write bulk data in a file but I didn't find any solution.

Related

How to remove the usage of temp file and read data from the command itself

I have a shell script and I need help to make it efficient. I am using temp files to store and read the data, but I need to read the data in memory.
It collects metrics from the Postgres database using a command and fetches the metrics. My current script fetches the metrics to a temp file, then reads from it.
I want to stop using temp files and use memory instead.
The script works, I just need help to automate more and get rid of reading data from temp files.
List item
INPUT=`mktemp`
#/usr/pgsql-9.5/bin/pgbench -c1 -j1 -t 1000 -S man > $INPUT
TESTTIME=15 #seconds
echo "Waiting $TESTTIME seconds..."
/usr/pgsql-9.5/bin/pgbench -c1 -j1 -T $TESTTIME -r man > $INPUT
OLDIFS=$IFS
IFS=" "
[ ! -f $INPUT ] && { echo "$INPUT file not found"; exit 99; }
tps=`cat $INPUT |awk '/^tps/ {print $3}' |awk -F'.' '{print $1}' |head -n1`
update_l=`cat $INPUT |awk '/UPDATE/ {print $1}' |tail -n1`
select_l=`cat $INPUT |awk '/SELECT/ {print $1}' |tail -n1`
insert_l=`cat $INPUT |awk '/INSERT/ {print $1}' |tail -n1`
echo ${PLOTTER_PREFIX}.tps $tps kv
echo ${PLOTTER_PREFIX}.update_latency $update_l kv
echo ${PLOTTER_PREFIX}.select_latency $select_l kv
echo ${PLOTTER_PREFIX}.insert_latency $insert_l kv
#{ while read line; do
# # statsite_buildData ${PLOTTER_PREFIX}.latency average ${latency average} kv
# echo ${PLOTTER_PREFIX}.${line} kv
# done } < $INPUT
statsite_sendData
#echo $Test
IFS=$OLDIFS
rm -f $INPUT
You can capture the output of the command to a variable, like so:
output=$(/usr/pgsql-9.5/bin/pgbench -c1 -j1 -T $TESTTIME -r man)
Then just use echo instead of cat and substitute $INPUT with the variable name.
tps=`echo "$output" | awk '/^tps/ {print $3}' | awk -F'.' '{print $1}' |head -n1`
update_l=`echo "$output" | awk '/UPDATE/ {print $1}' | tail -n1`
...
I would also suggest using $() instead of surrounding commands with backticks. So the above would become:
tps=$(echo "$output" | awk '/^tps/ {print $3}' | awk -F'.' '{print $1}' |head -n1)
update_l=$(echo "$output" | awk '/UPDATE/ {print $1}' | tail -n1)
...

Bash Shell Issue

currentDate="20160324"
headerDumpFile="header.txt"
#currentDate="$(date +ā€™%Y%m%dā€™)"
printf "Current date in dd/mm/yyyy format %s\n" $currentDate
contId=""
labelList="c12,playlist-play,play,pause,end,playlist-end,heartbeat,ns_st_cl"
params="corporate=abc&user=abc&password=abc&startdate=$currentDate&site=abc&extralabels=$labelList"
url="https://example.com/v1/start?$params"
a=1
while true
do
curl -D $headerDumpFile -v -k -H "Accept-Encoding:gzip" $url > $a.zip
contId= cat $headerDumpFile | grep "X-CS-Continuation-Id:" | awk '{print $NF}'
if [ "$contId" ];then
printf "Breaking the Loop.."
break;
fi
url="https://example.com/v1/start?$params&continuationId=${contId}"
a=$((a + 1))
echo $contId
echo $url
done
When i Do echo url its giving value of contId as blank but when i do echo $contId. Its printed correctly .Please suggest
Perhaps is it what you want to achieve:
contId=$(cat $headerDumpFile | grep "X-CS-Continuation-Id:" | awk '{print $NF}')
Or the simpler:
contId=$(awk '/X-CS-Continuation-Id:/ {print $NF}' $headerDumpFile)
Note that unlike what you were guessing, echo $contId isn't displaying anything in your code. What is displayed is the result of the bogus contId= cat $headerDumpFile | grep "X-CS-Continuation-Id:" | awk '{print $NF}' line.

Make the times from a log file relative to the starttime

I have a logfile with this format:
10:33:56 some event occurs
10:33:57 another event occurs
10:33:59 another one occurs
I want to make the times relative to the start time:
00:00:00 some event occurs
00:00:01 another event occurs
00:00:03 another one occurs
using a bash script. That would allow me to compare better different execution delays.
One can make this script rebase_time.sh:
adddate() {
while IFS= read -r line; do
log_file_hours=`echo $line | awk 'BEGIN{FS="[ [/:]+"}; {print $1}'`
log_file_minutes=`echo $line | awk 'BEGIN{FS="[ [/:]+"}; {print $2}'`
log_file_seconds=`echo $line | awk 'BEGIN{FS="[ [/:]+"}; {print $3}'`
log_date="$log_file_hours:$log_file_minutes:$log_file_seconds"
if [[ -z "$first_date" ]]; then
first_date=$log_date
fi
StartDate=$(date -u -d "$first_date" +"%s")
FinalDate=$(date -u -d "$log_date" +"%s")
diff=$(date -u -d "0 $FinalDate sec - $StartDate sec" +"%H:%M:%S")
echo $diff ${line#$log_date}
done
}
cat "$1" | adddate
and call it this way:
./rebase_time events.log

using date variable inside sed command

I am storing date inside a variable and using that in the sed as below.
DateTime=`date "+%m/%d/%Y"`
Plc_hldr1=`head -$i place_holder.txt | tail -1 | awk -F ' ' '{ print $1 }'`
Plc_hldr2=`head -$i place_holder.txt | tail -1 | awk -F ' ' '{ print $2 }'`
sed "s/$Plc_hldr1/$DateTime/;s/$Plc_hldr2/$Total/" html_format.htm >> /u/raskar/test/html_final.htm
While running the sed command I am getting the below error.
sed: 0602-404 Function s/%%DDMS1RT%%/01/02/2014/;s/%%DDMS1C%%/1235/ cannot be parsed.
I suppose this is happening as the date contains the following output which includes slashes '/'
01/02/2014
I tried with different quotes around the date. How do I make it run?
Change the separator to something else that won't appear in your patterns, for example:
sed "s?$Plc_hldr1?$DateTime?;s?$Plc_hldr2?$Total?"
Not the direct quertion but replace
Plc_hldr1=`head -$i place_holder.txt | tail -1 | awk -F ' ' '{ print $1 }'`
Plc_hldr2=`head -$i place_holder.txt | tail -1 | awk -F ' ' '{ print $2 }'`
by
Plc_hldr1=`sed -n "$i {s/ .*//p;q}"`
Plc_hldr2=`sed -n "$i {s/[^ ]\{1,\} \{1,\}\([^ ]\{1,\}\) .*/\1/p;q}"`
and with aix/ksh
sed -n "$i {s/\([^ ]\{1,\} \{1,\}[^ ]\{1,\}\) .*/\1/p;q}" | read Plc_hldr1 Plc_hldr2

extract information regarding : size && time && row_count in one line shell script

Hey every one! I am pretty new for shell script and I am stuck
I need to extract information regarding: file_name && size && time && row_count and I want it do in one command line. I tried like this :
ls -l * && wc -l file.txt && du -ks file.txt | cut -f1| awk '{print $5" " $6 " " $7 " "$8 " " $9 " "$1 " "$2}'
but is not working properly
I also tried do in loop but i dont know how extract from there
for file in `ls -ltr /export/home/oracle/dbascripts/scripts`
do
[[ -f $file ]] && echo $file | awk '{print $3}'
done
Then I want to redirect to file like this >> for sql loader purpose.
Thanks in advance!
This could be a start if you have GNU find and GNU coreutils (most Linux distribution will do):
for i in /my/path/*; do
find "$i" ! -type d -printf '%p %TY-%Tm-%Td %TH:%TM:%TS %s '
wc -l <"$i"
done
/my/path/* should be modified to reflect the files you want to probe.
Also keep in mind that this one-liner has a few major issues if any directories are specified. This should be safer in that regard:
for i in *; do
if [[ -d "$i" ]]; then
continue
fi
find "$i" -printf '%p %TY-%Tm-%Td %TH:%TM:%TS %s '
wc -l <"$i"
done
You will want to see the manual page for GNU find to understand this better.
EDIT:
There is at least other faster way, using join and bash process substitution, but it's a bit ugly and somewhat harder to make safe and work the kinks out of.
ExtractInformation()
{
timesep="-"
sep="|"
dot=":"
sec="00"
lcount=`wc -l < $fname`
modf_time=`ls -l $fname`
f_size=`echo $modf_time | awk '{print $5}'`
time_month=`echo $modf_time | awk '{print $6}'`
time_day=`echo $modf_time | awk '{print $7}'`
time_hrmin=`echo $modf_time | awk '{print $8}'`
time_hr=`echo $time_hrmin | cut -d ':' -f1`
time_min=`echo $time_hrmin | cut -d ':' -f2`
time_year=`date '+%Y'`
time_param="DD-MON-YYYY HH24:MI:SS"
time_date=$time_day$timesep$time_month$timesep$time_year" "$time_hrmin$dot$sec
result=$fname$sep$time_date$sep$f_size$sep$lcount$sep$time_param
sqlresult=`echo $result | awk '{FS = "|" ;q=sprintf("%c", 39); print "INSERT INTO SIP_ICMS_FILE_T(f_name, f_date_time,f_size,f_row_count) VALUES (" q $1 q ", TO_DATE("q $2 q,q $5 q "),"$3","$4");";}'`
echo $sqlresult>>data.sql
echo "Reading data....."
}
UploadData()
{
#ss=`sqlplus -s a/a#adb #data.sql
#set serveroutput on
#set feedback off
#set echo off`
echo "loading with sql Loader....."
}
f_data=data.sql
[[ -f $f_data ]] && rm data.sql
for fname in * ;
do
if [[ -f $fname ]] then
ExtractInformation
fi
UploadData
#Zipdata
done

Resources