Timestamp Subtraction Using Bash-Utils - bash

Hey I am trying to calculate the difference between time stamp of a file in which I always want to subtract the last the fields using bash/bash-utils. This is how the file looks
14:11:56.953700000,172.20.10.1
14:25:49.233263000,172.20.10.1
Now the issue is that I want to lose that huge number and IP from the calculation.
I can put them in csv or any data file needed.

Could you please try following and let me know if this helps you.
awk -F'[.,]' '
FNR==1{
split($1,time,":");
sec=time[1] * 3600+time[2]*60+time[3]}
FNR==2{
split($1,time1,":");
sec1=time1[1] * 3600+time1[2]*60+time1[3];
seconds=(sec1-sec)%60;
min=sprintf("%d",(sec1-sec)/60);
printf("%s %s\n",min" min",seconds" sec")
}' Input_file
Output will be as follows.
13 min 53 sec

Related

Subtracting Start TIME Values from Finish TIME Values and adding the values as a new column in CSV File in Unix

I have a CSV file (output from a SQL Query). It has Start Time & Finish Time values given in different columns. I need to get the difference of Start Time and Finish Time and generate an HTML Report based on the difference value. For this, I wanted to include a new column, which will hold the output of "Finish Time" - "Start Time". columns are as below.
Time format is in the below format
START TIME: 2018-11-08 01:45:39.0
FINISH TIME:2018-11-06 06:48:20.0
I used below code, but I am not sure, whether its returning correct values. Any help on this will be appreciated.
Below are the 1st 3 lines of my CSV file
DESCRIPTION,SCHEDULE,JOBID,CLASSIFICATION,STARTTIME,FINISHTIME,NEXTRUNSTART,SYSTEM,CREATIONDATETIME,
DailyClearance,Everyday,XXXXXX, Standard,2018-11-08 01:59:59.0,2018-11-08 02:00:52.0,CAK-456,018-11-08 04:28:18,
Miscellinious,Everyday,XXXXXX, standart,2018-11-08 02:59:59.0,2018-11-08 03:29:39.0,2018-11-09 03:00:00.0,CAT-251,2018-11-08 04:28:18,
And this is my attempt
awk 'NR==1 {$7 = "DIFFMIN"} NR > 1 { $7 = $5 - $6} 1' <inputfile.csv
This might be of assistance to you. The idea is to use GNU awk which has time functions.
awk 'BEGIN{FS=OFS=","}
(NR==1){print $0 OFS "DURATION"; next}
{ tstart = $5; tend = $6
gsub(/[-:]/," ",tstart); tstart=mktime(tstart)
gsub(/[-:]/," ",tend); tend =mktime(tend)
$(NF+1)=tend-tstart;
print
}'
This should add the extra column. The time will be expressed in seconds.
The idea is to select the two columns and convert them into seconds since epoch (1970-01-01T00:00:00). This is done using the mktime function which expects a string of the form YYYY MM DD hh mm ss. That is why we first perform a substitution. Once we have the seconds since epoch for the start and end-time, we can just subtract them to get the duration in seconds.
note: there might be some problems during daylight saving time. This depends on the settings of your system.
note: subsecond accuracy is ignored.

Transpose from row to column

Traspose from line to column is the objetive, taking in consideration the first column, which is the date
Input file
72918,111000009,111000009,111000009,111000009,111000009,111000009,111000009,111000009,111000009
72918,2356,2357,2358,2359,2360,2361,2362,2363,2364
72918,0,0,0,0,0,0,0,0,0
72918,0,0,0,0,0,0,1,0,0
72918,1496,1502,1752,1752,1752,1752,1751,974,972
73018,111000009,111000009,111000009,111000009,111000009,111000009,111000009,111000009,111000009
73018,2349,2350,2351,2352,2353,2354,2355,2356,2357
73018,0,0,0,0,0,0,0,0,0
73018,0,0,0,0,0,0,0,0,0
73018,1524,1526,1752,1752,1752,1752,1752,256,250
Output desired
72918,111000009,2356,0,0,1496
72918,111000009,2357,0,0,1502
72918,111000009,2358,0,0,1752
72918,111000009,2359,0,0,1752
72918,111000009,2360,0,0,1752
72918,111000009,2361,0,0,1752
72918,111000009,2362,0,1,1751
72918,111000009,2363,0,0,974
72918,111000009,2364,0,0,972
73018,111000009,2349,0,0,1524
73018,111000009,2350,0,0,1526
73018,111000009,2351,0,0,1752
73018,111000009,2352,0,0,1752
73018,111000009,2353,0,0,1752
73018,111000009,2354,0,0,1752
73018,111000009,2355,0,0,1752
73018,111000009,2356,0,0,256
73018,111000009,2357,0,0,250
Please advise, thanks in advance.
This code seems to do exactly what you need:
awk -F, '
func init_block() {ts=$1;delete a;cnt=0;nf0=NF}
func dump_block() {for(f=2;f<=nf0;f+=1){printf("%s",ts);for(r=1;r<=cnt;r+=1){printf(",%s",a[r,f])};print ""}}
BEGIN{ts=-1}
ts<0{init_block()}
ts!=$1{dump_block();init_block()}
{cnt+=1;for(f=1; f<=NF; f++) a[cnt,f]=$f}
END{dump_block()}' <input.txt >output.txt
It collects rows until the timestamp changes, then prints the transpose of the block with keeping the same timestamp. The number of fields in the input must be the same within each block so that this code behaves correctly.

Awk printing out smallest and highest number, in a time format

I'm fairly new to linux/bash shell and I'm really having trouble printing two values (the highest and lowest) from a particular column in a text file. The file is formatted like this:
Geoff Audi 2:22:35.227
Bob Mercedes 1:24:22.338
Derek Jaguar 1:19:77.693
Dave Ferrari 1:08:22.921
As you can see the final column is a timing, I'm trying to use awk to print out the highest and lowest timing in the column. I'm really stumped, I've tried:
awk '{print sort -n < $NF}' timings.txt
However that didn't even seem to sort anything, I just received an output of:
1
0
1
0
...
Repeating over and over, it went on for longer but I didn't want a massive line of it when you get the point after the first couple iterations.
My desired output would be:
Min: 1:08:22.921
Max: 2:22:35.227
After question clarifications: if the time field always has a same number of digits in the same place, e.g. h:mm:ss.ss, the solution can be drastically simplified. Namely, we don't need to convert time to seconds to compare it anymore, we can do a simple string/lexicographical comparison:
$ awk 'NR==1 {m=M=$3} {$3<m&&m=$3; $3>M&&M=$3} END {printf("min: %s\nmax: %s",m,M)}' file
min: 1:08:22.921
max: 2:22:35.227
The logic is the same as in the (previous) script below, just using a simpler string-only based comparison for ordering values (determining min/max). We can do that since we know all timings will conform to the same format, and if a < b (for example "1:22:33" < "1:23:00") we know a is "smaller" than b. (If values are not consistently formatted, then by using the lexicographical comparison alone, we can't order them, e.g. "12:00:00" < "3:00:00".)
So, on first value read (first record, NR==1), we set the initial min/max value to the timing read (in the 3rd field). For each record we test if the current value is smaller than the current min, and if it is, we set the new min. Similarly for the max. We use short circuiting instead if to make expressions shorter ($3<m && m=$3 is equivalent to if ($3<m) m=$3). In the END we simply print the result.
Here's a general awk solution that accepts time strings with variable number of digits for hours/minutes/seconds per record:
$ awk '{split($3,t,":"); s=t[3]+60*(t[2]+60*t[1]); if (s<min||NR==1) {min=s;min_t=$3}; if (s>max||NR==1) {max=s;max_t=$3}} END{print "min:",min_t; print "max:",max_t}' file
min: 1:22:35.227
max: 10:22:35.228
Or, in a more readable form:
#!/usr/bin/awk -f
{
split($3, t, ":")
s = t[3] + 60 * (t[2] + 60 * t[1])
if (s < min || NR == 1) {
min = s
min_t = $3
}
if (s > max || NR == 1) {
max = s
max_t = $3
}
}
END {
print "min:", min_t
print "max:", max_t
}
For each line, we convert the time components (hours, minutes, seconds) from the third field to seconds which we can later simply compare as numbers. As we iterate, we track the current min val and max val, printing them in the END. Initial values for min and max are taken from the first line (NR==1).
Given your statements that the time field is actually a duration and the hours component is always a single digit, this is all you need:
$ awk 'NR==1{min=max=$3} {min=(min<$3?min:$3); max=(max>$3?max:$3)} END{print "Min:", min ORS "Max:", max}' file
Min: 1:08:22.921
Max: 2:22:35.227
You don't want to run sort inside of awk (even with the proper syntax).
Try this:
sed 1d timings.txt | sort -k3,3n | sed -n '1p; $p'
where
the first sed will remove the header
sort on the 3rd column numerically
the second sed will print the first and last line

Change date and data cells in .csv file progressively

I have a file that I'm trying to get ready for my boss in time for his manager's meeting tomorrow morning at 8:00AM -8GMT. I want to retroactively change the dates in non consecutive rows in this .csv file: (truncated)
,,,,,
,,,,,sideshow
,,,
date_bob,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14
bob_available,531383,531383,531383,531383,531383,531383,531383,531383,531383,531383,531383,531383,531383,531383
bob_used,448312,448312,448312,448312,448312,448312,448312,448312,448312,448312,448312,448312,448312,448312
,,,
date_mel,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14
mel_available,343537,343537,343537,343537,343537,343537,343537,343537,343537,343537,343537,343537,343537,343537
mel_used,636159,636159,636159,636159,636159,636159,636159,636159,636159,636159,636159,636159,636159,636159
,,,
date_sideshow-ws2,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14
sideshow-ws2_available,936239,936239,936239,936239,936239,936239,936239,936239,936239,936239,936239,936239,936239,936239
sideshow-ws2_used,43441,43441,43441,43441,43441,43441,43441,43441,43441,43441,43441,43441,43441,43441
,,,
,,,,,simpsons
,,,
date_bart,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14
bart_available,62559,62559,62559,62559,62559,62559,62559,62559,62559,62559,62559,62559,62559,62559
bart_used,1135117,1135117,1135117,1135117,1135117,1135117,1135117,1135117,1135117,1135117,1135117,1135117,1135117,1135117
,,,
date_homer,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14
homer_available,17799,17799,17799,17799,17799,17799,17799,17799,17799,17799,17799,17799,17799,17799
homer_used,1179877,1179877,1179877,1179877,1179877,1179877,1179877,1179877,1179877,1179877,1179877,1179877,1179877,1179877
,,,
date_lisa,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14
lisa_available,3899,3899,3899,3899,3899,3899,3899,3899,3899,3899,3899,3899,3899,3899
lisa_used,1193777,1193777,1193777,1193777,1193777,1193777,1193777,1193777,1193777,1193777,1193777,1193777,1193777,1193777
In other words a row that now reads:
date_lisa,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14,09-17-14
would desirably read:
date_lisa,09-04-14,09-05-14,09-06-14,09-07-14,09-08-14,09-09-14,09-10-14,09-11-14,09-12-14,09-13-14,09-14-14,09-15-14,09-16-14,09-17-14
I'd like to make the daily available numbers less at the beginning and then get progressively bigger day by day. This will mean that the used rows will have to be proportionately smaller at the beginning and then get progressively bigger in lock step with the available rows as they shrink.
Not by a large amount don't make it look obvious just a few GB here and there. I plan to make pivot tables and graphs out of this and so it has to vary a little. BTW the numbers are all in MB as I generated them using df -m.
Thanks in advance if anyone can help me.
The following awk does what you need:
awk -F, -v OFS=, '
/^date/ {
split ($2, date, /-/);
for (i=2; i<=NF; i++) {
$i = date[1] "-" sprintf ("%02d", date[2] - NF + i) "-" date[3]
}
}
/available|used/ {
for (i=2; i<=NF; i++) {
$i = int (($i*i)/NF)
}
}1' csv
Set the Input and Output Field Separator to ,
All the lines that start with date, we split the second column to find the date part.
We iterate from second column to the end of the line and set the column to new calculated start date which basically uses the current date and the total number of fields.
All other lines remain as is and gets printed along with modified lines.
This has a caveat of not rolling over different months correctly.
For data fields we iterate from second column to the end of line and do a calculation to make them progressively greater than the previous one to match the original value for last field.

SPLIT file by Script (bash, cpp) - numbers in columns

I have files with some columns filled by numbers (float). I would need to split these files according to the value in one of the columns (can set as the first one). This means, when
a b c
in my file the value c fullfils 0.05<=c<=0.1 then create the file named c and copy the whole columns there which fullfils the c-condition...
is this possible? I can something small with bash, awk, something also with c++.
I have searched for some solutions but - I can the data sort of course and only read the first number of the line..
I don't know.
Please, very please.
Thank you
Jane
As you mentioned awk, the basic rule in awk is 'match a line (either by default or with a regexp, condition or line number)' AND 'do something because you found a match'.
awk uses values like $1, $2, $3 to indicate which column in the current line of data it is looking at. $0 refers to the whole line. So ...
awk '
BEGIN{
afile="afile.txt"
bfile="bfile.txt"
cfile="cfile.txt"
}
{
# test c value between .05 and .1
if ($3 >= 0.05 && $3 <= 0.1) print $0 > cfile
} inputData
Note that I am testing the value of the third column (c in your example). You can use $2 to test b column, etc.
If you don't know about the sort of condition test I have included >= 0.5 && $3 <= 0.1 you'll have some learning ahead of you.
Questions in the form of 1. I have this input, 2. I want this output. 3. (but) I'm getting this output, 4. with this code .... {code here} .... have a much better chance of getting a reasonable response in a reasonable amount of time ;-)
I hope this helps.
P.S. as you appear to be a new user, if you get an answer that helps you please remember to mark it as accepted, and/or give it a + (or -) as a useful answer.
If I understand your requirements correctly:
awk '{print > $3}' file ...

Resources