Select first and last element on a column (time format) gnuplot - time

I am performing statistics with Gnuplot on several files, each with a large numbers of rows and columns, first of them in time format "%H:%M:%S". I've found here (sorry, cannot find the link) a way to easily do the process, that is:
# converts the time string in column COL to a number of seconds
intime(COL) = strptime("%H:%M:%S",strcol(1))
# finds correlation of time (seconds) in col 1 with the value in col 2
startrange0 = strptime("%H:%M:%S","18:46:27")
endrange0 = strptime("%H:%M:%S","23:59:27")
stats [startrange0:endrange0] 'datafile.txt' u (intime(1)):2 prefix "A"
set print outfilename append
My problem here is that I have to access to each file to write the starting and ending time exactly in the code, wich breaks a bit the automation of the script.
The question is, is there a way to access the first and last element of the time column (that is, the "18:46:27" and the "23:59:27" in the example), so it can read it form each file?
Thanks!

Related

How to Create a List of Available Times after removing Downtimes from a Period

I have a grid which lists the Period (Start - End), and a list of Downtimes.
The downtimes are then sorted (to ensure chronological order based on the start time of the outage), using the following formula:
=SORT(INDIRECT("B4:C"&SUMPRODUCT(MAX((B4:B12<>"")*ROW(B4:B12)))),1)
After which I am trying to calculate the list of Available times (Uptimes).
Currently i have a mess of inflexible formulas as follows:
B26 =IF(B14<B15,B14,C15)
C26 =IF(C14<C15,C14,B16)
B27 =C16
C27 =B17
I am searching for either a universal single celled arrayformula, or a formula that can be dragged (down/across), that can calculate the list of Available times (Uptimes).
I am seeking formula solutions that will work in both Excel (Mac 2021) and Google Sheets.
See attached image:
EDIT:
Here is a google sheet that has some example data, and explanatory notes: https://docs.google.com/spreadsheets/d/1t0XImtjP4RKeTdg3L97bjPzateUX2waHPhjf3nmSFIk/edit#gid=2100307022
in GS try:
=FILTER(B3:C24; A3:A24="Period")
update
B15:
=SORT(B4:C12)
B25:
=QUERY({C15:C23, B16:B24}, "where Col1 is not null and Col2 is not null")
Here is the solution i created (working in both Excel + Google Sheets) for cutting downtimes from a time period, to leave only the remaining uptimes.
Using the same cells and ranges as per the Question....
The original downtimes does not need to be in order, they are sorted in the intermediate table.
in cell B15:
=IFERROR(SORT(FILTER(B4:C12,(B4:B12<=C3)*(C4:C12>=B3))),"")
Then, create named ranges for the elements to be used in the formulas that will populate the remaining uptimes (this makes the formulas easier to edit as will be noted below):
start is the start time of the original time period (B14)
end is the end time of the original time period (C14)
downstart is the range of the start datetimes of the downtimes (B15:B23)
downend is the range of the end datetimes of the downtimes (C15:C23)
Then, to create/populate the list of the remaining uptimes, for the opening times, enter this formula into the first cell (B25) :
=IFERROR(IF((start>=MIN(downstart))*(end<=MAX(downend)),IF(ROW()=(25+SUMPRODUCT(--(LEN(downstart)>0))-1),"",SMALL(downend,1+ROW()-25)),
IF((start>=MIN(downstart)),SMALL(downend,1+ROW()-25),
IF((end<=MAX(downend)),IF(ROW()=25,start,IF(ROW()=(25+SUMPRODUCT(--(LEN(downstart)>0))),"",SMALL(downend,ROW()-25))),
IF(ROW()=25,start,SMALL(downend,ROW()-25))
))),"")
And for the ending times, enter this formula into the first cell (C25):
=IFERROR(IF((start>=MIN(downstart))*(end<=MAX(downend)),IF(ROW()=(25+SUMPRODUCT(--(LEN(downend)>0))-1),"",SMALL(downstart,2+ROW()-25)),
IF((start>=MIN(downstart)),IF(ROW()=(25+SUMPRODUCT(--(LEN(downend)>0))-1),end,SMALL(downstart,2+ROW()-25)),
IF((end<=MAX(downend)),IF(ROW()=(25+SUMPRODUCT(--(LEN(downend)>0))-1),end,SMALL(downstart,1+ROW()-25)),
IF(ROW()=(25+SUMPRODUCT(--(LEN(downend)>0))),end,SMALL(downstart,1+ROW()-25))
))),"")
Note: In both of these formulas, the number 25 is the row number of the first row where the results are to be populated, so if you have these results starting on a different row, just change the number 25 to your starting row. Due to use of named ranges, no other changes are necessary.
After entering the formulas, drag them both down to fill the remaining results.
For those with Excel 2019 or newer (or Google Sheets), you can use IFS instead. For the opening times, use this (in B25 ):
=IFERROR(IFS(
(start>=MIN(downstart))*(end<=MAX(downend)),(IF(ROW()=(25+SUMPRODUCT(--(LEN(downstart)>0))-1),"",SMALL(downend,1+ROW()-25))),
(start>=MIN(downstart)),SMALL(downend,1+ROW()-25),
(end<=MAX(downend)),(IF(ROW()=25,start,IF(ROW()=(25+SUMPRODUCT(--(LEN(downstart)>0))),"",SMALL(downend,ROW()-25)))),
(start<MIN(downstart))*(end>MAX(downend)),(IF(ROW()=25,start,SMALL(downend,ROW()-25)))
),"")
And, for the ending times use this (in C25):
=IFERROR(IFS(
(start>=MIN(downstart))*(end<=MAX(downend)),(IF(ROW()=(25+SUMPRODUCT(--(LEN(downend)>0))-1),"",SMALL(downstart,2+ROW()-25))),
(start>=MIN(downstart)),(IF(ROW()=(25+SUMPRODUCT(--(LEN(downend)>0))-1),end,SMALL(downstart,2+ROW()-25))),
(end<=MAX(downend)),(IF(ROW()=(25+SUMPRODUCT(--(LEN(downend)>0))-1),end,SMALL(downstart,1+ROW()-25))),
(start<MIN(downstart))*(end>MAX(downend)),(IF(ROW()=(25+SUMPRODUCT(--(LEN(downend)>0))),end,SMALL(downstart,1+ROW()-25)))
),"")
Again, drag them down to populate the remaining results.
Explanation:
(IF(ROW()=(25+SUMPRODUCT(--(LEN(downend)>0))-1),xxxxxx styled sections of the formulas check if on certain row, so that can return specific results
SMALL(named_range,ROW()-(25)) styled sections of the formulas uses the ROW with an offset (25, based on this examples starting row for the results) to increment the SMALL
Both the nested IF and the IFS styled solutions are in the example file linked in the opening Question.

Generate new format from a non-system generated report using Power Query

I have an excel file which is non-system generated report format.
I wish to calculate and generate another new output.
Given the Report format as below:-
1) Inside the query when load this excel file, how can I create a new column to copy and paste on the first found value (1#51) at column at the next record, if the next record is empty. Once, if detected a new value (1#261) then copy and paste to the subsequent null value of few next records till this end?
2) The final aim is to generate a new output to auto match/calculate the money to be assign to different reference. As shown below:-
The reference A ~ E is sharing the 3 bank Ref (28269,28542 & RMP) , was thinking to read the same data source a few times, first time to read the column A ~ O(QueryRef) and 2nd time to read the same source to read from A, Q ~ V(QueryBank).
After this I do not have idea how I can allocate the $$ from Query Bank to QueryRef based on the Sum of Total AR.
Eg,
Total Amt of BankRef 28269, $57,044.67 is sufficient to cover Ref#A $10,947.12
BankRef 28269 still sufficient to cover Ref#B $27,647.60
BankRef 28269 left only $18,449.95 , hence the balance of 28269 be allocate to Ref#C.
Remaining balance of Ref#C will need to use BankRef28542 to cover,i.e. $1,812.29
Ref#D will then be allocated of the remaining balance of BankRef28542, i.e. $4,595.32
Ref#D still left $13,350.03 unallocated, hence this will use BankRef#RMP
Ref#E only need $597.66, and BankRef#RMP is sufficient to cover this.
I am not sure if my above case study can be solved using power query or not, due to me still being a newbie # Power Query? Or this is too complicate to handle hence we need to write a program to auto matching this kinds of scenario?
Attached is the sample source file and output :
https://www.dropbox.com/sh/dyecwcdz2qg549y/AACzezsXBwAf8eHUNxxLD1eWa?dl=0
Any advice/opinion/guidance is very much appreciated.
Answering question one:
You have a feature in Powerquery called FILL, DOWN or UP.
For a selected column you can copy the first non empty value to all rows under until a new non empty row is found and so on.

advanced concatenation of lines based on the specific number of compared columns in csv

this is the question based on the previous solved problem.
i have the following type of .csv files(they aren't all sorted!, but the structure of columns is the same):
name1,address1,town1,zip1,email1,web1,,,,category1
name2,address2,town2,zip2,email2,,,,,category2
name3,address3,town3,zip3,email3,,,,,category3_1
name3,address3,town3,zip3,,,,,,category3_2
name3,address3,town3,zip3,,,,,,category3_3
name4,address4,town4,zip4,,,,,,category4_1
name4,address4,town4,zip4,email4,,,,,category4_2
name4,address4,town4,zip4,email4,,,,,category4_3
name4,address4,town4,zip4,,,,,,category4_4
name5,address5,town5,zip5,,,,,,category5_1
name5,address5,town5,zip5,,web5,,,,category5_2
name6,address6,town6,zip6,,,,,,category6
first 4 records in columns are always populated, other columns are not always, except the last one - category
empty space between "," delimiter means that there is no data for the particular line or name
if the nameX doesnt contain addressX but addressY, it is a different record(not the same line) and should not be concatenated
i need the script in sed or awk, maybe the bash(but this solution is little slower on bigger files[hundreds of MB+]), that will take first 4 columns(in this case) compares them and if matched, will merge every category with the ";" delimiter and will keep the structure and the most possible data in other columns of those matched lines of a .csv file:
name1,address1,town1,zip1,email1,web1,,,,category1
name2,address2,town2,zip2,email2,,,,,category2
name3,address3,town3,zip3,email3,,,,,category3_1;category3_2;category3_3
name4,address4,town4,zip4,email4,,,,,category4_1;category4_2;category4_3;category4_4
name5,address5,town5,zip5,,web5,,,,category5_1;category5_2
name6,address6,town6,zip6,,,,,,category6
if that is not possible, solution could be to retain data from the first line of the duped data(the one with categoryX_1). example:
name1,address1,town1,zip1,email1,web1,,,,category1
name2,address2,town2,zip2,email2,,,,,category2
name3,address3,town3,zip3,email3,,,,,category3_1;category3_2;category3_3
name4,address4,town4,zip4,,,,,,category4_1;category4_2;category4_3;category4_4
name5,address5,town5,zip5,,,,,,category5_1;category5_2
name6,address6,town6,zip6,,,,,,category6
does the .csv have to be sorted before using the script?
thank you again!
sed -n 's/.*/²&³/;H
$ { g
:cat
s/\(²\([^,]*,\)\{4\}\)\(\([^,]*,\)\{5\}\)\([^³]*\)³\(.*\)\n\1\(\([^,]*,\)\{5\}\)\([^³]*\)³/\1~\3~ ~\7~\5;\9³\6/
t fields
b clean
:fields
s/~\([^,]*\),\([^~]*~\) ~\1,\([^~]*~\)/\1,~\2 ~\3/
t fields
s/~\([^,]*\),\([^~]*~\) ~\([^,]*,\)\([^~]*~\)/\1\3~\2 ~\4/
t fields
s/~~ ~~//g
b cat
:clean
s/.//;s/[²³]//g
p
}' YourFile
Posix version (so --posixwith GNU sed) and without sorting your file previously
2 recursive loop after loading the full file in buffer, adding marker for easier manipulation and lot of fun with sed group substitution (hopefully just reach the maximum group available).
loop to add category (1 line after the other, needed for next loop on each field) per line and a big sub field temporary structured (2 group of field from the 2 concatened lines. field 5 to 9 are 1 group)
ungroup sub field to original place
finaly, remove marker and first new line
Assuming there is no ²³~ character because used as marker (you can use other marker and adapt the script with your new marker)
Note:
For performance on a hundred MB file, i guess awk will be lot more efficient.
Sorting the data previoulsy may help certainly in performance reducing amount of data to manipulate after each category loop
i found, that this particular problem is faster being processed through db...
SQL - GROUP BY to combine/concat a column
db: mysql through wamp

Applescript Excel Date cell value not displaying correctly

When I use the following Applescript (which I found on StackOverflow :)), it works great except that the cell the script selects is a date but it does not display it correctly at the applescript. Perhaps this will help:
set searchRange to range ("D1:D100")
set foundRange to find searchRange what "string" with match case
set fRow to first row index of foundRange
set myData to value of range ("B" & fRow as text)
The value of the selected cell in column B is 4:14:00 AM but in AppleScript, it returns as 0.176388888889
How can I solve this?
That seems to be the way Excel stores the data internally. Try using string value instead of value. That worked for me.
Your question is answered, but just to add out of interest: Excel stores time as a decimal fraction, and the formatting makes it meaningful. So noon is 0.5, and in theory, 24:00:00 would be 1, but obviously time rolls round to 00:00:00 again. If you multiply your 04:14:00 by 86400 (the number of seconds in a day), it will reveal the time in seconds. So you can see how to do maths with time, and how Excel does maths in the background.
Dates are stored as integers, counting up from 01/01/1901 (1), where today's date is stored as 41601.
To see the stored value just change the cell format to 'general' (cmd+1, useful shortcut).

Google SpreadSheet - handling MM:SS.sss time formats

I'd like to process the following columns in a google-spreadsheet. The Time column represents the minutes, second and milliseconds take to run 1km and I'd like to be able to sum the four values.
Split Time
1 3:13:4
2 3:20:5
3 3:16:1
4 3:26:3
I suspect that I need to convert and split the time column into a specific minute and second columns to achieve this goal but would appreciate any advise that the developer may have.
I updated the format of the time column and used the SPLIT / CONTINUE functions
Minutes=SPLIT(B2,":")
Seconds=CONTINUE(C2,1,2)
Total Seconds=(C2*60)+D2
The table now looks like
Split Time minutes Seconds Total Seconds
1 03:13:00 3 13 193
2 03:15:00 3 15 195
3 03:16:00 3 16 196
Still wondering about the most efficient way to convert the Total Seconds value to time.
You can use the LEFT(text, number), MID(text, start, number), and RIGHT(text, number).
In detail:
Minutes = LEFT(B2, 1)
Seconds = MID(B2, 3, 2)
Milliseconds = RIGHT(B2, 2)
You can just use SUM for those values, a la:
=SUM(A1:A4)
Alternatively, you can use functions such as HOUR, MINUTE and SECOND to extract appropriate values if you want more fine-grained control.
Where the source data is a specified (ie The Time column represents the minutes,second and milli seconds) then to be able to add to a sensible result (795.013 seconds for the first sample of four) conversion similar to:
=60*index(split(B2,":"),0,1)+index(split(B2,":"),0,2)+index(split(B2,":"),0,3)/1000
is required.
To convert the total (assumed to be in C6) to the same absurd format as for input (13:15:13):
=int(C6/60)&":"&int(mod(C6,60))&":"&value(mid(C6,find(".",C6)+1,3))
`
In your original sheet, in a new column:
=TO_DATE("00:0" & left($B2,4))
Then copy the formula down the column.
This will convert your M:SS (the left 4 characters of your data) to the sheet's system date/time format, for each entry in column B.
You can then sum and format the results as you like.
This assumes there are no leading zeroes on your data. You can add code to check for this, but if your times all have single digits for the minutes value, it won't matter.

Resources