Howe to count an event by minute in Big Query - time

Many years ago I knew SQL quite well but apparently it's been so long I lost my skills and knolwedge.
I have a number of tables that each track a given event with additional metadata. One piece of Metadata is a timestamp in UTC format(2021-08-11 17:27:27.916007 UTC).
Now I need to count how many times the event occurred per minute.
Col 1, Col2
EventName, Timestamp in UTC
I am trying to recall my past knowledge and also how to apply that to BQ. Any help is appreciated.

If I'm understanding well, you could transform your Timestamp into minutes and then group by it.
SELECT count(*) AS number_events,
FLOOR(UNIX_SECONDS(your_timestamp)/60) AS minute
FROM your_table
GROUP BY FLOOR(UNIX_SECONDS(your_timestamp)/60)
So it transforms your timestamps to unix_seconds, then divide by 60 to get minutes and floor() to skip decimals after the division.
If you have multiple type of events in the same table, just add the name of the event to the select and to the group by

The first step would be to group by event column.
Then the Timestamp events can be counted.
Select Col2_EventName, count(Timestamp )
group by 1
Depending on your data, some more transformation have to be done. E.g. ignore the seconds in the timestamp and hold only the full minutes, as done in the answer from Javier Montón.

Related

Impala date subtraction timestamp and get the result in equivalent days irrespective of difference in hours or year or days or seconds

I want to subtract two date in impala. I know there is a datediff funciton in impala but if there is two timestamp value how to deal with it, like consider this situation:
select to_date('2022-01-01 15-05-53','yyyy-mm-dd HH24-mi-ss')-to_date('2022-01-01 15-04-53','yyyy-mm-dd HH24-mi-ss') from dual;
There is 1 minute difference and oracle would put the result as 0.000694444 days.
My requirement is if there is any such functionality in impala where I can subtract two timestamp value in the manner 'yyyy-mm-dd HH24-mi-ss', and get the result in equivalent days irrespective of if there is difference in days , year, hours, minute or seconds. Any difference should reflect in equivalent number of days.
Any other way where I can achieve the same thing, I am open to that as well.
Thank you in advance.
You can use unix_timestamp(timestamp) to convert both fields to unixtime (int) format. This is actually seconds from 1970-01-01 and very suitable to calculate date time differences in seconds. Once you have seconds from 1970-01-01, you can easily minus them both to know the differences.
Your sql should be like this -
select
unix_timestamp(to_timestamp('2022-01-01 15-06-53','yyyy-MM-dd HH-mm-ss')) -
unix_timestamp(to_timestamp('2022-01-01 15-05-53','yyyy-MM-dd HH-mm-ss')
) diff_in_seconds
Once youhave difference in seconds, you can easily convert them to minutes/hours/days - whatever format you want it.

Cognos 11 Crosstab - need a value that doesn't have a reference to the column values

Crosstab report works 99%.
About 20 rows, all but one are ok.
5 columns - Company Division.
The rows are things like cost, revenue, revenue 2, etc.
All the rows that work have three attributes I'm using to select them:
Fiscal Year
Period
Solution.
The problem is there is table that lists an YTD rate for each period. This table is not Division Specific; it's company wide.
All the tables are linked to the accounting period table that has fiscal year and period. So the overall query limits data to fiscal year (?pFiscalYear?) and period <= ?pPeriod?, based on prompt page results.
The source table has this:
FY_CD PD_NO ACT_CURR_RT ACT_YTD_RT
2018 1 0.36121715 0.36121715
2018 2 0.32471476 0.34255512
2018 3 0.25240906 0.31210183
2018 4 0.33154745 0.31925874
Note the YTD rate is not an average of any of the other numbers.
When I select the ACT_YTD_RT, as a row, I want the ACT_YTD_RT that matches the selected period.
What I get is the average if I set the aggregation to average or the lowest if I set it to other aggregations. So sometimes, it looks right (if I run for period 1,2,3, as the rate kept falling), and sometimes it's wrong (period 4
returns .3121 instead of .3192).
I've tried a number of different methods and can generate garbage data (totals, min, max, average) and crossjoins but can't figure out how to get the value I'm looking for.
I want YTD_RT where fiscal year =?pFiscal? and period = ?pPeriod?.
I tried a straight if then clause:
if (sourcetable.fiscalYear = ?pFiscalYear?) and (sourcetable.Period = ?pPeriod?) then (ACT_YTD_RT)
but I get an error like this:
'ACT_YTD_RT' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. (SQLSTATE=42000, SQLERRORCODE=8120)
If I create another query that generates the right response and try to include it, I get a crossjoin error that the query I'm referencing is trying to crossjoin several other items in the crosstab query.
A union doesn't work (different number of columns).
Not sure how a join would work since the division doesn't exist in the rate table.
I maybe could create a view in the database that did a crossjoin of the division table and the rate table, add that to the framework and then I wouldn't have a crossjoin since the solution would be in the rate "table" (really view), but that seems wrong somehow.
If I could just write a freaking parameterized query direct to the database I'd be done. But in Cognos 11 crosstabs I can't find a place for a SQL query object. And that shouldn't be necessary.
I've spent hours and hours chasing this in circles.
Anybody have any ideas?
Thanks
Paul
So the earlier problem was that this:
if (sourcetable.fiscalYear = ?pFiscalYear?) and (sourcetable.Period = ?pPeriod?) then (ACT_YTD_RT)
Generated an error like this:
'ACT_YTD_RT' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. (SQLSTATE=42000, SQLERRORCODE=8120)
To fix the above, I had to add a cross join of the division table and the rate table as a view in the database. Then add that to the framework. Then build the data item this way:
total (
if (sourcetable.fiscalYear = ?pFiscalYear?) and (sourcetable.Period = ?pPeriod?) then (ACT_YTD_RT)
)
And now the "total" provides the missing group by. And the crossjoin in the database provides the division information so the crosstab is happy.
I still think there should have been an easier way to do this, but I have a functioning hammer at the moment.

Can I generate the number of business days in a month in Visual Studio?

I have a report that takes sales data from a few tables. I want to add a field that will divide the total sales for the given month by the total number of business days in that same month. Is there a way I can calculate that in an expression? Do I need to create a new table in the database specifically for months and their number of business days? How should I go about this?
Thank you
Intuitively, I would say that you need a simple function and a table.
The table is to host the exceptions like Independence day, labor day, etc.
The function will get two parameters: Month and Year (I'm not providing any sample code since you haven't specified which language you are using).
It will then build a date as yyyy-mm-01 (meaning, first day of the month). If will then loop from 2 to 31 and:
Create a new date by adding the index of the loop to the initial date,
Check if the resulting date is still within the month,
Check if it is a working or not working day (e.g. Sunday),
Check if it is found within the table of exceptions.
If the created date passes all the above tests, you add 1 to the counter.
Though it might look complex, it is not and it will provide you the correct answer regardless of the month (e.g. Feb.) and the year (leap or not).

Sum based on specific condition - Oracle

I need your advice on the following query that I have - Let's say that I have a table with all payments that are booked on my current account.
The details of the payment contain date of the operation and hour. I would like to extract the information in a such a way so to have next to each transaction the amount of of the balance(sum of transactions' amount) since the beginning of the day up to the current transaction. The balance for each day is reset to 0.
I was thinking to join this table to itself and find all unique operations from the joined table where the date matches and the hour is less then currently reviewed operation's hour then to use sum on the group.
Still I think that there is much more intelligent solution.
Thanks in advance
here is a sample of the table. Expected result is in the last column
My guess is that you just want a rolling sum. Making up column names and table names, you probably want something like this in your projection (your select list). You shouldn't need to do a self-join.
SUM(transaction_amount)
OVER (PARTITION BY account_number, trunc(transaction_date)
ORDER BY transaction_date) rolling_sum

Storing recurring time periods in Oracle database

I'm writing monitoring software, where most of the logic will be in Oracle databasen & pl/sql.
When my monitoring is called it should alert about problems. For example, it should alert about problem if
1. There are less than 2 operation, in every minute, on Friday from 22:00 till 23:00
2. There are less than 5 operation, in every minute, on 31 of January from 22:00-23:00
3. There are less than 3 operation, in every minute, every day from 10:00 till 12:00
If my monitoring is called on 22:30, 31 of January I should compare my operation number to 5.
4. If there are less than 5 operation, in every minute, from Friday 22:00 till Monday 15:00
I was thinking about saving data periods with cron expression format in database. In this case I have to compare SYSDATE (current call date of monitoring function) to cron expression saved in the database.
My questions:
1. How can I find out if SYSDATE falls under cron expression?
2. Is it correct to use cron expressions in this case, at all? Can you suggest any other way of saving periods of time.
Don't do it
I am completely with SpaceTrucker: Don't do it in SQL or PL/SQL, do it in Java with either Java 8 date API or JodaTime.
How to do it nevertheless
But even when you should't do it, there might still be some good reason to do it. So here is how:
Table for each instant you want to check
First let's create a table for each second or minute in the interval you want to check. The granularity and the length of your interval depends on the cron expressions you want to allow. Usually one second for a whole week should be sufficient (about 100'000 rows). If you want to check a whole year, use minutes as granularity (about 500'000 rows). Both amount or rows are nothing for a modern database. On my notebook, according queries return instantly.
CREATE TABLE week AS
SELECT
running_second,
ts,
EXTRACT(SECOND FROM ts) as sec,
EXTRACT(MINUTE FROM ts) as min,
EXTRACT(HOUR FROM ts) as h,
to_char(ts, 'Day') as dow
FROM (
SELECT
level as running_second,
TO_TIMESTAMP_TZ('2015-09-05 00:00:00 0:00',
'YYYY-MM-DD HH24:MI:SS TZH:TZM') +
NUMTODSINTERVAL(level-1, 'SECOND') AS ts
FROM dual CONNECT BY level<=60*60*24*7
)
;
Query for each filter expression
Next, you convert each cron expression to a query. You can either use PL/SQL to transform each cron expression to a where clause, or you can use a generic where clause.
You should get something like this:
SELECT
*
FROM
week
WHERE
h =5
AND min=0
AND sec=0;
or in a generic version:
SELECT
filter_expression.name, week.ts
FROM
week, filter_expressions
WHERE
(fiter_hour is null or h = filter_hour)
AND (filter_min is null or min = filer_min)
AND (filter_sec is null or sec = filter_sec);
(given your filters are stored in a table filter_expressions, that has a column for each constraint type, and each row has either a parameter for the constraint or NULL if the constraint is not applicable).
Store the result in a global temporary table cron_startpoints.
Check for violations
Group the table cron_startpoints to check for constraint violations. You can count, how many matches are there for Friday or midnight or whatever and can check, whether that number is OK for you or not.
It depends on how much flexibility you want. For the examples you provided such structure would be enough:
CREATE TABLE monitoring_periods (
id INTEGER NOT NULL PRIMARY KEY,
monit_month VARCHAR2(2),
monit_day VARCHAR(2),
monit_day_of_week VARCHAR(3),
monit_time_from INTERVAL DAY TO SECOND,
monit_time_to INTERVAL DAY TO SECOND,
required_ops INTEGER
);
Here are some examples to store the periods and checking against sysdate. I would avoid storing the cron expression literally as a string, as it would require parsing it at query time. However, the more complex your expressions are (kind of '5 4,15,22 */2 * 1-5') the more complicated the structure to store it - you need to think carefully of your requirements.
I once had the task to write difficult date calculations with recurring periods and time windoes for 10g. Among those were things like "Tuesday of the second week of the month every 2 months between 8 AM and 2 PM". We decided to use java stored procedures for this (also because they were already in use for other purposes).
Depending on your oracle version, you can choose a joda-time version, which can be run within the oracle database jvm. Also note that joda-time 1.6 can be compiled with java 1.3 (which we had to use).
If you are looking for cron expressions explicitly, than you might also do well with using another java library within the oracle database jvm. For example here is one:
CronExpression expression = CronExpression.parser()
.withSecondsField(true)
.withOneBasedDayOfWeek(true)
.allowBothDayFields(false)
.parse("0 15 10 L * ?");
assert expression.matches(dateTime);
However i think cron is not suited for your task at hand. Cron is a way to specify when to run jobs. However you need to observe what happend. So for your requirement There are less than 2 operation, in every minute you could have operations at the 1st and 2nd second or at the 1st 31st second and both are valid, but their cron expressions are very different.
When it's about saving the time periods, you could also look at ISO 8601 recurinng intervals stored as varchars:
P1Y2M10DT2H30M
In any case you will need to apply calculations on every row you would like to match. Depending on how many lines that are, you might need to use some heuristics to sort out results which are far away from meeting your criteria.
Thinking a bit more outside the box:
you should question your architecture. The requirements you listed ca be represented by state machines. You can feed them with the events that occured in chronological order. If a state machine reaches some unwanted state you can just report that. However I doubt that this can be easily done in pure pl/sql.

Resources