How to find the earliest date of the occurrence of a value for each year - oracle

I have a table with this structure:
STATION ID
YEAR
MONTH
DAY
RECDATE
VALUE
123456
1950
01
01
01-01-1950
95
123456
1950
01
15
01-15-1950
85
123456
1950
03
15
03-15-1950
95
123456
1951
01
02
01-02-1951
35
123456
1951
01
10
01-10-1951
35
123456
1952
02
12
02-12-1952
80
123456
1952
02
13
02-13-1952
80
And so on. There's a TMIN value for this station ID for every day of every year between 1888 and 2022. What I'm trying to figure out is a query that will give me the earliest date in each year that a value between -100 and 100 occurs.
The query select year, max(value) from table where value between -100 and 100 group by year order by year gives the year and value. The query select recdate, min(value) from table group by recdate order by recdate gives me every recdate with the value.
I have a vague memory of a query that practically partitions the data by a year or a date range so that the query would look at all the 1950 dates and give the earliest date for the value, then all the 1951 dates, and so on. Does anyone remember queries like that?
Thanks for any and all suggestions.

If I understood you correctly, this is your question:
What I'm trying to figure out is a query that will give me the earliest date in each year that a value between -100 and 100 occurs.
Then you posted 2 queries which return something, but I don't see relation to the question. What was their purpose? To me, they look like some random queries one could write against data in that table.
Therefore, back to the question: isn't that just
select min(recdate), --> "earliest date
year --> in each year
from that_table -- that a
where value between -100 and 100 --> value between -100 and 100 occurs"
group by year

Related

Get closest date with id and value Oracle

I ran into a problem and maybe there are experienced guys here to help me figure it out:
I have a table with rows:
ID
VALUE
DATE
2827
0
20.07.2022 10:40:01
490
27432
20.07.2022 10:40:01
565
189
20.07.2022 9:51:03
200
1
20.07.2022 9:50:01
731
0.91
20.07.2022 9:43:21
161
13004
19.07.2022 16:11:01
This table has a million records, there are about 1000 ID instances, only the date of the value change and, therefore, the value itself changes in them.
When the value of the ID changes is added to this table:
ID | Tme the value was changed (DATE) | VALUE
My task is to get the all id's values closest to the input date.
I mean: if I input date "20.07.2022 10:00:00"
I want to get each ID (1-1000) with rows "value, date" with last date before "20.07.2022 10:00:00":
ID
VALUE
DATE
2827
0
20.07.2022 9:59:11
490
27432
20.07.2022 9:40:01
565
189
20.07.2022 9:51:03
200
1
20.07.2022 9:50:01
731
0.91
20.07.2022 8:43:21
161
13004
19.07.2022 16:11:01
What query will be the most optimal and correct in this case?
If you want the data for each ID with the latest change up to, but not after, your input date then you can just filter on that date, and use aggregate functions to get the most recent data in that filtered range:
select id,
max(change_time) as change_time,
max(value) keep (dense_rank last order by change_time) as value
from your_table
where change_time <= <your input date>
group by id
With your previous sample data, using midnight this morning as the input date would give:
select id,
max(change_time) as change_time,
max(value) keep (dense_rank last order by change_time) as value
from your_table
where change_time <= timestamp '2022-07-28 00:00:00'
group by id
order by id
ID
CHANGE_TIME
VALUE
1
2022-07-24 10:00:00
900
2
2022-07-22 21:51:00
422
3
2022-07-24 13:01:00
1
4
2022-07-24 10:48:00
67
and using midday today woudl give:
select id,
max(change_time) as change_time,
max(value) keep (dense_rank last order by change_time) as value
from your_table
where change_time <= timestamp '2022-07-28 12:00:00'
group by id
order by id
ID
CHANGE_TIME
VALUE
1
2022-07-24 10:00:00
900
2
2022-07-22 21:51:00
422
3
2022-07-28 11:59:00
12
4
2022-07-28 11:45:00
63
5
2022-07-28 10:20:00
55
db<>fiddle with some other input dates to show the result set changing.

How to store multiple values in a variable to be used in a case statement

I am having this issue and any help in this regard will greatly be appreciated.
I have Oracle db and working with following business case:
An employee can work in a different job grades in his/her regular time hours or in overtime
Need to calculate employee’s hours w.r.t. different job grades and wage codes, because I have hours and job grades in different tables and the table which has job grades doesn’t have hours, instead time in and time out so after querying the db I get the following result.
Emp_ID
Wage Code
Job grade
Hours
Date
1
01
8
2021/06/07
1
02
P
2
2021/06/07
1
08
8
2021/06/08
1
01
6
2021/06/09
1
01
E
8
2021/06/09
1
01
8
2021/06/10
1
01
8
2021/06/11
1
02
9
2021/06/11
Now I get wrong hours when the employee works in different job grade(s).
To overcome this, I need to identify on which date employee worked in a different job grade do I can put case statement.
I used this logic.
Pick the date on which employee worked in different job grade and on that date do calculation of hours from table A
Other wise do calculation of hours from table B.
The problem is I can’t simply use variables because there could be multiple dates.
How can I achieve this? Can I use any other logic?
Thanks,
Here are my tables
TABLE A
Emp_ID
Wage_code
time_in
time_out
Job_grade
Date
01
8:00
16:00
2021-06-7
01
16:00
18:00
P
2021-06-7
01
8:00
16:00
2021-06-08
01
8:00
14:00
2021-06-09
01
14:00
16:00
E
2021-06-09
01
8:00
16:00
2021-06-10
01
8:00
16:00
2021-06-11
01
16:00
17:00
2021-06-11
This table doesn't store wage_codes. empty job_grade means employee has worked in the same job grade
TABLE B
Emp_ID
Wage_code
Hours
Date
01
1
8
2021-06-7
01
2
2
2021-06-7
01
8
8
2021-06-08
01
1
8
2021-06-09
01
1
8
2021-06-10
01
1
8
2021-06-11
01
2
2
2021-06-11
This table stores wage_codes but no job grade change, just a regular one and hours for each wage_code (1=regular,2=overtime,8=vacation etc..)
my query
select
A.emp_id,
A.job_grade,
B.Wage_code,
B.Date,
case
when A.job_grade ='' then B.Hours
else
to_char(A.time_in - A.time_out) *(24),'fm99.90')
end "Hours"
from A
left join B on A.emp_id=B.emp_id and A.Date=B.Date
With this query I get wrong hours when employee has worked in a different job grade. Because the condition in case statement checks if job grade is empty then calculate hours from Table B. Now e.g. on 06/07, employee has worked in a normal grade as well as in a different job grade.
How can I identify the date on which employee has worked in a different job grade so I can combine it with the job_grade condition in case statement and calculate hours accurately.
Many thanks for your support!!

MDX query count Login occurences over time interval

Im puzzle as to how to build my fact and dimensions to procude the following results:
I want to count the number of occurences of logged people for each time interval.
In this case every 30 mins. It would look like this
Example: Person1 login at 10:05:00 and logout at 12:10:00
Person2 login at 10:45:00 and logout at 11:25:00
Person3 login at 11:05:00 and logout at 14:01:00
TimeStart TimeEnd People logged
00:00:00 00:30:00 0
00:30:00 01:00:00 0
...
10:00:00 10:30:00 1
10:30:00 11:00:00 2
11:00:00 11:30:00 3
11:30:00 12:00:00 2
12:00:00 12:30:00 2
12:30:00 13:00:00 1
13:00:00 13:30:00 1
13:30:00 14:00:00 1
14:00:00 14:30:00 0
...
23:30:00 00:00:00 0
So i have a DimTime and DimDate table that contain hour, halfhour, quarterhour
and i have a FactTimestamp table that has the following:
DateLoginID that points to DimDate dateID
DateLogoutID that points to DimDate dateID
TimeLoginID that points to DimTime timeID
TimeLogoutID that points to DimTime timeID
I'd like to know what kind of cube design i would need to achieve that?
Ive done it in sql if that can help:
--Create tmp table for time interval
CREATE TABLE #tmp(
StartRange time(0),
EndRange time(0),
);
--Interval set to 30 minutes
DECLARE #Interval int = 30
-- Example with #Date = 2017-07-27: Set starttime at 2017-07-27 00:00:00
DECLARE #StartTime datetime = DATEADD(HOUR,0, #Date)
--Set endtime at 2017-07-27 23:59:59
DECLARE #EndTime datetime = DATEADD(SECOND,59,DATEADD(MINUTE,59,DATEADD(HOUR,23, #Date)))
--Populate tmp table with the time interval. from midnight to 23:59:59
;WITH cSequence AS
(
SELECT
#StartTime AS StartRange,
DATEADD(MINUTE, #Interval, #StartTime) AS EndRange
UNION ALL
SELECT
EndRange,
DATEADD(MINUTE, #Interval, EndRange)
FROM cSequence
WHERE DATEADD(MINUTE, #Interval, EndRange) <= #EndTime
)
INSERT INTO #tmp SELECT cast(StartRange as time(0)),cast(EndRange as time(0)) FROM cSequence OPTION (MAXRECURSION 0);
--Insert last record 23:30:00 to 23:59:59
INSERT INTO #tmp (StartRange, EndRange) values ('23:30:00','23:59:59');
SELECT tmp.StartRange as [Interval], COUNT(ts.TimeIn) as [Operators]
FROM #tmp tmp
JOIN Timestamp ts ON
--If timeIn is earlier than StartRange OR within the start/end range
(CAST(ts.TimeIn as time(0)) <= tmp.StartRange OR CAST(ts.TimeIn as time(0)) BETWEEN tmp.StartRange AND tmp.EndRange)
AND
--AND If timeOut is later than EndRange OR within the start/end range
CAST(ts.[TimeOut] as time(0)) >= tmp.EndRange OR CAST(ts.[TimeOut] as time(0)) BETWEEN tmp.StartRange AND tmp.EndRange
GROUP BY tmp.StartRange, tmp.EndRange
END
Really any kind of hint as to how to achieve it in mdx would be greatly appreciated.
Honestly, I wouldn't do it in MDX against that table structure. Even if you succeed in getting an MDX query that returns that value, and surely it can be done, it will most likely be tremendously complex and hard to maintain and debug, and will probably require multiple passes on the fact table to get the numbers, hurting performance.
I think this is a clear cut case for a periodic snapshot table. Pick your granularity, but even at 1 min snapshots you get 1440 points of data per day for each tuple of all other dimensions. If your login/logout table is large you may need to decrease this to keep its size manageable. In the end, you get a table with time_id, count_of_logins, and whatever other keys you need to other dimensions, and the query you need is just a filter on which time periods you want (give me all hours of the day, but filter on only minutes 00 and 30 of each hour) and the count of total number of logged in users is trivial.

Event Study (Extracting Dates in SAS)

I need to analyse abnormal returns for an event study on mergers and acquisitions.
** I would like to analyse abnormal returns to acquirers by using event windows. Basically I would like to extract the prices for the acquirers using -1 (the day before the announcement date), announcement date, and +1 (the day after the announcement date).**
I have two different datasets to extract information from.
The first is a dataset with all the merger and acquisition information that has the information in the following format:
DealNO AcquirerNO TargetNO AnnouncementDate
123 abcd Cfgg 22/12/2010
222 qwert cddfgf 26/12/1998
In addition, I have a 2nd dataset which has all the prices.
ISINnumber Date Price
abcd 21/12/2010 10
abcd 22/12/2010 11
abcd 23/12/2010 11
abcd 24/12/2010 12
qwert 20/12/1998 20
qwert 21/12/1998 20
qwert 22/12/1998 21
qwert 23/12/1998 21
qwert 24/12/1998 21
qwert 25/12/1998 22
qwert 26/12/1998 21
qwert 27/12/1998 23
ISIN number is the same as acquirer no, and that is the matching code.
In the end I would like to have a database something like this:
DealNO AcquirerNO TargetNO AnnouncementDate Acquirerprice(-1day) Acquireeprice(0day) Acquirerprice(+1day)
123 abcd Cfgg 22/12/2010 10 11 12
222 qwert cddfgf 26/12/1998 22 21 23
Do you know how I can get this?
I'd prefer to use sas to run the code, but if you are familiar with any other programs that can get the data like this, please let me know.
Thank you in advance ^_^.
This can be done quite easily with PROC SQL and joining the PRICE dataset three times. Try this (assuming data set names of ANNOUCE and PRICE):
Warning: untested code
%let day='21DEC2010'd;
proc sql;
create table RESULT as
select a.dealno,
a.acquirerno,
a.targetno,
a.annoucementdate,
p.price as acquirerprice_prev,
c.price as acquirerprice_cur,
n.price as acquirerprice_next
from ANNOUCE a
left join (select * from PRICE where date = &day-1) p on a.acquirerno = p.isinumber
left join (select * from PRICE where date = &day) c on a.acquirerno = c.isinumber
left join (select * from PRICE where date = &day+1) n on a.acquirerno = n.isinumber
;
quit;

Processing Timebased values

I have a list of timebased values in the following form:
20/Dec/2011:10:16:29 9
20/Dec/2011:10:16:30 13
20/Dec/2011:10:16:31 13
20/Dec/2011:10:16:32 9
20/Dec/2011:10:16:33 13
20/Dec/2011:10:16:34 14
20/Dec/2011:10:16:35 6
20/Dec/2011:10:16:36 7
20/Dec/2011:10:16:37 16
20/Dec/2011:10:16:38 5
20/Dec/2011:10:16:39 7
20/Dec/2011:10:16:40 15
20/Dec/2011:10:16:41 12
20/Dec/2011:10:16:42 13
20/Dec/2011:10:16:43 11
20/Dec/2011:10:16:44 6
20/Dec/2011:10:16:45 7
20/Dec/2011:10:16:46 9
20/Dec/2011:10:16:47 14
20/Dec/2011:10:16:49 6
20/Dec/2011:10:16:50 11
20/Dec/2011:10:16:51 15
20/Dec/2011:10:16:52 10
20/Dec/2011:10:16:53 16
20/Dec/2011:10:16:54 12
20/Dec/2011:10:16:55 8
The second column contains value against each second. Values are there for complete month and for each and every second. I want to add these values:
Per minute basis. [for 00 - 59 seconds ]
Per hour basis [ for 00 - 59 minutes ]
Per Day basis. [ for 0 - 24 hours ]
Sounds like a job for Excel and a pivot table.
The trick is to parse the text date/time you have into something Excel can work with; splitting it on the colon will do just that. Assuming the value you have is in cell A2, this formula will convert the text into a real date:
=DATEVALUE(LEFT(A2,SEARCH(":",A2)-1))+TIMEVALUE(RIGHT(A2,LEN(A2)-SEARCH(":",A2)))
Then just create Minute, Hour and Day columns where you subtract out that portion of the date. For example, if the date from the above formula is in C2, the following will subtract out the seconds and give you just up to the minute:
=C2-SECOND(C2)/24/60/60
Then repeat the process for the next two columns to give you the hour and the day:
=D2-MINUTE(D2)/24/60
=E2-HOUR(E2)/24
Then all you have to do is create a pivot table on the data with rows Day, Hour, Minute and value Sum(Value).

Resources