Event Study (Extracting Dates in SAS) - events

I need to analyse abnormal returns for an event study on mergers and acquisitions.
** I would like to analyse abnormal returns to acquirers by using event windows. Basically I would like to extract the prices for the acquirers using -1 (the day before the announcement date), announcement date, and +1 (the day after the announcement date).**
I have two different datasets to extract information from.
The first is a dataset with all the merger and acquisition information that has the information in the following format:
DealNO AcquirerNO TargetNO AnnouncementDate
123 abcd Cfgg 22/12/2010
222 qwert cddfgf 26/12/1998
In addition, I have a 2nd dataset which has all the prices.
ISINnumber Date Price
abcd 21/12/2010 10
abcd 22/12/2010 11
abcd 23/12/2010 11
abcd 24/12/2010 12
qwert 20/12/1998 20
qwert 21/12/1998 20
qwert 22/12/1998 21
qwert 23/12/1998 21
qwert 24/12/1998 21
qwert 25/12/1998 22
qwert 26/12/1998 21
qwert 27/12/1998 23
ISIN number is the same as acquirer no, and that is the matching code.
In the end I would like to have a database something like this:
DealNO AcquirerNO TargetNO AnnouncementDate Acquirerprice(-1day) Acquireeprice(0day) Acquirerprice(+1day)
123 abcd Cfgg 22/12/2010 10 11 12
222 qwert cddfgf 26/12/1998 22 21 23
Do you know how I can get this?
I'd prefer to use sas to run the code, but if you are familiar with any other programs that can get the data like this, please let me know.
Thank you in advance ^_^.

This can be done quite easily with PROC SQL and joining the PRICE dataset three times. Try this (assuming data set names of ANNOUCE and PRICE):
Warning: untested code
%let day='21DEC2010'd;
proc sql;
create table RESULT as
select a.dealno,
a.acquirerno,
a.targetno,
a.annoucementdate,
p.price as acquirerprice_prev,
c.price as acquirerprice_cur,
n.price as acquirerprice_next
from ANNOUCE a
left join (select * from PRICE where date = &day-1) p on a.acquirerno = p.isinumber
left join (select * from PRICE where date = &day) c on a.acquirerno = c.isinumber
left join (select * from PRICE where date = &day+1) n on a.acquirerno = n.isinumber
;
quit;

Related

How to find the earliest date of the occurrence of a value for each year

I have a table with this structure:
STATION ID
YEAR
MONTH
DAY
RECDATE
VALUE
123456
1950
01
01
01-01-1950
95
123456
1950
01
15
01-15-1950
85
123456
1950
03
15
03-15-1950
95
123456
1951
01
02
01-02-1951
35
123456
1951
01
10
01-10-1951
35
123456
1952
02
12
02-12-1952
80
123456
1952
02
13
02-13-1952
80
And so on. There's a TMIN value for this station ID for every day of every year between 1888 and 2022. What I'm trying to figure out is a query that will give me the earliest date in each year that a value between -100 and 100 occurs.
The query select year, max(value) from table where value between -100 and 100 group by year order by year gives the year and value. The query select recdate, min(value) from table group by recdate order by recdate gives me every recdate with the value.
I have a vague memory of a query that practically partitions the data by a year or a date range so that the query would look at all the 1950 dates and give the earliest date for the value, then all the 1951 dates, and so on. Does anyone remember queries like that?
Thanks for any and all suggestions.
If I understood you correctly, this is your question:
What I'm trying to figure out is a query that will give me the earliest date in each year that a value between -100 and 100 occurs.
Then you posted 2 queries which return something, but I don't see relation to the question. What was their purpose? To me, they look like some random queries one could write against data in that table.
Therefore, back to the question: isn't that just
select min(recdate), --> "earliest date
year --> in each year
from that_table -- that a
where value between -100 and 100 --> value between -100 and 100 occurs"
group by year

How would I add an artificial termination date to the termination date column based on two different dates for the same patient id

I need to figure out a query that will compare two EFFECTIVE dates for a given patient number with different HMOs and determine which is the later date of the two and then populate a TERMINATION date field for only the older of the two effective dates with the last day of the previous month of the newer effective date of the two. This needs to be done across multiple patient, HMO, effective date combinations in a table.
SELECT * FROM tablename
The output is this:
HMO PATIENT EFFECTIVE TERMINATION
16 221135 01-APR-18
18 221135 01-OCT-17
12 251181 01-SEP-16
16 251181 01-MAR-15
12 271126 01-MAR-15
16 271126 01-DEC-16
12 291141 01-DEC-16
16 291141 01-FEB-19
12 391134 09-MAY-13
16 391134 01-APR-18
What I am trying to do via a query or queries is this:
HMO PATIENT EFFECTIVE TERMINATION
16 221235 01-APR-18
18 221235 01-OCT-17 3/31/2018
12 251381 01-SEP-16
16 251381 01-MAR-15 8/31/2016
12 2711126 01-MAR-15 11/30/2016
16 2711126 01-DEC-16
12 292241 01-DEC-16 1/31/2019
16 292241 01-FEB-19
12 391534 09-MAY-13 31-MAR-19
16 391534 01-APR-18
I've tried using a case statement but it is unsurprisingly creating four rows per patient, hmo combo and populating two of the rows with dates and leaving two blank:
SELECT DISTINCT
S.HMO
,S.PATIENT
,S.EFFECTIVE
,CASE WHEN S.EFFECTIVE > E.EFFECTIVE THEN LAST_DAY(ADD_MONTHS(S.EFFECTIVE, -1))
WHEN S.EFFECTIVE < E.EFFECTIVE THEN LAST_DAY(ADD_MONTHS(E.EFFECTIVE, -1))
ELSE NULL END AS TERMINATION
FROM tablename S INNER JOIN tablename E ON S.PATIENT=E.PATIENT
WHERE S.PATIENT =221135
Any ideas or advice would be welcome.
With sample data you posted:
SQL> select * from tablename order by patient, effective;
HMO PATIENT EFFECTIVE TERMINATIO
---------- ---------- ---------- ----------
18 221135 10/01/2017
16 221135 04/01/2018
16 251181 03/01/2015
12 251181 09/01/2016
12 271126 03/01/2015
16 271126 12/01/2016
6 rows selected.
such a MERGE might do:
SQL> merge into tablename a
2 using (select patient, max(effective) max_effective,
3 min(effective) min_effective
4 from tablename
5 group by patient
6 ) x
7 on (a.patient = x.patient)
8 when matched then update set
9 a.termination = x.max_effective - 1
10 where a.effective = x.min_effective;
3 rows merged.
Result is then
SQL> select * from tablename order by patient, effective;
HMO PATIENT EFFECTIVE TERMINATIO
---------- ---------- ---------- ----------
18 221135 10/01/2017 03/31/2018
16 221135 04/01/2018
16 251181 03/01/2015 08/31/2016
12 251181 09/01/2016
12 271126 03/01/2015 11/30/2016
16 271126 12/01/2016
6 rows selected.
SQL>

Reorder factored matrix columns in Power BI

I have a matrix visual in Power BI. The columns are departments and the rows years. The values are counts of people in each department each year. The departments obviously don't have a natural ordering, BUT I would like to reorder them using the total column count for each department in descending order.
For example, if Department C has 100 people total over the years (rows), and all the other departments have fewer, I want Department C to come first.
I have seen other solutions that add an index column, but this doesn't work very well for me because the "count of people" variable is what I want to index by and that doesn't already exist in my data. Rather it's a calculation based on individual people which each have a department and year.
If anyone can point me to an easy way of changing the column ordering/sorting that would be splendid!
| DeptA | DeptB | DeptC
------|-------|-------|-------
1900 | 2 | 5 | 10
2000 | 6 | 7 | 2
2010 | 10 | 1 | 12
2020 | 0 | 3 | 30
------|-------|-------|-------
Total | 18 | 16 | 54
Order: #2 #3 #1
I don't think there is a built-in way to do this like there is for sorting the rows (there should be though, so go vote for a similar idea here), but here's a possible workaround.
I will assume your source table is called Employees and looks something like this:
Department Year Value
A 1900 2
B 1900 5
C 1900 10
A 2000 6
B 2000 7
C 2000 2
A 2010 10
B 2010 1
C 2010 12
A 2020 0
B 2020 3
C 2020 30
First, create a new calculated table like this:
Depts = SUMMARIZE(Employees, Employees[Department], "Total", SUM(Employees[Value]))
This should give you a short table as follows:
Department Total
A 18
B 16
C 54
From this, you can easily rank the totals with a calculated column on this Depts table:
Rank = RANKX('Depts', 'Depts'[Total])
Make sure your new Depts table is related to the original Employees table on the Department column.
Under the Data tab, use Modeling > Sort by Column to sort Depts[Department] by Depts[Rank].
Finally, replace the Employees[Department] with Depts[Department] on your matrix visual and you should get the following:

How to group by multiple columns and then transpose in Hive

I have some data that I want to group by on multiple columns, perform an aggregation function on, and then transpose into different columns using Hive.
For example, given this input
Input:
hr type value
01 a 10
01 b 20
01 c 50
01 a 30
02 c 10
02 b 90
02 a 80
I want to produce this output:
Output:
hr a_avg b_avg c_avg
01 20 20 50
02 80 90 10
Where there is one distinct column for each distinct type in my input. a_avg corresponds to the average a value for each hour.
How can I do this in Hive? I am guessing I might need to make use of https://github.com/klout/brickhouse/wiki/Collect-UDFs
So far the best I can think of is to use multiple group-by clauses, but that won't transpose the data into multiple columns.
Any ideas?
You don't necessarily need to use Brickhouse, but it will definitely make it easier. Here is what I'm thinking, something like
select hr
, type_map['a'] a_avg
, type_map['b'] b_avg
, type_map['c'] c_avg
from (
select hr
, collect(type, avg_value) type_map -- Brickhouse collect; creates a map
from (
select hr
, type
, avg( value ) avg_value
from db.table
group by hr, type ) x
group by hr ) y

Hive: Joining two tables with different keys

I have two tables like below. Basically i want to join both of them and expected the result like below.
First 3 rows of table 2 does not have any activity id just empty.
All fields are tab separated. Category "33" is having three description as per table 2.
We need to make use of "Activity ID" to get the result for "33" category as there are 3 values for that.
could anyone tell me how to achieve this output?
TABLE: 1
Empid Category ActivityID
44126 33 TRAIN
44127 10 UFL
44128 12 TOI
44129 33 UNASSIGNED
44130 15 MICROSOFT
44131 33 BENEFITS
44132 43 BENEFITS
TABLE 2:
Category ActivityID Categdesc
10 billable
12 billable
15 Non-billable
33 TRAIN Training
33 UNASSIGNED Bench
33 BENEFITS Benefits
43 Benefits
Expected Output:
44126 33 Training
44127 10 Billable
44128 12 Billable
44129 33 Bench
44130 15 Non-billable
44131 33 Benefits
44132 43 Benefits
It's little difficult to do this Hive as there are many limitations. This is how I solved it but there could be a better way.
I named your tables as below.
Table1 = EmpActivity
Table2 = ActivityMas
The challenge comes due to the null fields in Table2. I created a view and Used UNION to combine result from two distinct queries.
Create view actView AS Select * from ActivityMas Where Activityid ='';
SELECT * From (
Select EmpActivity.EmpId, EmpActivity.Category, ActivityMas.categdesc
from EmpActivity JOIN ActivityMas
ON EmpActivity.Category = ActivityMas.Category
AND EmpActivity.ActivityId = ActivityMas.ActivityId
UNION ALL
Select EmpActivity.EmpId, EmpActivity.Category, ActView.categdesc from EmpActivity
JOIN ActView ON EmpActivity.Category = ActView.Category
)
You have to use top level SELECT clause as the UNION ALL is not directly supported from top level statements. This will run total 3 MR jobs. ANd below is the result I got.
44127 10 billable
44128 12 billable
44130 15 Non-billable
44132 43 Benefits
44131 33 Benefits
44126 33 Training
44129 33 Bench
I'm not sure if I understand your question or your data, but would this work?
select table1.empid, table1.category, table2.categdesc
from table1 join table2
on table1.activityID = table2.activityID;

Resources