ElasticSearch: Want to rollup aggregation to parent level - elasticsearch

I have a following records
++++++++++++++++++++++
rid cid result timestamp
1 2 true t1
1 2 false t2
1 3 false t3
1 3 true t4
1 4 false t5
++++++++++++++++++++++
I need to do aggregation in such a way that:
for rid 1 and cid unique combination get the latest record with recent timestamp
for example i should get:
++++++++++get reslts w.r.t. timestamp for cid and rid combination+++++++++++++
rid cid result timestamp
1 2 false t2
1 3 true t4
1 4 false t5
+++++++++++++++++++++++++
once you have the recent recotds in step 1 aggregate the results at rid level and get the pass and fail count for each rid
++++++final output++++++++
rid: 1
result_false: 2
result_true: 1
+++++++++++++++++

Related

Hierarchical query get all children as rows

Data:
ID PARENT_ID
1 [null]
2 1
3 1
4 2
Desired result:
ID CHILD_AT_ANY_LEVEL
1 2
1 3
1 4
2 4
I've tried SYS_CONNECT_BY_PATH, but I don't understand how to convert it result into "inline view" which I can use for JOIN with main table.
select connect_by_root(id) id, id child_at_any_level
from table
where level <> 1
connect by prior id = parent_id;

Get Total count

I want to merge two columns(Sender and Receiver) and get the Transaction Type count then merge another table with using Sender_Receiver primary id.
Sender Receiver Type Amount Date
773787639 777611388 1 300 2/1/2019
773631898 776806843 4 450 8/20/2019
773761571 777019819 6 369 2/11/2019
774295511 777084440 34 1000 1/22/2019
774263079 776816905 45 678 6/27/2019
774386894 777202863 12 2678 2/10/2019
773671537 777545555 14 38934 9/29/2019
774288117 777035194 18 21 4/22/2019
774242382 777132939 21 1275 9/30/2019
774144715 777049859 30 6309 7/4/2019
773911674 776938987 10 3528 5/1/2019
773397863 777548054 15 35892 7/6/2019
776816905 772345091 6 1234 7/7/2019
777035194 775623065 4 453454 7/20/2019
Second Table
Mobile_number Age
773787639 34
773787632 23
774288117 65
I am try to get like this kind of table
Sender/Receiver Type_1 Type_4 Type_12...... Type_45 Age
773787639 3 2 0 0 23
773631898 1 0 1 2 56
773397863 2 2 0 0 65
772345091 1 1 0 3 32
Ok, I have seen your old question and you just need inner join in sub-query as following:
SELECT
SenderReceiver,
COUNT(CASE WHEN Type = 1 THEN 1 END) AS Type_1,
COUNT(CASE WHEN Type = 2 THEN 1 END) AS Type_2,
COUNT(CASE WHEN Type = 3 THEN 1 END) AS Type_3,
...
COUNT(CASE WHEN Type = 45 THEN 1 END) AS Type_45,
Age -- changes here
FROM
( SELECT sr.SenderReceiver, sr.Type, st.Age from -- changes here
(SELECT Sender AS SenderReceiver, Type FROM yourTable
UNION ALL
SELECT Receiver, Type FROM yourTable) sr
join <second_table> st on st.Mobile_number = sr.SenderReceiver -- changes here
) t
GROUP BY
SenderReceiver,
Age; -- changes here
Changes done in your previous query are marked with comments -- changes here.
Please replace the name of the <second_table> with the original name of the table.
Cheers!!

HIVE : Failed to breakup Windowing invocations into Groups. Invalid function LAG

I'm working in HIVE,
I have a dataset like :
client_id date nb_pts
1 2016-06-01 1
1 2016-06-02 3
1 2016-06-03 4
2 2016-06-01 2
2 2016-06-02 3
I need to output for each client, the difference between current nb_pts and previous nb_pts.
So my output should be :
client_id date nb_pts nb_pts_per_row
1 2016-06-01 1 1 (1-0)
1 2016-06-02 3 2 (3-1)
1 2016-06-03 4 1 (4-3)
2 2016-06-01 2 2 (2-0)
2 2016-06-02 3 1 (3-2)
I've tried to use LAG function un HIVE:
SELECT client_id, date, nb_pts,
nb_pts - (LAG(nb_pts, 1, 0) OVER (PARTITION BY client_id ORDER BY date ROWS 1 PRECEDING)) as nb_pts_per_row
FROM MyTable
But the validation failed. Its says :
Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns. Also check for circular dependencies. Underlying error: Expecting left window frame boundary for function LAG((TOK_TABLE_OR_COL nb_pts), 1, 0) org.apache.hadoop.hive.ql.parse.WindowingSpec$WindowSpec#27a007cd as LAG_window_0 to be unbounded.
EDIT (SOLUTION):
So it works without ROWS 1 PRECEDING :
SELECT client_id, date, nb_pts,
nb_pts - (LAG(nb_pts, 1, 0) OVER (PARTITION BY client_id ORDER BY date)) as nb_pts_per_row
FROM MyTable

SAS Sorting within group

I would like to try and sort this data by descending number of events and from latest date, grouped by ID
I have tried proc sql;
proc sql;
create table new as
select *
from old
group by ID
order by events desc, date desc;
quit;
The result I currently get is
ID Date Events
1 09/10/2015 3
1 27/06/2014 3
1 03/01/2014 3
2 09/11/2015 2
3 01/01/2015 2
2 16/10/2014 2
3 08/12/2013 2
4 08/10/2015 1
5 09/11/2014 1
6 02/02/2013 1
Although the dates and events are sorted descending. Those IDs with multiple events are no longer grouped.
Would it be possible to achieve the below in fewer steps?
ID Date Events
1 09/10/2015 3
1 27/06/2014 3
1 03/01/2014 3
3 01/01/2015 2
3 08/12/2013 2
2 09/11/2015 2
2 16/10/2014 2
4 08/10/2015 1
5 09/11/2014 1
6 02/02/2013 1
Thanks
It looks to me like you're trying to sort by descending event, then by either the earliest or latest date (I can't tell which one from your explanation), also descending, and then by id. In your proc sql query, you could try calculating the min or max of the Date variable, grouped by event and id, and then sort the result by descending event, the descending min/max of the date, and id.

oracle left outer joins not showing null values but displays same value

Problem is in left outer join, when there are no rows in right side table then it does not display null values, it displays previous values....
Like this....
1 st Table contains
PGMTX_CODE PGMTX_MARKS PGMTX_TOTQSTN
-------------------------------------------
EE 1 5
EE 2 5
EE 3 0
EE 4 0
2 nd Table contains
PGMTX_CODE PGMTX_MARKS PGMTX_ACTUSEDQST
-------------------------------------------
EE 1 5
So I want result like...
PGMTX_MARKS PGMTX_TOTQSTN PGMTX_ACTUSEDQST
--------------------------------------------------
1 5 5
2 5 blank
3 0 blank
4 0 blank
I use query like this...
SELECT m.PGMTX_MARKS,
m.PGMTX_TOTQSTN,
tlm.PGMTX_ACTUSEDQST,
from PAPERGEN_MTL_OEX m
left OUTER JOIN PAPERGEN_TLMTL_OEX tlm
ON m.PGMTX_CODE=tlm.PGMTX_CODE
where m.PGMTX_CODE='EE'
order by m.PGMTX_MARKS
But I got result like
PGMTX_MARKS PGMTX_TOTQSTN PGMTX_ACTUSEDQST
--------------------------------------------------
1 5 5
2 5 5
3 0 5
4 0 5
Your join condition is wrong, should be
ON m.PGMTX_CODE=tlm.PGMTX_CODE AND m.PGMTX_MARKS = tlm.PGMTX_MARKS

Resources