Hive sum query needed? - hadoop

I have data set like below:
PIC_NUMBER|C_DATE|OR_QUANTITY
1|2017-03-01|10
1|2017-03-02|11
1|2017-03-03|12
1|2017-03-04|13
1|2017-03-05|14
1|2017-03-06|15
1|2017-03-07|16
2|2017-03-02|20
2|2017-03-04|13
2|2017-03-05|14
3|2017-03-02|5
3|2017-03-03|6
3|2017-03-05|7
3|2017-03-06|8
3|2017-03-07|9
4|2017-03-01|10
4|2017-03-02|11
4|2017-03-03|12
4|2017-03-04|13
4|2017-03-05|14
4|2017-03-06|15
4|2017-03-07|16
1|2017-03-08|20
1|2017-03-09|21
1|2017-03-10|22
1|2017-03-11|23
1|2017-03-12|24
1|2017-03-13|25
1|2017-03-14|26
2|2017-03-08|30
2|2017-03-09|31
2|2017-03-10|32
2|2017-03-11|33
2|2017-03-12|34
2|2017-03-13|35
2|2017-03-14|36
3|2017-03-08|30
3|2017-03-09|31
3|2017-03-12|34
3|2017-03-14|36
4|2017-03-08|20
4|2017-03-09|21
4|2017-03-10|22
4|2017-03-11|23
4|2017-03-12|24
4|2017-03-13|25
4|2017-03-14|26
And I want to sum OR_QUANTITY in a way that is exclude lesser date OR_QANTITY ,and sum will be for same PIC_NUMBER.
Example result set is:
PIC_NUMBER|C_DATE|SUM_OR_QUANTITY
1|2017-03-01|252
1|2017-03-02|242
1|2017-03-03|231
1|2017-03-04|219
1|2017-03-05|206
1|2017-03-06|192
1|2017-03-07|177
2|2017-03-02|278
2|2017-03-04|258
2|2017-03-05|245
3|2017-03-02|166
3|2017-03-03|161
3|2017-03-05|155
3|2017-03-06|148
3|2017-03-07|140
4|2017-03-01|252
4|2017-03-02|242
4|2017-03-03|231
4|2017-03-04|219
4|2017-03-05|206
4|2017-03-06|192
4|2017-03-07|177
1|2017-03-08|161
1|2017-03-09|141
1|2017-03-10|120
1|2017-03-11|98
1|2017-03-12|75
1|2017-03-13|51
1|2017-03-14|26
2|2017-03-08|231
2|2017-03-09|201
2|2017-03-10|170
2|2017-03-11|138
2|2017-03-12|105
2|2017-03-13|71
2|2017-03-14|36
3|2017-03-08|131
3|2017-03-09|101
3|2017-03-12|70
3|2017-03-14|36
4|2017-03-08|161
4|2017-03-09|141
4|2017-03-10|120
4|2017-03-11|98
4|2017-03-12|75
4|2017-03-13|51
4|2017-03-14|26
Can we write recursive functions in hive for this aggregation ?

This will give the desired result
select PIC_NUMBER, val1 , sum(OR_QUANTITY) from
( select a.PIC_NUMBER,a.C_DATE ,OR_QUANTITY,
case when (a.C_DATE >= temp.C_DATE ) then temp.C_DATE ELSE null END as val1
from table_name a , table_name temp
where temp.PIC_NUMBER = a.PIC_NUMBER ) temp1
where val1 is not null
group by PIC_NUMBER ,val1

Related

ORA-00904: "S"."AIR_TIME": invalid identifier

Why does this code show invalid identifier when sum is used in distance and air_time column?
When sum is not used this statement process successfully but using sum I get error? I need to use sum for this statement.
MERGE INTO FACT_COMPANY_GROWTH F
USING (SELECT DISTINCT TIME_ID, FLIGHT_KEY, AEROPLANE_KEY, SUM(DISTANCE) AS TOTAL_DISTANCE, SUM(AIR_TIME) AS TOTAL_AIRTIME
FROM TRANSFORM_FLIGHT T
INNER JOIN TRANSFORM_AEROPLANE A
ON T.FK_AEROPLANE_KEY = A.AEROPLANE_KEY
INNER JOIN DIM_TIME D
ON D.YEAR = T.YEAR
AND D.MONTH = T.MONTH
GROUP BY TIME_ID, FLIGHT_KEY, AEROPLANE_KEY) S
ON (F.FK1_TIME_ID = S.TIME_ID
AND F.FK2_FLIGHT_KEY = S.FLIGHT_KEY
AND F.FK3_AEROPLANE_KEY = S.AEROPLANE_KEY
)
WHEN MATCHED THEN
UPDATE SET
F.TOTAL_AIRTIME = S.AIR_TIME,
F.TOTAL_DISTANCE = S.DISTANCE,
F.TOTAL_NO_OF_FLIGHTS = S.FLIGHT_KEY,
F.TOTAL_NO_OF_AEROPLANE = S.AEROPLANE_KEY
WHEN NOT MATCHED THEN
INSERT(FACT_ID, FK1_TIME_ID, FK2_FLIGHT_KEY, FK3_AEROPLANE_KEY, TOTAL_DISTANCE, TOTAL_AIRTIME, TOTAL_NO_OF_FLIGHTS, TOTAL_NO_OF_AEROPLANE)
VALUES
(NULL, S.TIME_ID, S.FLIGHT_KEY, S.AEROPLANE_KEY, S.DISTANCE, S.AIR_TIME, S.FLIGHT_KEY, S.AEROPLANE_KEY);
USING(
SELECT DISTINCT
TIME_ID,
FLIGHT_KEY,
AEROPLANE_KEY,
SUM(DISTANCE) AS TOTAL_DISTANCE,
SUM(AIR_TIME) AS TOTAL_AIRTIME
...) S
The problem is at UPDATE SET F.TOTAL_AIRTIME = S.AIR_TIME. There are 5 fields defined in S and none is named AIR_TIME.
UPDATE SET
F.TOTAL_AIRTIME = S.TOTAL_AIRTIME,
F.TOTAL_DISTANCE = S.TOTAL_DISTANCE,

SQLRPGLE & JSON_OBJECT CTE Statements -101 Error

This program compiles correctly, we are on V7R3 - but when running it receives an SQLCOD of -101 and an SQLSTATE code is 54011 which states: Too many columns were specified for a table, view, or table function. This is a very small JSON that is being created so I do not think that is the issue.
The RPGLE code:
dcl-s OutFile sqltype(dbclob_file);
xfil_tofile = '/ServiceID-REFCODJ.json';
Clear OutFile;
OutFile_Name = %TrimR(XFil_ToFile);
OutFile_NL = %Len(%TrimR(OutFile_Name));
OutFile_FO = IFSFileCreate;
OutFile_FO = IFSFileOverWrite;
exec sql
With elm (erpRef) as (select json_object
('ServiceID' VALUE trim(s.ServiceID),
'ERPReferenceID' VALUE trim(i.RefCod) )
FROM PADIMH I
INNER JOIN PADGUIDS G ON G.REFCOD = I.REFCOD
INNER JOIN PADSERV S ON S.GUID = G.GUID
WHERE G.XMLTYPE = 'Service')
, arr (arrDta) as (values json_array (
select erpRef from elm format json))
, erpReferences (refs) as ( select json_object ('erpReferences' :
arrDta Format json) from arr)
, headerData (hdrData) as (select json_object(
'InstanceName' : trim(Cntry) )
from padxmlhdr
where cntry = 'US')
VALUES (
select json_object('header' : hdrData format json,
'erpReferenceData' value refs format json)
from headerData, erpReferences )
INTO :OutFile;
Any help with this would be very much appreciated, this is our first attempt at creating JSON for sending and have not experienced this issue before.
Thanks,
John
I am sorry for the delay in getting back to this issue. It has been corrected, the issue was with the "values" statement.
This is the correct code needed to make it work correctly:
Select json_object('header' : hdrData format json,
'erpReferenceData' value refs format json)
INTO :OutFile
From headerData, erpReferences )

Error with group by expression?

SELECT DISTINCT dha.order_number,
dha.ordered_date,
dha.org_id,
houf.name Business_Unit,
dha.TRANSACTIONAL_CURRENCY_CODE,
dla.ORDERED_QTY,
dla.UNIT_SELLING_PRICE,
esib.item_number,
esit.description,
hca.account_name,
hzps.PARTY_SITE_NAME,
cicev.UNIT_COST_AVERAGE,
MAX (cicev.cost_date) AS MaxDate
FROM DOO_HEADERS_ALL dha,
DOO_PRICE_ADJUSTMENTS dpa,
hr_organization_units_f_tl houf,
DOO_LINES_ALL dla,
DOO_FULFILL_LINES_ALL dfla,
egp_system_items_b esib,
egp_system_items_tl esit,
hz_cust_accounts hca,
HZ_party_sites hzps,
CST_ITEM_COST_ELEMENTS_V cicev
WHERE dha.org_id = houf.organization_id
AND dha.header_id = dla.header_id
AND esib.inventory_item_id = dfla.inventory_item_id
AND dfla.inventory_item_id = esit.INVENTORY_ITEM_ID
AND dfla.header_id = dha.header_id
AND dha.SOLD_TO_PARTY_ID = hca.party_id
AND dha.SOLD_TO_PARTY_ID = hzps.party_id
AND esit.inventory_item_id = cicev.inventory_item_id
GROUP BY (dha.order_number)
All columns that aren't aggregated must be included into the GROUP BY clause, which means that it should look like this; note that column aliases must be removed!
GROUP BY dha.order_number,
dha.ordered_date,
dha.org_id,
houf.name,
dha.TRANSACTIONAL_CURRENCY_CODE,
dla.ORDERED_QTY,
dla.UNIT_SELLING_PRICE,
esib.item_number,
esit.description,
hca.account_name,
hzps.PARTY_SITE_NAME,
cicev.UNIT_COST_AVERAGE
Apart from that, there's no need for DISTINCT (in SELECT) because GROUP BY will select distinct values anyway.
[EDIT: how to include a new condition into the WHERE clause?]
select ...
from ...
where cicev.cost_date = (select max(cicev1.cost_date)
from CST_ITEM_COST_ELEMENTS_V cicev1
where cicev1.inventory_item_id = cicev.inventory_item_id
)
and ...

Suppress ORA-01403: no data found excpetion

I have the following code
SELECT SUM(nvl(book_value,
0))
INTO v_balance
FROM account_details
WHERE currency = 'UGX';
--Write the balance away
SELECT SUM(nvl(book_value,
0))
INTO v_balance
FROM account_details
WHERE currency = 'USD';
--Write the balance away
Now the problem is, there might not be data in the table for that specific currency, but there might be data for the 'USD' currency. So basically I want to select the sum into my variable and if there is no data I want my stored proc to continue and not throw the 01403 exception.
I don't want to put every select into statement in a BEGIN EXCEPTION END block either, so is there some way I can suppress the exception and just leave the v_balance variable in an undefined (NULL) state without the need for exception blocks?
select nvl(balance,0)
into v_balance
from
(
select sum(nvl(book_value,0)) as balance
from account_details
where currency = 'UGX'
);
SELECT L1.PKCODE L1CD, L1.NAME L1N, L1.LVL L1LVL,
L2.PKCODE L2CD, L2.NAME L2N, L2.LVL L2LVL,
L5.PKCODE L5CD, L5.NAME L5N,
INFOTBLM.OPBAL ( L5.PKCODE, :PSTDT, :PSTUC, :PENUC, :PSTVT, :PENVT ) OPBAL,
INFOTBLM.DEBIT ( L5.PKCODE, :PSTDT,:PENDT, :PSTUC, :PENUC, :PSTVT, :PENVT ) AMNTDR,
INFOTBLM.CREDIT ( L5.PKCODE, :PSTDT,:PENDT, :PSTUC, :PENUC, :PSTVT, :PENVT ) AMNTCR
FROM FSLVL L1, FSLVL L2, FSMAST L5
WHERE L2.FKCODE = L1.PKCODE
AND L5.FKCODE = L2.PKCODE
AND L5.PKCODE Between :PSTCD AND NVL(:PENCD,:PSTCD)
GROUP BY L1.PKCODE , L1.NAME , L1.LVL ,
L2.PKCODE , L2.NAME , L2.LVL ,
L5.PKCODE , L5.NAME
ORDER BY L1.PKCODE, L2.PKCODE, L5.PKCODE

Problem in oracle query

Hai guys,
I've a query in which i need to interchange the values of two fields.
The query is as follows:
SELECT TO_DATE(A.G_LEDGER_DATE,'dd/mm/YYY')as G_LEDGER_DATE,C.ACC_MASTER_NAME,
A.G_LEDGER_REF_NO ,
NVL(CASE WHEN B.G_LEDGER_SECTION = 1 THEN
CASE WHEN
(SELECT COUNT(*)FROM SOSTRANS.ACC_GEN_LEDGER WHERE G_LEDGER_SECTION = B.G_LEDGER_SECTION AND G_LEDGER_ID = B.G_LEDGER_ID)> 1 THEN
B.G_LEDGER_VALUE ELSE A.G_LEDGER_VALUE END END,0) AS G_LEDGER_DR_VALUE,
NVL(CASE WHEN B.G_LEDGER_SECTION = -1 THEN
CASE WHEN
(SELECT COUNT(*) FROM SOSTRANS.ACC_GEN_LEDGER WHERE G_LEDGER_SECTION = B.G_LEDGER_SECTION AND G_LEDGER_ID = B.G_LEDGER_ID)> 1 THEN
B.G_LEDGER_VALUE ELSE A.G_LEDGER_VALUE END END,0) AS G_LEDGER_CR_VALUE,
B.G_LEDGER_SECTION,C.ACC_MASTER_ID,SUBSTR(A.G_LEDGER_REF_NO,0,3) AS Types,'Z' as OrderChar ,
CASE WHEN A.G_LEDGER_REMARK IS NULL THEN B.G_LEDGER_REMARK ELSE A.G_LEDGER_REMARK END AS Narration
FROM SOSTRANS.ACC_GEN_LEDGER A
LEFT OUTER JOIN SOSTRANS.ACC_GEN_LEDGER B ON A.G_LEDGER_ID = B.G_LEDGER_ID
LEFT OUTER JOIN SOSMASTER.ACC_ACCOUNT_MASTER C ON A.ACC_MASTER_ID = C.ACC_MASTER_ID WHERE A.G_LEDGER_CANCEL='N' AND
B.ACC_MASTER_ID = 'MSOS000001' AND
A.ACC_MASTER_ID <> 'MSOS000001' AND
A.G_LEDGER_SECTION <> B.G_LEDGER_SECTION AND
A.G_LEDGER_DATE >= '25/sep/2009' AND
A.G_LEDGER_DATE<='26/sep/2009'
ORDER BY OrderChar,G_LEDGER_DATE
Now i get the output as
... G_LEDGER_DR_VALUE G_LEDGER_CR_VALUE .....
... 2000 0 .....
... 3000 0 .....
... -1000 0 .....
I need to get the negetive value of the G_LEDGER_DR_VALUE side in G_LEDGER_CR_VALUE and if negetive value exists in G_LEDGER_CR_VALUE then it should be in the G_LEDGER_DR_VALUE field
Can anyone help me to solve this?
If I understood your question well, you select a value (that I will call g_ledger_value) that you want to appear in a different column depending on its sign.
This is how I would do it :
SELECT
CASE WHEN t.g_ledger_value>0 THEN t.g_ledger_value ELSE 0 END AS g_ledger_dr_value,
CASE WHEN t.g_ledger_value<0 THEN t.g_ledger_value ELSE 0 END AS g_ledger_cr_value
FROM
(SELECT g_ledger_value FROM mytable) t;
It sounds like a combination of SIGN() and CASE is what you need ...
CASE WHEN SIGN(G_LEDGER_DR_VALUE) = -1 then ...
ELSE ...
END
etc
SELECT G_LEDGER_DR_VALUE,
CASE WHEN G_LEDGER_DR_VALUE < 0
THEN G_LEDGER_CR_VALUE
ELSE G_LEDGER_DR_VALUE
END
FROM (...)
Is it that you mean? I suggest calculate values of CR___VALUE and DR_VALUE in subquery, and then in wrapping query make CASE which returns you correct value.

Resources