How to use Hive Set statement is SAS SQL? - hadoop

I am not aware of SAS SQL but one of our user is struggling with the syntax actually,
PROC SQL;
24 CONNECT TO IMPALA (USER="&TDUSER" PW="&***" DSN="BIGDATA" DATABASE= abc);
25 CREATE TABLE SIL_MONITORED AS SELECT * FROM CONNECTION TO IMPALA
26 (
27 SELECT DISTINCT a.partyid, a.baselinerecordstatuscode, cast(b.cin as decimal(10)) as cin, a.business_date
28 /*cast(unix_timestamp(a.business_date, "yyyy-MM-dd") as timestamp) as business_date*/
29 FROM abc.baseline_party as a
30 LEFT JOIN
31 abc.baseline_relationship as b
32 on a.partyid = b.partyid
33 where a.business_date = (select max(business_date) from abc.baseline_party)
34 and upper(a.baselinerecordstatuscode) = 'MONITORED') ;
Recommendation from Bigdata side to use the below property to overcome the Scratch limit issue
set mem_limit=1g
but we aren't sure to use it the above SAS Client side to incorporate and make it work. If it is Hue, it can be set at session level but not SAS.
He has tried like below but it was being ignored at Bigdata side for the other property (SCRATCH_LIMIT),
PROC SQL;
24 CONNECT TO IMPALA (USER="&TDUSER" PW="&***" DSN="BIGDATAPROD" DATABASE= abc
24 **! conopts='SCRATCH_LIMIT=200g'**);
25 CREATE TABLE SIL_MONITORED AS SELECT * FROM CONNECTION TO IMPALA
What's the right way to make
set mem_limit=1g
to work with the above SQL from SAS side?
Thank you!

Related

ORA-30483: window functions are not allowed here in ODI Mapping

I am working on ODI mapping where I am calculating" Min(ID) over parition by(device_num, sys_id) as min_id" in expression component, I used another expression component to filter duplicates using row_number() over partition by (ID) order by(min_id) followed by a filter component "rownum=1" this results in window function error are not allowed here.
I understand that I need to run the analytical function on top the aggregate results. I am not sure how to achieve this in odi mapping (odi 12c). can anyone of you please guide me?
merge into (
select /*+ */ *
from target_base.tgt_table
where (1=1)
) TGT
using (
select /*+ */
RESULT2.ID_1 AS ID,
RESULT2.COL AS MIN_ID
from (
SELECT
RESULT1.ID AS ID ,
RESULT1.DEVICE__NUM AS DEVICE__NUM ,
RESULT1.SYS_ID AS SYS_ID ,
MIN(RESULT1.ID) OVER (PARTITION BY RESULT1.DEVICE__NUM ,RESULT1.SYS_ID) AS COL ,
ROW_NUMBER() OVER (PARTITION BY RESULT1.ID ORDER BY (MIN(RESULT1.ID) OVER (PARTITION BY RESULT1.DEVICE__NUM ,RESULT1.SYS_ID) AS COL) DESC ) AS COL_1
-- WINDOW FUNCTION ERROR,
FROM
(
select * from union_table
) RESULT1
)RESULT2
where (1=1)
and (RESULT2.COL_1 = 1)
) SRC
on (
and TGT.ID=SRC.ID )
when matched then update set
TGT.COMMON_ID = SRC.MIN_ID
, TGT.REC_UPDATE = SYSDATE
WHERE (
DECODE(TGT.COMMON_ID, SRC.COMMON_ID, 0, 1) > 0
)
UNION_TABLE has data as per below table
ID
device_num
sys_id
1
A
5
2
B
15
3
C
25
4
D
35
5
A
10
5
A
5
6
B
15
6
B
20
7
C
25
7
C
30
8
D
35
8
D
40
output expected: the ID where the rown_num=1 will be updated in target
ODI Mapping
This is very complex use case to model in ODI and the parser might not understand what you are trying to achieve.
My advice would be to write the difficult part of the query manually in SQL and use it as a source in ODI. Here is how to do it :
In the physical design of your mapping click on your source table. In the property pane, go to the Extract Options. You can then paste your SQL as a value for option CUSTOMER_TEMPLATE.
Of course it hides a bit the logic of the mapping so it shouldn't be used everywhere but for complex use cases as this one, this is an easy way to get the job done. I personally always add a memo on mapping with custom SQL so other developers can quickly see it.
Let try use IKM :Oracle Incremental Update on target table replace for IKM Oracle Merge.
Physical -> click target table -> Intergration Knowlege Module -> Oracle Incremental Update

Error executing view: Function count is not executable in denodo

I have a table with few columns as shown below. I would like to get a count of all the records per week level but I am unable to count it. I could also use group by but I do not want to do that because it gives me too many records. I use denodo and oracle 18g.
s_id sub_id week year st_id
24hifew njfhwf 50 2020 ew1eer
939hjefbw newfkhwfe 34 2019 e3eef3
hewfhwe23 67832ghef 44 2018 ewfwf1
Code:
select
xx.s_id,
xx.sub_id,
xx.st_id,
yy.week,
yy.year,
count(*) OVER ( PARTITION BY yy.year, yy.week,xx.s_id,xx.sub_id xx.st_id) as week_l
from xx as xx left join yy as yy
Basically, I am looking for an equivalent query to partition by which will run fine.
Error:
finished with error: Error executing view: Function count is not executable

How to create a view from PIVOT SQL

Our Employee benefits have 5 Plan_Types, I need to put all the data into a view that looks like this:
`EMPLID YEAR SK PB VTO BV CT VC
-----------------------------------------------------
0199990 2017 23 22 5 0 169
0000004 2018 22 0 2 5 65
0199990 2017 5 34 34 0 55
0000004 2018 23 0 19 5 0
----------------------------------------------------
`
Here is the SQL for the above pivot table
`SELECT * FROM (SELECT b.emplid,
b.empl_rcd,
EXTRACT (YEAR FROM B.ACCRUAL_PROC_DT)
AS year,
DECODE (b.plan_type,
'50', 'SK',
'52', 'PB',
'5V', 'VTO',
'5Y', 'BV',
'5Z', 'CT',
'51', 'VC')
BANK,
B.HRS_CARRYOVER
+ B.HRS_EARNED_YTD
- B.HRS_TAKEN_YTD
+ B.HRS_ADJUST_YTD
+ B.HRS_BOUGHT_YTD
- B.HRS_SOLD_YTD
- B.HRS_TAKEN_UNPROC
+ B.HRS_ADJUST_UNPROC
+ B.HRS_BOUGHT_UNPROC
- B.HRS_SOLD_UNPROC
BALANCE
FROM ps_leave_accrual b)
PIVOT (SUM (balance) AS bal
FOR (bank)
IN ('SK', 'PB', 'VTO', 'BV', 'CT', 'VC'))
WHERE emplid in ('0199990','0000004');`
How do I turn this into a view I can use in PS Query. If I put this code into the view SQL, It fails at the Pivot point - "SQL command not properly ended "
To use non-standard code in views, we have different options:
We create the record definition and in the SQL we put the SQL, but comment it out, and reference our migration package information. When we build, we build manually as part of the migration process. I've seen folks use DMS to build the views.
We can create a SQL Object and reference it in the view sql using %SQL(M_CUSTOM_VIEW_SQL), that way the code is migrated with the project.

QlikView Data Load from Oracle Script

I have a small problem with loading data from an Oracle database into QlikView 11 using the following script:
SET ThousandSep='.';
SET DecimalSep=',';
SET MoneyThousandSep='.';
SET MoneyDecimalSep=',';
SET MoneyFormat='#.##0,00 €;-#.##0,00 €';
SET TimeFormat='hh:mm:ss';
SET DateFormat='DD.MM.YYYY';
SET TimestampFormat='DD.MM.YYYY hh:mm:ss[.fff]';
SET MonthNames='Jan;Feb;Mrz;Apr;Mai;Jun;Jul;Aug;Sep;Okt;Nov;Dez';
SET DayNames='Mo;Di;Mi;Do;Fr;Sa;So';
ODBC CONNECT TO [Oracle X;DBQ=db1.dc.man.lan] (XUserId is X, XPassword is Y);
SQL SELECT *
FROM UC140017."TABLE_1";
SQL SELECT *
FROM UC140017."TABLE_2";
SQL SELECT *
FROM UC140017."TABLE_3";
SQL SELECT *
FROM UC140017."TABLE_4";
SQL SELECT *
FROM UC140017."TABLE_5";
This results in the following output:
Connecting to Oracle X;DBQ=db1.dc.man.lan
Connected
TABLE_1 2.421 lines fetched
TABLE_2 1 lines fetched
TABLE_2 << TABLE_3 2 lines fetched
TABLE_2 << TABLE_4 22 lines fetched
TABLE_2 << TABLE_5 22 lines fetched
There is no reason why TABLE_3, TABLE_4 & TABLE_5 are joined to TABLE_2. This relationship doesn't exist in the database and I don't see the option to change this in QlikView. Does anyone of you know where this is coming from and has suggestions how to fix this? Thanks!
Best,
Christoph
If the columns in Table_2,Table_3,Table_4 and Table_5 are the same number and same names then QV will auto concatenate them in one table. To avoid this you can use "NoConcatenate" prefix:
SQL SELECT *
FROM UC140017."TABLE_1";
NoConcatenate
SQL SELECT *
FROM UC140017."TABLE_2";
NoConcatenate
SQL SELECT *
FROM UC140017."TABLE_3";
NoConcatenate
SQL SELECT *
FROM UC140017."TABLE_4";
NoConcatenate
SQL SELECT *
FROM UC140017."TABLE_5";
This will force QV to treat all tables as different tables. Be aware that, if this is the case, then after the reload you will have massive synthetic key.

Hive: Joining two tables with different keys

I have two tables like below. Basically i want to join both of them and expected the result like below.
First 3 rows of table 2 does not have any activity id just empty.
All fields are tab separated. Category "33" is having three description as per table 2.
We need to make use of "Activity ID" to get the result for "33" category as there are 3 values for that.
could anyone tell me how to achieve this output?
TABLE: 1
Empid Category ActivityID
44126 33 TRAIN
44127 10 UFL
44128 12 TOI
44129 33 UNASSIGNED
44130 15 MICROSOFT
44131 33 BENEFITS
44132 43 BENEFITS
TABLE 2:
Category ActivityID Categdesc
10 billable
12 billable
15 Non-billable
33 TRAIN Training
33 UNASSIGNED Bench
33 BENEFITS Benefits
43 Benefits
Expected Output:
44126 33 Training
44127 10 Billable
44128 12 Billable
44129 33 Bench
44130 15 Non-billable
44131 33 Benefits
44132 43 Benefits
It's little difficult to do this Hive as there are many limitations. This is how I solved it but there could be a better way.
I named your tables as below.
Table1 = EmpActivity
Table2 = ActivityMas
The challenge comes due to the null fields in Table2. I created a view and Used UNION to combine result from two distinct queries.
Create view actView AS Select * from ActivityMas Where Activityid ='';
SELECT * From (
Select EmpActivity.EmpId, EmpActivity.Category, ActivityMas.categdesc
from EmpActivity JOIN ActivityMas
ON EmpActivity.Category = ActivityMas.Category
AND EmpActivity.ActivityId = ActivityMas.ActivityId
UNION ALL
Select EmpActivity.EmpId, EmpActivity.Category, ActView.categdesc from EmpActivity
JOIN ActView ON EmpActivity.Category = ActView.Category
)
You have to use top level SELECT clause as the UNION ALL is not directly supported from top level statements. This will run total 3 MR jobs. ANd below is the result I got.
44127 10 billable
44128 12 billable
44130 15 Non-billable
44132 43 Benefits
44131 33 Benefits
44126 33 Training
44129 33 Bench
I'm not sure if I understand your question or your data, but would this work?
select table1.empid, table1.category, table2.categdesc
from table1 join table2
on table1.activityID = table2.activityID;

Resources