Convert Oracle to Hive - SubQuery can contain only 1 item in Select List - oracle

Code and sample data to try on your system:
CREATE TABLE "EMP"
( "DR_SID" NUMBER,
"DR_NAME" VARCHAR2(50 BYTE) COLLATE "USING_NLS_COMP",
"ACTIVE_FLAG" VARCHAR2(1 BYTE) COLLATE "USING_NLS_COMP",
"LAST_UPDATED_TIME" TIMESTAMP (6),
"DATA_SOURCE" VARCHAR2(100 BYTE) COLLATE "USING_NLS_COMP",
"ROW_LIMIT" VARCHAR2(50 BYTE) COLLATE "USING_NLS_COMP",
"VERSION#" NUMBER,
"PARENT_DR_SID" NUMBER
)
;
REM INSERTING into EMP
SET DEFINE OFF;
Insert into EMP (DR_SID,DR_NAME,LAST_UPDATED_TIME,VERSION#,PARENT_DR_SID) values (1,'this should not come1',to_timestamp('18-APR-20 05.05.52.425734000 AM','DD-MON-RR HH.MI.SSXFF AM'),1,1);
Insert into EMP (DR_SID,DR_NAME,LAST_UPDATED_TIME,VERSION#,PARENT_DR_SID) values (2,'come',to_timestamp('19-SEP-20 07.18.56.271199000 AM','DD-MON-RR HH.MI.SSXFF AM'),1,2);
Insert into EMP (DR_SID,DR_NAME,LAST_UPDATED_TIME,VERSION#,PARENT_DR_SID) values (3,'come123',to_timestamp('13-FEB-21 05.05.51.645956000 AM','DD-MON-RR HH.MI.SSXFF AM'),1,3);
Insert into EMP (DR_SID,DR_NAME,LAST_UPDATED_TIME,VERSION#,PARENT_DR_SID) values (4,'come456',to_timestamp('13-FEB-21 05.05.51.951505000 AM','DD-MON-RR HH.MI.SSXFF AM'),1,4);
Insert into EMP (DR_SID,DR_NAME,LAST_UPDATED_TIME,VERSION#,PARENT_DR_SID) values (5,'this should not come2',to_timestamp('18-APR-20 05.05.52.425734000 AM','DD-MON-RR HH.MI.SSXFF AM'),2,1);
Insert into EMP (DR_SID,DR_NAME,LAST_UPDATED_TIME,VERSION#,PARENT_DR_SID) values (6,'this should COME',to_timestamp('18-APR-20 05.05.52.425734000 AM','DD-MON-RR HH.MI.SSXFF AM'),3,1);
SELECT DR_SID, DR_NAME, LAST_UPDATED_TIME, VERSION#, PARENT_DR_SID FROM emp ;
the below query needs to be converted into Hive, can someone help?
SELECT DR_SID, DR_NAME, LAST_UPDATED_TIME, VERSION#, PARENT_DR_SID FROM emp t
where (version#,parent_dr_sid)
in (select max(version#),parent_dr_sid from emp group by parent_dr_sid)
;
I am try to find out which record is latest, so am using version# column (if
version# column has the max value then the record is latest and its previous records are old and not to display).
Now how the records are linked with each other, so we have two columns, dr_sid is pk and parent_dr_sid contains same value to show this record is linked with which old record.
you can see the example here, in the given sample code, dr_sid = 1 is present 3 times in parent_dr_sid, all these 3 records
of parent_dr_sid have the same value as 1 (which is linked to dr_sid).
Now I want the below o/p, can you do the same in hive?
FYI - we cant update the table so trying to update the record in this way and fetching in this way.
DR_SID, DR_NAME, LAST_UPDATED_TIME, VERSION#, PARENT_DR_SID
1 this should not come1 18-APR-20 05.05.52.425734000 AM 1 1
2 come 19-SEP-20 07.18.56.271199000 AM 1 2
3 come123 13-FEB-21 05.05.51.645956000 AM 1 3
4 come456 13-FEB-21 05.05.51.951505000 AM 1 4
5 this should not come2 18-APR-20 05.05.52.425734000 AM 2 1
6 this should COME 18-APR-20 05.05.52.425734000 AM 3 1

Use left semi join to emulate in semantics for tuples.
with input as (
select inline(array(
(1,1),
(1,2),
(2,1),
(2, 2)
)) as (c1, c2)
)
, flt as (
select inline(array(
(1,1),
(2, 2)
)) as (f1, f2)
)
select *, split(version(), ' ')[0] as v
from input
left semi join flt
on input.c1 = flt.f1
and input.c2 = flt.f2
input.c1
input.c2
v
1
1
3.1.3000.7.1.7.0-551
2
2
3.1.3000.7.1.7.0-551

I don't know Hive so these might well be completely useless suggestions; however, see if it helps.
If you can use a subquery in FROM clause, you might do the following:
select e.*
from emp e join (select max(a.create_tm) create_tm, a.open_dt
from emp a group by a.open_dt
) x
on x.create_tm = e.create_tm
and x.open_dt = e.open_dt;
Or, make your subquery return a single column by concatenating values. They look like "time" and "date" (I don't know their datatypes so you might need to apply e.g. TO_CHAR function to these columns; no problem in that, as long as it returns desired result):
select *
from emp
where concat(create_tm, open_dt) in (select concat(max(create_tm), open_dt)
from emp
group by open_dt);

this is working:
SELECT DR_SID, DR_NAME, LAST_UPDATED_TIME, VERSION#--, PARENT_DR_SID
FROM emp t join
(select max(version#) v,parent_dr_sid from emp group by parent_dr_sid) t2
on t.version#=t2.v and t.parent_dr_sid = t2.parent_dr_sid
;

Related

how to split one string column of `(12345)some_string` to two column `12345`, `some_string` in Oracle

As the question,
how to split one string column of (12345)some_string to two-column 12345and some_string in Oracle?
Notice: Not all the columns are (12345)some_string, part of columns are only some_string without (12345), the two columns are null and some string
With sample data you posted, this could be one option (line #5 onward):
SQL> with test (col) as
2 (select '(12345)some_string' from dual union all
3 select 'another_string' from dual
4 )
5 select regexp_substr(col, '\d+') col1,
6 substr(col, instr(col, ')') + 1) col2
7 from test;
COL1 COL2
------------------ ------------------
12345 some_string
another_string
SQL>
Assuming the following table:
create table my_table (my_column varchar2(30));
insert into my_table values ('(12345)some_string');
commit;
1) Add a new column to the table
alter table my_table add new_column number;`
2) Fill the new column
update my_table set new_column = regexp_substr(my_column, '^\(([1-9]+)\)', 1, 1, NULL, 1);
3) Update the original column
update my_table set my_column = regexp_replace(my_column, '^\([1-9]+\)', '');

insert all and inner join in oracle

I would like to insert data in to two tables. Will be one-to-many connection. For this, I have to use Foreign Key, of course.
I think, table1 - ID column is an ideal for this a Primary Key. But I generate it always with a trigger, automatically, every line. SO,
How can I put Table1.ID (auto generated, Primary Key) column in to table2.Fkey column in the same insert query?
INSERT ALL INTO table1 ( --here (before this) generated the table1.id column automatically with a trigger.
table1.food,
table1.drink,
table1.shoe
) VALUES (
'apple',
'water',
'slippers'
)
INTO table2 (
fkey,
color
) VALUES (
table1.id, -- I would like table2.fkey == table1.id this gave me error
'blue'
) SELECT
*
FROM
table1
INNER JOIN table2 ON table1.id = table2.fkey;
The error message:
"00904. 00000 - "%s: invalid identifier""
As suggested by #OldProgrammer, use sequence
INSERT ALL INTO table1 ( --here (before this) generated the table1.id column automatically with a trigger.
table1_id,
table1.food,
table1.drink,
table1.shoe
) VALUES (
<sequecename_table1>.nextval,
'apple',
'water',
'slippers'
)
INTO table2 (
fkey,
color
) VALUES (
<sequecename_table2>.nextval,
<sequecename_table1>.currval, -- returns the current value of a sequence.
'blue'
) SELECT
*
FROM
table1
INNER JOIN table2 ON table1.id = table2.fkey;
Since you're using Oracle DB's 12c version, then might use Identity Column Property. Then easily return the value of first table's (table1) to a local variable by charging of returning clause just after an insert statement for table1, and use inside the next insert statement which is for table2 as stated below :
SQL> create table table1(
2 ID integer generated always as identity primary key,
3 food varchar2(50), drink varchar2(50), shoe varchar2(50)
4 );
SQL> create table table2(
2 fkey integer references table1(ID),
3 color varchar2(50)
4 );
SQL> declare
2 cl_tab table1.id%type;
3 begin
4 insert into table1(food,drink,shoe) values('apple','water','slippers' )
5 returning id into cl_tab;
6 insert into table2 values(cl_tab,'blue');
7 end;
8 /
SQL> select * from table1;
ID FOOD DRINK SHOE
-- ------- ------- -------
1 apple water slippers
SQL> select * from table2;
FKEY COLOR
---- --------------------------------------------------
1 blue
Anytime you issue the above statement for insertions between begin and end, both table1.ID and table2.fkey columns will be populated by the same integer values. By the way do not forget to commit the changes by insertions, if you need these values throughout the DB(i.e.from other sessions also).

Copy from VARCHAR field to NUMBER field such that VARCHAR value becomes null after being copied to NUMBER field

I have two tables Table1 and Table2 both with the same columns TestResult and Testcounts. Table1 has testresult as varchar and Table2 has testresult as number.
I have a string .for eg "Oracle" as value for testresult of varchar type for Table1 which needs to be inserted to testresult of number type of Table2 as null.How can i do this? Any suggestions will be highly appreciated :)
EDIT
I have table1 with columns as TestResult varchar2(50) and Testcount number with values as "0.5","0.6","0.8","Oracle" for TestResult and 1,2,3,4 for Testcount.
Now i have another table Table2 as TestResult number and Testcount number with no values, in other words its empty.. I would like to insert all data from table1 to table2 with "Oracle" being inserted as "null"
The following will do what you've asked for:
INSERT INTO TABLE2 (TESTRESULT, TESTCOUNTS)
SELECT CASE
WHEN LENGTH(REGEXP_SUBSTR(TESTRESULT, '[0-9.]*')) = LENGTH(TESTRESULT) THEN TESTRESULT
ELSE NULL
END,
TESTCOUNTS
FROM TABLE1
SQLFiddle here
If you only have a single string value that you can't convert to a number, and you want to set that to null, you can just use a case expression to supply the null:
insert into table2 (testresult, testcounts)
select case when testresult = 'Oracle' then null else to_number(testresult) end,
testcounts
from table1;
Demo:
create table table1 (testresult varchar2(10), testcounts number);
insert into table1
select '0.5', 1 from dual
union all select '0.6', 2 from dual
union all select '0.8', 3 from dual
union all select 'Oracle', 4 from dual;
create table table2 (testresult number, testcounts number);
insert into table2 (testresult, testcounts)
select case when testresult = 'Oracle' then null else to_number(testresult) end,
testcounts
from table1;
select * from table2;
TESTRESULT TESTCOUNTS
---------- ----------
.5 1
.6 2
.8 3
4
db<>fiddle
If you are using Oracle 12c Release 2 (or above) you could also just try to convert the string to a number and use the default ... on conversion error clause to substitute null for that, or any other, non-numeric value:
insert into table2 (testresult, testcounts)
select to_number(testresult default null on conversion error), testcounts
from table1;
select * from table2;
TESTRESULT TESTCOUNTS
---------- ----------
.5 1
.6 2
.8 3
4
In earlier versions you could do the same thing with a user-defined function that wraps the real to_number() call and returns null on error. Or a regex/translate check similar to what #BobJarvis has shown.
Having multiple rows with null would make the data hard to interpret though, so hopefully you only have this one fixed value...

How to display comma separated descriptions based on comma separated values in Oracle 10g?

I am new to Oracle technology. Earlier I posted 2 posts for the same issue due to lack of understanding the requirement.
Table 1:
MSGID
-----
1,2,3
2,3
4
null
null
Table 2:
MID MSGDESC
---- -------
1 ONE
2 TWO
3 THREE
4 FOUR
Expected output:
XCOL DESC
----- -----
1,2,3 ONE,TWO,THREE
2,3 TWO,THREE
4 FOUR
I am not able to fulfil this requirement. Please provide me one solution.
Note: tables don't have any unique or primary key values. Table 1 has 5000 records and table 2 only has 80 records with descriptions.
create table Table1 (MSGID varchar2(100));
insert into Table1 values ('1,2,3');
insert into Table1 values ('2,3');
insert into Table1 values ('4');
insert into Table1 values (null);
insert into Table1 values (null);
create table Table2 (MID varchar2(100), MSGDESC varchar2(100));
insert into Table2 values ('1','ONE');
insert into Table2 values ('2','TWO');
insert into Table2 values ('3','THREE');
insert into Table2 values ('4','FOUR');
select
msgid as xcol,
"DESC",
col1, col2, ..., col12
from
Table1
left join (
select
msgid,
wm_concat(msgdesc) as "DESC"
from
(
select
msgid,
msgdesc
from
(select distinct msgid from Table1 where ...)
cross join (
select level as occ from dual connect by level <= 100)
)
left join Table2
on mid = regexp_substr(msgid, '[^,]+', 1, occ)
where
occ <= regexp_count(msgid, ',') + 1
order by msgid, occ
)
group by msgid
) using (msgid)

Oracle: how to drop a subpartition of a specific partition

I am using an oracle 11 table with interval partitioning and list subpartitioning like this (simplified):
CREATE TABLE LOG
(
ID NUMBER(15, 0) NOT NULL PRIMARY KEY
, MSG_TIME DATE NOT NULL
, MSG_NR VARCHAR2(16 BYTE)
) PARTITION BY RANGE (MSG_TIME) INTERVAL (NUMTOYMINTERVAL (1,'MONTH'))
SUBPARTITION BY LIST (MSG_NR)
SUBPARTITION TEMPLATE (
SUBPARTITION login VALUES ('FOO')
, SUBPARTITION others VALUES (DEFAULT)
)
(PARTITION oldvalues VALUES LESS THAN (TO_DATE('01-01-2010','DD-MM-YYYY')));
How do I drop a specific subpartitition for a specific month without knowing the (system generated) name of the subpartition? There is a syntax "alter table ... drop subpartition for (subpartition_key_value , ...)" but I don't see a way to specify the month for which I am deleting the subpartition. The partition administration guide does not give any examples, either. 8-}
You can use the metadata tables to get the specific subpartition name:
SQL> insert into log values (1, sysdate, 'FOO');
1 row(s) inserted.
SQL> SELECT p.partition_name, s.subpartition_name, p.high_value, s.high_value
2 FROM user_tab_partitions p
3 JOIN
4 user_tab_subpartitions s
5 ON s.table_name = p.table_name
6 AND s.partition_name = p.partition_name
7 AND p.table_name = 'LOG';
PARTITION_NAME SUBPARTITION_NAME HIGH_VALUE HIGH_VALUE
--------------- ------------------ ------------ ----------
OLDVALUES OLDVALUES_OTHERS 2010-01-01 DEFAULT
OLDVALUES OLDVALUES_LOGIN 2010-01-01 'FOO'
SYS_P469754 SYS_SUBP469753 2012-10-01 DEFAULT
SYS_P469754 SYS_SUBP469752 2012-10-01 'FOO'
SQL> alter table log drop subpartition SYS_SUBP469752;
Table altered.
If you want to drop a partition dynamically, it can be tricky to find it with the ALL_TAB_SUBPARTITIONS view because the HIGH_VALUE column may not be simple to query. In that case you could use DBMS_ROWID to find the subpartition object_id of a given row:
SQL> insert into log values (4, sysdate, 'FOO');
1 row(s) inserted.
SQL> DECLARE
2 l_rowid_in ROWID;
3 l_rowid_type NUMBER;
4 l_object_number NUMBER;
5 l_relative_fno NUMBER;
6 l_block_number NUMBER;
7 l_row_number NUMBER;
8 BEGIN
9 SELECT rowid INTO l_rowid_in FROM log WHERE id = 4;
10 dbms_rowid.rowid_info(rowid_in =>l_rowid_in ,
11 rowid_type =>l_rowid_type ,
12 object_number =>l_object_number,
13 relative_fno =>l_relative_fno ,
14 block_number =>l_block_number ,
15 row_number =>l_row_number );
16 dbms_output.put_line('object_number ='||l_object_number);
17 END;
18 /
object_number =15838049
SQL> select object_name, subobject_name, object_type
2 from all_objects where object_id = '15838049';
OBJECT_NAME SUBOBJECT_NAME OBJECT_TYPE
--------------- --------------- ------------------
LOG SYS_SUBP469757 TABLE SUBPARTITION
As it turns out, the "subpartition for" syntax does indeed work, though that seems to be a secret Oracle does not want to tell you about. :-)
ALTER TABLE TB_LOG_MESSAGE DROP SUBPARTITION FOR
(TO_DATE('01.02.2010','DD.MM.YYYY'), 'FOO')
This deletes the subpartition that would contain MSG_TIME 2010/02/01 and MSG_NR FOO. (It is not necessary that there is an actual row with this exact MSG_TIME and MSG_NR. It throws an error if there is no such subpartition, though.)
Thanks for the post - it was very useful for me.
One observation though on the above script to identify the partition and delete it:
The object_id returned by dbms_rowid.rowid_info is not the object_id of the all_objects table. It is actually the data_object_id. It is observed that usually these ids match. However, after truncating the partitioned table several times, these ids diverged in my database. Hence it might be reasonable to instead use the data_object_id to find out the name of the partition:
select object_name, subobject_name, object_type
from all_objects where data_object_id = '15838049';
From the table description of ALL_OBJECTS:
OBJECT_ID Object number of the object
DATA_OBJECT_ID Object number of the segment which contains the object
http://docs.oracle.com/cd/B19306_01/appdev.102/b14258/d_rowid.htm
In the sample code provided in the above link, DBMS_ROWID.ROWID_OBJECT(row_id) is used instead to derive the same information that is given by dbms_rowid.rowid_info. However, the documentation around this sample mentions that it is a data object number from the ROWID.
Examples
This example returns the ROWID for a row in the EMP table, extracts
the data object number from the ROWID, using the ROWID_OBJECT function
in the DBMS_ROWID package, then displays the object number:
DECLARE object_no INTEGER; row_id ROWID; ... BEGIN
SELECT ROWID INTO row_id FROM emp
WHERE empno = 7499; object_no := DBMS_ROWID.ROWID_OBJECT(row_id); DBMS_OUTPUT.PUT_LINE('The obj. # is
'|| object_no); ...

Resources