How to write correct left Join of two tables? - oracle

I want to join two tables, first table primary key data type is number, and second table primary key data type is VARCHAR2(30 BYTE). How to join both tables.
I tried this code but second tables all values are null. why is that?
SELECT a.act_phone_no,a.act_actdevice,a.bi_account_id, a.packag_start_date, c.identification_number,
FROM ACTIVATIONS_POP a
left JOIN customer c
on TO_CHAR(a.act_phone_no) = c.msisdn_voice
first table
act_phone_no bi_account_id
23434 45345
34245 43556
Second table
msisdn_voice identification_number
23434 321113
34245 6547657

It seems that you didn't tell us everything. Query works, if correctly written, on such a sample data:
SQL> with
2 -- Sample data
3 activations_pop (act_phone_no, bi_account_id) as
4 (select 23434, 45345 from dual union all
5 select 34245, 43556 from dual
6 ),
7 customer (msisdn_voice, identification_number) as
8 (select '23434', 321113 from dual union all
9 select '34245', 6547657 from dual
10 )
11 -- query works OK
12 select a.act_phone_no,
13 a.bi_account_id,
14 c.identification_number
15 from activations_pop a join customer c on to_char(a.act_phone_no) = c.msisdn_voice;
ACT_PHONE_NO BI_ACCOUNT_ID IDENTIFICATION_NUMBER
------------ ------------- ---------------------
23434 45345 321113
34245 43556 6547657
SQL>
What could be wrong? Who knows. If you got some result but columns from the CUSTOMER table are empty (NULL?), then they really might be NULL, or you didn't manage to join rows on those columns (left/right padding with spaces?). Does joining on e.g.
on to_char(a.act_phone_no) = trim(c.msisdn_voice)
or
on a.act_phone_no = to_number(c.msisdn_voice)
help?
Consider posting proper test case (CREATE TABLE and INSERT INTO statements).

You are using Oracle ?
Please check the below demo
SELECT a.act_phone_no, a.bi_account_id, c.identification_number
FROM ACTIVATIONS_POP a
left JOIN customer c
on TO_CHAR(a.act_phone_no) = c.msisdn_voice;
SQLFiddle

Related

Translate hierarchical Oracle query to DB2 query

I work primarily with SAS and Oracle and am still new to DB2. Im faced with needing a hierarchical query to separate a clob into chunks that can be pulled into sas. SAS has a limit of 32K for character variables so I cant just pull the dataset in normally.
I found an old stackoverflow question about the best way to pull a clob into a sas data set but it is written in Oracle.
Import blob through SAS from ORACLE DB
Since I am new to DB2 and the syntax for this type of join seems very different I was hoping to find someone that could help convert it and explain the syntax. I find the Oracle syntax to be much easier to understand. I'm not sure in DB2 if you would use a CTE recursion like this https://www.ibm.com/support/knowledgecenter/en/SSEPEK_10.0.0/apsg/src/tpc/db2z_xmprecursivecte.html or if you would use hierarchical queries like this https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_71/sqlp/rbafyrecursivequeries.htm
Here is the Oracle query.
SELECT
id
, level as chunk_id
, regexp_substr(clob_value, '.{1,32767}', 1, level, 'n') as clob_chunk
FROM (
SELECT id, clob_value
FROM schema.table
WHERE id = 1
)
CONNECT BY LEVEL <= regexp_count(clob_value, '.{1,32767}',1,'n')
order by id, chunk_id;
The table has two fields the id and the clob_value and would look like this.
ID CLOB_VALUE
1 really large clob
2 medium clob
3 another large clob
The thought is I would want this result. I would only ever be doing this one row at a time where id= which ever row I am processing.
ID CHUNK_ID CLOB
1 1 clob_chunk1of3
1 2 clob_chunk2of3
1 3 clob_chunk3of3
Thanks for any time spent reading and helping.
Here is a solution that should work in DB2 with few changes (but please be advised that I don't know DB2 at all; I am just using Oracle features that are in the SQL Standard, so they should be implemented identically - or almost so - in DB2).
Below I create a table with your sample data; then I show how to chunk it into substrings of length at most 8 characters. Although the strings are short, I defined the column as CLOB and I am using CLOB tools; this should work on much larger CLOBs.
You can make both the chunk size and the id into bind parameters, if needed. In my demo below I hardcoded the chunk size and I show the result for all IDs in the table. In case the CLOB is NULL, I do return one chunk (which is NULL, of course).
Note that touching CLOBs in a query is very expensive; so most of the work is done without touching the CLOBs. I only work on them as little as possible.
PREP WORK
drop table tbl purge; -- If needed
create table tbl (id number, clob_value clob);
insert into tbl (id, clob_value)
select 1, 'really large clob' from dual union all
select 2, 'medium clob' from dual union all
select 3, 'another large clob' from dual union all
select 4, null from dual -- added to check handling
;
commit;
QUERY
with
prep(id, len) as (
select id, dbms_lob.getlength(clob_value)
from tbl
)
, rec(id, len, ord, pos) as (
select id, len, 1, 1
from prep
union all
select id, len, ord + 1, pos + 8
from rec
where len >= pos + 8
)
select id, ord, dbms_lob.substr(clob_value, 8, pos)
from tbl inner join rec using (id)
order by id, ord
;
ID ORD CHUNK
---- ---- --------
1 1 really l
1 2 arge clo
1 3 b
2 1 medium c
2 2 lob
3 1 another
3 2 large cl
3 3 ob
4 1
Another option is to enable the Oracle compatibility in Db2 and just issue the hierarchical query.
This GitHub repository has background information on SQL recursion in DB2, including the Oracle-style syntax and a side by side example (both work against the Db2 sample database):
-- both queries are against the SAMPLE database
-- and should return the same result
SELECT LEVEL, CAST(SPACE((LEVEL - 1) * 4) || '/' || DEPTNAME
AS VARCHAR(40)) AS DEPTNAME
FROM DEPARTMENT
START WITH DEPTNO = 'A00'
CONNECT BY NOCYCLE PRIOR DEPTNO = ADMRDEPT;
WITH tdep(level, deptname, deptno) as (
SELECT 1, CAST( DEPTNAME AS VARCHAR(40)) AS DEPTNAME, deptno
FROM department
WHERE DEPTNO = 'A00'
UNION ALL
SELECT t.LEVEL+1, CAST(SPACE(t.LEVEL * 4) || '/' || d.DEPTNAME
AS VARCHAR(40)) AS DEPTNAME, d.deptno
FROM DEPARTMENT d, tdep t
WHERE d.admrdept=t.deptno and d.deptno<>'A00')
SELECT level, deptname
FROM tdep;

Tune oracle query with groupby clause

I have a table with Lots of cost columns for each Key
TableA
SK1 SK2 Col1 Col2 Col3..... Col50 Flg(Y/N)
1 2 10 20 30 ...... 500 Y
1 2 10 20 30 ...... 500 N
2 2 10 20 30 ...... 500 N
I need to aggregate(sum) of all values and then check if there are any values with Y then add them to new tableB.
Here table A record combination (1,2) for (sk1,sk2) should be returned.
The i have written query is to select lisr of all cols and add as group by.
We have lots of data so this query is taking too long to run. Any chance to relook into this and do so that it can become faster.
select
Sk1,
Sk2,
nvl(sum(col3),0),
nvl(sum(col4))0,
.....
nvl(sum(col50))
from table A
group by Sk1,
Sk2
Iam using this as part of large query where in many other calculations are performed on top of this.
Working out whether any of a grouped set of records contains a 'Y' would be as simple as ...
select ...
from ...
group by ...
having max(flg) = 'Y'
For now i have created a temporary table and have loaded all the data into it.
If you are using this as part of large query, did you try WITH option?
It could be like this
WITH SUM_DATA AS (select col1, col2, nvl(sum(col3),0), nvl(sum(col4))0, ..... nvl(sum(col50)) from table A group by col1, col2)
SELECT xyz
FROM abc, sum_data
WHERE abc.join_col = sum_data.join_col
More help here

UPDATE with JOIN syntax for Oracle Database

First, I execute the following SQL statements.
drop table names;
drop table ages;
create table names (id number, name varchar2(20));
insert into names values (1, 'Harry');
insert into names values (2, 'Sally');
insert into names values (3, 'Barry');
create table ages (id number, age number);
insert into ages values (1, 25);
insert into ages values (2, 30);
insert into ages values (3, 35);
select * from names;
select * from ages;
As a result, the following tables are created.
ID NAME
---------- ----------
1 Harry
2 Sally
3 Barry
ID AGE
---------- ----------
1 25
2 30
3 35
Now, I want to update increment the age of Sally by 1, i.e. set it to 31. The following query works fine.
update ages set age = age + 1 where id = (select id from names where name = 'Sally');
select * from ages;
The table now looks like this.
ID AGE
---------- ----------
1 25
2 31
3 35
I want to know if there is a way it can be done by joins. For example, I tried the following queries but they fail.
SQL> update ages set age = age + 1 from ages, names where ages.id = names.id and names.name = 'Sally';
update ages set age = age + 1 from ages, names where ages.id = names.id and names.name = 'Sally'
*
ERROR at line 1:
ORA-00933: SQL command not properly ended
SQL> update ages set age = age + 1 from names join ages on ages.id = names.id where names.name = 'Sally';
update ages set age = age + 1 from names join ages on ages.id = names.id where names.name = 'Sally'
*
ERROR at line 1:
ORA-00933: SQL command not properly ended
The syntax of the UPDATE statement is:
http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_10007.htm
where dml_table_expression_clause is:
Please pay attention on ( subquery ) part of the above syntax.
The subquery is a feature that allows to perform an update of joins.
In the most simplest form it can be:
UPDATE (
subquery-with-a-join
)
SET cola=colb
Before update a join, you must know restrictions listed here:
https://docs.oracle.com/cd/B28359_01/server.111/b28286/statements_8004.htm
The view must not contain any of the following constructs:
A set operator
A DISTINCT operator
An aggregate or analytic function
A GROUP BY, ORDER BY, MODEL, CONNECT BY, or START WITH clause
A collection expression in a SELECT list
A subquery in a SELECT list
A subquery designated WITH READ ONLY
Joins, with some exceptions, as documented in Oracle Database Administrator's Guide
and also common rules related to updatable views - here (section: Updating a Join View):
http://docs.oracle.com/cd/B19306_01/server.102/b14231/views.htm#sthref3055
All updatable columns of a join view must map to columns of a
key-preserved table. See "Key-Preserved Tables" for a discussion of
key-preserved tables. If the view is defined with the WITH CHECK
OPTION clause, then all join columns and all columns of repeated
tables are not updatable.
We can first create a subquery with a join:
SELECT age
FROM ages a
JOIN names m ON a.id = m.id
WHERE m.name = 'Sally'
This query simply returns the following result:
AGE
----------
30
and now we can try to update our query:
UPDATE (
SELECT age
FROM ages a
JOIN names m ON a.id = m.id
WHERE m.name = 'Sally'
)
SET age = age + 1;
but we get an error:
SQL Error: ORA-01779:cannot modify a column which maps to a non key-preserved table
This error means, that one of the above restriction is not meet (key-preserved table).
However if we add primary keys to our tables:
alter table names add primary key( id );
alter table ages add primary key( id );
then now the update works without any error and a final outcome is:
select * from ages;
ID AGE
---------- ----------
1 25
2 31
3 35

Add next unique value to SQL column

I have two tables which I am trying to join based on two criteria. One of the criteria is that a date from t1 is between a date in t2 and the next date in t2. The other is that the name from t1 matches the name from t2.
I.e. if t2 looks like this:
Record Name Date
1 A1234 2016-01-03 04:58:00
2 A1234 2015-12-15 08:34:00
3 A5678 2016-01-04 03:14:00
4 A1234 2016-01-05 21:06:00
Then:
Any records from t1 for Name A1234 with dates between 2016-01-03 04:58:00 and 2016-01-05 21:06:00 would be joined to record 1.
Any records from t1 for Name A1234 with dates between 2015-12-15 08:34:00 and 2016-01-03 04:58:00 would be joined to record 2
Any records from t1 for A1234 after the date of record 4 would be joined to record 4
Any records from t1 for A5678 would be joined to record 3 because there's only one date.
My initial approach is to use a correlated subquery to find the next date. However, due to a large number of records, I determined this would take over a year to execute because it searches all of t2 for the next later date during each iteration. Original SQLite:
CREATE TABLE outputtable AS SELECT * FROM t1, t2 d
WHERE t1.Name = d.Name AND t1.Date BETWEEN d.Date AND (
SELECT * FROM (
SELECT Date from t2
WHERE t2.Name = d.Name
ORDER BY Date ASC )
WHERE Date > d.Date
LIMIT 1 )
Now, I would like to find the next date only once for all records in t2 and create a new column in t2 that contains the next date. This way, I only search for the next date about 400,000 times instead of 56 billion times, significantly improving my performance.
Thus the output of the query I'm looking for would make t2 look like this:
Record Name Date Next_Date
1 A1234 2016-01-03 04:58:00 2016-01-05 21:06:00
2 A1234 2015-12-15 08:34:00 2016-01-03 04:58:00
3 A5678 2016-01-04 03:14:00 2999-12-31 23:59:59
4 A1234 2016-01-05 21:06:00 2999-12-31 23:59:59
Then I would be able to simply query whether t1.Date is between t2.Date and t2.Next_Date.
How can I build a query that will add the next date to a new column in t2?
Rather than add the new column, you should just be able to use a query like the one below to join the tables:
SELECT
T1.*,
T2_1.*
FROM
T1
INNER JOIN T2 T2_1 ON
T2_1.Name = T1.Name AND
T2_1.some_date < T1.some_date
LEFT OUTER JOIN T2 T2_2 ON
T2_2.Name = T1.Name AND
T2_2.some_date > T2_1.some_date
LEFT OUTER JOIN T2 T2_3 ON
T2_3.Name = T1.Name AND
T2_3.some_date > T2_1.some_date AND
T2_3.some_date < T2_2.some_date
WHERE
T2_3.Name IS NULL
You can do the same with NOT EXISTS, but this method often has better performance.
You can speed up (sub)queries by using proper indexes.
To check which indexes are actually used, use EXPLAIN QUERY PLAN.
Your original query, without any indexes, would be executed by SQLite 3.10.0 like this:
0|0|0|SCAN TABLE t1
0|1|1|SEARCH TABLE t2 AS d USING AUTOMATIC COVERING INDEX (name=?)
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 1
1|0|0|SCAN TABLE t2
1|0|0|USE TEMP B-TREE FOR ORDER BY
(The "automatic" index is created temporarily just for this query; the optimizer has estimated that this would still be faster than not using any index.)
In this case, you get the most optimal query plan by indexing all columns used for lookups:
create index i1nd on t1(name, date);
create index i2nd on t2(name, date);
0|0|1|SCAN TABLE t2 AS d
0|1|0|SEARCH TABLE t1 USING INDEX i1nd (name=? AND date>? AND date<?)
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 1
1|0|0|SEARCH TABLE t2 USING COVERING INDEX i2nd (name=? AND date>?)
I've used this method on tables with around 1 mil rows with success. Obviously, creating an index that will cover this query will help performance.
This approach uses RANK to create a value to join against. After creating the RANK in a CTE (I use this for readability reasons, please correct for style or personal preference), use a sub-query to join rnk to rnk + 1; aka the next date.
Here's an example of what the code looks like using your sample values.
IF OBJECT_ID('tempdb..#T2') IS NOT NULL
DROP TABLE #T2
CREATE TABLE #T2
(
Record INT NOT NULL PRIMARY KEY,
Name VARCHAR(10),
[DATE] DATETIME,
)
INSERT INTO #T2
VALUES (1, 'A1234', '2016-01-03 04:58:00'),
(2, 'A1234', '2015-12-15 08:34:00'),
(3, 'A5678', '2016-01-04 03:14:00'),
(4, 'A1234', '2016-01-05 21:06:00');
WITH Rank_Dates
AS (Select *
,rank() OVER(PARTITION BY #t2.name ORDER BY #t2.date DESC) AS rnk
FROM #T2)
select RD1.Record,
RD1.Name,
RD1.DATE,
COALESCE (RD2.DATE, '2999-12-31 23:59:59') AS NEXT_DATE
FROM Rank_Dates RD1
LEFT JOIN Rank_Dates RD2
ON RD1.rnk = RD2.rnk + 1
AND RD1.Name = RD2.Name
ORDER BY RD1.Record -- ORDER BY is optional
;
EDIT: added code output below.
The code above produces the following output.
Record Name DATE NEXT_DATE
1 A1234 2016-01-03 04:58:00.000 2016-01-05 21:06:00.000
2 A1234 2015-12-15 08:34:00.000 2016-01-03 04:58:00.000
3 A5678 2016-01-04 03:14:00.000 2999-12-31 23:59:59.000
4 A1234 2016-01-05 21:06:00.000 2999-12-31 23:59:59.000
On a random note. Would using the CURRENT_TIMESTAMP in place of hard coding '2999-12-31 23:59:59.000' produce a similar result?

Fastest way of doing field comparisons in the same table with large amounts of data in oracle

I am recieving information from a csv file from one department to compare with the same inforation in a different department to check for discrepencies (About 3/4 of a million rows of data with 44 columns in each row). After I have the data in a table, I have a program that will take the data and send reports based on a HQ. I feel like the way I am going about this is not the most efficient. I am using oracle for this comparison.
Here is what I have:
I have a vb.net program that parses the data and inserts it into an extract table
I run a procedure to do a full outer join on the two tables into a new table with the fields in one department prefixed with '_c'
I run another procedure to compare the old/new data and update 2 different tables with detail and summary information. Here is code from inside the procedure:
DECLARE
CURSOR Cur_Comp IS SELECT * FROM T.AEC_CIS_COMP;
BEGIN
FOR compRow in Cur_Comp LOOP
--If service pipe exists in CIS but not in FM and the service pipe has status of retired in CIS, ignore the variance
If(compRow.pipe_num = '' AND cis_status_c = 'R')
continue
END IF
--If there is not a summary record for this HQ in the table for this run, create one
INSERT INTO t.AEC_CIS_SUM (HQ, RUN_DATE)
SELECT compRow.HQ, to_date(sysdate, 'DD/MM/YYYY') from dual WHERE NOT EXISTS
(SELECT null FROM t.AEC_CIS_SUM WHERE HQ = compRow.HQ AND RUN_DATE = to_date(sysdate, 'DD/MM/YYYY'))
-- Check fields and update the tables accordingly
If (compRow.cis_loop <> compRow.cis_loop_c) Then
--Insert information into the details table
INSERT INTO T.AEC_CIS_DET( Fac_id, Pipe_Num, Hq, Address, AutoUpdatedFl,
DateTime, Changed_Field, CIS_Value, FM_Value)
VALUES(compRow.Fac_ID, compRow.Pipe_Num, compRow.Hq, compRow.Street_Num || ' ' || compRow.Street_Name,
'Y', sysdate, 'Cis_Loop', compRow.cis_loop, compRow.cis_loop_c);
-- Update information into the summary table
UPDATE AEC_CIS_SUM
SET cis_loop = cis_loop + 1
WHERE Hq = compRow.Hq
AND Run_Date = to_date(sysdate, 'DD/MM/YYYY')
End If;
END LOOP;
END;
Any suggestions of an easier way of doing this rather than an if statement for all 44 columns of the table? (This is run once a week if it matters)
Update: Just to clarify, there are 88 columns of data (44 of duplicates to compare with one suffixed with _c). One table lists each field in a row that is different so one row can mean 30+ records written in that table. The other table keeps tally of the number of discrepencies for each week.
First of all I believe that your task can be implemented (and should be actually) with staight SQL. No fancy cursors, no loops, just selects, inserts and updates. I would start with unpivotting your source data (it is not clear if you have primary key to join two sets, I guess you do):
Col0_PK Col1 Col2 Col3 Col4
----------------------------------------
Row1_val A B C D
Row2_val E F G H
Above is your source data. Using UNPIVOT clause we convert it to:
Col0_PK Col_Name Col_Value
------------------------------
Row1_val Col1 A
Row1_val Col2 B
Row1_val Col3 C
Row1_val Col4 D
Row2_val Col1 E
Row2_val Col2 F
Row2_val Col3 G
Row2_val Col4 H
I think you get the idea. Say we have table1 with one set of data and the same structured table2 with the second set of data. It is good idea to use index-organized tables.
Next step is comparing rows to each other and storing difference details. Something like:
insert into diff_details(some_service_info_columns_here)
select some_service_info_columns_here_along_with_data_difference
from table1 t1 inner join table2 t2
on t1.Col0_PK = t2.Col0_PK
and t1.Col_name = t2.Col_name
and nvl(t1.Col_value, 'Dummy1') <> nvl(t2.Col_value, 'Dummy2');
And on the last step we update difference summary table:
insert into diff_summary(summary_columns_here)
select diff_row_id, count(*) as diff_count
from diff_details
group by diff_row_id;
It's just rough draft to show my approach, I'm sure there is much more details should be taken into account. To summarize I suggest two things:
UNPIVOT data
Use SQL statements instead of cursors
You have several issues in your code:
If(compRow.pipe_num = '' AND cis_status_c = 'R')
continue
END IF
"cis_status_c" is not declared. Is it a variable or a column in AEC_CIS_COMP?
In case it is a column, just put the condition into the cursor, i.e. SELECT * FROM T.AEC_CIS_COMP WHERE not (compRow.pipe_num = '' AND cis_status_c = 'R')
to_date(sysdate, 'DD/MM/YYYY')
That's nonsense, you convert a date into a date, simply use TRUNC(SYSDATE)
Anyway, I think you can use three single statements instead of a cursor:
INSERT INTO t.AEC_CIS_SUM (HQ, RUN_DATE)
SELECT comp.HQ, trunc(sysdate)
from AEC_CIS_COMP comp
WHERE NOT EXISTS
(SELECT null FROM t.AEC_CIS_SUM WHERE HQ = comp.HQ AND RUN_DATE = trunc(sysdate));
INSERT INTO T.AEC_CIS_DET( Fac_id, Pipe_Num, Hq, Address, AutoUpdatedFl, DateTime, Changed_Field, CIS_Value, FM_Value)
select comp.Fac_ID, comp.Pipe_Num, comp.Hq, comp.Street_Num || ' ' || comp.Street_Name, 'Y', sysdate, 'Cis_Loop', comp.cis_loop, comp.cis_loop_c
from T.AEC_CIS_COMP comp
where comp.cis_loop <> comp.cis_loop_c;
UPDATE AEC_CIS_SUM
SET cis_loop = cis_loop + 1
WHERE Hq IN (Select Hq from T.AEC_CIS_COMP)
AND trunc(Run_Date) = trunc(sysdate);
They are not tested but they should give you a hint how to do it.

Resources