unix: compare two tables with unique_id of both table match

unix: compare two tables with unique_id of both table match - validation

I have 2 tables where both table's unique_id will match. Comparison of both tables will produce highlighted mismatches of data in each column with basis of the unique_id. Sample as below;
Table A:
enter image description here
Table B:
enter image description here
Result:
enter image description here
Unique_id should play an important role here. If no unique_id matched present, results should throw null/empty records.
Any idea of how i can solve this?

file: t1.csv
maria;22;us
bryon;23;uk
alex;24;aus
file: t2.csv
maria;22;us
bryon;24;uk
alex;24;aus
file: test.sh
#!/bin/sh
sqlite3 <<EOF
create table t1 (id,a,b);
create table t2 (id,a,b);
.separator ;
.import $1 t1
.import $2 t2
select t1.*,' <-> ', t2.*
from t1
left join t2 on t1.id = t2.id
where t1.a <> t2.a
or t1.b <> t2.b
or t2.id is null;
EOF
how to use:
$ bash test.sh t1.csv t2.csv
bryon;23;uk; <-> ;bryon;24;uk
but also check for empty rows in t1.csv
$ bash test.sh t2.csv t1.csv

Related

Error: column reference is ambiguous

I have two tables from which i want to get the date without using joins.
id ProductVersion productName productDate
1 p1.1 product1 2017-3-11
2 p1.2 product1 2017-3-11
3 p2.1 product2 2017-5-12
4 p2.2 product2 2017-5-12
5 p2.3 product2 2017-5-12
6 p3.1 product3 2017-11-21
7 p3.1 product3 2017-11-21
Table2
tid productVersion comments status AvailableDate
101 p1.1 Good Sold 2017-3-11
102 p1.1 Good Available 2017-3-12
1009 p1.1 Good Available 2017-3-12
4008 p3.1 Average NA 2017-11-11
106 p3.2 Good Sold 2017-5-14
6 p3.1 Average Available 2017-11-12
I have two tables as shown above.
I want to get productVersion,productName,productDate,Comments,status column details from the above two tables.
SQL Query(without joins):
select productversion t1,productName t1,productDate t1,comments t2,status t2 from table1 t1,table2 t2
where t1.productVersion = t2.productversion
Error message:
Error: column reference "productDate" is ambiguous.
Any inputs?

[TL;DR] Your main issue is that you appear to be putting the table aliases after the column name where an alias for the column is expected when they should be prefixing the column name to identify which table the columns belong to.
Your query is equivalent to:
select productversion AS columnalias1,
productName AS columnalias2,
productDate AS columnalias3,
comments AS columnalias4,
status AS columnalias5
from table1 t1,
table2 t2
where t1.productVersion = t2.productversion
And all your column aliases are either t1 or t2 so you will get multiple columns with the same name. I do not think this is what you intended as both tables have a productVersion column so the query parser does not know which you intended to use. You probably want the table aliases before the column name to identify which table each column is from:
select t1.productversion,
t1.productName,
t1.productDate,
t2.comments,
t2.status
from table1 t1,
table2 t2
where t1.productVersion = t2.productversion
The second problem is that, while you say it is a query "without joins", you are using a legacy Oracle comma-join syntax and your query can be rewritten to have exactly the same functionality using ANSI/ISO SQL syntax and is equivalent to:
select t1.productversion,
t1.productName,
t1.productDate,
t2.comments,
t2.status
from table1 t1
INNER JOIN table2 t2
ON ( t1.productVersion = t2.productversion )
If you want something without joins then use UNION ALL:
SELECT productVersion,
productName,
productDate,
NULL AS Comments,
NULL AS status
FROM table1
UNION ALL
SELECT productVersion,
NULL AS productName,
NULL AS productDate,
Comments,
status
FROM table2
But it will not correlate the values in the two tables.

To refer to a specific table column, you use this syntax:
table_name.column_name
Your query should be:
select t1.productversion, t1.productName, t1.productDate,
t2.comments, t2.status
from table1 as t1
join table2 as t2 on t1.productVersion = t2.productversion

update set value based on another value of another column and/or same column in another row:-ORA 1427

I'm trying to set a column to reset to zero or increment by +1 based on a pass or fail in another column, and/or the value of that same column in the previous weeks row.
There are two other variable columns which must match those in the previous weeks row.
Table is something like:
WEEK | ID1 | ID2 | FLAG | INCREMENT_COUNT |
--------------------------------------------------------
--------------------------------------------------------
I have been trying to get this part of the procedure to work, and the best I've got so far is:
ID_IN and ID_IN3 are passed in the procedure call
OLD_DATE and NEW_DATE are set as the previous week and current week
----------------------------------------------------------------------
update table1
set table1.INCREMENT_COUNT = CASE
WHEN table1.FLAG is null then null
WHEN table1.FLAG = 1 then 0
WHEN table1.FLAG = 0 then (NVL(INCREMENT_COUNT,0)+ 1)
END
where (select INCREMENT_COUNT
from table1
where WEEK=NEW_DATE
and ID1=ID_IN
and exists (select (1)
from table2
where table1.ID2=table2.ID2
and table2.ID3=ID_IN3))
=
(select INCREMENT_COUNT
from table1
where WEEK=OLD_DATE
and ID1=ID_IN
and exists (select (1)
from table2
where table1.ID2=table2.ID2
and table2.ID3=ID_IN3));
When this procedure is called I get the error
ORA-01427: single-row subquery returns more than one row
Additionally, in MySQL I could do it something like this and get it working...
update table1 as t01
left join(select ID3, ID2, INCREMENT_COUNT as prev_count from table1 as t10 inner join table2 as t2 on t10.ID2=t2.ID2 where ID1=ID_IN and ID3=ID_IN3 and t10.WEEK=OLD_DATE) as prev_date on t01.WEEK=NEW_DATE and prev_date.ID2=t01.ID2 and t01.ID1=ID_IN
set t01.INCREMENT_COUNT = if(t1.FLAG is null, null, if(t1.FLAG,0, IFNULL(prev_date.prev_count,0)+1))
where t01.ID1=ID_IN
and t1.WEEK=NEW_DATE
and prev_date.ID3=ID_IN3;

Similarl to your mySQL example, you can do something like this in oracle. This may not work for you depending on your data model. I've put together a crude basic version based on your information, but you've not provided enough information about your data model and your tables/aliases/column names are poor for readability...
(more on update with a subquery here -> https://docs.oracle.com/database/121/SQLRF/statements_10008.htm#i2067871)
update
(select t01.increment_count, t01.flag, prev_date.prev_count
from table1 t01
left join(select ID3, ID2, INCREMENT_COUNT as prev_count
from table1 t10
inner join table2 t2 on t10.ID2=t2.ID2
where ID1=ID_IN
and ID3=ID_IN3
and t10.WEEK=OLD_DATE) prev_date on t01.WEEK=NEW_DATE and prev_date.ID2=t01.ID2 and t01.ID1=ID_IN
where t01.ID1=ID_IN
and t1.WEEK=NEW_DATE
and prev_date.ID3=ID_IN3)
set INCREMENT_COUNT = if(FLAG is null, null, if(FLAG,0, IFNULL(prev_count,0)+1));

One of the queries in where condition returns more than 1 record

This seems to have done the job.
Thanks for the help, it got me thinking in a different way.
UPDATE TABLE1 T01
SET INCREMENT_COUNT = CASE
WHEN T01.FLAG IS NULL THEN NULL
WHEN T01.FLAG = 1 THEN 0
WHEN T01.FLAG = 0 THEN (NVL((SELECT INCREMENT_COUNT
FROM TABLE1 T10
WHERE T10.WEEK=OLD_DATE
AND T01.WEEK=NEW_DATE
AND T01.ID2=T10.ID2
AND ID1=ID_IN),0)+ 1)
END
WHERE EXISTS (SELECT (1)
FROM TABLE2
WHERE TABLE1.ID2=TABLE2.ID2
AND TABLE2.ID3=ID_IN3);

Slowly changing dimensions- SCD1 and SCD2 implementation in Hive [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I am looking for SCD1 and SCD2 implementation in Hive (1.2.1). I am aware of the workaround to load SCD1 and SCD2 tables prior to Hive (0.14). Here is the link for loading SCD1 and SCD2 with the workaround approach http://hortonworks.com/blog/four-step-strategy-incremental-updates-hive/
Now that Hive supports ACID operations just want to know if there is a better or direct way of loading it.

As HDFS is immutable storage it could be argued that versioning data and keeping history (SCD2) should be the default behaviour for loading dimensions. You can create a View in your Hadoop SQL query engine (Hive, Impala, Drill etc.) that retrieves the current state/latest value using windowing functions. You can find out more about dimensional models on Hadoop in my blog post, e.g. how to handle a large dimension and fact table.

Well, I work it around using two temp tables:
drop table if exists administrator_tmp1;
drop table if exists administrator_tmp2;
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
--review_administrator
CREATE TABLE if not exists review_administrator(
admin_id bigint ,
admin_name string,
create_time string,
email string ,
password string,
status_description string,
token string ,
expire_time string ,
granter_user_id bigint ,
admin_time string ,
effect_start_date string ,
effect_end_date string
)
partitioned by (current_row_indicator string comment 'current, expired')
stored as parquet;
--tmp1 is used for saving origin data
CREATE TABLE if not exists administrator_tmp1(
admin_id bigint ,
admin_name string,
create_time string,
email string ,
password string ,
status_description string ,
token string ,
expire_time string ,
granter_user_id bigint ,
admin_time string ,
effect_start_date string ,
effect_end_date string
)
partitioned by (current_row_indicator string comment 'current, expired:')
stored as parquet;
--tmp2 saving the scd data
CREATE TABLE if not exists administrator_tmp2(
admin_id bigint ,
admin_name string,
create_time string,
email string ,
password string ,
status_description string ,
token string ,
expire_time string ,
granter_user_id bigint ,
admin_time string ,
effect_start_date string ,
effect_end_date string
)
partitioned by (current_row_indicator string comment 'current, expired')
stored as parquet;
--insert origin data into tmp1
INSERT OVERWRITE TABLE administrator_tmp1 PARTITION(current_row_indicator)
SELECT
user_id as admin_id,
name as admin_name,
time as create_time,
email as email,
password as password,
status as status_description,
token as token,
expire_time as expire_time,
admin_id as granter_user_id,
admin_time as admin_time,
'{{ ds }}' as effect_start_date,
'9999-12-31' as effect_end_date,
'current' as current_row_indicator
FROM
ks_db_origin.gifshow_administrator_origin
;
--insert scd data into tmp2
--for the data unchanged
INSERT INTO TABLE administrator_tmp2 PARTITION(current_row_indicator)
SELECT
t2.admin_id,
t2.admin_name,
t2.create_time,
t2.email,
t2.password,
t2.status_description,
t2.token,
t2.expire_time,
t2.granter_user_id,
t2.admin_time,
t2.effect_start_date,
t2.effect_end_date as effect_end_date,
t2.current_row_indicator
FROM
administrator_tmp1 t1
INNER JOIN
(
SELECT * FROM review_administrator
WHERE current_row_indicator = 'current'
) t2
ON
t1.admin_id = t2.admin_id
AND t1.admin_name = t2.admin_name
AND t1.create_time = t2.create_time
AND t1.email = t2.email
AND t1.password = t2.password
AND t1.status_description = t2.status_description
AND t1.token = t2.token
AND t1.expire_time = t2.expire_time
AND t1.granter_user_id = t2.granter_user_id
AND t1.admin_time = t2.admin_time
;
--for the data changed , update the effect_end_date
INSERT INTO TABLE administrator_tmp2 PARTITION(current_row_indicator)
SELECT
t2.admin_id,
t2.admin_name,
t2.create_time,
t2.email,
t2.password,
t2.status_description,
t2.token,
t2.expire_time,
t2.granter_user_id,
t2.admin_time,
t2.effect_start_date as effect_start_date,
'{{ yesterday_ds }}' as effect_end_date,
'expired' as current_row_indicator
FROM
administrator_tmp1 t1
INNER JOIN
(
SELECT * FROM review_administrator
WHERE current_row_indicator = 'current'
) t2
ON
t1.admin_id = t2.admin_id
WHERE NOT
(
t1.admin_name = t2.admin_name
AND t1.create_time = t2.create_time
AND t1.email = t2.email
AND t1.password = t2.password
AND t1.status_description = t2.status_description
AND t1.token = t2.token
AND t1.expire_time = t2.expire_time
AND t1.granter_user_id = t2.granter_user_id
AND t1.admin_time = t2.admin_time
)
;
--for the changed data and the new data
INSERT INTO TABLE administrator_tmp2 PARTITION(current_row_indicator)
SELECT
t1.admin_id,
t1.admin_name,
t1.create_time,
t1.email,
t1.password,
t1.status_description,
t1.token,
t1.expire_time,
t1.granter_user_id,
t1.admin_time,
t1.effect_start_date,
t1.effect_end_date,
t1.current_row_indicator
FROM
administrator_tmp1 t1
LEFT OUTER JOIN
(
SELECT * FROM review_administrator
WHERE current_row_indicator = 'current'
) t2
ON
t1.admin_id = t2.admin_id
AND t1.admin_name = t2.admin_name
AND t1.create_time = t2.create_time
AND t1.email = t2.email
AND t1.password = t2.password
AND t1.status_description = t2.status_description
AND t1.token = t2.token
AND t1.expire_time = t2.expire_time
AND t1.granter_user_id = t2.granter_user_id
AND t1.admin_time = t2.admin_time
WHERE t2.admin_id IS NULL
;
--for the data already marked by 'expired'
INSERT INTO TABLE administrator_tmp2 PARTITION(current_row_indicator)
SELECT
t1.admin_id,
t1.admin_name,
t1.create_time,
t1.email,
t1.password,
t1.status_description,
t1.token,
t1.expire_time,
t1.granter_user_id,
t1.admin_time,
t1.effect_start_date,
t1.effect_end_date,
t1.current_row_indicator
FROM
review_administrator t1
WHERE t1.current_row_indicator = 'expired'
;
--populate the dim table
INSERT OVERWRITE TABLE review_administrator PARTITION(current_row_indicator)
SELECT
t1.admin_id,
t1.admin_name,
t1.create_time,
t1.email,
t1.password,
t1.status_description,
t1.token,
t1.expire_time,
t1.granter_user_id,
t1.admin_time,
t1.effect_start_date,
t1.effect_end_date,
t1.current_row_indicator
FROM
administrator_tmp2 t1
;
--drop the two temp table
drop table administrator_tmp1;
drop table administrator_tmp2;
-- --example data
-- --2017-01-01
-- insert into table review_administrator PARTITION(current_row_indicator)
-- SELECT '1','a','2016-12-31','a#ks.com','password','open','token1','2017-12-31',
-- 0,'2017-12-31','2017-01-01','9999-12-31','current'
-- FROM default.sample_07 limit 1;
-- --2017-01-02
-- insert into table administrator_tmp1 PARTITION(current_row_indicator)
-- SELECT '1','a','2016-12-31','a01#ks.com','password','open','token1','2017-12-31',
-- 0,'2017-12-31','2017-01-02','9999-12-31','current'
-- FROM default.sample_07 limit 1;
-- insert into table administrator_tmp1 PARTITION(current_row_indicator)
-- SELECT '2','b','2016-12-31','a#ks.com','password','open','token1','2017-12-31',
-- 0,'2017-12-31','2017-01-02','9999-12-31','current'
-- FROM default.sample_07 limit 1;
-- --2017-01-03
-- --id 1 is changed
-- insert into table administrator_tmp1 PARTITION(current_row_indicator)
-- SELECT '1','a','2016-12-31','a03#ks.com','password','open','token1','2017-12-31',
-- 0,'2017-12-31','2017-01-03','9999-12-31','current'
-- FROM default.sample_07 limit 1;
-- --id 2 is not changed at all
-- insert into table administrator_tmp1 PARTITION(current_row_indicator)
-- SELECT '2','b','2016-12-31','a#ks.com','password','open','token1','2017-12-31',
-- 0,'2017-12-31','2017-01-03','9999-12-31','current'
-- FROM default.sample_07 limit 1;
-- --id 3 is a new record
-- insert into table administrator_tmp1 PARTITION(current_row_indicator)
-- SELECT '3','c','2016-12-31','c#ks.com','password','open','token1','2017-12-31',
-- 0,'2017-12-31','2017-01-03','9999-12-31','current'
-- FROM default.sample_07 limit 1;
-- --now dim table will show you the right SCD.

Here's the detailed implementation of slowly changing dimension type 2 in Hive using exclusive join approach.
Assuming that the source is sending a complete data file i.e. old, updated and new records.
Steps-
Load the recent file data to STG table
Select all the expired records from HIST table
select * from HIST_TAB where exp_dt != '2099-12-31'
Select all the records which are not changed from STG and HIST using inner join and filter on HIST.column = STG.column as below
select hist.* from HIST_TAB hist
inner join STG_TAB stg
on hist.key = stg.key
where hist.column = stg.column
Select all the new and updated records which are changed from STG_TAB using exclusive left join with HIST_TAB and set expiry and effective date as below
select stg.*, eff_dt (yyyy-MM-dd), exp_dt (2099-12-31)
from STG_TAB stg
left join
(select * from HIST_TAB where exp_dt = '2099-12-31') hist
on hist.key = stg.key
where hist.key is null
or hist.column != stg.column
Select all updated old records from the HIST table using exclusive left join with STG table and set their expiry date as shown below:
select hist.*, exp_dt(yyyy-MM-dd) from
(select * from HIST_TAB where exp_dt = '2099-12-31') hist
left join STG_TAB stg
on hist.key= stg.key
where hist.key is null
or hist.column!= stg.column
unionall queries from 2-5 and insert overwrite result to HIST table
More detailed implementation of SCD type 2 can be found here-
https://github.com/sahilbhange/slowly-changing-dimension

drop table if exists harsha.emp;
drop table if exists harsha.emp_tmp1;
drop table if exists harsha.emp_tmp2;
drop table if exists harsha.init_load;
show databases;
use harsha;
show tables;
create table harsha.emp (eid int,ename string,sal int,loc string,dept int,start_date timestamp,end_date timestamp,current_status string)
comment "emp scd implementation"
row format delimited
fields terminated by ','
lines terminated by '\n'
;
create table harsha.emp_tmp1 (eid int,ename string,sal int,loc string,dept int,start_date timestamp,end_date timestamp,current_status string)
comment "emp scd implementation"
row format delimited
fields terminated by ','
lines terminated by '\n'
;
create table harsha.emp_tmp2 (eid int,ename string,sal int,loc string,dept int,start_date timestamp,end_date timestamp,current_status string)
comment "emp scd implementation"
row format delimited
fields terminated by ','
lines terminated by '\n'
;
create table harsha.init_load (eid int,ename string,sal int,loc string,dept int)
row format delimited
fields terminated by ','
lines terminated by '\n'
;
show tables;
insert into table harsha.emp select 101 as eid,'aaaa' as ename,3400 as sal,'chicago' as loc,10 as did,from_unixtime(unix_timestamp()) as start_date,from_unixtime(unix_timestamp('9999-12-31 23:59:59','yyyy-mm-dd hh:mm:ss')) as end_date,'current' as current_status from (select '123')x;
insert into table harsha.emp select 102 as eid,'abaa' as ename,6400 as sal,'ny' as loc,10 as did,from_unixtime(unix_timestamp()) as start_date,from_unixtime(unix_timestamp('9999-12-31 23:59:59','yyyy-mm-dd hh:mm:ss')) as end_date,'current' as current_status from (select '123')x;
insert into table harsha.emp select 103 as eid,'abca' as ename,2300 as sal,'sfo' as loc,20 as did,from_unixtime(unix_timestamp()) as start_date,from_unixtime(unix_timestamp('9999-12-31 23:59:59','yyyy-mm-dd hh:mm:ss')) as end_date,'current' as current_status from (select '123')x;
insert into table harsha.emp select 104 as eid,'afga' as ename,3000 as sal,'seattle' as loc,10 as did,from_unixtime(unix_timestamp()) as start_date,from_unixtime(unix_timestamp('9999-12-31 23:59:59','yyyy-mm-dd hh:mm:ss')) as end_date,'current' as current_status from (select '123')x;
insert into table harsha.emp select 105 as eid,'ikaa' as ename,1400 as sal,'LA' as loc,30 as did,from_unixtime(unix_timestamp()) as start_date,from_unixtime(unix_timestamp('9999-12-31 23:59:59','yyyy-mm-dd hh:mm:ss')) as end_date,'current' as current_status from (select '123')x;
insert into table harsha.emp select 106 as eid,'cccc' as ename,3499 as sal,'spokane' as loc,20 as did,from_unixtime(unix_timestamp()) as start_date,from_unixtime(unix_timestamp('9999-12-31 23:59:59','yyyy-mm-dd hh:mm:ss')) as end_date,'current' as current_status from (select '123')x;
insert into table harsha.emp select 107 as eid,'toiz' as ename,4000 as sal,'WA.DC' as loc,40 as did,from_unixtime(unix_timestamp()) as start_date,from_unixtime(unix_timestamp('9999-12-31 23:59:59','yyyy-mm-dd hh:mm:ss')) as end_date,'current' as current_status from (select '123')x;
load data local inpath 'Documents/hadoop_scripts/t3.txt' into table harsha.emp;
load data local inpath 'Documents/hadoop_scripts/t4.txt' into table harsha.init_load;
insert into table harsha.emp_tmp1 select eid,ename,sal,loc,dept,from_unixtime(unix_timestamp()) as start_date,from_unixtime(unix_timestamp('9999-12-31 23:59:59','yyyy-mm-dd hh:mm:ss')) as end_date,'current' as current_status
from harsha.init_load;
insert into table harsha.emp_tmp2
select a.eid,a.ename,a.sal,a.loc,a.dept,from_unixtime(unix_timestamp()) as start_date,from_unixtime(unix_timestamp('9999-12-31 23:59:59','yyyy-mm-dd hh:mm:ss')) as end_date,'updated' as current_status from emp_tmp1 a
left outer join emp b on
a.eid=b.eid and
a.ename=b.ename and
a.sal=b.sal and
a.loc = b.loc and
a.dept = b.dept
where b.eid is null
union all
select a.eid,a.ename,a.sal,a.loc,a.dept,from_unixtime(unix_timestamp()) as start_date,from_unixtime(unix_timestamp('9999-12-31 23:59:59','yyyy-mm-dd hh:mm:ss')) as end_date,'current' as current_status from emp_tmp1 a
left outer join emp b on
a.eid = b.eid and
a.ename=b.ename and
a.sal=b.sal and
a.loc=b.loc and
a.dept=b.dept
where b.eid is not null
union all
select b.eid,b.ename,b.sal,b.loc,b.dept,b.start_date as start_date,from_unixtime(unix_timestamp()) as end_date,'expired' as current_status from emp b
inner join emp_tmp1 a on
a.eid=b.eid
where
a.ename <> b.ename or
a.sal <> b.sal or
a.loc <> b.loc or
a.dept <> b.dept
;
insert into table harsha.emp select eid,ename,sal,loc,dept,start_date,end_date,current_status from emp_tmp2;
records including expired:
select * from harsha.emp order by eid;
latest recods:
select a.* from emp a inner join (select eid ,max(start_date) as start_date from emp where current_status <> 'expired' group by eid) b on a.eid=b.eid and a.start_date=b.start_date;

I did use another approach when it come to managing data with SCDs:
Never update data that does exist inside your historical file or table.
Make sure that new rows will be compared to the most recent generation, for instance the load logic will add control columns : loaded_on, checksum and if needed a sequence column that would be used if multiple loads does occur the same day then comparing new data to the most recent generation will use both control columns and a key column that does exist inside your data like a customer or product key.
Now, the magic does take place by computing the checksum of all the column involved but the control columns, creating a unique finger print for each row. The finger print (checksum) column then will be used to determine if any columns have changed compared to the most recent generation (most recent generation is based on the latest state of the data based on the key, loaded_on and sequence).
Now, you know if a row coming from your daily update is new because there is no previous generation or if a row coming from your daily update will require to create a new row (new generation) inside your historical file or table and last if a row coming from your daily update does not have any changes therefore no need to create a row because there is no difference compared to previous generation.
The type of logic needed can be build using Apache Spark, in a single statement you can ask Spark to concatenate any number of columns of any datatypes then compute a hash value that is used to finger print it.
All together now you can develop a utility based on spark that will accept any data source and output a well organized, clean and slow dimensions aware historical file, table,... last, never update append only!

How to update table1 field by using other table and function

I have two table and one function,
Table1 contains shop_code,batch_id,registry_id
shop_code| batch_id|registry_id
123 | 100 |12
124 | 100 |13
125 | 100 |12
Table2 contains shop_code,shop_name
shop_code| shop_name
123 | need to populate
124 | need to populate
125 | need to populate
Function1 take parameter registry_id from table1 and returns shop_name
Table2 shop_name is empty I want to populate against the shop_code.
I have tried my best but all effort is gone in vain.
It will be great if someone can help I am using Oracle.
I tried below code but giving error on from keyword
update TABLE2 set T2.SHOP_NAME = T.SHOP_NAME
from(
select GET_shop_name(t1.registry_id) as shop_name ,
t1.shop_code shop_code
from TABLE1 T1
) t where t.shop_code = t1.shop_code;

I am not entirely 100% sure if I got your question right, but I believe you want something like
update
table2 u
set
shop_name = (
select
get_shop_name(t1.batch_id)
from
table1 t1
where
t1.chop_code = u.shop_code
);

can you try this approach try to put inner query to get shop name value; I have not tested it but I think approach will work for you.
update TABLE2 T2
set T2.SHOP_NAME =
(select GET_shop_name(t1.batch_id, t1.shop_code) from table1 t1 wehre t1.shop_code = t2.shop_code)
where T2.shop_name is null

You want the MERGE statement.
Something like this might work:
MERGE INTO TABLE2 t2
USING (
SELECT GET_shop_name(t1.batch_id) AS shop_name ,
t1.shop_code shop_code
FROM TABLE1 T1 ) t1
ON (t2.shop_code = t1.shop_code)
WHEN MATCHED THEN
UPDATE SET t2.shop_name = t1.shop_name
;
You'll have to excuse if the exact code above doesn't work I don't have SQL Dev where I am right now for syntax details. :)

Comparing two tables, if rows are different, run query in Oracle

Think of my two tables have the same columns. One column is the ID, and the other one is the text. Is it possible to implement the following pseudo code in PLSQL?
Compare each row (They will have the same ID)
If anything is different about them
Run a couple of queries: an Update, and an Insert
ElseIf they are the same
Do nothing
Else the row does not exist
So add the row to the table compared on
Is it easy to do this using PLSQL or should I create a standalone application to do do this logic.

As your table have the same columns, by using NATURAL JOIN you can easily check if two corresponding rows are identical -- without need to update your code if a column is added to your table.
In addition, using OUTER JOIN allow you to find the rows present in one table but not in the other.
So, you can use something like that to achieve your purpose:
for rec in (
SELECT T.ID ID1,
U.ID ID2,
V.EQ
FROM T
FULL OUTER JOIN U ON T.ID = U.ID
FULL OUTER JOIN (SELECT ID, 1 EQ FROM T NATURAL JOIN U) V ON U.ID = V.ID)
loop
if rec.id1 is null
then
-- row in U but not in T
elsif rec.id2 is null
then
-- row in T but not in U
elsif rec.eq is null
-- row present in both tables
-- but content mismatch
end if
end loop

Else the row does not exist
So add the row to the table compared on
Is this condition means that rows can be missed in both tables? If only in one, then:
insert into t1 (id, text)
select id, text
from t2
minus
select id, text
from t1;
If missed records can be in both tables, you need the same query that inserts into table t2 rows from t1.
If anything is different about them
If you need one action for any amount of different rows, then use something like this:
select count(*)
into a
from t1, t2
where t1.id = t2.id and t1.text <> t2.text;
if a > 0 then
...
otherwise:
for i in (
select *
from t1, t2
where t1.id = t2.id and t1.text <> t2.text) loop
<do something>
end loop;

A 'merge' statement is what u needed.
Here is the syntax:
MERGE INTO TARGET_TABLE
USING SOURCE_TABLE
ON (CONDITION)
WHEN MATCHED THEN
UPDATE SET (DO YOUR UPDATES)
WHEN NOT MATCHED THEN
(INSERT YOUR NEW ROWS)
Google MERGE syntax for more about the statement.

Just use MINUS.
query_1
MINUS
query_2
In your case, if you really want to use PL/SQL, then select count into a local variable. Write a logic, if count > 0 then do other stuff.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

unix: compare two tables with unique_id of both table match - validation

Related

Error: column reference is ambiguous

update set value based on another value of another column and/or same column in another row:-ORA 1427

Slowly changing dimensions- SCD1 and SCD2 implementation in Hive [closed]

How to update table1 field by using other table and function

Comparing two tables, if rows are different, run query in Oracle

Categories

Resources