How to make existing table column value unique in vertica which are duplicate? - vertica

I have a vertica table "Product" it contains product_id, order_id and some more columns. order_id is a varchar column.
it has data like this:
product_id order_id
1 a:111
2 a:222
3 a:111
2 a:444
1 a:222
4 a:111
now i want to update order_id each row which are duplicate(want to make unique). like this:
product_id order_id
1 a:112
2 a:222
3 a:113
2 a:444
1 a:223
4 a:114
how to do this?

I am not very sure if updating would be the right idea for vertica.
However, you can use the below select statement.
INSERT INTO new_table
SELECT product_id
-- , order_id
, order_id + ROW_NUMBER() OVER (
PARTITION BY product_id
, rank ORDER BY order_id
) AS as order_id
FROM (
SELECT product_id
, order_id
, DENSE_RANK() OVER (
PARTITION BY product_id ORDER BY order_id
) AS rank
FROM abc
) sub;
Let us know if this works for you.

Related

Oracle, Select Distinct columns, with corresponding columns

I want to select DISTINCT results from the user_id column but I need the corresponding columns as well.
Result set needs to return two role_id that are Distnct user_id and be not an 'Unassigned' status.
The query I am using:
SELECT role_id, user_id, role_code, status_code FROM table where school_id=5 and status_code= 'DRAFT';
This an example of my table:
ROLE_ID USER_ID SCHOOL_ID CAMPUS_ID ROLE_CODE STATUS_CODE
1 4 5 7 Unassigned DRAFT
2 4 5 7 TEST DRAFT
3 4 5 8 TEST DRAFT
4 5 5 9 Unassigned DRAFT
5 5 5 9 TEST DRAFT
6 5 5 10 TEST DRAFT
I have tried to add group by based on user_id but I get an ORA-00979.
You can use ROW_NUMBER() to identify the rows you want. For example:
select *
from (
select t.*,
row_number() over(partition by user_id order by role_id) as rn
from t
where role_code <> 'Unassigned'
) x
where rn = 1
DISTINCT is across the entire set of columns and not for one specific column. Therefore, if you want to get the DISTINCT rows which are not Unassigned you can use:
SELECT DISTINCT
role_id,
user_id,
role_code,
status_code
FROM table
where school_id = 5
and status_code = 'DRAFT'
and role_code != 'Unassigned';
If you want to get a single row for each user_id then you can use GROUP BY and find the minimum role_id:
SELECT MIN(role_id) AS role_id,
user_id,
MIN(role_code ) KEEP (DENSE_RANK FIRST ORDER BY role_id) AS role_code,
MIN(status_code) KEEP (DENSE_RANK FIRST ORDER BY role_id) AS status_code
FROM table
where school_id = 5
and status_code = 'DRAFT'
and role_code != 'Unassigned'
GROUP BY
user_id;

In oracle SQL DB same primary Id is present more then once with different batch_id. How can I know the batch ID just before the current batch ID

I am working on oracle database.
We load customer data in source table which eventually migrates to target table.
Every time customer data is loaded in source table it is having a unique batch_id.
If we want to update some field in customer table, then we again load the same customer in source table but this time with different batch_id.
Now I want to know batch_id of the customer just before the latest batch_id.
Batch_id we take is usually the current date.
Use ROW_NUMBER analytic function
your sample data
select * from tab
order by 1,2
CUSTOMER_ID BATCH_ID
----------- -------------------
1 09.12.2019 00:00:00
1 10.12.2019 00:00:00
2 10.12.2019 00:00:00
Row_number assihns sequence number starting from 1 for each customer order descending on BATCH_ID - you are interested on one before the latest, i.e. the rows with the number 2.
with cust as (
select
customer_id, batch_id,
row_number() over (partition by customer_id order by batch_id desc) rn
from tab)
select CUSTOMER_ID, BATCH_ID
from cust
where rn = 2;
CUSTOMER_ID BATCH_ID
----------- -------------------
1 09.12.2019 00:00:00
It seems that you're basically looking for the second biggest value in the SOURCE table.
In this example code the SOURCE_TABLE represents the table containing same CUSTOMER_NO with different BATCH_NO:
create table source_table (customer_no integer, batch_no date);
insert into source_table values ('1', SYSDATE-2);
insert into source_table values ('1', SYSDATE-1);
insert into source_table values ('1', SYSDATE);
SELECT batch_no
FROM (
SELECT batch_no, row_number() over (order by batch_no desc) as row_num
FROM source_table
) t
WHERE row_num = 2
Where row_num = 2 represents the second biggest value in the table.
The query returns SYSDATE-1.

ORDER BY BASED ON COLUMN

I have two tables,PRODUCTS AND LOOKUP TABLES.Now i want to order the KEY Column in products table based on KEY column value in LOOKUP TABLE.
CREATE TABLE PRODUCTS
(
ID INT,
KEY VARCHAR(50)
)
INSERT INTO PRODUCTS
VALUES (1, 'EGHS'), (2, 'PFE'), (3, 'EGHS'),
(4, 'PFE'), (5, 'ABC')
CREATE TABLE LOOKUP (F_KEY VARCHAR(50))
INSERT INTO LOOKUP VALUES('PFE,EGHS,ABC')
Now I want to order the records in PRODUCTS table based on KEY (PFE,EGHS,ABC) values in LOOKUP table.
Example output:
PRODUCTS
ID F_KEY
-----------
2 PFE
4 PFE
1 EGHS
3 EGHS
5 ABC
I use this query, but it is not working
SELECT *
FROM PRODUCTS
ORDER BY (SELECT F_KEY FROM LOOKUP)
You can split the string using XML. You first need to convert the string to XML and replace the comma with start and end XML tags.
Once done, you can assign an incrementing number using ROW_NUMBER() like following.
;WITH cte
AS (SELECT dt,
Row_number()
OVER(
ORDER BY (SELECT 1)) RN
FROM (SELECT Cast('<X>' + Replace(F.f_key, ',', '</X><X>')
+ '</X>' AS XML) AS xmlfilter
FROM [lookup] F)F1
CROSS apply (SELECT fdata.d.value('.', 'varchar(500)') AS DT
FROM f1.xmlfilter.nodes('X') AS fdata(d)) O)
SELECT P.*
FROM products P
LEFT JOIN cte C
ON C.dt = P.[key]
ORDER BY C.rn
Online Demo
Output:
ID F_KEY
-----------
2 PFE
4 PFE
1 EGHS
3 EGHS
5 ABC
You may do it like this:
SELECT ID, [KEY] FROM PRODUCTS
ORDER BY
CASE [KEY]
WHEN 'PFE' THEN 1
WHEN 'EGHS' THEN 2
WHEN 'ABC' THEN 3
END

Fetch single record from duplicate rows from oracle table

I have a table user_audit_records_tbl which has multiple rows for a single user ,Every time user logs in one entry is made into this table so i want a select query which will fetch a latest single record for each user, I have a query which uses IN clause.
Table Name : user_audit_records_tbl
Record_id Number Primary Key,
user_id varchar Primary Key ,
user_ip varchar,
.
.
etc
Current query i am using is
select * from user_audit_records_tbl where record_id in (select
max(record_id) from user_audit_records_tbl
group by user_id);
but was just wondering if anybody has better solution for this since this table has huge volumns.
You can use the first/last function
select max(Record_id) as Record_id,
user_id,
max(user_ip) keep (dense_rank last order by record_id) as user_ip,
...
from user_audit_records_tbl
group by user_id
No sure if it will be more efficient.
EDIT : As above query is less efficient, may be you could try an exist clause
select *
from user_audit_records_tbl A
where exists ( select 1
from user_audit_records_tbl B
where A.user_id = B.user_id
group by B.user_id
having max(B.record_id) = A.record_id
)
But maybe, you should look on the index side instead of the query side.
select *
from ( select row_number() over ( partition by user_id order by record_id desc) row_nr,
a.*
from user_audit_records_tbl a
)
where row_nr = 1
;

SELECT only rows that aren't repeated

So I have a table like this. This is a standard Order header - Order Detail table:
order id order_line
----------- -----------
100 1
100 2
100 3
101 1
102 1
103 1
103 2
104 1
105 1
Now, how can I make a SELECT that will only pick the orders that only have one line?
In this case I don't want orders 100 and 103.
Thanks!
Tiago
Counting lines using "group by order_id" is a good solution, however counting is not needed, simpler Max function works fine:
select order_id from orders
group by order_id
having max(order_line)=1;
In case order_line has consecutive values further "optimization" is possible:
select order_id from orders
where order_line <= 2
group by order_id
having max(order_line)=1;
Group by the order_id and take only those having 1 record per group
select order_id
from orders
group by order_id
having count(*) = 1
If you need the complete record then do
select t1.*
from orders t1
join
(
select order_id
from orders
group by order_id
having count(*) = 1
) t2 on t1.order_id = t2.order_id
You can try following query too :
select order_id , order_line
from Order_Detail
group by order_id ,order_line
having count(order_id)<2;

Resources