Oracle to retrieve data - oracle

I have table with id, name, update_date etc. columns.
Select distinct id from table1 order by update_date desc;
In above query am getting duplicate values as well. I need to retrieve distinct id with having latest updated date.

Generally speaking, this might do the job:
select id,
max(update_date) max_update_date
from table1
group by id;
as
MAX will return the latest UPDATE_DATE
GROUP BY will return DISTINCT ID's anyway (so you don't have to specify it)
If it does not, provide test case and explain what output you'd want to get; someone will assist.

Related

ORACLE SQL:Can someone explain me the difference between these two?

I'm a student and this is my first year of learning Oracle SQL. On the exam, I used this code:
SELECT department_id,MAX(salary)
FROM employees
GROUP BY department_id
HAVING INSTR(TO_CHAR(department_id),'5')!=1 AND MAX(salary)<1000;
and the professor said that I should be using something like
SELECT DEPARTMENT_ID, MAX(SALARY)
FROM EMPLOYEES
WHERE SUBSTR(TO_CHAR(DEPARTMENT_ID),1,1)<>'5'
GROUP BY DEPARTMENT_ID
HAVING MAX(SALARY) < 1000;
We got the same table so my question is when these two can display different results. I'm aware that data processing is different but as he said that was not the problem. The problem is not is using the INSTR function but not using WHERE.
The Where clause workn on the raw contents of the rows
so you filter the dataset that is evaluated for the select clause
WHERE SUBSTR(TO_CHAR(DEPARTMENT_ID),1,1)<>'5'
don't select the rows with
SUBSTR(TO_CHAR(DEPARTMENT_ID),1,1)='5'
so these rows are not used for
SELECT DEPARTMENT_ID, MAX(SALARY) ..GROUP BY DEPARTMENT_ID
HAVING work on the result of the selected result so also the rows with SUBSTR(TO_CHAR(DEPARTMENT_ID),1,1)='5' should be processed
in your case each value for the column DEPARTMENT_ID is always selected because is mentioned in group by and then both the query should return the same result

Delete data based on the count & timestamp using pl\sql

I'm new to PL\SQL programming and I'm from DBA background. I got one requirement to delete data from both main table and reference table but need to follow below logic while deleting data because we need to delete 30M of data from the tables so we're reducing data based on the "State_ID" column below.
Following conditions need to consider
1. As per sample data given below(Main Table), sort data based on timestamp with desc order and leave the first 2 rows of data for each "State_id" and delete rest of the data from the both tables based on "state_id" column.
2. select state_id,count() from maintable group by state_id order by timestamp desc Having count()>2;
So if state_id=1 has 5 rows then has to delete 3 rows of data by leaving first 2 rows for state_id=1 and repeat for other state_id values.
Also same matching data should be deleted from the reference table as well.
Please someone help me on this issue. Thanks.
enter image description here
Main table
You should be able to do each table delete as a single SQL command. Anything else would essentially force row-by-row processing, which is the last thing you want for that much data. Something like this:
delete from main_table m
where m.row_id not in (
with keep_me as (
select row_id,
row_number() over (partition by state_id
order by time_stamp desc) id_row_number
from main_table where id_row_number<3)
select row_id from keep_me)
or
delete from main_table m
where m.row_id in (
with delete_me as (
select row_id,
row_number() over (partition by state_id
order by time_stamp desc) id_row_number
from main_table where id_row_number>2)
select row_id from delete_me)

FILTER WHERE at count in ClickHouse

I'm trying to migrate one of my Postgres tables at ClickHouse. Here what I came up with at ClickHouse:
CREATE TABLE loads(
country_id UInt16,
partner_id UInt32,
is_unique UInt8,
ip String,
created_at DateTime
) ENGINE=MergeTree PARTITION BY toYYYYMM(created_at) ORDER BY (created_at);
is_unique here is a Boolean with 0 or 1. I wanna know count for aggregates: country_id, partner_id and created_at, but also I wanna know how much from these loads are unique loads. At Postgres it looks like:
SELECT
count(*) AS loads,
count(*) FILTER (WHERE is_unique) AS uniq,
country_id,
partner_id,
created_at::date AS ts
FROM loads
GROUP BY ts, country_id, partner_id
Is it possible at ClickHouse or should I think again about how to aggregate the data? I didn't find any clues at manual except count can get expr instead of asterisk, but count(is_unique = 1) doesn't work and just returns the same amount as count(*).
I just found an answer in minutes after posting:
SELECT count(*), countIf(is_unique = 1) /* .. */
Good luck.

Oracle performance Issue

Need help query performance.
I have a table A joining to a view and it is taking 7 seconds to get the results. But when i do select query on view i get the results in 1 seconds.
I have created the indexes on the table A. But there is no improvements in the query.
SELECT
ITEM_ID, BARCODE, CONTENT_TYPE_CODE, DEPARTMENT, DESCRIPTION, ITEM_NUMBER, FROM_DATE,
TO_DATE, CONTACT_NAME, FILE_LOCATION, FILE_LOCATION_UPPER, SOURCE_LOCATION,
DESTRUCTION_DATE, SOURCE, LABEL_NAME, ARTIST_NAME, TITLE, SELECTION_NUM, REP_IDENTIFIER,
CHECKED_OUT
FROM View B,
table A
where B.item_id=A.itemid
and status='VALID'
AND session_id IN ('naveen13122016095800')
ORDER BY item_id,barcode;
CREATE TABLE A
(
ITEMID NUMBER,
USER_NAME VARCHAR2(25 BYTE),
CREATE_DATE DATE,
SESSION_ID VARCHAR2(240 BYTE),
STATUS VARCHAR2(20 BYTE)
)
CREATE UNIQUE INDEX A_IDX1 ON A(ITEMID);
CREATE INDEX A_IDX2 ON A(SESSION_ID);
CREATE INDEX A_IDX3 ON A(STATUS);'
So querying the view joined to a table is slower than querying the view alone? This is not surprising, is it?
Anyway, it doesn't make much sense to create separate indexes on the fields. The DBMS will pick one index (if any) to access the table. You can try a composed index:
CREATE UNIQUE INDEX A_IDX4 ON A(status, session_id, itemid);
But the DBMS will still only use this index when it sees an advantage in this over simply reading the full table. That means, if the DBMS expects to have to read a big amount of records anyway, it won't indirectly access them via the index.
At last two remarks concerning your query:
Don't use those out-dated comma-separated joins. They are less readable and more prone to errors than explicit ANSI joins (FROM View B JOIN table A ON B.item_id = A.itemid).
Use qualifiers for all columns when working with more than one table or view in your query (and A.status='VALID' ...).
UPDATE: I see now, that you are not selecting any columns from the table, so why join it at all? It seems you are merely looking up whether a record exists in the table, so use EXISTS or IN accordingly. (This may not make it faster, but a lot more readable at least.)
SELECT
ITEM_ID, BARCODE, CONTENT_TYPE_CODE, DEPARTMENT, DESCRIPTION, ITEM_NUMBER, FROM_DATE,
TO_DATE, CONTACT_NAME, FILE_LOCATION, FILE_LOCATION_UPPER, SOURCE_LOCATION,
DESTRUCTION_DATE, SOURCE, LABEL_NAME, ARTIST_NAME, TITLE, SELECTION_NUM, REP_IDENTIFIER,
CHECKED_OUT
FROM View
WHERE itemid IN
(
SELECT itemid
FROM A
WHERE status = 'VALID'
AND session_id IN ('naveen13122016095800')
)
ORDER BY item_id, barcode;

Column name is masked in oracle indexes

I have a table in oracle db which has a unique index composed of two columns (id and valid_from). The column valid_from is of type timestamps with time zone.
When I query the SYS.USER_IND_COLUMNS to see which columns my table is using as unique index, I can not see the name of the valid_from column but instead I see smth like SYS_NC00027$.
Is there any possibility that I can display the name valid_from rather than SYS_NC00027$. ?
Apparently Oracle creates a function based index for timestamp with time zone columns.
The definition of them can be found in the view ALL_IND_EXPRESSIONS
Something like this should get you started:
select ic.index_name,
ic.column_name,
ie.column_expression
from all_ind_columns ic
left join all_ind_expressions ie
on ie.index_owner = ic.index_owner
and ie.index_name = ic.index_name
and ie.column_position = ic.column_position
where ic.table_name = 'FOO';
Unfortunately column_expression is a (deprecated) LONG column and cannot easily be used in a coalesce() or nvl() function.
Use the below to verify the col info.
select column_name,virtual_column,hidden_column,data_default from user_tab_cols where table_name='EMP';

Resources