let's assume I have a table tab1 in my Oracle DB 12.1, which has a column record_id (type NUMBER) and many other columns, among them a column named exchg_id.
This record_id is always empty when a batch of new rows gets inserted into the table. What I need to do is to populate the record_id with values 1..N for all rows that satisfy a condition ...WHERE EXCHG_ID = 'something' and number of such rows is N. Of course I know how to do this procedurally (in a for-loop), but I'd like to know if there's an faster way using a single UPDATE statement. I imagine something like this:
UPDATE tab1 SET record_id = {1..N} WHERE exchg_id = 'something';
Many thanks for your help!
UPDATE: the order of the rows is not important, I need no specific ordering. I just need unique record_id's 1..N for any given exchg_id.
You could use rownum to set record_id to 1 to N :
UPDATE tab1 SET record_id = rownum WHERE exchg_id = 'something';
If you have some offset, say 10, then use rownum + 10
Related
Am trying to list top 3 records from atable based on some amount stored in a column FTE_TMUSD which is of varchar datatype
below is the query i tried
SELECT *FROM
(
SELECT * FROM FSE_TM_ENTRY
ORDER BY FTE_TMUSD desc
)
WHERE rownum <= 3
ORDER BY FTE_TMUSD DESC ;
o/p i got
972,9680,963 -->FTE_TMUSD values which are not displayed in desc
I am expecting an o/p which will display the top 3 records of values
That should work; inline view is ordered by FTE_TMUSD in descending order, and you're selecting values from it.
What looks suspicious are values you specified as the result. It appears that FTE_TMUSD's datatype is VARCHAR2 (ah, yes - it is, you said so). It means that values are sorted as strings, not numbers - and it seems that you expect numbers. So, apply TO_NUMBER to that column. Note that it'll fail if column contains anything but numbers (for example, if there's a value 972C).
Also, an alternative to your query might be use of analytic functions, such as row_number:
with temp as
(select f.*,
row_number() over (order by to_number(f.fte_tmusd) desc) rn
from fse_tm_entry f
)
select *
from temp
where rn <= 3;
I have a table with two columns(Using oracle 11g database) : Country, IndexNumber. Table contains 10 rows(10 different cities and with its unique index number.)
For example:
Country IndexNUmber
India 1
Australia 2
. .
. .
. .
. .
US 10
Now i want to fetch a random row from above table by generating random number using dbms_random.value(1,10). To achieve that i am using below query:
select * from tab_name where indexnumber = dbms_random.value(1,10);
I am not able to understand the output of this query as some time it is fetching one row, some time zero rows and some time more that one row.
Can someone please make me understand how oracle is evaluating this query.
Thanks
Ankit
Since dbms_random.value is a nondeterministic PL/SQL function, it will be called once for each row evaluated by the query.
The function might return 4 when evaluating the first row, then it might return 8 on the second row, etc.
To compare each row to a single random number, you can turn the function call into a scalar subquery, e.g.:
select * from tab_name where indexnumber = (select dbms_random.value(1,10) from dual);
Since the subquery is not correlated to the main query, Oracle will execute it only once (for the first row returned from the table) and remember the result for all subsequent rows. In particular, if a suitable index is on indexnumber the query will be able to use it more efficiently since it knows it is probing for a single value.
When you run your original query:
select * from tab_name where indexnumber = dbms_random.value(1,10);
it appears that the call to dbms_random is happening for each record's where clause. In other words, there is a chance that every record in your table might be returned if the random number chosen happen to match the index for every record. If you want to retrieve a single random record, then follow this pattern:
select *
from
( select * from tab_name order by DBMS_RANDOM.VALUE )
where rownum < 2;
I have a table with >1M rows of data and 20+ columns.
Within my table (tableX) I have identified duplicate records (~80k) in one particular column (troubleColumn).
If possible I would like to retain the original table name and remove the duplicate records from my problematic column otherwise I could create a new table (tableXfinal) with the same schema but without the duplicates.
I am not proficient in SQL or any other programming language so please excuse my ignorance.
delete from Accidents.CleanedFilledCombined
where Fixed_Accident_Index
in(select Fixed_Accident_Index from Accidents.CleanedFilledCombined
group by Fixed_Accident_Index
having count(Fixed_Accident_Index) >1);
You can remove duplicates by running a query that rewrites your table (you can use the same table as the destination, or you can create a new table, verify that it has what you want, and then copy it over the old table).
A query that should work is here:
SELECT *
FROM (
SELECT
*,
ROW_NUMBER()
OVER (PARTITION BY Fixed_Accident_Index)
row_number
FROM Accidents.CleanedFilledCombined
)
WHERE row_number = 1
UPDATE 2019: To de-duplicate rows on a single partition with a MERGE, see:
https://stackoverflow.com/a/57900778/132438
An alternative to Jordan's answer - this one scales better when having too many duplicates:
#standardSQL
SELECT event.* FROM (
SELECT ARRAY_AGG(
t ORDER BY t.created_at DESC LIMIT 1
)[OFFSET(0)] event
FROM `githubarchive.month.201706` t
# GROUP BY the id you are de-duplicating by
GROUP BY actor.id
)
Or a shorter version (takes any row, instead of the newest one):
SELECT k.*
FROM (
SELECT ARRAY_AGG(x LIMIT 1)[OFFSET(0)] k
FROM `fh-bigquery.reddit_comments.2017_01` x
GROUP BY id
)
To de-duplicate rows on an existing table:
CREATE OR REPLACE TABLE `deleting.deduplicating_table`
AS
# SELECT id FROM UNNEST([1,1,1,2,2]) id
SELECT k.*
FROM (
SELECT ARRAY_AGG(row LIMIT 1)[OFFSET(0)] k
FROM `deleting.deduplicating_table` row
GROUP BY id
)
Not sure why nobody mentioned DISTINCT query.
Here is the way to clean duplicate rows:
CREATE OR REPLACE TABLE project.dataset.table
AS
SELECT DISTINCT * FROM project.dataset.table
If your schema doesn’t have any records - below variation of Jordan’s answer will work well enough with writing over same table or new one, etc.
SELECT <list of original fields>
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Fixed_Accident_Index) AS pos,
FROM Accidents.CleanedFilledCombined
)
WHERE pos = 1
In more generic case - with complex schema with records/netsed fields, etc. - above approach can be a challenge.
I would propose to try using Tabledata: insertAll API with rows[].insertId set to respective Fixed_Accident_Index for each row.
In this case duplicate rows will be eliminated by BigQuery
Of course, this will involve some client side coding - so might be not relevant for this particular question.
I havent tried this approach by myself either but feel it might be interesting to try :o)
If you have a large-size partitioned table, and only have duplicates in a certain partition range. You don't want to overscan nor process the whole table. use the MERGE SQL below with predicates on partition range:
-- WARNING: back up the table before this operation
-- FOR large size timestamp partitioned table
-- -------------------------------------------
-- -- To de-duplicate rows of a given range of a partition table, using surrage_key as unique id
-- -------------------------------------------
DECLARE dt_start DEFAULT TIMESTAMP("2019-09-17T00:00:00", "America/Los_Angeles") ;
DECLARE dt_end DEFAULT TIMESTAMP("2019-09-22T00:00:00", "America/Los_Angeles");
MERGE INTO `gcp_project`.`data_set`.`the_table` AS INTERNAL_DEST
USING (
SELECT k.*
FROM (
SELECT ARRAY_AGG(original_data LIMIT 1)[OFFSET(0)] k
FROM `gcp_project`.`data_set`.`the_table` AS original_data
WHERE stamp BETWEEN dt_start AND dt_end
GROUP BY surrogate_key
)
) AS INTERNAL_SOURCE
ON FALSE
WHEN NOT MATCHED BY SOURCE
AND INTERNAL_DEST.stamp BETWEEN dt_start AND dt_end -- remove all data in partiion range
THEN DELETE
WHEN NOT MATCHED THEN INSERT ROW
credit: https://gist.github.com/hui-zheng/f7e972bcbe9cde0c6cb6318f7270b67a
Easier answer, without a subselect
SELECT
*,
ROW_NUMBER()
OVER (PARTITION BY Fixed_Accident_Index)
row_number
FROM Accidents.CleanedFilledCombined
WHERE TRUE
QUALIFY row_number = 1
The Where True is neccesary because qualify needs a where, group by or having clause
Felipe's answer is the best approach for most cases. Here is a more elegant way to accomplish the same:
CREATE OR REPLACE TABLE Accidents.CleanedFilledCombined
AS
SELECT
Fixed_Accident_Index,
ARRAY_AGG(x LIMIT 1)[SAFE_OFFSET(0)].* EXCEPT(Fixed_Accident_Index)
FROM Accidents.CleanedFilledCombined AS x
GROUP BY Fixed_Accident_Index;
To be safe, make sure you backup the original table before you run this ^^
I don't recommend to use ROW NUMBER() OVER() approach if possible since you may run into BigQuery memory limits and get unexpected errors.
Update BigQuery schema with new table column as bq_uuid making it NULLABLE and type STRING
Create duplicate rows by running same command 5 times for example
insert into beginner-290513.917834811114.messages (id, type, flow, updated_at) Values(19999,"hello", "inbound", '2021-06-08T12:09:03.693646')
Check if duplicate entries exist
select * from beginner-290513.917834811114.messages where id = 19999
Use generate uuid function to generate uuid corresponding to each message
UPDATE beginner-290513.917834811114.messages
SET bq_uuid = GENERATE_UUID()
where id>0
Clean duplicate entries
DELETE FROM beginner-290513.917834811114.messages
WHERE bq_uuid IN
(SELECT bq_uuid
FROM
(SELECT bq_uuid,
ROW_NUMBER() OVER( PARTITION BY updated_at
ORDER BY bq_uuid ) AS row_num
FROM beginner-290513.917834811114.messages ) t
WHERE t.row_num > 1 );
Can any one please help me to solve this issue
Table Name:RW_LN
LN_ID RE_LN_ID RE_PR_ID
LN001 RN001 RN002
LN002 RN002 RN003
LN003 RN003 RN001
LN004 RN001 RN002
MY Update Query is:
update table RW_LN set RE_LN_ID=(
select LN_ID
from RW_LN as n1,RW_LN as n2
where n1.RE_LN_ID = n2.RE_PR_ID)
MY Expected Result is:
LN_ID RE_LN_ID
LN001 LN003
LN002 LN004
LN003 LN002
LN004 LN003
This above query shows error as SUB QUERY RETURNS MULTIPLE ROWS.Can any one provide the solution for this, I am Beginner in Oracle 9i.So Stuck in the logic
you can try to solve this with a distinct
update table RW_LN set RE_LN_ID=(
select distinct LN_ID
from RW_LN as n1,RW_LN as n2
where n1.RE_LN_ID = n2.RE_PR_ID)
if that still returns multiple rows, it means you are missing a join somewhere along the way or potentially have a bad schema that needs to use primary keys.
If you want to take the "biggest" corresponding LN_ID, you could do
update RW_LN r1
set r1.RE_LN_ID = (select MAX(LN_ID)
FROM RW_LN r2
where r1.RE_LN_ID = r2.RE_PR_ID);
see SqlFiddle
But you should explain why you choose (as new RE_LN_ID) LN004 instead of LN001 for LN_ID LN002 (cause you could choose both)
Just guessing, but possibly this is what you want.
update
RW_LN n1
set
RE_LN_ID=(
select n2.LN_ID
from RW_LN n2
where n1.RE_LN_ID = n2.RE_PR_ID)
where exists (
select null
from RW_LN n2
where n1.RE_LN_ID = n2.RE_PR_ID and
n2.ln_id is not null)
At the moment there is no correlation between the rows you are updating and the value being returned in the subquery.
The query reads as follows:
For every row in RW_LN change the value of RE_LN_ID to be:
the value of LN_ID in a row in RW_LN for which:
the RE_PR_ID equals the original tables value of RE_LN_ID
IF there exists at least one row in RW_LN for which:
RE_PR_ID is the same as RE_LN_ID in the original table AND
LN_ID is not null
Hi Guys I have Two tables (MIGADM.CORPMISCELLANEOUSINFO and CRMUSER.PREFERENCES) and Each Has a field called PREFERENCE_ID and ORGKEY. I want to Update the Preference ID for MIGADM.CORPMISCELLANEOUSINFO with Preference_ID from CRMUSER.PREFERENCES for Each Corresponding ORGKEY. SO I wrote this Query;
update migadm.CORPMISCELLANEOUSINFO s set s.PREFERENCE_ID = (
select e.PREFERENCE_ID from crmuser.preferences e where s.ORGKEY = e.ORGKEY)
But I get:
ORA-01427: single-row subquery returns more than one row
What Should I do?
It means the columns you have selected are not unique enough to identify one row in your source table. Your first step would be to identify those columns.
To see the set of rows that have this problem, run this query.
select e.origkey,
count(*)
from crmuser.preferences e
group by e.origkey
having count(*) > 1
eg : for origkey of 2, let's say there are two rows in the preferences table.
orig_key PREFERENCE_ID
2 202
2 201
Oracle is not sure which of these should be used to update the preference_id column in CORPMISCELLANEOUSINFO
identify the row where the subquery returns more than one row (You could use REJECT ERROR clause to do it for instance) or use the condition 'where rownum = 1'.