A weird request maybe but. My boss wants me to create an admin version of a page we have that displays data from an oracle query in a table.
The admin page, instead of displaying the data (query returns 1 row), needs to return the table name and column name
Ex: Instead of:
Name Initial
==================
Bob A
I want:
Name Initial
============================
Users.FirstName Users.MiddleInitial
I realize I can do this in code but would rather just modify the query to return the data I want so I can leave the report generation code mostly alone.
I don't want to do it in a stored procedure.
So when I spit out the data in the report using something like:
blah blah = MyDataRow("FirstName")
I can leave that as is but instead of it displaying "BOB" it would display "Users.FirstName"
And I want to do the query using select * if possible instead of listing all the columns
So for each of the columns I am querying in the * , I want to get (instead of the column value) the tablename.ColumnName or tablename|columnName
hope you are following- I am confusing myself...
pseudo:
select tablename + '.' + Columnname as WhateverTheColumnNameIs
from Table1
left join Table2 on whatever...
Join Table_Names on blah blah
Whew- after writing all this I think I will just do it on the code side.
But if you are up for it maybe a fun challenge
Oracle does not provide an authentic way(there is no pseudocolumn) to get the column name of a table as a result of a query against that table. But you might consider these two approaches:
Extract column name from an xmltype, formed by passing cursor expression(your query) in the xmltable() function:
-- your table
with t1(first_name, middle_name) as(
select 1,2 from dual
), -- your query
t2 as(
select * -- col1 as "t1.col1"
--, col2 as "t1.col2"
--, col3 as "t1.col3"
from hr.t1
)
select *
from ( select q.object_value.getrootelement() as col_name
, rownum as rn
from xmltable('//*'
passing xmltype(cursor(select * from t2 where rownum = 1))
) q
where q.object_value.getrootelement() not in ('ROWSET', 'ROW')
)
pivot(
max(col_name) for rn in (1 as "name", 2 as "initial")
)
Result:
name initial
--------------- ---------------
FIRST_NAME MIDDLE_NAME
Note: In order for column names to be prefixed with table name, you need to list them
explicitly in the select list of a query and supply an alias, manually.
PL/SQL approach. Starting from Oracle 11g you could use dbms_sql() package and describe_columns() procedure specifically to get the name of columns in the cursor(your select).
This might be what you are looking for, try selecting from system views USER_TAB_COLS or ALL_TAB_COLS.
Related
I have a program that has been working for years. Today, we upgraded from SAS 9.4M3 to 9.4M7.
proc setinit
Current version: 9.04.01M7P080520
Since then, I am not able to get the same results as before the upgrade.
Please note that I am querying on Oracle databases directly.
Trying to replicate the issue with a minimal, reproducible SAS table example, I found that the issue disappear when querying on a SAS table instead of on Oracle databases.
Let's say I have the following dataset:
data have;
infile datalines delimiter="|";
input name :$8. id $1. value :$8. t1 :$10.;
datalines;
Joe|A|TLO
Joe|B|IKSK
Joe|C|Yes
;
Using the temporary table:
proc sql;
create table want as
select name,
min(case when id = "A" then value else "" end) as A length 8
from have
group by name;
quit;
Results:
name A
Joe TLO
However, when running the very same query on the oracle database directly I get a missing value instead:
proc sql;
create table want as
select name,
min(case when id = "A" then value else "" end) as A length 8
from have_oracle
group by name;
quit;
name A
Joe
As per documentation, the min() function is behaving properly when used on the SAS table
The MIN function returns a missing value (.) only if all arguments are missing.
I believe this happens when Oracle don't understand the function that SAS is passing it - the min functions in SAS and Oracle are very different and the equivalent in SAS would be LEAST().
So my guess is that the upgrade messed up how is translates the SAS min function to Oracle, but it remains a guess. Does anyone ran into this type of behavior?
EDIT: #Richard's comment
options sastrace=',,,d' sastraceloc=saslog nostsuffix;
proc sql;
create table want as
select t1.name,
min(case when id = 'A' then value else "" end) as A length 8
from oracle_db.names t1 inner join oracle_db.ids t2 on (t1.tid = t2.tid)
group by t1.name;
ORACLE_26: Prepared: on connection 0
SELECT * FROM NAMES
ORACLE_27: Prepared: on connection 1
SELECT UI.INDEX_NAME, UIC.COLUMN_NAME FROM USER_INDEXES UI,USER_IND_COLUMNS UIC WHERE UI.TABLE_NAME='NAMES' AND
UIC.TABLE_NAME='NAMES' AND UI.INDEX_NAME=UIC.INDEX_NAME
ORACLE_28: Executed: on connection 1
SELECT statement ORACLE_27
ORACLE_29: Prepared: on connection 0
SELECT * FROM IDS
ORACLE_30: Prepared: on connection 1
SELECT UI.INDEX_NAME, UIC.COLUMN_NAME FROM USER_INDEXES UI,USER_IND_COLUMNS UIC WHERE UI.TABLE_NAME='IDS' AND
UIC.TABLE_NAME='IDS' AND UI.INDEX_NAME=UIC.INDEX_NAME
ORACLE_31: Executed: on connection 1
SELECT statement ORACLE_30
ORACLE_32: Prepared: on connection 0
select t1."NAME", MIN(case when t2."ID" = 'A' then t1."VALUE" else ' ' end) as A from
NAMES t1 inner join IDS t2 on t1."TID" = t2."TID" group by t1."NAME"
ORACLE_33: Executed: on connection 0
SELECT statement ORACLE_32
ACCESS ENGINE: SQL statement was passed to the DBMS for fetching data.
NOTE: Table WORK.SELECTED_ATTR created, with 1 row and 2 columns.
! quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.34 seconds
cpu time 0.09 seconds
Use the SASTRACE= system option to log SQL statements sent to the DBMS.
options SASTRACE=',,,d';
will provide the most detailed logging.
From the prepared statement you can see why you are getting a blank from the Oracle query.
select
t1."NAME"
, MIN ( case
when t2."ID" = 'A' then t1."VALUE"
else ' '
end
) as A
from
NAMES t1 inner join IDS t2 on t1."TID" = t2."TID"
group by
t1."NAME"
The SQL MIN () aggregate function will exclude null values from consideration.
In SAS SQL, a blank value is also interpreted as null.
In SAS your SQL query returns the min non-null value TLO
In Oracle transformed query, the SAS blank '' is transformed to ' ' a single blank character, which is not-null, and thus ' ' < 'TLO' and you get the blank result.
The actual MIN you want to force in Oracle is min(case when id = "A" then value else null end) which #Tom has shown is possible by omitting the else clause.
The only way to see the actual difference is to run the query with trace in the prior SAS version, or if lucky, see the explanation in the (ignored by many) "What's New" documents.
Why are you using ' ' or '' as the ELSE value? Perhaps Oracle is treating a string with blanks in it differently than a null string.
Why not use null in the ELSE clause?
or just leave off the ELSE clause and let it default to null?
libname mylib oracle .... ;
proc sql;
create table want as
select name
, min(case when id = "A" then value else null end) as A length 8
from mylib.have_oracle
group by name
;
quit;
Also try running the Oracle code yourself, instead of using implicit pass thru.
proc sql;
connect to oracle ..... ;
create table want as
select * from connection to oracle
(
select name,
min(case when id = "A" then value else null end) as A length 8
from have_oracle
group by name
)
;
quit;
When I try to reproduce this in Oracle I get the result you are looking for so I suspect it has something to do with SAS (which I'm not familiar with).
with t as (
select 'Joe' name, 'A' id, 'TLO' value from dual union all
select 'Joe' name, 'B' id, 'IKSK' value from dual union all
select 'Joe' name, 'C' id, 'Yes' value from dual
)
select name
, min(case when id = 'A' then value else '' end) as a
from t
group by name;
NAME A
---- ----
Joe TLO
Unrelated, if you are only interested in id = 'A' then a better query would be:
select name
, min(value) as a
from t
where id = 'A'
group by name;
Am trying to list top 3 records from atable based on some amount stored in a column FTE_TMUSD which is of varchar datatype
below is the query i tried
SELECT *FROM
(
SELECT * FROM FSE_TM_ENTRY
ORDER BY FTE_TMUSD desc
)
WHERE rownum <= 3
ORDER BY FTE_TMUSD DESC ;
o/p i got
972,9680,963 -->FTE_TMUSD values which are not displayed in desc
I am expecting an o/p which will display the top 3 records of values
That should work; inline view is ordered by FTE_TMUSD in descending order, and you're selecting values from it.
What looks suspicious are values you specified as the result. It appears that FTE_TMUSD's datatype is VARCHAR2 (ah, yes - it is, you said so). It means that values are sorted as strings, not numbers - and it seems that you expect numbers. So, apply TO_NUMBER to that column. Note that it'll fail if column contains anything but numbers (for example, if there's a value 972C).
Also, an alternative to your query might be use of analytic functions, such as row_number:
with temp as
(select f.*,
row_number() over (order by to_number(f.fte_tmusd) desc) rn
from fse_tm_entry f
)
select *
from temp
where rn <= 3;
query 1: I need to get the dept code from one table.
query 2: use that dept code to query another table which also has got another set of dept code. kind of one is to many, one dept referring to many depts.
NOTE: they don't have the same column name in two tables.
and the final result should be union of 1st query and 2nd query.
for eg: query 1 result : ECE
query 2 result : EEE, Mech, Comp. Sc.
I need the result to be ECE, EEE, Mech, Comp. Sc.
declare default_dept_Code varchar2(10);
begin
select dept_code into default_dept_Code from (select dept_code from
course_per WHERE student_no ='526765771');
dbms_output.put_line(default_dept_Code);
SELECT dept_code FROM course_per WHERE student_no ='526765771'
union all
select add_dept_code from addition_dept where dept_Code = default_dept_Code;
I'm unable to execute this query, it has got error. What are the other best ways I can handle it, I need to put this in a VIEW. I tried to create temp table and insert the select result into it, did not work. I'm a new bee to Oracle. I don't want to use cursor, if that is the only option I can go for it.
From what I understand you can write your query like this:
SELECT dept_code as code
FROM course_per
WHERE student_no ='526765771'
UNION ALL
SELECT add_dept_code as code
FROM addition_dept
WHERE dept_Code = (
SELECT dept_code
FROM course_per
WHERE student_no ='526765771');
I have a table with >1M rows of data and 20+ columns.
Within my table (tableX) I have identified duplicate records (~80k) in one particular column (troubleColumn).
If possible I would like to retain the original table name and remove the duplicate records from my problematic column otherwise I could create a new table (tableXfinal) with the same schema but without the duplicates.
I am not proficient in SQL or any other programming language so please excuse my ignorance.
delete from Accidents.CleanedFilledCombined
where Fixed_Accident_Index
in(select Fixed_Accident_Index from Accidents.CleanedFilledCombined
group by Fixed_Accident_Index
having count(Fixed_Accident_Index) >1);
You can remove duplicates by running a query that rewrites your table (you can use the same table as the destination, or you can create a new table, verify that it has what you want, and then copy it over the old table).
A query that should work is here:
SELECT *
FROM (
SELECT
*,
ROW_NUMBER()
OVER (PARTITION BY Fixed_Accident_Index)
row_number
FROM Accidents.CleanedFilledCombined
)
WHERE row_number = 1
UPDATE 2019: To de-duplicate rows on a single partition with a MERGE, see:
https://stackoverflow.com/a/57900778/132438
An alternative to Jordan's answer - this one scales better when having too many duplicates:
#standardSQL
SELECT event.* FROM (
SELECT ARRAY_AGG(
t ORDER BY t.created_at DESC LIMIT 1
)[OFFSET(0)] event
FROM `githubarchive.month.201706` t
# GROUP BY the id you are de-duplicating by
GROUP BY actor.id
)
Or a shorter version (takes any row, instead of the newest one):
SELECT k.*
FROM (
SELECT ARRAY_AGG(x LIMIT 1)[OFFSET(0)] k
FROM `fh-bigquery.reddit_comments.2017_01` x
GROUP BY id
)
To de-duplicate rows on an existing table:
CREATE OR REPLACE TABLE `deleting.deduplicating_table`
AS
# SELECT id FROM UNNEST([1,1,1,2,2]) id
SELECT k.*
FROM (
SELECT ARRAY_AGG(row LIMIT 1)[OFFSET(0)] k
FROM `deleting.deduplicating_table` row
GROUP BY id
)
Not sure why nobody mentioned DISTINCT query.
Here is the way to clean duplicate rows:
CREATE OR REPLACE TABLE project.dataset.table
AS
SELECT DISTINCT * FROM project.dataset.table
If your schema doesn’t have any records - below variation of Jordan’s answer will work well enough with writing over same table or new one, etc.
SELECT <list of original fields>
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Fixed_Accident_Index) AS pos,
FROM Accidents.CleanedFilledCombined
)
WHERE pos = 1
In more generic case - with complex schema with records/netsed fields, etc. - above approach can be a challenge.
I would propose to try using Tabledata: insertAll API with rows[].insertId set to respective Fixed_Accident_Index for each row.
In this case duplicate rows will be eliminated by BigQuery
Of course, this will involve some client side coding - so might be not relevant for this particular question.
I havent tried this approach by myself either but feel it might be interesting to try :o)
If you have a large-size partitioned table, and only have duplicates in a certain partition range. You don't want to overscan nor process the whole table. use the MERGE SQL below with predicates on partition range:
-- WARNING: back up the table before this operation
-- FOR large size timestamp partitioned table
-- -------------------------------------------
-- -- To de-duplicate rows of a given range of a partition table, using surrage_key as unique id
-- -------------------------------------------
DECLARE dt_start DEFAULT TIMESTAMP("2019-09-17T00:00:00", "America/Los_Angeles") ;
DECLARE dt_end DEFAULT TIMESTAMP("2019-09-22T00:00:00", "America/Los_Angeles");
MERGE INTO `gcp_project`.`data_set`.`the_table` AS INTERNAL_DEST
USING (
SELECT k.*
FROM (
SELECT ARRAY_AGG(original_data LIMIT 1)[OFFSET(0)] k
FROM `gcp_project`.`data_set`.`the_table` AS original_data
WHERE stamp BETWEEN dt_start AND dt_end
GROUP BY surrogate_key
)
) AS INTERNAL_SOURCE
ON FALSE
WHEN NOT MATCHED BY SOURCE
AND INTERNAL_DEST.stamp BETWEEN dt_start AND dt_end -- remove all data in partiion range
THEN DELETE
WHEN NOT MATCHED THEN INSERT ROW
credit: https://gist.github.com/hui-zheng/f7e972bcbe9cde0c6cb6318f7270b67a
Easier answer, without a subselect
SELECT
*,
ROW_NUMBER()
OVER (PARTITION BY Fixed_Accident_Index)
row_number
FROM Accidents.CleanedFilledCombined
WHERE TRUE
QUALIFY row_number = 1
The Where True is neccesary because qualify needs a where, group by or having clause
Felipe's answer is the best approach for most cases. Here is a more elegant way to accomplish the same:
CREATE OR REPLACE TABLE Accidents.CleanedFilledCombined
AS
SELECT
Fixed_Accident_Index,
ARRAY_AGG(x LIMIT 1)[SAFE_OFFSET(0)].* EXCEPT(Fixed_Accident_Index)
FROM Accidents.CleanedFilledCombined AS x
GROUP BY Fixed_Accident_Index;
To be safe, make sure you backup the original table before you run this ^^
I don't recommend to use ROW NUMBER() OVER() approach if possible since you may run into BigQuery memory limits and get unexpected errors.
Update BigQuery schema with new table column as bq_uuid making it NULLABLE and type STRING
Create duplicate rows by running same command 5 times for example
insert into beginner-290513.917834811114.messages (id, type, flow, updated_at) Values(19999,"hello", "inbound", '2021-06-08T12:09:03.693646')
Check if duplicate entries exist
select * from beginner-290513.917834811114.messages where id = 19999
Use generate uuid function to generate uuid corresponding to each message
UPDATE beginner-290513.917834811114.messages
SET bq_uuid = GENERATE_UUID()
where id>0
Clean duplicate entries
DELETE FROM beginner-290513.917834811114.messages
WHERE bq_uuid IN
(SELECT bq_uuid
FROM
(SELECT bq_uuid,
ROW_NUMBER() OVER( PARTITION BY updated_at
ORDER BY bq_uuid ) AS row_num
FROM beginner-290513.917834811114.messages ) t
WHERE t.row_num > 1 );
I am recieving information from a csv file from one department to compare with the same inforation in a different department to check for discrepencies (About 3/4 of a million rows of data with 44 columns in each row). After I have the data in a table, I have a program that will take the data and send reports based on a HQ. I feel like the way I am going about this is not the most efficient. I am using oracle for this comparison.
Here is what I have:
I have a vb.net program that parses the data and inserts it into an extract table
I run a procedure to do a full outer join on the two tables into a new table with the fields in one department prefixed with '_c'
I run another procedure to compare the old/new data and update 2 different tables with detail and summary information. Here is code from inside the procedure:
DECLARE
CURSOR Cur_Comp IS SELECT * FROM T.AEC_CIS_COMP;
BEGIN
FOR compRow in Cur_Comp LOOP
--If service pipe exists in CIS but not in FM and the service pipe has status of retired in CIS, ignore the variance
If(compRow.pipe_num = '' AND cis_status_c = 'R')
continue
END IF
--If there is not a summary record for this HQ in the table for this run, create one
INSERT INTO t.AEC_CIS_SUM (HQ, RUN_DATE)
SELECT compRow.HQ, to_date(sysdate, 'DD/MM/YYYY') from dual WHERE NOT EXISTS
(SELECT null FROM t.AEC_CIS_SUM WHERE HQ = compRow.HQ AND RUN_DATE = to_date(sysdate, 'DD/MM/YYYY'))
-- Check fields and update the tables accordingly
If (compRow.cis_loop <> compRow.cis_loop_c) Then
--Insert information into the details table
INSERT INTO T.AEC_CIS_DET( Fac_id, Pipe_Num, Hq, Address, AutoUpdatedFl,
DateTime, Changed_Field, CIS_Value, FM_Value)
VALUES(compRow.Fac_ID, compRow.Pipe_Num, compRow.Hq, compRow.Street_Num || ' ' || compRow.Street_Name,
'Y', sysdate, 'Cis_Loop', compRow.cis_loop, compRow.cis_loop_c);
-- Update information into the summary table
UPDATE AEC_CIS_SUM
SET cis_loop = cis_loop + 1
WHERE Hq = compRow.Hq
AND Run_Date = to_date(sysdate, 'DD/MM/YYYY')
End If;
END LOOP;
END;
Any suggestions of an easier way of doing this rather than an if statement for all 44 columns of the table? (This is run once a week if it matters)
Update: Just to clarify, there are 88 columns of data (44 of duplicates to compare with one suffixed with _c). One table lists each field in a row that is different so one row can mean 30+ records written in that table. The other table keeps tally of the number of discrepencies for each week.
First of all I believe that your task can be implemented (and should be actually) with staight SQL. No fancy cursors, no loops, just selects, inserts and updates. I would start with unpivotting your source data (it is not clear if you have primary key to join two sets, I guess you do):
Col0_PK Col1 Col2 Col3 Col4
----------------------------------------
Row1_val A B C D
Row2_val E F G H
Above is your source data. Using UNPIVOT clause we convert it to:
Col0_PK Col_Name Col_Value
------------------------------
Row1_val Col1 A
Row1_val Col2 B
Row1_val Col3 C
Row1_val Col4 D
Row2_val Col1 E
Row2_val Col2 F
Row2_val Col3 G
Row2_val Col4 H
I think you get the idea. Say we have table1 with one set of data and the same structured table2 with the second set of data. It is good idea to use index-organized tables.
Next step is comparing rows to each other and storing difference details. Something like:
insert into diff_details(some_service_info_columns_here)
select some_service_info_columns_here_along_with_data_difference
from table1 t1 inner join table2 t2
on t1.Col0_PK = t2.Col0_PK
and t1.Col_name = t2.Col_name
and nvl(t1.Col_value, 'Dummy1') <> nvl(t2.Col_value, 'Dummy2');
And on the last step we update difference summary table:
insert into diff_summary(summary_columns_here)
select diff_row_id, count(*) as diff_count
from diff_details
group by diff_row_id;
It's just rough draft to show my approach, I'm sure there is much more details should be taken into account. To summarize I suggest two things:
UNPIVOT data
Use SQL statements instead of cursors
You have several issues in your code:
If(compRow.pipe_num = '' AND cis_status_c = 'R')
continue
END IF
"cis_status_c" is not declared. Is it a variable or a column in AEC_CIS_COMP?
In case it is a column, just put the condition into the cursor, i.e. SELECT * FROM T.AEC_CIS_COMP WHERE not (compRow.pipe_num = '' AND cis_status_c = 'R')
to_date(sysdate, 'DD/MM/YYYY')
That's nonsense, you convert a date into a date, simply use TRUNC(SYSDATE)
Anyway, I think you can use three single statements instead of a cursor:
INSERT INTO t.AEC_CIS_SUM (HQ, RUN_DATE)
SELECT comp.HQ, trunc(sysdate)
from AEC_CIS_COMP comp
WHERE NOT EXISTS
(SELECT null FROM t.AEC_CIS_SUM WHERE HQ = comp.HQ AND RUN_DATE = trunc(sysdate));
INSERT INTO T.AEC_CIS_DET( Fac_id, Pipe_Num, Hq, Address, AutoUpdatedFl, DateTime, Changed_Field, CIS_Value, FM_Value)
select comp.Fac_ID, comp.Pipe_Num, comp.Hq, comp.Street_Num || ' ' || comp.Street_Name, 'Y', sysdate, 'Cis_Loop', comp.cis_loop, comp.cis_loop_c
from T.AEC_CIS_COMP comp
where comp.cis_loop <> comp.cis_loop_c;
UPDATE AEC_CIS_SUM
SET cis_loop = cis_loop + 1
WHERE Hq IN (Select Hq from T.AEC_CIS_COMP)
AND trunc(Run_Date) = trunc(sysdate);
They are not tested but they should give you a hint how to do it.