Smart Oracle tool to find missing field relationships - oracle

Does Oracle have a tool I can use to analyze a database and help determine possible missing field relationships? We have a legacy database with 150+ tables and many relationships are missing. We could go through it by hand but an automated tool might be useful. So find things like missing foreign keys and whatnot.

I've had to do this a few times now. I find it's a very human-intelligence kind of thing - helped by running a lot of queries across both the data dictionary (e.g. EvilTeach's query), querying sample data from the columns, examining how the data is created by the application, and understanding the business requirements and user processes.
For example, in many legacy applications I find constraints (including referential integrity constraints) that are checked and implemented in the front-end application, which means the data follows the constraint (almost 100% :) ) but it's not actually constrained at the database level. Lots of fun results.
I'd be surprised if a tool could do any of this automatically and yield useful results.

This might be a good start
select column_name, table_name, data_type
from user_tab_cols
order by column_name, table_name

You could find a POSSIBLE foreign key absence if you assume that you could identify a POSSIBLE foreign key relationshinp by finding columns with equal names and data types in different tables, one of which is a primary key, and another one has no reference to that key.
You could use a query like this:
select c1.TABLE_NAME, c1.COLUMN_NAME, c2.TABLE_NAME, c2.COLUMN_NAME
from user_tab_columns c1,
user_tables at1,
user_tab_columns c2,
user_tables at2
where c1.COLUMN_NAME = c2.COLUMN_NAME
and c1.DATA_TYPE = c2.DATA_TYPE
and c1.TABLE_NAME = at1.TABLE_NAME
and c2.TABLE_NAME = at2.TABLE_NAME
and c1.TABLE_NAME != c2.TABLE_NAME
/*and c1.TABLE_NAME = 'TABLE' --check this for one table
and c1.COLUMN_NAME = 'TABLE_PK'*/
and not exists (select 1
from user_cons_columns ucc,
user_constraints uc,
user_constraints uc2,
user_cons_columns ucc2
where ucc.CONSTRAINT_NAME = uc.CONSTRAINT_NAME
and uc.TABLE_NAME = ucc.TABLE_NAME
and ucc.table_name = c1.TABLE_NAME
and ucc.column_name = c1.COLUMN_NAME
and uc.CONSTRAINT_TYPE = 'P'
and uc2.table_name = c2.TABLE_NAME
and ucc2.column_name = c2.COLUMN_NAME
and uc2.table_name = ucc2.table_name
and uc2.r_constraint_name = uc.constraint_name
and uc2.constraint_type = 'R')
This one (a sketch, optimized in no way, though) scans through all pairs of column name-type equality, and finds if one is a PK, and another one doesn't reference it.
But, and here I agree with Jeffrey, it's a very human-intelligence kind of thing, and no tool will do this for sure. In any case you'll have to do it by hand.

Related

Nested select in hiveQL

In one of my use case, i have two tables namely flow and conf. The flow table contains list of all flight data. It has columns creationdate,datafilename,aircraftid. The conf table contains configuration information. It has columns configdate, aircraftid, configurationame. There are multiple versions of configurations created for one aircraft type. So, when we process a datafilename, we need to identify the aircraftid from the flow table, and pick up the configuration from conf table that was created just before the datafilename was created. So, i tried this,
FROM (
SELECT
F_FILE_CREATION_DATE,
F_FILE_ARCHIVED_RELATIVE_PATH,
F_FILE_ARCHIVED_NAME,
K_AIRCRAFT
from T_FLOW f )x left join
(
select c.config_date, c.aircraft_id, c.configurationfrom t_conf c
) y on y.aircraft_id = x.K_AIRCRAFT
select
x.F_FILE_CREATION_DATE,
x.F_FILE_ARCHIVED_RELATIVE_PATH,
x.F_FILE_ARCHIVED_NAME,
x.K_AIRCRAFT,
y.config_date,
y.aircraft_id,
y.configuration;
This picks up all the configurations created for the aircraft which is obvious as there is no condition to check conf.config_date < flow.f_file_creation_date. I tried to include this condition like this,
FROM (
SELECT
F_FILE_CREATION_DATE,
F_FILE_ARCHIVED_RELATIVE_PATH,
F_FILE_ARCHIVED_NAME,
K_AIRCRAFT
from T_FLOW f )x join
(
select c.config_date, c.aircraft_id, c.FILEFILTER from t_conf c
) y on y.aircraft_id = x.K_AIRCRAFT where y.config_date < x.f_file_creation_date
select
x.F_FILE_CREATION_DATE,
x.F_FILE_ARCHIVED_RELATIVE_PATH,
x.F_FILE_ARCHIVED_NAME,
x.K_AIRCRAFT,
y.config_date,
y.aircraft_id,
y.filefilter;
This time failed with the error
required (...)+ loop did not match anything at input 'where' in statement
Can someone give me a hint or two where i am going wrong and on how to fix this?
select f.f_file_creation_date
,f.f_file_archived_relative_path
,f.f_file_archived_name
,f.k_aircraft
,c.config_date
,c.aircraft_id
,c.filefilter
from t_flow as f
join (select config_date
,aircraft_id
,filefilter
,lead (config_date,1,date '3000-01-01') over
(
partition by aircraft_id
order by config_date
) as next_config_date
from t_conf
) c
on c.aircraft_id =
f.k_aircraft
where f.f_file_creation_date >= c.config_date
and f.f_file_creation_date < c.next_config_date
Please read carefully
Posting a question
When you post a data related question -
Supply a data sample: source data + required results.
It is going to be more clear than any explanation you give.
It will also supply a common background for further discussions and a way for you and others to verify the correctness of the given solutions.
Supply the size properties (records/volume) of the tables.
It is important for performance considerations ans might impact the given solution.
SQL
Hive currently does not support any JOIN condition type other than equijoin (e.g. t1.X = t2.X and t1.Y = t2.Y). This is why you get an error.
If you are doing an inner join (and not outer join) then you can move the non-equijoin conditions to the WHERE clause.
Stick to ISO SQL standard. There is a conventional order for SQL clauses: SELECT-FROM-WHERE...
You gain nothing from esoteric syntax except for esoteric error messages.
There is no reason what so ever to use sub-queries in order to narrow the columns list.
Just to make it perfectly clear - There isn't any performance gain doing that. More than that, if it would have work as you assume (and it does not) the performance would have been worse, not better.
I can't reproduce your error. I guess your query is valid.
What version do you use for Hive ? I tested this query with hive 2.1.1.
DROP TABLE IF EXISTS t_flow;
CREATE TABLE IF NOT EXISTS t_flow (
f_file_creation_date DATE
, f_file_archived_relative_path STRING
, f_file_archived_name STRING
, k_aircraft STRING
);
-- Conf table contains configuration information.
-- It has columns configdate, aircraftid, configurationame
DROP TABLE IF EXISTS t_conf;
CREATE TABLE IF NOT EXISTS t_conf (
config_date DATE
, aircraft_id STRING
, filefilter STRING
);
SELECT
x.f_file_creation_date,
x.f_file_archived_relative_path,
x.f_file_archived_name,
x.k_aircraft,
y.config_date,
y.aircraft_id,
y.filefilter
FROM
(SELECT
f_file_creation_date,
f_file_archived_relative_path,
f_file_archived_name,
k_aircraft
FROM t_flow f) x
JOIN
(SELECT
c.config_date,
c.aircraft_id,
c.filefilter
FROM t_conf c) y on y.aircraft_id = x.k_aircraft where y.config_date < x.f_file_creation_date;

Oracle dba_tab_cols query

Hi is it possible to retrieve the primary key and unique key using the dba_tab_cols query?
Is there any query that allows me to retrieve all of the following fields?
Column Name
Data Type
Primary Key
Null/Not Null
Unique Key
Default Value
Extra
Both primary and unique keys can span more than one column, so they wouldn't belong in dba_tab_columns. You'd need to look at dba_constraints and dba_cons_columns to get that information.
This is a starting point, maybe:
select owner, table_name, column_name, data_type, primary_key,
nullable, unique_key, data_default
from (
select dtc.owner, dtc.table_name, dtc.column_id, dtc.column_name,
dtc.data_type, dtc.nullable, dtc.data_default,
case when dc.constraint_type = 'P' and dcc.column_name = dtc.column_name
then dc.constraint_name end as primary_key,
case when dc.constraint_type = 'U' and dcc.column_name = dtc.column_name
then dc.constraint_name end as unique_key,
row_number() over (partition by dtc.owner, dtc.table_name, dtc.column_id
order by null) as rn
from dba_tab_columns dtc
left join dba_constraints dc
on dc.owner = dtc.owner
and dc.table_name = dtc.table_name
and dc.constraint_type in ('P', 'U')
left join dba_cons_columns dcc
on dcc.owner = dc.owner
and dcc.constraint_name = dc.constraint_name
and dcc.table_name = dc.table_name
and dcc.column_name = dtc.column_name
where dtc.owner = '<owner>'
and dtc.table_name = '<table_name>'
)
where rn = 1
order by owner, table_name, column_id;
I've done this with a subquery that generates a row_number value because you'd get duplicates for a table with more than one constraint; and because you want the default value, which is a long (column data_default), you can't use distinct or group by. It feels a bit inelegant, but I'm sure you can work on it to get what you need.
It's also possible to have a check constraint that replicates the not null version, though it isn't advisable. And a unique index won't show up as a unique constraint, so you might want to look for one of those too, via dba_indexes and dba_ind_columns. An index used to back up a unique constrain will appear in both, though.
You could look at dbms_metadata.get_ddl to get this information too, depending on what you intend to do with it. I'm not sure why this would be useful, other than to try to recreate the schema elsewhere, and there are better tools for doing that.

Finding sequences and triggers associated with an Oracle table

I have used this query to fetch the list of sequences belonging to an Oracle database user:
SELECT * FROM all_sequences x,all_tables B
WHERE x.sequence_owner=B.owner AND B.TABLE_NAME='my_table';
But that database user is having many more sequence also, so the query returns me all the sequence of the database user. Can anybody help me to find the particular sequence of my_table using query so that I can get the auto increment id in my application.
i want the query which fetch list of table of my database user with the sequence and triggers used in the table
You can get the triggers associated with your tables from the user_triggers view. You can then look for any dependencies recorded for those triggers in user_dependencies, which may include objects other than sequences (packages etc.), so joining those dependencies to the user_sequences view will only show you the ones you are interested in.
Something like this, assuming you are looking at your own schema, and you're only interesting in triggers that references sequences (which aren't necessarily doing 'auto increment', but are likely to be):
select tabs.table_name,
trigs.trigger_name,
seqs.sequence_name
from user_tables tabs
join user_triggers trigs
on trigs.table_name = tabs.table_name
join user_dependencies deps
on deps.name = trigs.trigger_name
join user_sequences seqs
on seqs.sequence_name = deps.referenced_name;
SQL Fiddle demo.
If you're actually looking at a different schema then you'll need to use all_tables etc. and filter and join on the owner column for the user you're looking for. And if you want to include tables which don't have triggers, or triggers which don't refer to sequences, you can use outer joins.
Version looking for a different schema, though this assumes you have the privs necessary to access the data dictionary information - that the tables etc. are visible to you, which they may not be:
select tabs.table_name,
trigs.trigger_name,
seqs.sequence_name
from all_tables tabs
join all_triggers trigs
on trigs.table_owner = tabs.owner
and trigs.table_name = tabs.table_name
join all_dependencies deps
on deps.owner = trigs.owner
and deps.name = trigs.trigger_name
join all_sequences seqs
on seqs.sequence_owner = deps.referenced_owner
and seqs.sequence_name = deps.referenced_name
where tabs.owner = '<owner>';
If that can't see them then you might need to look at the DBA views, again if you have sufficient privs:
select tabs.table_name,
trigs.trigger_name,
seqs.sequence_name
from dba_tables tabs
join dba_triggers trigs
on trigs.table_owner = tabs.owner
and trigs.table_name = tabs.table_name
join dba_dependencies deps
on deps.owner = trigs.owner
and deps.name = trigs.trigger_name
join dba_sequences seqs
on seqs.sequence_owner = deps.referenced_owner
and seqs.sequence_name = deps.referenced_name
where tabs.owner = '<owner>';
One way would be to run these queries to check if there are any sequence's Pseudocolumns (NEXTVAL and CURRVAL ) used in your functions , procedures, packages, Triggers or PL/SQL JAVA SOURCE.
select * from user_source where
UPPER(TEXT) LIKE '%NEXTVAL%';
select * from all_source where
UPPER(TEXT) LIKE '%NEXTVAL%';
Then go to the specific Procedure, Function or Trigger to check which column/table gets populated by a sequence.
The query could also be used with '%CURRVAL%'
This might not help if you are running inserts from JDBC or other external applications using a sequence.
Oracle 12c introduced the IDENTITY columns, using which you could create a table with an identity column, which is generated by default.
CREATE TABLE t1 (c1 NUMBER GENERATED BY DEFAULT ON NULL AS IDENTITY,
c2 VARCHAR2(10));
This will internally create a sequence that auto-generates the value for the table's column.So, If you would like to know which sequence generates the value for which table, you may query the all_tab_columns
SELECT data_default AS sequence_val
,table_name
,column_name
FROM all_tab_columns
WHERE OWNER = 'HR'
AND identity_column = 'YES';
SEQUENCE_VAL |TABLE_NAME |COLUMN_NAME
-----------------------------------------|-------------------------------------
"HR"."ISEQ$$_78160".nextval |T1 |C1
I found a solution to this problem to guess the sequence of a particular sequence
select * from SYS.ALL_SEQUENCES where SEQUENCE_OWNER='OWNER_NAME' and LAST_NUMBER between (select max(FIELD_NAME) from TABLE_NAME) and (select max(FIELD_NAME)+40 from TABLE_NAME);
This query will guess by search the LAST_NUMBER of the sequence value between MAX value of the field using sequence and Max value + 40 (in my case cache value is 20, so I put 40)
select SEQUENCE_NAME from sys.ALL_TAB_IDENTITY_COLS where owner = 'SCHEMA_NAME' and table_name = 'TABLE_NAME';

Find if a column in Oracle has a sequence

I am attempting to figure out if a column in Oracle is populated from a sequence. My impression of how Oracle handles sequencing is that the sequence and column are separate entities and one needs to either manually insert the next sequence value like:
insert into tbl1 values(someseq.nextval, 'test')
or put it into a table trigger. Meaning that it is non-trivial to tell if a column is populated from a sequence. Is that correct? Any ideas about how I might go about figuring out if a column is populated from a sequence?
You are correct; the sequence is separate from the table, and a single sequence can be used to populate any table, and the values in a column in some table may mostly come from a sequence (or set of sequences), except for the values manually generated.
In other words, there is no mandatory connection between a column and a sequence - and therefore no way to discover such a relationship from the schema.
Ultimately, the analysis will be of the source code of all applications that insert or update data in the table. Nothing else is guaranteed. You can reduce the scope of the search if there is a stored procedure that is the only way to make modifications to the table, or if there is a trigger that sets the value, or other such things. But the general solution is the 'non-solution' of 'analyze the source'.
If the sequence is used in a trigger, it is possible to find which tables it populates:
SQL> select t.table_name, d.referenced_name as sequence_name
2 from user_triggers t
3 join user_dependencies d
4 on d.name = t.trigger_name
5 where d.referenced_type = 'SEQUENCE'
6 and d.type = 'TRIGGER'
7 /
TABLE_NAME SEQUENCE_NAME
------------------------------ ------------------------------
EMP EMPNO_SEQ
SQL>
You can vary this query to find stored procedures, etc that make use of the sequence.
There are no direct metadata links between Oracle sequences and any use in the database. You could make an intelligent guess if a column's values are related to a sequence by querying the USER_SEQUENCES metadata and comparing the LAST_NUMBER column to the data for the column.
select t.table_name,
d.referenced_name as sequence_name,
d.REFERENCED_OWNER as "OWNER",
c.COLUMN_NAME
from user_trigger_cols t, user_dependencies d, user_tab_cols c
where d.name = t.trigger_name
and t.TABLE_NAME = c.TABLE_NAME
and t.COLUMN_NAME = c.COLUMN_NAME
and d.referenced_type = 'SEQUENCE'
and d.type = 'TRIGGER'
As Jonathan pointed out: there is no direct way to relate both objects. However, if you "keep a standard" for primary keys and sequences/triggers you could find out by finding the primary key and then associate the constraint to the table sequence.
I was in need of something similar since we are building a multi-db product and I tried to replicate some classes with properties found in a DataTable object from .Net which has AutoIncrement, IncrementSeed and IncrementStep which can only be found in the sequences.
So, as I said, if you, for your tables, use a PK and always have a sequence associated with a trigger for inserts on a table then this may come handy:
select tc.table_name,
case tc.nullable
when 'Y' then 1
else 0
end as is_nullable,
case ac.constraint_type
when 'P' then 1
else 0
end as is_identity,
ac.constraint_type,
seq.increment_by as auto_increment_seed,
seq.min_value as auto_increment_step,
com.comments as caption,
tc.column_name,
tc.data_type,
tc.data_default as default_value,
tc.data_length as max_length,
tc.column_id,
tc.data_precision as precision,
tc.data_scale as scale
from SYS.all_tab_columns tc
left outer join SYS.all_col_comments com
on (tc.column_name = com.column_name and tc.table_name = com.table_name)
LEFT OUTER JOIN SYS.ALL_CONS_COLUMNS CC
on (tc.table_name = cc.table_name and tc.column_name = cc.column_name and tc.owner = cc.owner)
LEFT OUTER JOIN SYS.ALL_CONSTRAINTS AC
ON (ac.constraint_name = cc.constraint_name and ac.owner = cc.owner)
LEFT outer join user_triggers trg
on (ac.table_name = trg.table_name and ac.owner = trg.table_owner)
LEFT outer join user_dependencies dep
on (trg.trigger_name = dep.name and dep.referenced_type='SEQUENCE' and dep.type='TRIGGER')
LEFT outer join user_sequences seq
on (seq.sequence_name = dep.referenced_name)
where tc.table_name = 'TABLE_NAME'
and tc.owner = 'SCHEMA_NAME'
AND AC.CONSTRAINT_TYPE = 'P'
union all
select tc.table_name,
case tc.nullable
when 'Y' then 1
else 0
end as is_nullable,
case ac.constraint_type
when 'P' then 1
else 0
end as is_identity,
ac.constraint_type,
seq.increment_by as auto_increment_seed,
seq.min_value as auto_increment_step,
com.comments as caption,
tc.column_name,
tc.data_type,
tc.data_default as default_value,
tc.data_length as max_length,
tc.column_id,
tc.data_precision as precision,
tc.data_scale as scale
from SYS.all_tab_columns tc
left outer join SYS.all_col_comments com
on (tc.column_name = com.column_name and tc.table_name = com.table_name)
LEFT OUTER JOIN SYS.ALL_CONS_COLUMNS CC
on (tc.table_name = cc.table_name and tc.column_name = cc.column_name and tc.owner = cc.owner)
LEFT OUTER JOIN SYS.ALL_CONSTRAINTS AC
ON (ac.constraint_name = cc.constraint_name and ac.owner = cc.owner)
LEFT outer join user_triggers trg
on (ac.table_name = trg.table_name and ac.owner = trg.table_owner)
LEFT outer join user_dependencies dep
on (trg.trigger_name = dep.name and dep.referenced_type='SEQUENCE' and dep.type='TRIGGER')
LEFT outer join user_sequences seq
on (seq.sequence_name = dep.referenced_name)
where tc.table_name = 'TABLE_NAME'
and tc.owner = 'SCHEMA_NAME'
AND AC.CONSTRAINT_TYPE is null;
That would give you the list of columns for a schema/table with:
Table name
If column is nullable
Constraint type (only for PK's)
Increment seed (from the sequence)
Increment step (from the sequence)
Column comments
Column name, of course :)
Data type
Default value, if any
Length of column
Index (column id)
Precision (for numbers)
Scale (for numbers)
I'm pretty sure that code can be optimized but it works for me, I use it to "load metadata" for tables and then represent that metadata as entities on my frontend.
Note that I'm filtering only primary keys and not retrieving compound key constraints since I don't care about those. If you do you'll have to modify the code to do so and make sure that you filter duplicates since you could get one column twice (one for the PK constraint, another for the compound key).

Displaying foreign key relationships in Oracle 9i

Is there a command in oracle 9i that displays the foreign keys of a table and also the table that those foreign keys reference?
I was searching, did not find anything but i found an equivalent command that works with MySql which is SHOW CREATE TABLE
Is there an equivalent command for this within oracle's SQL?
I appreciate your response, however I thought there was a really short way of doing this like MySql.
Here's another answer: The dbms_metadata package has a function that can return the DDL for a table definition.
SELECT dbms_metadata.get_ddl('TABLE', '<table>', '<schema>') FROM dual;
This package has apparently been available since Oracle 9.2
http://download-west.oracle.com/docs/cd/B10501_01/appdev.920/a96612/d_metada.htm#1656
You could start by listing all of the constraints for the table along with any referenced constraint on other tables:
SELECT
acc.table_name
,acc.column_name
,acc.constraint_name
,ac.r_constraint_name AS referenced_constraint
FROM all_cons_columns acc
INNER JOIN all_constraints ac ON (acc.constraint_name = ac.constraint_name)
WHERE acc.table_name = UPPER('your_table_here');
If you have sensible naming conventions for your constraints it should be possible to identify which are the foreign keys, an 'FK' prefix/suffix is typical.
This may do what you want, it uses Oracle system views. I don't have an Oracle instance handy to test it, however.
SELECT fk.owner, fk.constraint_name, fk.table_name, fc.column_name,
pk.owner, pk.constraint_name, pk.table_name, pc.column_name
FROM all_constraints fk
JOIN all_cons_columns fc ON (fk.owner = fc.owner AND fk.constraint_name = fc.constraint_name)
JOIN (all_constraints pk
JOIN all_cons_columns pc ON (pk.owner = pc.owner AND pk.constraint_name = pc.constraint_name))
ON (fk.r_owner = pk.owner AND fk.r_constraint_name = pk.constraint_name
AND fc.position = pc.position)
WHERE fk.constraint_type = 'R' AND pk.constraint_type IN ('P', 'U')
AND fk.owner = '<schema>' AND fk.table_name = '<table>';
If you need the DDL for the foreign keys in the future, then here is the answer in advance :)
select
DBMS_METADATA.GET_DEPENDENT_DDL('REF_CONSTRAINT' ,atb.table_name, atb.owner)
from
all_tables atb, all_constraints ac
where
atb.owner = ac.owner and
ac.constraint_type = 'R' and
ac.table_name = atb.table_name and
atb.owner = 'YOURSCHEMA';

Resources