Characters spilled over multiple columns in Oracle 11g? - oracle

This is related to question: How to store unlimited characters in Oracle 11g?
If maximum I need is 8000 characters, can I just add 3 more varchar2 columns so that I will have 4 columns with 2000 char each to get 8000 chars. So when the first column is full, values would be spilled over to the next column and so on. Will this design have any bad side effects? Please suggest.

Why not just use a CLOB column instead? I read your other link, and I don't understand why your DBA's don't like these types of columns. I mean, CLOB is an important feature of Oracle just for this exact purpose. And your company paid good money for that feature when buying Oracle, so why not leverage Oracle to it's fullest capabilities instead of trying to come up with hacks to do something that the DB is not really designed to do?
Maybe instead of spending time trying to devise hacks to overcome limitations created by your DBAs, you should spend some time educating your DBA's on why CLOBs are the right feature to solve your problem.
I would never be satisfied with a bad design when the DB has the feature I need to make a good design. If the DBA's are the problem, then they need to change their viewpoint or you should go to senior level management in my opinion.

I agree with dcp that you should be using CLOB. But if against all sense you are forced to "roll your own" unlimited text using just VARCHAR2 columns then I would not do it by adding more and more VARCHAR2 columns to the table like this:
create table mytable
( id integer primary key
, text varchar2(2000)
, more_text varchar2(2000)
, and_still_more_text varchar2(2000)
);
Instead I would move the text to a separate child table like this:
create table mytable
( id integer primary key
);
create table mytable_text
( id references mytable(id)
, seqno integer
, text varchar2(2000)
, primary key (id, seqno)
);
Then you can insert as much text as you like for each mytable row, using many rows in mytable_text.

To add to DCP's and Tony's excellent answers:
You ask if the approach you are proposing will have any bad side effects. Here are a few things to consider:
Suppose you want to perform a search of your text data for a particular string. Your approach requires repeating the search on each column containing your text - results in a convoluted and inefficient WHERE clause.
Every time you want to expand your text field you have to add another column, which now means you have to modify every place you coded to do (1).
Everyone who has to maintain this structure after you will curse your name ;-)

Related

Formatting a column using Expression Transformation in Informatica Powercenter

I have a Column (SALARY) in Source Table from Relational DB,
for example 15000 is a record in SALARY column
and I want to format it as $15,000.00 into the Target table which is a Relational DB
using Expression Transformation.
Thankyou,
Ajay
This can be done using below steps:
Pull the required column from source to Expression transformation.
Create a derived column as the concatenation of $ and the input column by making use of the equation CONCAT('$',Source_Column)
Load this new column to the target, instead of the column from the source.
I hope this is just to learn the functionality. In real life scenarios, this is a bad practice. We need not keep these symbols and all in tables. This can be directly handled at reporting level.
may be the scenario where salary column has salary of different types say 'dollor','euro','pounds', etc in same table.
still i agree with Thomas Cherian that is not good practice to store the symbol with data itself.
if this is the scenario you can do 2 things.
--1.) add another column in table and store the value of type there or add another column which stores the conversion rate also.
--2.) convert all the values in once currency and store it without symbol.

Best way to identify a handful of records expected to have a flag set to TRUE

I have a table that I expect to get 7 million records a month on a pretty wide table. A small portion of these records are expected to be flagged as "problem" records.
What is the best way to implement the table to locate these records in an efficient way?
I'm new to Oracle, but is a materialized view an valid option? Are there such things in Oracle such as indexed views or is this potentially really the same thing?
Most of the reporting is by month, so partitioning by month seems like an option, but a "problem" record may be lingering for several months theorectically. Otherwise, the reporting shuold be mostly for the current month. Would you expect that querying across all month partitions to locate any problem record would cause significant performance issues compared to usinga single table?
Your general thoughts of where to start would be appreciated. I realize I need to read up and I'll do that but I wanted to get the community thought first to make sure I read the right stuff.
One more thought: The primary key is a GUID varchar2(36). In order of magnitude, how much of a performance hit would you expect this to be relative to using a NUMBER data type PK? This worries me but it is out of my control.
It depends what you mean by "flagged", but it sounds to me like you would benefit from a simple index, function based index, or an indexed virtual column.
In all cases you should be careful to ensure that all the index columns are NULL for rows that do not need to be flagged. This way your index will contain only the rows that are flagged (Oracle does not - by default - index rows in B-Tree indexes where all index column values are NULL).
Your primary key being a VARCHAR2 GUID should make no difference, at least with regards to the specific flagging of rows in this question, indexes will point to rows via Oracle internal ROWIDs.
Indexes support partitioning, so if your data is already partitioned, your index could be set to match.
Simple column index method
If you can dictate how the flagging works, or the column already exists, then I would simply add an index to it like so:
CREATE INDEX my_table_problems_idx ON my_table (problem_flag)
/
Function-based index method
If the data model is fixed / there is no flag column, then you can create a function-based index assuming that you have all the information you need in the target table. For example:
CREATE INDEX my_table_problems_fnidx ON my_table (
CASE
WHEN amount > 100 THEN 'Y'
ELSE NULL
END
)
/
Now if you use the same logic in your SELECT statement, you should find that it uses the index to efficiently match rows.
SELECT *
FROM my_table
WHERE CASE
WHEN amount > 100 THEN 'Y'
ELSE NULL
END IS NOT NULL
/
This is a bit clunky though, and it requires you to use the same logic in queries as the index definition. Not great. You could use a view to mask this, but you're still duplicating logic in at least two places.
Indexed virtual column
In my opinion, this is the best way to do it if you are computing the value dynamically (available from 11g onwards):
ALTER TABLE my_table
ADD virtual_problem_flag VARCHAR2(1) AS (
CASE
WHEN amount > 100 THEN 'Y'
ELSE NULL
END
)
/
CREATE INDEX my_table_problems_idx ON my_table (virtual_problem_flag)
/
Now you can just query the virtual column as if it were a real column, i.e.
SELECT *
FROM my_table
WHERE virtual_problem_flag = 'Y'
/
This will use the index and puts the function-based logic into a single place.
Create a new table with just the pks of the problem rows.

SQL Loader, Trigger saturation? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I have a situation i can't find an explanation for, here it is. (I'll use hypothetical info since the original are really big.)
I have a table, let's say:
table_a
-------------
name
last name
dept
status
notes
And this table has a trigger on insert, which does a lot of validation to the info to change the status field of the new record according to the results of the validation, some of the validations are:
- check for the name existing in a dictionary
- check for the last name existing in a dictionary
- check that fields (name,last name,dept) aren't already inserted in table_b
- ... and so on
The thing is, if I do an insert on the table via query, like
insert into table_a
(name,last_name,dept,status,notes)
values
('john','smith',1,0,'new');
it takes only 173 ms to do all the validation process, update the status field and insert the record in the table. (the validation process does all the searches via indexes)
But if I try this via SQLloader, reading a file with 5000 records, it takes like 40 minutes to validate and insert 149 records (of course i killed it...)
So I tried loading the data disabling the trigger (to check speed)
and I got that it loads like all the records in less than 10 seconds.
So my question is, what can I do to improve this process? My only theory is that I could be saturating the database because it loads so fast and launches many instances of the trigger, but I really don't know.
My objective is to load around 60 files with info and validate them through the process in the trigger (willing to try other options though).
I would really appreciatte any help you can provide!!
COMPLEMENT---------------------------------------------------------------------------------
Thanks for the answer, now i'll read all about this, now hope you can help me with this part. let me explain some of the functionality i need (and i used a trigger cause i couldn't think of anything else)
so the table data comes with this (important) fields:
pid name lastname birthdate dept cicle notes
the data comes like this
name lastname birthdate dept
now, the trigger does this to the data:
Calls a function to calculate the pid (is calculated based on the name, lastname and birthdate with an algorithm)
Calls a function to check for the names on the dictionary (thats because in my dictionary i have single names, meaning if a person is named john aaron smith jones the function splits john aaron in two, and searches for john and aaron in the dictionary in separate querys, thats why i didn't use a foreign key [to avoid having lots of combinations john aaron,john alan,john pierce..etc]) --->kinda stuck on how to implement this one with keys without changing the dictionary...maybe with a CHECK?, the lastname foreign key would be good idea.
Gets the cicle from another table according to the dept and the current date (because a person can appear twice in the table in the same dept but in different cicle) --->how could i get this cicle value in a more efficient way to do the correct search?
And finally, after all this validation is done, i need to know exactly which validation wasn't met (thus the field notes) so the trigger concatenates all the strings of failed validations, like this:
lastname not in dictionary, cannot calculate pid (invalid date), name not in dictionary
i know that if the constraint check isn't met all i could do is insert the record in another table with the constraint-failed error message, but that only leaves me with one validation, am i right? but i need to validate all of them and send the report to other department so they can review the data and make all the necessary adjustments to it.
Anyway, this is my situation right now, i'll explore possibilities and hope you can share some light on the overall process, Thank you very much for your time.
You're halfway to the solution already:
"So I tried loading the data disabling the trigger (to check speed) ... it loads like all the records in less than 10 seconds."
This is not a surprise. Your current implementation executes a lot of single row SELECT statements for each row you insert into table B. That will inevitably give you a poor performance profile. SQL is a set-based language and performs better with multi-row operations.
So, what you need to do is find a way to replace all the SELECT statements which more efficient alternatives. Then you'll be able to drop the triggers permanently. For instance, replace the look-ups on the dictionary with foreign keys between the table A columns and the reference table. Relational integrity constraints, being internal Oracle code, perform much better than any code we can write (and work in multi-user environments too).
The rule about not inserting into table A if a combination of columns already exists in table B is more problematic. Not because it's hard to do but because it sounds like poor relational design. If you don't want to load records in table A when they already exits in table B why aren't you loading into table B directly? Or perhaps you have a sub-set of columns which should be extracted from table A and table B and formed into table C (which would have foreign key relationships with A and B)?
Anyway, leaving that to one side, you can do this with set-based SQL by replacing SQL*Loader with an external table. An external table allows us to present a CSV file to the database as if it were a regular table. This means we can use it in normal SQL statements. Find out more.
So, with foreign key constraints on dictionary and an external table you can replace teh SQL Loader code with this statement (subject to whatever other rules are subsumed into "...and so on"):
insert into table_a
select ext.*
from external_table ext
left outer join table_b b
on (ext.name = b.name and ext.last_name = b.last_name and ext.dept=b.dept)
where b.name is null
log errors into err_table_a ('load_fail') ;
This employs the DML error logging syntax to capture constraint errors for all rows in a set-based fashion. Find out more. It won't raise exceptions for rows which already exist in table B. You could either use the multi-table INSERT ALL to route rows into an overflow table or use a MINUS set operation after the event to find rows in the external table which aren't in table A. Depends on your end goal and how you need to report things.
Perhaps a more complex answer than you were expecting. Oracle SQL is a very extensive SQL implementation, with a lot of functionality for improving the efficient of bulk operations. It really pays us to read the Concepts Guide and the SQL Reference to find out just how much we can do with Oracle.

oracle- index organized table

what is use-case of IOT (Index Organized Table) ?
Let say I have table like
id
Name
surname
i know the IOT but bit confuse about the use case of IOT
Your three columns don't make a good use case.
IOT are most useful when you often access many consecutive rows from a table. Then you define a primary key such that the required order is represented.
A good example could be time series data such as historical stock prices. In order to draw a chart of the stock price of a share, many rows are read with consecutive dates.
So the primary key would be stock ticker (or security ID) and the date. The additional columns could be the last price and the volume.
A regular table - even with an index on ticker and date - would be much slower because the actual rows would be distributed over the whole disk. This is because you cannot influence the order of the rows and because data is inserted day by day (and not ticker by ticker).
In an index-organized table, the data for the same ticker ends up on a few disk pages, and the required disk pages can be easily found.
Setup of the table:
CREATE TABLE MARKET_DATA
(
TICKER VARCHAR2(20 BYTE) NOT NULL ENABLE,
P_DATE DATE NOT NULL ENABLE,
LAST_PRICE NUMBER,
VOLUME NUMBER,
CONSTRAINT MARKET_DATA_PK PRIMARY KEY (TICKER, P_DATE) ENABLE
)
ORGANIZATION INDEX;
Typical query:
SELECT TICKER, P_DATE, LAST_PRICE, VOLUME
FROM MARKET_DATA
WHERE TICKER = 'MSFT'
AND P_DATE BETWEEN SYSDATE - 1825 AND SYSDATE
ORDER BY P_DATE;
Think of index organized tables as indexes. We all know the point of an index: to improve access speeds to particular rows of data. This is a performance optimisation of trick of building compound indexes on sub-sets of columns which can be used to satisfy commonly-run queries. If an index can completely satisy the columns in a query's projection the optimizer knows it doesn't have to read from the table at all.
IOTs are just this approach taken to its logical confusion: buidl the index and throw away the underlying table.
There are two criteria for deciding whether to implement a table as an IOT:
It should consists of a primary key (one or more columns) and at most one other column. (okay, perhaps two other columns at a stretch, but it's an warning flag).
The only access route for the table is the primary key (or its leading columns).
That second point is the one which catches most people out, and is the main reason why the use cases for IOT are pretty rare. Oracle don't recommend building other indexes on an IOT, so that means any access which doesn't drive from the primary key will be a Full Table Scan. That might not matter if the table is small and we don't need to access it through some other path very often, but it's a killer for most application tables.
It is also likely that a candidate table will have a relatively small number of rows, and is likely to be fairly static. But this is not a hard'n'fast rule; certainly a huge, volatile table which matched the two criteria listed above could still be considered for implementations as an IOT.
So what makes a good candidate dor index organization? Reference data. Most code lookup tables are like something this:
code number not null primary key
description not null varchar2(30)
Almost always we're only interested in getting the description for a given code. So building it as an IOT will save space and reduce the access time to get the description.

How to store unlimited characters in Oracle 11g?

We have a table in Oracle 11g with a varchar2 column. We use a proprietary programming language where this column is defined as string. Maximum we can store 2000 characters (4000 bytes) in this column. Now the requirement is such that the column needs to store more than 2000 characters (in fact unlimited characters). The DBAs don't like BLOB or LONG datatypes for maintenance reasons.
The solution that I can think of is to remove this column from the original table and have a separate table for this column and then store each character in a row, in order to get unlimited characters. This tble will be joined with the original table for queries.
Is there any better solution to this problem?
UPDATE: The proprietary programming language allows to define variables of type string and blob, there is no option of CLOB. I understand the responses given, but I cannot take on the DBAs. I understand that deviating from BLOB or LONG will be developers' nightmare, but still cannot help it.
UPDATE 2: If maximum I need is 8000 characters, can I just add 3 more columns so that I will have 4 columns with 2000 char each to get 8000 chars. So when the first column is full, values would be spilled over to the next column and so on. Will this design have any bad side effects? Please suggest.
If a blob is what you need convince your dba it's what you need. Those data types are there for a reason and any roll your own implementation will be worse than the built in type.
Also you might want to look at the CLOB type as it will meet your needs quite well.
You could follow the way Oracle stored their stored procedures in the information schema. Define a table called text columns:
CREATE TABLE MY_TEXT (
IDENTIFIER INT,
LINE INT,
TEXT VARCHAR2 (4000),
PRIMARY KEY (INDENTIFIER, LINE));
The identifier column is the foreign key to the original table. The Line is a simple integer (not a sequence) to keep the text fields in order. This allows keeping larger chunks of data
Yes this is not as efficient as a blob, clob, or LONG (I would avoid LONG fields if at all possible). Yes, this requires more mainenance, buf if your DBAs are dead set against managing CLOB fields in the database, this is option two.
EDIT:
My_Table below is where you currently have the VARCHAR column you are looking to expand. I would keep it in the table for the short text fields.
CREATE TABLE MY_TABLE (
INDENTIFER INT,
OTHER_FIELD VARCHAR2(10),
REQUIRED_TEXT VARCHAR(4000),
PRIMERY KEY (IDENTFIER));
Then write the query to pull the data join the two tables, ordering by LINE in the MY_TEXT field. Your application will need to split the string into 2000 character chunks and insert them in line order.
I would do this in a PL/SQL procedure. Both insert and select. PL/SQL VARCHAR strings can be up to 32K characters. Which may or may not be large enough for your needs.
But like every other person answering this question, I would strongly suggest making a case to the DBA to make the column a CLOB. From the program perspective this will be a BLOB and therefore simple to manage.
You said no BLOB or LONG... but what about CLOB? 4GB character data.
BLOB is the best solution. Anything else will be less convenient and a bigger maintenance annoyance.
Is BFILE a viable alternative datatype for your DBAs?
I don't get it. A CLOB is the appropriate database datatype. If your weird programming language will deal with strings of 8000 (or whatever) characters, what stops it writing those to a CLOB.
More specifically, what error do you get (from Oracle or your programming language) when you try to insert an 8000 character string into a column defined as a CLOB.

Resources