Wild card searches using a list of values in Oracle - oracle

I am trying to create a search against a field on a table, which can contain a combination of all sorts of string values such as single words or complete sentences.
What I trying to do is to create a query that searches on this field and returns any rows that contains any of the words and combination of words from a list as my predicate parameter.
Is there a way of passing in this list as a wild card search?

As I understood, you have a list of keywords, and you need to filter all those rows for which a certain column contains one or more keywords.
Two possible ways that I can think of -
using REGEXP_LIKE
select * from tbl where regexp_like(col, '^(ABC|EFG|IJK)');
Maintaining the list of keywords in a separate table and join the tables -
CREATE TABLE matching_patterns(
pattern VARCHAR(20)
);
INSERT INTO patterns VALUES ('%ABC%'), ('%EFG%'), ('%IJK%');
SELECT t.* FROM your_table t JOIN matching_patternsp ON (t.your_column LIKE p.pattern);

Related

Oracle Composite Index Performace

My sample oracle query structure is this:
SELECT <LIST_OF_COLUMNS> FROM <TABLE_NAME>
WHERE COLUMN_01 = <SOMETHING> AND COLUMN_02 = <SOMETHING> AND COLUMN_03 = <SOMETHING>
The table has over 1 Million records. I have indexed COLUMN_01, COLUMN_02 and COLUMN_03 separately. The above query is working fine and provide results as expected.
If I make COLUMN_01, COLUMN_02 and COLUMN_03 (all columns in WHERE clause) as composite index without changing existing indexes, will it improve performance? If so, is there an order for composite index columns?
If I use OR instead of AND like this query, will it improve performance?
SELECT <LIST_OF_COLUMNS> FROM <TABLE_NAME>
WHERE COLUMN_01 = <SOMETHING> OR COLUMN_02 = <SOMETHING> OR COLUMN_03 = <SOMETHING>
If you create a composite index based on the three columns AND
you never query the table using only one column in the WHERE
clause, you won't need the single column indexes. Otherwise, it
depends on what columns participate in a query. And this case should
be well analyzed and tested.
Columns order in a composite index does matter. Columns should be ordered by uniqueness where the least distinct column goes first. It helps trim down the number of rows matched the query predicate and thus speed up performance.
It also should be noticed that Oracle can use a composite index with queries that do not contain all the index columns in their predicates.
For example:
create index idx1 on table_name (col1, col2, col3);
/*In this query Oracle can use index idx1 as a standard one-column index,
because col1 is the first column in the index*/
select * from table_name where col1 = 'some_value';
/*Here Oracle can still use the composite index,
but in this case it will use INDEX SKIP SCAN (assuming col1 equals ANY value),
which reduces query performance comparing to an ordinary index*/
select * from table_name where col2 = 'some_value1' and col3 = 'some_value2';
OR or AND operators do not really matter here. What is more important is the number of rows which match the given predicate.
If I make COLUMN_01, COLUMN_02 and COLUMN_03 (all columns in WHERE clause) as composite index without changing existing indexes, will it improve performance?
Probably. One index which satisfies all WHERE criteria serves as a complete access path and hence is more effective than a single column index access path. The optimizer chooses one index, so it will index read all the rows matching (say) COLUMN_02 criterion and filter those rows using the other columns' criteria.
The price you pay for this improvement in performance is the overhead of maintaining an additional index. So you should consider whether you need all three single column indexes (for other queries).
is there an order for composite index columns?
Yes. Put them in ascending order of distinct values. The leading index column should be the least discriminating column. Having a unique key as the leading column is probably a disaster, although there are edge cases, so be sure to benchmark.
If I use OR instead of AND like this query, will it improve performance?
You're going to be returning more rows, which in itself is more work. It is also hard to use indexes in such a situation, so most likely you're facing a Full Table Scan. But why not try it and see what happens?
If I make COLUMN_01, COLUMN_02 and COLUMN_03 (all columns in WHERE clause) as composite index without changing existing indexes, will it improve performance?
For this query: likely. For INSERT/UPDATE/DELETE: the performance will deteriorate.
So you'll need to measure and see whether improvement in some queries justifies the deterioration in others.
If so, is there an order for composite index columns?
Not for this query. If you prefix-compress the index, you may choose the order that compresses the best, otherwise it shouldn't matter much.
However, there may be other queries that use only some of the indexed columns, in which case you'd want to make sure the columns that are actually used are at the leading edge of the index.
If I use OR instead of AND like this query, will it improve performance?
No. Separate indexes (that you already have) are what is needed in this case.

Create index for last two digits of number in Oracle

I have a massive table in which I can't do any more partitioning or sub-partitioning, nor am allowed to do any alter. I want to query its records by batches, and thought a good way would be using the last two digits from the account numbers (wouldn't have any other field splitting records as evenly).
I guess I'd need to at least index that somehow (remember I can't alter table to add a virtual column either).
Is there any kind of index to be used in such situation?
I am using Oracle 11gR2
You can use function based index:
create index two_digits_idx on table_name (substr(account_number, -2));
This index will work only in queries like that:
select ...
from table_name t ...
where substr(account_number, -2) = '25' -- or any other two digits
For using index, you need to use in a query the same expression like in an index.

Cognos Report Studio Total of Column based on Distinct values of Column2

Cognos by default suppress duplicate/identical records. Duplicate rows do not appear in the report, but summaries are performed on all rows - including the duplicates that were eliminated.
To perform summaries on only the distinct rows, you must add the distinct key word when creating the summary definition. For example, the following summary:
Total(MyColumn)
Would become...
Total (distinct MyColumn)
But I would like Total of Column1 based on Distinct values of Column2. How to do this?
I assume your report is built on top of relational model.
The short answer to your questions is using FOR clause:
Using the AT and FOR Options with Relational Summary Functions
So you can do something like this:
Total(distinct MyColumn for Column2)
My question is why would you think distinct on one column is different from other column?
Cognos "eliminate" duplicate rows only if two or more rows are completely identical.
If one of the columns is different, then it's not a distinct row.
You can use grouping instead, which group together identical values on single column.

select distinct from dataset

I have got duplicate rows in a dataset. how can i select distinct rows from that.
From comments: My query is something like this:
select name, age
from student
When I receive its output in a dataset the output consists of rows having duplicate names. Using dataset itself I have to select distinct name from this because I need the same query with duplicate values for some other place.
select DISTINCT name, age from student
If you need both the distinct data as well as the full data (with duplicate values), then you'll either have to maintain two datasets or continue doing things as you are now.

Is an Index Organized Table appropriate here?

I recently was reading about Oracle Index Organized Tables (IOTs) but am not sure I quite understand WHEN to use them. So I have a small table:
create table categories
(
id VARCHAR2(36),
group VARCHAR2(100),
category VARCHAR2(100
)
create unique index (group, category, id) COMPRESS 2;
The id column is a foreign key from another table entries and my common query is:
select e.id, e.time, e.title from entries e, categories c where e.id=c.id AND e.group=? AND c.category=? ORDER by e.time
The entries table is indexed properly.
Both of these tables have millions (16M currently) of rows and currently this query really stinks (note: I have it wrapped in a pagination query also so I only get back the first 20, but for simplicity I omitted that).
Since I am basically indexing the entire table, does it make sense to create this table as an IOT?
EDIT by popular demand:
create table entries
(
id VARCHAR2(36),
time TIMESTAMP,
group VARCHAR2(100),
title VARCHAR2(500),
....
)
create index (group, time) compress 1;
My real question I dont think depends on this though. Basically if you have a table with few columns (3 in this example) and you are planning on putting a composite index on all three rows is there any reason not to use an IOT?
IOTs are great for a number of purposes, including this case where you're gonna have an index on all (or most) of the columns anyway - but the benefit only materialises if you don't have the extra index - the idea is that the table itself is an index, so put the columns in the order that you want the index to be in. In your case, you're accessing category by id, so it makes sense for that to be the first column. So effectively you've got an index on (id, group, category). I don't know why you'd want an additional index on (group, category, id).
Your query:
SELECT e.id, e.time, e.title
FROM entries e, categories c
WHERE e.id=c.id AND e.group=? AND c.category=?
ORDER by e.time
You're joining the tables by ID, but you have no index on entries.id - so the query is probably doing a hash or sort merge join. I wouldn't mind seeing a plan for what your system is doing now to confirm.
If you're doing a pagination query (i.e. only interested in a small number of rows) you want to get the first rows back as quick as possible; for this to happen you'll probably want a nested loop on entries, e.g.:
NESTED LOOPS
ACCESS TABLE BY ROWID - ENTRIES
INDEX RANGE SCAN - (index on ENTRIES.group,time)
ACCESS TABLE BY ROWID - CATEGORIES
INDEX RANGE SCAN - (index on CATEGORIES.ID)
Since the join to CATEGORIES is on ID, you'll want an index on ID; if you make it an IOT, and make ID the leading column, that might be sufficient.
The performance of the plan I've shown above will be dependent on how many rows match the given "group" - i.e. how selective an average "group" is.
Have you looked at dba-oracle.com, asktom.com, IOUG, another asktom.com?
There are penalties to pay for IOTs - e.g., poorer insert performance
Can you prototype it and compare performance?
Also, perhaps you might want to consider a hash cluster.
IOT's are a trade off. You are getting access performance for decreased insert/update performance. We typically use them for reference data that is batch loaded daily and not updated during the day. This is not to say it's the only way to use them, just how we use them.
Few things here:
You mention pagination - have you considered the first_rows hint?
Is that the order your index is in, with group as the first field? If so I'd consider moving ID to be the first column since that index will not be used.
foreign keys should have an index on the column. Consider addind an index on the foreign key (id column).
Are you sure it's not the ORDER BY causing slowness?
What version of Oracle are you using?
I ASSUME there is a primary key on table entries for field id, correct?
Why the WHERE condition does not include "c.group = e.group" ?
Try to:
Remove the order by condition
Change the index definition from "create unique index (group,
category, id)" to "create unique index (id, group, category)"
Reorganise table categories as an IOT on (group, category, id)
Reorganise table categories as an IOT on (id, group, category)
In each of the above case use EXPLAIN PLAN to review the cost

Resources