Using dynamic lookup from parallel sessions with synchronized cache in Informatica - informatica-powercenter

Using Informatica 9.1.0
Scenario
Get the Dimension key generated and inserted to the Fact table from the Fact load.
I have to load the Fact table with a dimension key along with other columns. This dimension record is created from within the same mapping. There are five different sessions using the same mapping and executes simultaneously to load the Fact table. In this case I'm using a dynamic lookup with 'Synchronize dynamic cache' enabled to get unique dimension records generated from the 5 sessions using some conditions. The dimension ID is generated using the Sequence-ID in associated expression of the lookup. When a single session alone is run it worked perfectly fine. But when the sessions were run parallely it started to show unique key violation error as random sessions tried to insert the same sequence which was already there.
To fix the issue I had to give persistent lookup cache enabled and Cache file name prefix. But I did not find this solution or this issue in any of the forums or in INFA communities. So not sure this is the right way of doing it or this is a bug of some kind.
Please let me know if you had similar issue or some different thoughts.
Thanks in advance

One other possible solution I can think of is to have the database generate a sequence instead of using Informatica's sequencer. The database should be capable of avoiding any unique key violations.

Related

Spring Boot application with Postgres: indexes not being used during first use

I have a Spring Boot application that is using a Postgres database. When the application is deployed I need to run a transactional operation that uploads a zip file that is used to populate the database. The application is checking for duplicate rows before inserting them (because users can upload duplicate data that should just be ignored).
The problem I am having is that the first time I upload the file, even thought the indexes are created, they are not being used when checking for the existence of a row. My theory is that this happens because the query plan is deciding not to use the index because it is checking the original statistics, which show that the tables are empty. If I upload a small zip file first, then the problem goes away because the tables now have data.
I have two questions. First, is my theory correct or is there some other reason for this behaviour? Also, if so, is there a way to force Postgres to update the query plan it uses at some predefined interval within the same transaction and can this be done using JPA? Any ideas are appreciated.
Just in case someone runs into this issue, I'll post the solution I found. It appears my theory was correct. The queries will not use the indexes until some statistics are collected. One way to force this is to call ANALYZE after a number of rows have been written to the database. You can do this using a native query like this:
entityManager.createNativeQuery("ANALYZE " + tbl).executeUpdate();
You can wrap this call in a try catch and ignore any exceptions that might occur if you change the database engine. I couldn't find a way of doing this in a database-independent way but this approach works fine and now the initial upload performs as expected.

Why no primary key

I have inherited a datababase with tables that lack primary keys. It's an OLTP database. One of the tables in question has ~300k records, and has no primary key implemented, even though examining the rest of the schema tells me one column is used AS a primary key, ie being replicated in another table, with identical name, etc. ie. This is not an 'end of line' table
This database also does not implement FKs.
My question is - is there ANY valid reason for a table (in Oracle for that matter) NOT to have a primary key?
I think PK is mandatory for almost all cases. Lots of reasons will exist but I'll treat some of them.
prevent to insert duplicate rows
rows will be referenced, so it must have a key for it
I saw very few cases make tables without PK (e.g. table for logs).
Not specific to Oracle but I recall reading about one such use-case where mysql was highly customized for a dam (electricity generation) project, I think. The input data from sensors were in the order 100-1000 per second or something. They were using timestamps for each record so didn't need a primary key (like with logs/logging mentioned in another answer here).
So good reasons would be:
Overhead, in the case of high frequency transactions
Necessity or Un-necessity in that case
"Uniqueness" maintained or inferred by application, not by db
In a normalized table, if every record needs to be unique and every field is referenced in other tables, then having a PK additionally adds an index overhead and if the PK would never actually be used in any SQL query (imho, I disagree with this but it's possible). But it should still have a unique index encompassing all the fields.
Bad reasons are infinite :-)
The most frequent bad reason which is actually responsible for the lack of a primary key is when DBs are designed by application/code-developers with little or no DB experience, who want to (or think they should) handle all data constraints in the application.
Any valid reason? I'd say "No"--I'm a database guy--but there are places that insist on using the database as a dumb data store. They usually implement all integrity "constraints" in application code.
Putting integrity constraints into application code isn't usually done to improve performance. In fact, if you built one database that enforces all the known constraints, and you built another with functionally identical constraints only in application code, the first one would almost certainly run rings around the second one.
Instead, application-level constraints usually hope to increase flexibility. (And, in the process, some of the known constraints are usually dropped, which appears to improve performance.) If it becomes inconvenient to enforce certain constraints in order to bulk load some scruffy data, an application programmer can just side-step the application-level constraints for a little while, then clean up the data when it's more convenient.
I'm not a db expert but I remember a conversation with a friend who worked in the Oracle apps dept. who told me that this was done to handle emergencies. If there was a problem in some report being generated which you could fix by putting in a row, db level constraints often stand in your way. They generally implemented things like unique primary keys in the application rather than the database. It was inefficient but enough and for them and much more manageable in case of a disaster recovery scenario.
You need a primary key to enforce uniqueness for a subset of its columns (useful if you need to refer to individual rows). It also speeds up certain queries because of the index associated to it.
If you do not need that index, or that uniqueness constraint, then you may not need a primary key (the index does not come free).
An example that comes to mind are logging tables, that just record some data (that is never updated or queried for individual records).
There is a small overhead when inserting to a table with an index and you need an index if you have a primary key. Downside of course is that finding a row is very costly.

Make row in a table read only on oracle?

I have a table with many rows.
For testing purpose my colleagues are also using same table. The problem is that some time he is deleting the row which I was testing and some time I.
So is there any way in oracle so I can make some specific rows to be read only so other should not delete and edit that?
Thanks.
There are a number of differnt ways to tackle this problem.
As Sun Tzu said, the best thing would be if you and your colleagues use data sets which do not collide.
For instance perhaps you could each have your own database instance, on local PCs; whether this will suit depends on a number of factors, not the least of which is your licensing arrangements with Oracle. Alternatively, you could have separate schemas in a shared database; depending on your application you may need to you synonyms or special connectioms.
Another approach: everybody builds their own data sets, known as test fixtures. This is a good policy, because testing is only truly valid when it runs against a known state; if we make assumptions regarding the presence or absence of data how valid are our test results? The point is, the tests should clean up after themselves, removing any data created in fixtures and by the running of tests. With this tactic you need to agree ranges of IDs for each team member: they must only use records within their ranges for testing or development work.
I prefer these sorts of approach because they don't really change the way the application works (arguably except using different schemas and synonyms). More draconian methods are available.
If you have Enterprise Edition you can use Row Level Security to protect your records. This is a extension of the last point: you will need a mechanism for identifying your records, and some infrastructure to identify ownership within the session. But in addition to preventing other users rom deleting your data you can also prevent them inserting, updating or even viewing records which are with your range of IDs. Find out more.
A lighter solution is use a trigger as A B Cade suggests. You will still need to identifying your records and who is connected (because presumably from time-to-time you will still want to delete your records.
One last strategy: take your ball home. Get the table in the state you want it and make a data pump export. For extra vindictiveness you can truncate the table at this point. Then any time you want to use the table you run a data pump import. This will reset the table's state, wiping out any existing data. This is just an extreme version of test scripts creating their own data.
You can create a trigger that prevents deleting some specific rows.
CREATE OR REPLACE TRIGGER trg_dont_delete
BEFORE DELETE
ON <your_table_name>
FOR EACH ROW
BEGIN
IF :OLD.ID in (<IDs of rows you dont want to be deleted>) THEN
raise_application_error (-20001, 'Do not delete my records!!!');
END IF;
END;
Of course you can make it smarter - make the if statement rely on user, or get the records IDs from another table and so on
Oracle supports row level locking. you can prevent the others to delete the row, which one you are using. for knowing better check this link.

managing/implementing auto-increment primary key in oracle without triggers

We have many tables in our database with autoincrement primary key ids setup the way they are in MySQL since we are in the process of migrating to Oracle from MySQL.
Now in oracle I recently learned that implementing this requires creating a sequence and a trigger on the id field for each such table. We have like 30 -40 tables in our schema and we want to avoid using database triggers in our product, since management of database is out of scope for our software appliance.
What are my options in implementing the auto increment id feature in oracle... apart from manually specifying the id in the code and managing it in the code which would change a lot of existing insert statements.
... I wonder if there is a way to do this from grails code itself? (by the way the method of specifying id as increment in domain class mapping doesnt work - only works for mysql)
Some info about our application environement: grails-groovy, hibernate, oracle,mysql support
This answer will have Grails/Hibernate handle the sequence generation by itself. It'll create a sequence per table for the primary key generation and won't cache any numbers, so you won't lose any identifiers if and when the cache times out. Grails/Hibernate calls the sequence directly, so it doesn't make use of any triggers either.
If you are using Grails hibernate will handle this for you automatically.
You can specify which sequence to use by putting the following in your domain object:
static mapping = {
id generator:'sequence', params:[sequence:'MY_SEQ']
}

Forcing Oracle to use Primary Key Index without using Hints

We have an application that generates some temporary tables and then processes the data. I dont really have control of the way the application creates this and the subsequent queries involved. What we have noticed is that Oracle uses a full table scan instead of using the index which is the primary key of the tables. If it used the primary key index the process would run a whole lot faster.
Since I do not have control over the select queries generated by the application I cannot use hints and force Oracle to use primary key index. Is there any other setting I could change somewhere that could force Oracle to use primary key index for the temporary tables?
The two most common reasons for a query not using indexes are:
It's quicker to do a full table scan.
Poor statistics.
If your queries are selecting all of the table or doing joins without mentioning a primary key in the where clause etc., chances are it's quicker to do a full scan. Without the query and indexes, and preferably an explain plan as well it's impossible to tell for certain.
I would, however, recommend that you ask your DBA to re-gather - I hope, if not gather for the first time - statistics on the table. Use dbms_stats.gather_table_stats, with an estimate percentage of 25%+.
If the tables are re-created each time the application is run then try and gather statistics after creation and primary key generation. If they are truncated and re-filled each time, then ask your DBA to rebuild them and the PK and then gather statistics as this could significantly increase query runtime.
With no control over anything I don't see how you can improve the query time any other way.
You can use hints without changing SQL by leveraging SQL Profiles. Wrap your hint(s) into a SQL Profile that takes effect for that particular SQL ID.
I understand you don't have control over SQL, I have many apps where I encounter the same restriction. After checking query structure and statistics as in Ben's post and you have proved that hinting to use the index will improve performance why not try a manually created SQL profile.
Christian Antognini has a great paper here about SQL Profiles and creating them manually. The paper mentions creating SQL Profiles manually is undocumented. I would agree undocumented, but that doesn't necessarily mean unsupported. I would say there is little documentation out there, but if you want proof that Oracle allows manual creation, check the API or look at the coe_xfr_sql_profile.sql file in the SQLT utility directory.
I also posted a cheatsheet on how to quickly manually create a SQL Profile here.

Resources