I have a table which needs to be ingested from Oracle source to Greenland target using ETL tool talend. The table is huge , hence we want to load the data on daily basis incrementally. The table doesn't have any primary or unique key.
Table has date column, I am able to get both inserted/updated records from last update date but to insert that data, we need a primary key.
Any solution on how to load the data without using a primary key?
You need to define your key in talend in the schema of the component that insert into your target table, like this :
And you can use this key to update your table, in the advanced settings of the same component, activate the check box use fields optins and select your key :
This is tested and worked fine against Oracle table that does not have primary key, and it should work for you.
Related
I am using talend for ETL I don't have enough experience in this, I am having two tables for example- account and account_roles account table is having id, name, password etc fields and account_roles table is having account_id which is f.k to account table's pk. and one more field.
Both the fields in account_roles are having duplicates, I want to save account_roles in destination with update and insert logic using talend.
But I am getting error as I don't have any table that can be treated as primary key in the account_roles table, so talend can't update or insert it.
How I deal with this situation I tried tDBOutput advance option use_field_option but still it requires unique entries.
Is there any possible solution to this issue, I also want to know if I can make table Foreign key in the account_roles table will it work then? If yes then How to make F.k in talend OPen studio is my second question.
Attaching Snapshots of my tables and tMap below -
I want to know the way I can put my tables into database if I don't have any primary key! Kindly help me.
First question
I think you should place the primary key in the physical account_roles table. Talend will use the key indication of the dbOutput component, and the physical key of the table.
In order to delete duplicates rows, you can also use a tUniqRow before the dbOutput: The key you indicate in the UniqRow is not directly linked to the database; this is only the key on which tUniqRow will be based.
Second question
It's not possible to delegate the f.k. verification to Talend. But you can do this verification in your database by placing foreign keys in your table. If an id is not present in the reference table, the database returns an error that is returned by Talend.
I try to update column type from DateTime(TZ) to DateTime, but it is key column and couldn't be changed. Drop/create table doesn't have any result - looks like metadata stored in ZK.
Can I change table structure (I can drop/create table) without changing ZK records? Or it is required to remove meta from ZK?
You need to drop a table at all replicas. If you lost a replica and did notdrop their table you need to clean ZK manually.
Or you can just use another ZK path. Table name does matter.
I read that we cannot create a primary key on a column in a Hive table. But I saw the below DDL in some other place and executed it. It worked without any problem.
create table prim(id int, name char(30))
TBLPROPERTIES("PRIMARY KEY"="id");
After this I executed "describe formatted prim" and got to see that a key is created on the column ID
Table Parameters:
PRIMARY KEY id
I inserted two records with same ID number into the table.
insert into prim values(1,'ABCD');
insert into prim values(2,'EFGH');
Both the records were inserted into the table. What baffles me is that we cannot give the PRIMARY KEY in the create statement which I can understand, but when given in TBLPROPERTIES("PRIMARY KEY"="id") how different is it to the primary key in RDBMS.
PRIMARY KEY in TBLPROPERTIES is for metadata reference to preserve column significance. It does not apply any constrain on that column. This can be used as a reference from design perspective.
There is a requirement in our application to create the unique primary key which depend on the value of another unique column (ERROR_CODE). But our application is in a geo active active environment (have several active databases which are synchronized using another program).
Therefore even-though we have a unique constraint on this ERROR_CODE field, there is a possibility that each database has a row with a different PK for the same ERROR_CODE. During the database synchronization, this is a problem, because there are some child tables which has the PK stored in one DB and other rows contain the PK stored in other DB. Because of the unique constraint of ERROR_CODE, sync process cannot move both rows to each database (which is also not a good thing to do).
So there is a suggestion to use the hash of the ERROR_CODE field as the PK value.
I would like to know whether we can define a function based Primary key in oracle?
If PK field is "ID",
"ID" should be equal to ora_has(ERROR_CODE).
Is it possible to define the primary key like that in oracle?
In Oracle 10 you cannot do this, but in Oracle 11 you can. You have to create a virtual column, such columns can be used also as primary key:
ALTER TABLE MY_TABLE ADD (ID NUMBER GENERATED ALWAYS AS (ora_has(ERROR_CODE)) VIRTUAL);
ALTER TABLE MY_TABLE ADD CONSTRAINT t_test_pk PRIMARY KEY (ID) USING INDEX;
I have one table - TableA. This is source and target also. Table doesn't have any primary key. I am fetching data from TableA, then doing some calculation on some fields and updating them in same tableA. Now how can I update data when it doesn't have any primary key or composite key? Second question - If joining two columns make a record unique then how can I use it in informatica?Plz help
You can define the update statement in the target. There is that properties.
Still you have to make informatica to perform an update, not insert. To do that you need to use the update strategy.
I think you don't need in this solution to make any PK on that table, because you will use your own update statement, but please verify this.
To set the fields and make proper where condition for update you need to use :TU alias in the code. TU -> means the update strategy before the target.
Example:
update t_table set field1 = :TU.f1 where key_field = :TU.f5
If you don't want (or can't) create primary key in your table in database you can just define it in informatica source
If record unique as combination of two columns just mark both of them as primary key in informatica source