How to catch DB Error in Sub Job in Talend - oracle

I have the below design in Talend I am catching the error when the component fails but if there is any error from DB like Cannot Insert as Parent Key not found Cannot Insert into column col2 expected 15 but actual 16 it is not showing any error if the insert job is ran a subjob
If I run the Job FACTDIM_COMBINE I can see the error but if it is ran as Subjob I am not able see the error
Please help to get the DB error when it is run as SubJob also

Please use tLogCatcher component in your job. This will log all the errors even in the sub jobs. Also enable die on error functionality in all the components where ever necessary

Related

Hive/Hadoop intermittent failure: Unable to move source to destination

There have been some SO articles about Hive/Hadoop "Unable to move source" error. Many of them point to permission problem.
However, in my site I saw the same error but I am quite sure that it is not related to permission problem. This is because the problem is intermittent -- it worked one day but failed on another day.
I thus looked more deeply into the error message. It was complaining about failing to move from a
.../.hive-stating_hive.../-ext-10000/part-00000-${long-hash}
source path to a destination path of
.../part-00000-${long-hash}
folder. Would this observation ring a bell with someone?
This error was triggered by a super simple test query: just insert a row into a test table (see below)
Error message
org.apache.hadoop.hive.ql.metadata.HiveException:
Unable to move source
hdfs://namenodeHA/apps/hive/warehouse/some_db.db/testTable1/.hive-staging_hive_2018-02-02_23-02-13_065_2316479064583526151-5/-ext-10000/part-00000-832944cf-7db4-403b-b02e-55b6e61b1af1-c000
to destination
hdfs://namenodeHA/apps/hive/warehouse/some_db.db/testTable1/part-00000-832944cf-7db4-403b-b02e-55b6e61b1af1-c000;
Query that triggered this error (but only intermittently)
insert into testTable1
values (2);
Thanks for all the help. I have found a solution. I am providing my own answer here.
The problem was with a "CTAS" create table as ... operation that preceded the failing insert command due to an inappropriate close of the file system. The telltale sign was that there would be an IOException: Filesystem closed message shown together with the failing HiveException: Unable to move source ... to destination operation. ( I found the log message from my Spark Thrift Server not my application log )
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:808)
at org.apache.hadoop.hdfs.DFSClient.getEZForPath(DFSClient.java:3288)
at org.apache.hadoop.hdfs.DistributedFileSystem.getEZForPath(DistributedFileSystem.java:2093)
at org.apache.hadoop.hdfs.client.HdfsAdmin.getEncryptionZoneForPath(HdfsAdmin.java:289)
at org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.isPathEncrypted(Hadoop23Shims.java:1221)
at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2607)
The solution was actually from another SO article: https://stackoverflow.com/a/47067350/1168041
But here I provide an excerpt in case that article is gone:
add the property to hdfs-site.xml
<property>
<name>fs.hdfs.impl.disable.cache</name>
<value>true</value>
</property>
Reason: spark and hdfs use the same api (at the bottom they use the same instance).
When beeline close a filesystem instance . It close the thriftserver's
filesystem instance too. Second beeline try to get instance , it will
always report "Caused by: java.io.IOException: Filesystem closed"
Please check this issue here:
https://issues.apache.org/jira/browse/SPARK-21725
I was not using beeline but the problem with CTAS was the same.
My test sequence:
insert into testTable1
values (11)
create table anotherTable as select 1
insert into testTable1
values (12)
Before the fix, any insert would failed after the create table as …
After the fix, this problem was gone.

Can we insert into external table

I am debugging a Big Data code in Production environment of my company. Hive return the following error:
Exception: org.apache.hadoop.hive.ql.lockmgr.LockException: No record of lock could be found, may have timed out
Killing DAG...
Execution has failed.
Exception in thread "main" java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask.
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:282)
at org.apache.hive.jdbc.HiveStatement.executeUpdate(HiveStatement.java:392)
at HiveExec.main(HiveExec.java:159)
After investigation, I have found that this error could be caused by BoneCP in connectionPoolingType property, but the cluster support team told me that they fixed this bug by upgrading BoneCP.
My question is: can we INSERT INTO an external table in Hive, because I have doubt about the insertion script ?
Yes, you can insert into external table.

Hive - error while using dynamic partition query

I am trying to execute the query below:
INSERT OVERWRITE TABLE nasdaq_daily
PARTITION(stock_char_group)
select exchage, stock_symbol, date, stock_price_open,
stock_price_high, stock_price_low, stock_price_close,
stock_volue, stock_price_adj_close,
SUBSTRING(stock_symbol,1,1) as stock_char_group
FROM nasdaq_daily_stg;
I have already set hive.exec.dynamic.partition=true and hive.exec.dynamic.partiion.mode=nonstrict;.
Table nasdaq_daily_stg table contains proper information in the form of a number of CSV files. When I execute this query, I get this error message:
Caused by: java.lang.SecurityException: sealing violation: package org.apache.derby.impl.jdbc.authentication is sealed.
FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.MapRedTask
The mapreduce job didnt start at all. So there are no logs present in the jobtracker web-UI for this error. I am using derby to store meta-store information.
Can someone help me fix this?
Please try this. This may be the issue. You may be having Derby classes twice on your classpath.
"SecurityException: sealing violation" when starting Derby connection

DataStage Job "ABORTED" because of Deadlock issue

DataStage -- 8.1
Database -- Oracle 10g
OS -- Unix
I have a DataStage job (FCT) which is doing a lookup based on two keys columns to the DIM table. This job ABORTED with the following error message.
"main_program: (aptoci.C:483). Message: ORA-04020: deadlock detected while trying to lock object DIM_TABLE_NAME "
Partition type for lookup stage -- Auto
Config file Nodes -- 2
Note:- Most of the times this job runs without any issues but sometimes fail with the above error message.
Don't understand what can cause deadlock here and how to resolve this issue.
Thanks in advance.

Oracle triggers error are not captured while using ADODB

I have and application which uses Adodb to insert data in Oracle table(customers database).
Data is successfully inserted if there are no errors.
If there is any error like invalid datatype etc. Error is raised and captured by my application and dumped in log gile.
My customer has written their own triggers on this particular table. When a record is inserted few other checking are done be fore the data insertion
Now all fine until now.
But recently we found that many a times data is not inserted in the oracle table.
When checked in log file no error was found.
Then I logged the query which was executed.
Copied the query to oracle Sql prompt and executed it gave error of trigger.
My Issue is
Customer is not ready to share the details of trigger.
Error is not raised while inserting to oracle table so we are not able to log it or take any action.
The same qry when executed directly in oracle the trigger errors are show.
Help needed for
Why the error is not raised in ADODB
Do I have to inform customer to implement any error raising
Anything that you can suggest for resolving the issue
I have 0% to 10% knowledge of Oracle
"Copied the query to oracle Sql prompt and executed it gave error of trigger." Since the ADO session doesn't report an error, it may be that the error from the trigger is misleading. It may simply be a check on the lines of "Hey, you are not allowed to insert into this table except though the application".
"Error is not raised while inserting to oracle table so we are not able to log it or take any action."
If the error isn't raised at the time of insert, it MAY be raised at the time of committing. Deferred constraints and materialized views could give this.
Hypothetically, I could reproduce your experience as follows:
1. Create a table tab_a with a deferrable constraint initially deferred (eg val_a > 10)
2. The ADO session inserts a row violating the constraint but it dooesn't error because the constraint is deferred
3. The commit happens and the constraint violation exception fires and the transaction is rolled back instead of being committed.
So see if you are catering for the possibility of an error in the commit.
It may also be something else later in the transaction which results in a rollback of the whole transaction (eg a deadlock). Session tracing would be good. Failing that, look into a SERVERERROR trigger on the user to log the error (eg in a file, so it won't be rolled back)
http://download-west.oracle.com/docs/cd/B19306_01/server.102/b14200/statements_7004.htm#i2153530
You can log your business logic in log table.
But you have to use stored procedure to log the message.
Stored procedure should have pragma Transaction such that your log data must be saved in log table.
You are trigger should have error handling - and in error handling , you have to call Logged stored procedure (which have pragma transaction)
I've never used adodb ( and I assume that is what you are using, not ADO.NET?).. But, a quick look at its references leads to this question.. Are you actually checking the return state of your query?
$ok = $DB->Execute("update atable set aval = 0");
if (!$ok) mylogerr($DB->ErrorMsg());

Resources