Migration task failed at 'load data local infile' while migrating the data from RDS MySQL to Azure Database for MySQL using Attunity Replicate MSM - amazon-aurora

I try to migrate data from Amazon RDS MySQL to Azure Database for MySQL using 'Attunity Replicate for Microsoft Migrations(Replicate MSM)'.
For this I setup the Replicate MSM tool on a Windows 10 machine locally, then I define & test the source & target database endpoints eg. RDS as source and Azure as target, install the required mysql, odbc drivers and enable binary logging, local-infile parameters on both the databases. But now when I run the migration task it only create the schema's of migrated tables on target db and failed at 'load data local infile' command.
Here's the stack trace:
00014468: 2019-06-20T11:17:41 [SOURCE_UNLOAD ]I: Unload finished for table 'TestDb'.'Employee' (Id = 1). 2000 rows sent. (streamcomponent.c:2892)
00014968: 2019-06-20T11:17:41 [TARGET_LOAD ]I: Loading table 'migrationtesting'.'Employee' with parallel threads (odbc_endpoint_imp.c:5256)
00014968: 2019-06-20T11:17:41 [TARGET_LOAD ]I: Use parallel load thread pool with '3' threads (csv_target.c:280)
00014968: 2019-06-20T11:17:42 [TARGET_LOAD ]I: Load finished for table 'TestDb'.'Employee' (Id = 1). 2000 rows received. 0 rows skipped. Volume transfered 904960 (streamcomponent.c:3116)
00014968: 2019-06-20T11:17:43 [TARGET_LOAD ]E: Failed to execute statement: 'load data local infile "C:\\Program Files\\Attunity\\ReplicateMSM\\data\\tasks\\Aws2Azure\\data_files\\1\\LOAD00000001.csv" into table `migrationtesting`.`Employee` CHARACTER SET UTF8 fields terminated by ',' enclosed by '"' lines terminated by '\n'( `id`,`name`,`gender`,`mobile`,`city` ) ;' [1022502] (ar_odbc_stmt.c:4349)
00014968: 2019-06-20T11:17:43 [TARGET_LOAD ]E: RetCode: SQL_ERROR SqlState: HY000 NativeError: 1148 Message: [MySQL][ODBC 5.3(w) Driver][mysqld-5.6.39.0]The used command is not allowed with this MySQL version [1022502] (ar_odbc_stmt.c:4355)
00007376: 2019-06-20T11:17:43 [TASK_MANAGER ]W: Table 'TestDb'.'Employee' (subtask 1 thread 1) is suspended (replicationtask.c:2050)
00014968: 2019-06-20T11:17:43 [TARGET_LOAD ]E: Failed to start load process for file '1' [1022502] (csv_target.c:1350)
00007376: 2019-06-20T11:17:43 [TASK_MANAGER ]I: All tables are loaded. Full load only task is stopped (replicationtask.c:2992)
00014968: 2019-06-20T11:17:43 [TARGET_LOAD ]E: Failed to load file '1' [1022502] (csv_target.c:1418)
00014968: 2019-06-20T11:17:43 [TARGET_LOAD ]E: Failed to load data from csv file. [1022502] (odbc_endpoint_imp.c:5331)
According to Azure docs:
LOAD DATA INFILE is supported, but the [LOCAL] parameter must be
specified and directed to a UNC path (Azure storage mounted through
SMB).
if this is the solution then kindly explain how to implement it.
Note: MySQL Server version on both RDS and AZURE is 5.6

As the error logs suggest that you're using v5.3 of MySQL ODBC Driver in which the LOAD DATA INFILE functionality is disabled by default, to enable this we need to explicitly set the value of ENABLE_LOCAL_INFILE to 1.
In Attunity Replicate for Microsoft migrations, you have to enable this flag for your Target database endpoint, you can enable it by following these steps...
Open the settings of target Endpoint.
Goto Advanced tab > Internal Parameters.
Add search key additionalConnectionProperties and hit Enter. (it's case-sensitive,so just copy/paste the same)
You can see a new key has been created under internal parameters, type the value for this newly created key as: ENABLE_LOCAL_INFILE=1;
Save and then Reload your task.
Credits: Official Attunity Community/Support team for Microsoft Migrations

Related

Reg: database is not starting up an error

getting below error while starting the database:-
startup
ORA-01078: failure in processing system parameters
ORA-01565: error in identifying file '+DATA/mis/PARAMETERFILE/spfile.276.967375255'
ORA-17503: ksfdopn:10 Failed to open file +DATA/mis/PARAMETERFILE/spfile.276.967375255
ORA-04031: unable to allocate 56 bytes of shared memory ("shared pool","unknown object","KKSSP^24","kglseshtSegs")
Your database cannot find the SPFILE (newer init.ora) within ASM with the actual system parameters or has no permissions to access it.
Either your Grid Infrastructure stack or the dbs/spfile.ora is pointing to the wrong file.
To find out what the grid infrastructure stack is using, run "srvctl" which should display the parameterfile name the database should be using
srvctl config database -d <dbname>
...
Spfile: +DATA/<dbname>/PARAMETERFILE/spfile.269.1066152225
...
Then check (as the grid user), if the file indeed is not visible (by using asmcmd):
asmcmd
ASMCMD> ls +DATA/<dbname>/PARAMETERFILE/
spfile.269.1066152225
If the name is different, then you got the issue... (and you have to point to the correct file).
If the name is correct, then it could be wrong permissions on the oracle executable(s) (check My Oracle Support):
RAC Database Can't Start: ORA-01565, ORA-17503: ksfdopn:10 Failed to open file +DATA/BPBL/spfileBPBL.ora (Doc ID 2316088.1)

Connect to Teradata Using Airflow JDBC Connection

I'm trying to execute a SqlSensor task in Airflow using a connection to Teradata database. The connection is configured as follow:
I have provide in particular 2 driver paths separated by ", " but I am not sure if it's the proper way to do it?
/home/airflow/java_sample/tdgssconfig.jar
/home/airflow/java_sample/terajdbc4.jar
When the DAG executes, it triggers the error message
[2017-08-02 02:32:45,162] {models.py:1342} INFO - Executing <Task(SqlSensor): check_running_batch> on 2017-08-02 02:32:12
[2017-08-02 02:32:45,179] {base_hook.py:67} INFO - Using connection to: jdbc:teradata://myservername.mycompanyname.org/database=MYDBNAME,TMODE=ANSI,CHARSET=UTF8
[2017-08-02 02:32:45,313] {sensors.py:109} INFO - Poking: SELECT BATCH_KEY FROM MYDBNAME.AUDIT_BATCH WHERE BATCH_OWNER='ARO_TEST' AND AUDIT_STATUS_KEY=1;
[2017-08-02 02:32:45,316] {base_hook.py:67} INFO - Using connection to: jdbc:teradata://myservername.mycompanyname.org/database=MYDBNAME,TMODE=ANSI,CHARSET=UTF8
[2017-08-02 02:32:45,497] {models.py:1417} ERROR - java.lang.RuntimeException: Class com.teradata.jdbc.TeraDriver not found
What am I doing wrong?
The appropriate way to input multiple jars in the connections page is to separate both fully qualified paths with a comma which you did above.
I can confirm this is the approach I took and it worked (Airflow 10.1.1 and 10.1.2).
See: https://github.com/apache/airflow/blob/master/airflow/hooks/jdbc_hook.py#L51
Bonus: If you use Ad Hoc Query in Data Profiling to test it out, you'll notice that you'll get an error when you send a SELECT statement because Airflow wraps it in a LIMIT clause which TD doesn't support.
The solution provided by my team member was to merge the two jar into a single jar file. After doing it and indicating that new jar file in the driver path, it worked as expected.
Here is the link to the JAR file: https://github.com/alexisrolland/linux-setup/blob/master/teradataDriverJdbc.jar
Here is a code snippet example to use the connection in a SQLSensor Task:
CheckRunningBatch = SqlSensor(
task_id='check_running_batch',
conn_id='ed_data_quality_edw_dev',
sql="SELECT CASE WHEN MAX(BATCH_KEY) IS NOT NULL THEN 0 ELSE 1 END FROM DATABASE.AUDIT_BATCH WHERE STATUS_KEY=1;",
poke_interval=300,
dag=dag)

Hive issue using yarn

I am running hive sql on yarn,
it's throwing error with join condition , I am able to create External as well as internal table but failed to create table when use command
create table as AS SELECT name from student.
when running same query through hive cli it's working fine but with spring jog it throws error
2016-03-28 04:26:50,692 [Thread-17] WARN
org.apache.hadoop.hive.shims.HadoopShimsSecure - Can't fetch tasklog:
TaskLogServlet is not supported in MR2 mode.
Task with the most failures(4):
-----
Task ID:
task_1458863269455_90083_m_000638
-----
Diagnostic Messages for this Task:
AttemptID:attempt_1458863269455_90083_m_000638_3 Timed out after 1 secs
2016-03-28 04:26:50,842 [main] INFO
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Killed application
application_1458863269455_90083
2016-03-28 04:26:50,849 [main] ERROR com.mapr.fs.MapRFileSystem - Failed to
delete path maprfs:/home/pro/amit/warehouse/scratdir/hive_2016-03-28_04-
24-32_038_8553676376881087939-1/_task_tmp.-mr-10003, error: No such file or
directory (2)
2016-03-28 04:26:50,852 [main] ERROR org.apache.hadoop.hive.ql.Driver -
FAILED: Execution Error, return code 2 from
As per my findings I think there is some issue with scratdir.
Kindly suggest if any one face same issue.
This issue occurs if the recursive directory doesnot exist. Hive doesnt automatically create directories recursively.
Please check existence of directories to child\table level from root
I faced a similar issue while running the below Hive query
select * from <db_name>.<internal_tbl_name> where <field_name_of_double_type> in (<list_of_double_values>) order by <list_of_order_fields> limit 10;
I performed an explain on the above statement and below was the result.
fs.FileUtil: Failed to delete file or dir [/hdfs/Hadoop_Misc_Logs/Edge01/local_scratch/<hive_username>/41289638-cd53-4d4b-88c9-3359e9ec99e2/hive_2017-05-08_04-26-36_658_6626096693992380903-1/.nfs0000000057b93e2d00001590]: it still exists.
2017-05-08 04:26:37,969 WARN [41289638-cd53-4d4b-88c9-3359e9ec99e2 main] fs.FileUtil: Failed to delete file or dir [/hdfs/Hadoop_Misc_Logs/Edge01/local_scratch/<hive_username>/41289638-cd53-4d4b-88c9-3359e9ec99e2/hive_2017-05-08_04-26-36_658_6626096693992380903-1/.nfs0000000057b93e2700001591]: it still exists.
Time taken: 0.886 seconds, Fetched: 24 row(s)
And checked the logs through
yarn logs -applicationID application_1458863269455_90083
The error happened after a MapR upgrade from the admin team. It is probably due to some upgrade or installation issue and Tez configurations (as suggested by the line 873 in log below). Or probably, the Hive query is syntactically not supporting the Tez optimization. Saying so, because another Hive query on an external table is running fine in my case. Have to check a bit deeper though.
Though not sure but the error line in the logs that looks to be most relevant is as follows:
2017-05-08 00:01:47,873 [ERROR] [main] |web.WebUIService|: Tez UI History URL is not set
Solution:
It is probably happening due to some open files or applications that are using some resources. Pls check https://unix.stackexchange.com/questions/11238/how-to-get-over-device-or-resource-busy
You can run the explain <your_Hive_statement>
In the result execution plan, you can come across the filenames/dirs that Hive execution engine fails to delete e.g.
2017-05-08 04:26:37,969 WARN [41289638-cd53-4d4b-88c9-3359e9ec99e2 main] fs.FileUtil: Failed to delete file or dir [/hdfs/Hadoop_Misc_Logs/Edge01/local_scratch/<hive_username>/41289638-cd53-4d4b-88c9-3359e9ec99e2/hive_2017-05-08_04-26-36_658_6626096693992380903-1/.nfs0000000057b93e2d00001590]: it still exists.
Go to the path given in the step 2 e.g. /hdfs/Hadoop_Misc_Logs/Edge01/local_scratch/<hive_username>/41289638-cd53-4d4b-88c9-3359e9ec99e2/hive_2017-05-08_04-26-36_658_6626096693992380903-1/
In path 3, doing ls -a or lsof +D /path will show the open process_ids blocking the files from delete.
If you run ps -ef | grep <pid>, you get
hive_username <pid> 19463 1 05:19 pts/8 00:00:35 /opt/mapr/tools/jdk1.7.0_51/jre/bin/java -Xmx256m -Dhiveserver2.auth=PAM -Dhiveserver2.authentication.pam.services=login -Dmapr_sec_enabled=true -Dhadoop.login=maprsasl -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/mapr/hadoop/hadoop-2.7.0 -Dhadoop.id.str=hive_username -Dhadoop.root.logger=INFO,console -Djava.library.path=/opt/mapr/hadoop/hadoop-2.7.0/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx512m -Dlog4j.configurationFile=hive-log4j2.properties -Dlog4j.configurationFile=hive-log4j2.properties -Djava.util.logging.config.file=/opt/mapr/hive/hive-2.1/bin/../conf/parquet-logging.properties -Dhadoop.security.logger=INFO,NullAppender -Djava.security.auth.login.config=/opt/mapr/conf/mapr.login.conf -Dzookeeper.saslprovider=com.mapr.security.maprsasl.MaprSaslProvider -Djavax.net.ssl.trustStore=/opt/mapr/conf/ssl_truststore org.apache.hadoop.util.RunJar /opt/mapr/hive/hive-2.1//lib/hive-cli-2.1.1-mapr-1703.jar org.apache.hadoop.hive.cli.CliDriver
CONCLUSION:
The HiveCLiDriver clearly shows that running "Hive on Spark" (or managed) tables through Hive CLI is not supported any more from Hive 2.0 onwards and it is going to be deprecated going forward. You have to use HiveContext in Spark for running Hive queries. But you can still run queries on Hive external tables through Hive CLI.

Oracle Data Integrator SQL to HDFS IKM returns error

I am using ODI (12.1.3.0.0). I created topology for Oracle DB which is OK and I created topology for HDFS using File technology where I think the problem is in.
DataServer for HDFS, I left JDBC driver empty, and filled JDBC Url with hdfs://remotehostname:port
Physical Schema for HDFS, I filled both Schema and Work Schema with /my/path
Then created Logical Schema and Model. After that created Datastore under the model with these definitions.
Name: TestName
Resource Name: TESTFILE.txt
File Format: Fixed
After all these, created a project and a mapping under the project.
Finally when I run the mapping I see these errors:
ODI-1217: Session Oracle2HDFSMapping_Physical_SESS (15) fails with return code ODI-1298.
ODI-1226: Step Physical_STEP fails after 1 attempt(s).
ODI-1240: Flow Physical_STEP fails while performing a Add execute to Sqoop script-IKM SQL to HDFS File (Sqoop)- operation. This flow loads target table null.
ODI-1298: Serial task "SERIAL-MAP_MAIN- (10)" failed because child task "SERIAL-EU-GGUSER_UNIT (20)" is in error.
ODI-1298: Serial task "SERIAL-EU-GGUSER_UNIT (20)" failed because child task "Add execute to Sqoop script-IKM SQL to HDFS File (Sqoop)- (40)" is in error.
Caused By: java.io.IOException: Cannot run program "chmod": CreateProcess error=2, The system cannot find the file specified
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
at java.lang.Runtime.exec(Runtime.java:617)
at java.lang.Runtime.exec(Runtime.java:450)
at java.lang.Runtime.exec(Runtime.java:347)
at oracle.odi.runtime.agent.execution.cmd.OSCommandExecutor.execute(OSCommandExecutor.java:54)
at oracle.odi.runtime.agent.execution.cmd.OSCommandExecutor.execute(OSCommandExecutor.java:29)
at oracle.odi.runtime.agent.execution.TaskExecutionHandler.handleTask(TaskExecutionHandler.java:52)
at oracle.odi.runtime.agent.execution.SessionTask.processTask(SessionTask.java:203)
at oracle.odi.runtime.agent.execution.SessionTask.doExecuteTask(SessionTask.java:114)
at oracle.odi.runtime.agent.execution.AbstractSessionTask.execute(AbstractSessionTask.java:886)
at oracle.odi.runtime.agent.execution.SessionExecutor$SerialTrain.runTasks(SessionExecutor.java:2198)
at oracle.odi.runtime.agent.execution.SessionExecutor.executeSession(SessionExecutor.java:591)
at oracle.odi.runtime.agent.processor.TaskExecutorAgentRequestProcessor$1.doAction(TaskExecutorAgentRequestProcessor.java:718)
at oracle.odi.runtime.agent.processor.TaskExecutorAgentRequestProcessor$1.doAction(TaskExecutorAgentRequestProcessor.java:611)
at oracle.odi.core.persistence.dwgobject.DwgObjectTemplate.execute(DwgObjectTemplate.java:203)
at oracle.odi.runtime.agent.processor.TaskExecutorAgentRequestProcessor.doProcessStartAgentTask(TaskExecutorAgentRequestProcessor.java:800)
at oracle.odi.runtime.agent.processor.impl.StartSessRequestProcessor.access$1400(StartSessRequestProcessor.java:74)
at oracle.odi.runtime.agent.processor.impl.StartSessRequestProcessor$StartSessTask.doExecute(StartSessRequestProcessor.java:702)
at oracle.odi.runtime.agent.processor.task.AgentTask.execute(AgentTask.java:180)
at oracle.odi.runtime.agent.support.DefaultAgentTaskExecutor$2.run(DefaultAgentTaskExecutor.java:108)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified
at java.lang.ProcessImpl.create(Native Method)
at java.lang.ProcessImpl.<init>(ProcessImpl.java:385)
at java.lang.ProcessImpl.start(ProcessImpl.java:136)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
... 20 more
I wonder where I did it wrong?
For a file Datastore, you need to define the attributes (columns) by opening the Datastore and going on the attribute tab. If the file already exists, you can reverse-engineer the attributes and rename them and change the datatype if needed.
The error message you received for the second task mentions that the file (generated in the fist task) does not exist. So there might be a problem with the first task, probably due to the missing attributes in your datastore.
Here is a detailed article about SQL To HDFS file (Sqoop) KM written by the ODI A-Team : http://www.ateam-oracle.com/importing-data-from-sql-databases-into-hadoop-with-sqoop-and-oracle-data-integrator-odi/

Worklight 6.2 migration tool error

I am working on migrating data from WL 5.0.6.2 to 6.2. While doing that I encountered problems running data migration tool.
Background:
Approach:
we export the WRKLGHT table from 5062 DB to an intermediate DB for data migration so the running DB is not affected.
DBs:
1. schema: proj
its storing 5062 runtime data, upgrade-worklight-506-60-oracle.sql, upgrade-worklight-60-61-oracle.sql upgrade sql are executed.
2. schema: proj6
its blank DB at first, "create-worklightadmin-oracle.sql" is executed to prepare WLADMIN tables.
WL Project existing:
1. wlProj
its the app running on 5062 WL server
2. wlProj6
its the target project running on 6.2 WL, and data will be migrated to be used by this project
Steps:
1. recreate proj6 schema
1.1 SQLPlus -> connect as SYSTEM -> "drop user proj6 cascade"
1.2. -> "create user proj6 identified by {password}" -> "grant all privileges to proj6"
2. execute "create-worklightadmin-oracle.sql" on proj6 schema
2.1 SQLDeveloper -> connect as user proj6
2.2 run #"{ WL server install dir}/databases/create-worklightadmin-oracle.sql";
2.3 commit
3. Open cmd and run the following command
java -cp ojdbc6.jar;worklight-ant-deployer.jar com.ibm.worklight.config.dbmigration62.MigrationTool -p /wlProj -sourceurl jdbc:oracle:thin:#//192.168.0.**:1521/xe -sourcedriver oracle.jdbc.driver.OracleDriver -sourceuser proj -sourcepassword *** -targeturl jdbc:oracle:thin:#//192.168.0.***:1521/xe -targetdriver oracle.jdbc.driver.OracleDriver -targetuser proj6 -targetpassword *** 2>out.txt
'migrating wlProj-{ios,android}' & 'migrating access data record for wlProj-.....' are shown in cmd.
And in out.txt, the following error is shown.
15 WorklightManagementPU-oracle INFO [main] openjpa.Runtime - Starting OpenJPA 1.2.2
15 WorklightManagementPU-oracle INFO [main] openjpa.jdbc.JDBC - Using dictionary class "org.apache.openjpa.jdbc.sql.OracleDictionary".
com.ibm.worklight.config.dbmigration62.exceptions.MigrationException: FWLSE3406E: The applications migration failed with error The field "description" of instance "ApplicationEntity[id=51, name=wlProj, displayName=, description=null, thumbnail=null, platformVersion=null]" contained a null value; the metadata for this field specifies that nulls are illegal..
at com.ibm.worklight.config.dbmigration62.MigrationTool.run(MigrationTool.java:195)
at com.ibm.worklight.config.dbmigration62.MigrationTool.main(MigrationTool.java:128)
Caused by: <openjpa-1.2.2-r422266:898935 fatal user error> org.apache.openjpa.persistence.InvalidStateException: The field "description" of instance "ApplicationEntity[id=51, name=wlProj, displayName=, description=null, thumbnail=null, platformVersion=null]" contained a null value; the metadata for this field specifies that nulls are illegal.
at org.apache.openjpa.kernel.SingleFieldManager.preFlush(SingleFieldManager.java:540)
at org.apache.openjpa.kernel.SingleFieldManager.preFlush(SingleFieldManager.java:478)
at org.apache.openjpa.kernel.StateManagerImpl.preFlush(StateManagerImpl.java:2829)
at org.apache.openjpa.kernel.PNewState.beforeFlush(PNewState.java:39)
at org.apache.openjpa.kernel.StateManagerImpl.beforeFlush(StateManagerImpl.java:960)
at org.apache.openjpa.kernel.BrokerImpl.flush(BrokerImpl.java:1967)
at org.apache.openjpa.kernel.BrokerImpl.flushSafe(BrokerImpl.java:1927)
at org.apache.openjpa.kernel.BrokerImpl.beforeCompletion(BrokerImpl.java:1845)
at org.apache.openjpa.kernel.LocalManagedRuntime.commit(LocalManagedRuntime.java:81)
at org.apache.openjpa.kernel.BrokerImpl.commit(BrokerImpl.java:1369)
at org.apache.openjpa.kernel.DelegatingBroker.commit(DelegatingBroker.java:877)
at org.apache.openjpa.persistence.EntityManagerImpl.commit(EntityManagerImpl.java:512)
at com.ibm.worklight.config.dbmigration62.ApplicationMigration.migrateApplication(ApplicationMigration.java:138)
at com.ibm.worklight.config.dbmigration62.ApplicationMigration.migrate(ApplicationMigration.java:63)
at com.ibm.worklight.config.dbmigration62.AbstractMigration.run(AbstractMigration.java:66)
at com.ibm.worklight.config.dbmigration62.MigrationTool.run(MigrationTool.java:183)
... 1 more
ERROR 20 : FWLSE3406E: The applications migration failed with error The field "description" of instance "ApplicationEntity[id=51, name=wlProj, displayName=, description=null, thumbnail=null, platformVersion=null]" contained a null value; the metadata for this field specifies that nulls are illegal..
This is recorded as IBM PMR #77138,999,738, and was identified as a defect that is currently being worked on.
There is no local fix to do other than wait for a fix via the support ticket.

Resources