sqoop import unable to locate sqoop-1.4.6.jar - hadoop

I'm using sqoop for importing data from mysql table to be used with hadoop.
While importing it is showing error.
Hadoop Version: 2.5.0
Sqoop Version: 1.4.6
Command Used for import
sqoop import --connect jdbc:mysql://localhost/<dbname> --username root --password pass#123 --table <tablename> -m 1
Error Shown
15/05/27 23:13:59 ERROR tool.ImportTool: Encountered IOException running import job: java.io.FileNotFoundException: File does not exist: hdfs://localhost:9000/usr/lib/sqoop/sqoop-1.4.6.jar
Any help?

Try this:
1. Create a directory in HDFS:
hdfs dfs -mkdir /usr/lib/sqoop
2. Copy sqoop jar into HDFS:
hdfs dfs -put /usr/lib/sqoop/sqoop-1.4.6.jar /usr/lib/sqoop/
3. Check whether the file exists in HDFS:
hdfs dfs -ls /usr/lib/sqoop
4. Import using sqoop:
sqoop import --connect jdbc:mysql://localhost/<dbname> --username root --password pass#123 --table <tablename> -m 1

Related

FileNotFound error in Sqoop Merge command

I am trying to execute a sqoop merge command and for that, I have executed a Sqoop codegen to get the class and the jar of the table into the HDFS
Sqoop CodeGen Command:
sqoop codegen --connect jdbc:mysql://127.0.0.1/mydb --table mergetab --username root --password cloudera --outdir /user/cloudera/codegenclasses --fields-terminated-by '\t'
I have the following files in the outdir: /user/cloudera/codegenclasses
-rw-r--r-- 1 cloudera cloudera 9572 2017-04-20 16:26 codegenclasses/mergetab.class
-rw-r--r-- 1 cloudera cloudera 3902 2017-04-20 16:26 codegenclasses/mergetab.jar
-rw-r--r-- 1 cloudera cloudera 12330 2017-04-20 16:26 codegenclasses/mergetab.java
I am running the below sqoop merge command to update the rows in my hive table:
sqoop merge --merge-key id --new-data /user/cloudera/incrdata/incrementaldata --onto /user/cloudera/hivetables/fulltabledata --target-dir /user/cloudera/updateddatam --class-name /user/cloudera/codegenclasses/mergetab.class --jar-file /user/cloudera/codegenclasses/mergetab.jar
But Im getting the error:
Encountered IOException running import job: java.io.FileNotFoundException: File /user/cloudera/codegenclasses/44059c9b2bd47b95f03866d8d93eff7f/mergetab.jar does not exist
I have all the files in the folder and I gave the proper directories. But Im unable to identify the mistake Im doing here.
Could anyone help me fixing this ?
In the argument --outdir <dir>, the <dir> path specified belongs to the local filesystem. And using --outdir will only store the generated code i.e., tablename.java. Use --bindir <dir> instead.
sqoop codegen --connect jdbc:mysql://127.0.0.1/mydb --table mergetab --username root --password cloudera --bindir /path/to/store/jarfile --fields-terminated-by '\t'
Then merge. By default, the table name is the --class-name.
sqoop merge --merge-key id --new-data /user/cloudera/incrdata/incrementaldata --onto /user/cloudera/hivetables/fulltabledata --target-dir /user/cloudera/updateddatam --class-name mergetab --jar-file /path/to/store/jarfile/mergetab.jar

sqoop import error from teradata to hive

I am using below given sqoop command:
sqoop import
--libjars /usr/hdp/2.4.0.0-169/sqoop/lib,/usr/hdp/2.4.0.0-169/hive/lib
--connect jdbc:teradata://x/DATABASE=x
--connection-manager org.apache.sqoop.teradata.TeradataConnManager
--username ec
--password dc
--query "select * from hb where yr_nbr=2017"
--hive-table schema.table
--num-mappers 1
--hive-import
--target-dir /user/hive/warehouse/GG
I'm getting this error:
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
17/04/06 11:15:41 INFO mapreduce.Job: map 100% reduce 0%
17/04/06 11:15:41 INFO mapreduce.Job: Task Id : attempt_1491466460468_0029_m_000000_1, Status : FAILED
Error: org.apache.hadoop.fs.FileAlreadyExistsException: /user/root/temp_111508/part-m-00000 for client 192.168.211.133 already exists
From the error, I can guess that the output file is already in your target directory, may be from your previous sqoop import. There is an option in sqoop import named --delete-target-dir which will delete your target output directory and re-create them in your next sqoop import. Hope that helps.

Sqoop job unable to work with Hadoop Credential API

I have stored my database passwords in Hadoop CredentialProvider.
Sqoop import from terminal is working fine, successfully fetching the password from CredentialProvider.
sqoop import
-Dhadoop.security.credential.provider.path=jceks://hdfs/user/vijay/myPassword.jceks
--table myTable -m 1 --target-dir /user/vijay/output --delete-target-dir --username vijay --password-alias db2-dev-password
But when I try to setup as a Sqoop job, it is unable to recognize the -Dhadoop.security.credential.provider.path argument.
sqoop job --create my-sqoop-job -- import --table myTable -m 1 --target-dir /user/vijay/output --delete-target-dir --username vijay -Dhadoop.security.credential.provider.path=jceks://hdfs/user/vijay/myPassword.jceks --password-alias
Following is the error message:
14/04/05 13:57:53 ERROR tool.BaseSqoopTool: Error parsing arguments for import:
14/04/05 13:57:53 ERROR tool.BaseSqoopTool: Unrecognized argument: -Dhadoop.security.credential.provider.path=jceks://hdfs/user/vijay/myPassword.jceks
14/04/05 13:57:53 ERROR tool.BaseSqoopTool: Unrecognized argument: --password-alias
14/04/05 13:57:53 ERROR tool.BaseSqoopTool: Unrecognized argument: db2-dev-password
I couldn't find any special instructions in Sqoop User Guide for configuring Hadoop credential API with Sqoop Job.
How to resolve this issue?
Re positioning the Sqoop parameters solve the problem.
sqoop job -Dhadoop.security.credential.provider.path=jceks://hdfs/user/vijay/myPassword.jceks --create my-sqoop-job -- import --table myTable -m 1 --target-dir /user/vijay/output --delete-target-dir --username vijay --password-alias myPasswordAlias
Place the Hadoop credential before the Sqoop job keyword.
Your Sqoop job command is not proper, i.e. --password-alias is incomplete.
Please execute below command in your Hadoop server
hadoop credential list -provider jceks://hdfs/user/vijay/myPassword.jceks
Add the output in below Sqoop job command
sqoop job --create my-sqoop-job -- import --table myTable -m 1 --target-dir /user/vijay/output --delete-target-dir --username vijay -Dhadoop.security.credential.provider.path=jceks://hdfs/user/vijay/myPassword.jceks --password-alias <<output of above command>>

Exception during while importing data by using sqoop

I am importing data from mysql to hive using command
sqoop import --connect jdbc:mysql://localhost:3306/mydb --username root --table mytable --hive-import
and i am getting this error message:
ERROR tool.ImportTool: Encountered IOException running import job: java.io.FileNotFoundException: File does not exist: hdfs://localhost:54310/opt/sqoop/lib/jackson-databind-2.3.1.jar
I am using hadoop 2.6.0 and jdk1.7.0_79.
Please provide me a solution what i have to do to overcome this problem as i have been stuck at this point.
All the steps until importing the data has done successfully.

Sqoop import not working in Hadoop 2.x

I installed Hadoop-2.0.3 and Sqoop-1.4.4 and run Hadoop in pseudo distributed mode. When I try to import table from rdbms to hdfs issuing below command
master#hadoop:~/apps/sqoop-1.4.4$ bin/sqoop import --connect jdbc:mysql://localhost:3306/hadoop --username root --password root --table employees
I get the following error:
14/02/10 05:20:32 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.UnsupportedOperationException: Not implemented by the DistributedFileSystem FileSystem implementation
java.lang.UnsupportedOperationException: Not implemented by the DistributedFileSystem FileSystem implementation
Can you please provide solution for this?

Resources