sqoop importing mysql issue - hadoop

I am 4 days old with Hadoop, I am trying to import a table from my local database mysql, to learbn sqoop, my machine is ubuntu 13.04, my sqoop version: 1.4.3-cdh4.7.0, mysql:5.5.34
this is the command I use in the prompt:
sqoop import --connect jdbc:mysql://192.168.52.60:3306/saloni --username user --table pv --password xxxx;
that what i get:
14/06/03 16:11:36 WARN conf.Configuration: bad conf file: element not <property>
14/06/03 16:11:36 WARN conf.Configuration: bad conf file: element not <property>
14/06/03 16:11:36 INFO sqoop.Sqoop: Running Sqoop version: 1.4.3-cdh4.7.0
14/06/03 16:11:36 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/06/03 16:11:36 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/06/03 16:11:36 INFO tool.CodeGenTool: Beginning code generation
14/06/03 16:11:36 WARN conf.Configuration: bad conf file: element not <property>
14/06/03 16:11:36 WARN conf.Configuration: bad conf file: element not <property>
14/06/03 16:11:37 ERROR manager.SqlManager: Error executing statement: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:409)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1127)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:356)
at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2502)
at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2539)
at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2321)
at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:832)
at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:46)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:409)
at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:417)
at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:344)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:215)
at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:827)
at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:686)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:709)
at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:244)
at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:227)
at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:347)
at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1298)
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1110)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:396)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:506)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:222)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:231)
at org.apache.sqoop.Sqoop.main(Sqoop.java:240)
Caused by: java.net.ConnectException: Connessione rifiutata
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at java.net.Socket.connect(Socket.java:528)
at java.net.Socket.<init>(Socket.java:425)
at java.net.Socket.<init>(Socket.java:241)
at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:258)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:306)
... 32 more
14/06/03 16:11:37 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: No columns to generate for ClassWriter
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1116)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:396)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:506)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:222)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:231)
at org.apache.sqoop.Sqoop.main(Sqoop.java:240)
if I use this command:
sqoop import --connect jdbc:mysql://localhost:3306/saloni --username user --password 123456 --table pv
it starts, but then it stucks into:
INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
So I do not understand what my error is, can you put me in right way?
thanks

Something is wrong with your JDBC parameters as the link is failing to establish. Here is something I use, that works:
sqoop import --connect "jdbc:oracle:thin:<UserName>/<Password>##(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=<HOST>)(PORT=1521))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=<service>)))"
In your case, if you are using a mySQL database, you will want to use the mysql JDBC.

This should work:
sqoop import --connect jdbc:mysql://192.168.52.60:3306/saloni --username user --table pv --password xxxx --target-dir '/myimport' -m 1;

You should continue using localhost:3306. Your actual IP is not used to loopback. 8032 is the port for Yarn. I suspect you may not have configured it properly. Check the parameters below in your yarn-site.XML
<property><name>yarn.resourcemanager.address</name> <value>127.0.0.1:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>127.0.0.1:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>127.0.0.1:8031</value> </property>

Related

Why can't I find the file when importing from sqoop?

I try to import mysql table into hive through sqoop.
I used the command below.
sqoop import --username testuser --password test123 --num-mappers 1 --connect "jdbc:sqlserver://testdomain.com:2031;database=test_db" --target-dir /sqoop_batch/ --append --as-textfile --table test_table
However, an error occurred at the bottom error occurred.
...
2022-03-29 09:47:23,322 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1648452113129_0011
2022-03-29 09:47:23,324 INFO mapreduce.JobSubmitter: Executing with tokens: []
2022-03-29 09:47:23,376 INFO mapreduce.JobSubmitter: Cleaning up the staging area /user/root/.staging/job_1648452113129_0011
2022-03-29 09:47:23,387 ERROR tool.ImportTool: Import failed: java.io.FileNotFoundException: File does not exist: hdfs://mycluster:8020/user/root/.staging/job_1648452113129_0011
at org.apache.hadoop.fs.Hdfs.getFileStatus(Hdfs.java:145)
at org.apache.hadoop.fs.AbstractFileSystem.resolvePath(AbstractFileSystem.java:488)
at org.apache.hadoop.mapred.YARNRunner.setupLocalResources(YARNRunner.java:394)
at org.apache.hadoop.mapred.YARNRunner.createApplicationSubmissionContext(YARNRunner.java:573)
at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:325)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:251)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1588)
at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:200)
at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:173)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:270)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)
at org.apache.sqoop.manager.SQLServerManager.importTable(SQLServerManager.java:163)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:520)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
NOTE. I also modified the permission to access the /user/root/.staging directory to 777, but it was not resolved.
NOTE. I also modified 'yarn.app.mapreduce.am.staging-dir' in 'yarn-site.xml', but it was not resolved.

Sqoop import from Intersystems Caché

I'm trying to set up a Sqoop import to pull a query from Intersystems Caché into a Hive table; I've managed to connect, but the job fails some 2 minutes after the mapping phase starts, with a connection timeout message.I'll provide both the sqoop job and the messages below:
sqoop import
-Dhadoop.security.credential.provider.path=jceks://hdfs/user/bigdata/myCachePWD.password.jceks
--connect jdbc:Cache://server:1972/database
--username my_username
--password-alias myCachePWD.password.alias
--table sds.T00055_PCTE
--hive-database stg_splunk
--hive-table t00055_pcte
--hive-overwrite
--hive-import
--num-mappers 10
--as-parquetfile
--compress
--compression-codec org.apache.hadoop.io.compress.SnappyCodec
--warehouse-dir /user/hive/warehouse/stage/
--driver com.intersys.jdbc.CacheDriver
--split-by DT_ICLO
And here is the relevant error log:
19/12/30 16:24:14 INFO mapreduce.JobSubmitter: Submitting tokens for
job: job_1577368744752_0394 19/12/30 16:24:15 INFO
impl.YarnClientImpl: Submitted application
application_1577368744752_0394 19/12/30 16:24:15 INFO mapreduce.Job:
The url to track the job:
http://[SERVER]:8088/proxy/application_1577368744752_0394/
19/12/30 16:24:15 INFO mapreduce.Job: Running job:
job_1577368744752_0394 19/12/30 16:24:46 INFO mapreduce.Job: Job
job_1577368744752_0394 running in uber mode : false 19/12/30 16:24:46
INFO mapreduce.Job: map 0% reduce 0% 19/12/30 16:27:25 INFO
mapreduce.Job: Task Id : attempt_1577368744752_0394_m_000000_0, Status
: FAILED Error: java.lang.RuntimeException:
java.lang.RuntimeException: java.sql.SQLException: [Cache JDBC]
Communication link failure: Connection timed out (Connection timed
out) at
org.apache.sqoop.mapreduce.db.DBInputFormat.setDbConf(DBInputFormat.java:170)
at
org.apache.sqoop.mapreduce.db.DBInputFormat.setConf(DBInputFormat.java:161)
at
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:755) at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at
org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused
by: java.lang.RuntimeException: java.sql.SQLException: [Cache JDBC]
Communication link failure: Connection timed out (Connection timed
out) at
org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:223)
at
org.apache.sqoop.mapreduce.db.DBInputFormat.setDbConf(DBInputFormat.java:168)
... 10 more Caused by: java.sql.SQLException: [Cache JDBC]
Communication link failure: Connection timed out (Connection timed
out) at
com.intersys.jdbc.CacheConnection.connect(CacheConnection.java:1063)
at com.intersys.jdbc.CacheConnection.(CacheConnection.java:370)
at com.intersys.jdbc.CacheDriver.connect(CacheDriver.java:211) at
java.sql.DriverManager.getConnection(DriverManager.java:664) at
java.sql.DriverManager.getConnection(DriverManager.java:247) at
org.apache.sqoop.mapreduce.db.DBConfiguration.getConnection(DBConfiguration.java:302)
at
org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:216)
... 11 more
I'm not sure if I got the correct driver; I copied the Cache driver from DBeaver into /var/lib/sqoop and used the com.intersys.jdbc.CacheDriver in the sqoop job; it does connect though, so I'm not sure if it's a driver version issue or some other server side config...
Any insight would be greatly appreciated.
Well, it turned out to be a firewall issue; access was granted only to two machines on our cluster, so when Sqoop sent out the mappers from the other machines to read from Caché, the connection from those nodes was refused.

Sqoop Import Error: org.apache.hadoop.security.AccessControlException: Permission denied by sticky bit

I have a single node Cloudera Cluster (CDH 5.16) in a Rhel 7 Remote server.
I have installed CDH using packages.
When i am running sqoop import job, i am getting the following error:
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
19/06/04 15:49:31 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.16.1
19/06/04 15:49:31 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
19/06/04 15:49:32 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
19/06/04 15:49:32 INFO tool.CodeGenTool: Beginning code generation
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
19/06/04 15:49:34 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
19/06/04 15:49:35 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
19/06/04 15:49:35 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-ak_bng/compile/d07f2f60a7ecbf9411c79687daa024c9/categories.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
19/06/04 15:49:37 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-ak_bng/compile/d07f2f60a7ecbf9411c79687daa024c9/categories.jar
19/06/04 15:49:38 ERROR tool.ImportTool: Import failed: org.apache.hadoop.security.AccessControlException: Permission denied by sticky bit: user=ak_bng, path="/user/hive/warehouse/sales.db/categories":hive:hive:drwxr-xr-t, parent="/user/hive/warehouse/sales.db":hive:hive:drwxr-xr-t
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkStickyBit(DefaultAuthorizationProvider.java:387)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:159)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3885)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6861)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4290)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4245)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4229)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:856)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:313)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:626)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2278)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2274)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2272)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2106)
at org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:688)
at org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:684)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:684)
at org.apache.sqoop.tool.ImportTool.deleteTargetDir(ImportTool.java:546)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:509)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied by sticky bit: user=ak_bng, path="/user/hive/warehouse/sales.db/categories":hive:hive:drwxr-xr-t, parent="/user/hive/warehouse/sales.db":hive:hive:drwxr-xr-t
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkStickyBit(DefaultAuthorizationProvider.java:387)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:159)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3885)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6861)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4290)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4245)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4229)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:856)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:313)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:626)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2278)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2274)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2272)
at org.apache.hadoop.ipc.Client.call(Client.java:1504)
at org.apache.hadoop.ipc.Client.call(Client.java:1441)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
at com.sun.proxy.$Proxy10.delete(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:552)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy11.delete(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2104)
... 13 more
Sqoop command is
sqoop import --connect jdbc:mysql://10.188.177.228:3306/sales --username vaishak --password root_123 --table categories --m 1 --delete-target-dir --target-dir /user/hive/warehouse/sales.db/categories
As per the below documentation, i tried changing the fs.defaultFS in core-site.xml.
https://community.cloudera.com/t5/CDH-Manual-Installation/Permission-denied-user-root-access-WRITE-inode-quot-user/td-p/4943
It didnt work.
I tried the below link in stackoverflow:
Permission exception for Sqoop
That too didnt work out for me.
i created a new folder for ak_bng and added to hive group as below:
sudo -u hdfs hadoop fs -mkdir /user/ak_bng
sudo -u hdfs hadoop fs -chown ak_bng:hive /user/ak_bng
i was still getting the same error.
in few links i saw suggestions of adding users(in my case ak_bng) to supergroup.
But i am not aware of how to do it.
few suggested to run sqoop script as different user. I am not aware of how to do that too.
I am very new to Unix and CDH and i am not aware of how to achieve this.
I had a similar permission issue when i tried running sqoop scripts from HUE editor
Following is the error i got at that time:
Failed to create deployment directory: AccessControlException: Permission denied: user=hive, access=WRITE, inode="/user/hue/oozie/deployments":hue:hue:drwxr-xr-x (error 500)
Before CDH, I had setup Hadoop 3.1 and sqoop separately (Both outside CDH) and i was able to successfully import data into HDFS.
But while using CDH i am getting these errors.
Can someone please shed some light on what is the issue here and how to resolve this issue.
Output of hadoop fs -ls /user
drwx------ - hdfs supergroup 0 2019-06-04 12:47 /user/hdfs
drwxr-xr-x - mapred hadoop 0 2019-05-27 20:06 /user/history
drwxr-xr-t - hive hive 0 2019-06-03 18:01 /user/hive
drwxr-xr-x - hue hue 0 2019-06-03 18:01 /user/hue
drwxr-xr-x - impala impala 0 2019-05-27 20:08 /user/impala
drwxr-xr-x - oozie oozie 0 2019-05-27 20:12 /user/oozie
drwxr-xr-x - spark spark 0 2019-05-27 20:06 /user/spark
Group details:
root:x:0:
bin:x:1:
daemon:x:2:
sys:x:3:
adm:x:4:
tty:x:5:
disk:x:6:
lp:x:7:
mem:x:8:
kmem:x:9:
wheel:x:10:ak_bng
cdrom:x:11:
mail:x:12:postfix
man:x:15:
dialout:x:18:
floppy:x:19:
games:x:20:
tape:x:33:
video:x:39:
ftp:x:50:
lock:x:54:
audio:x:63:
nobody:x:99:
users:x:100:
utmp:x:22:
utempter:x:35:
input:x:999:
systemd-journal:x:190:
systemd-network:x:192:
dbus:x:81:
polkitd:x:998:
ssh_keys:x:997:
sshd:x:74:
postdrop:x:90:
postfix:x:89:
printadmin:x:996:
dip:x:40:
cgred:x:995:
rpc:x:32:
libstoragemgmt:x:994:
unbound:x:993:
kvm:x:36:qemu
qemu:x:107:
chrony:x:992:
gluster:x:991:
rtkit:x:172:
radvd:x:75:
tss:x:59:
usbmuxd:x:113:
colord:x:990:
abrt:x:173:
geoclue:x:989:
saslauth:x:76:
libvirt:x:988:
pulse-access:x:987:
pulse-rt:x:986:
pulse:x:171:
gdm:x:42:
setroubleshoot:x:985:
gnome-initial-setup:x:984:
stapusr:x:156:
stapsys:x:157:
stapdev:x:158:
tcpdump:x:72:
avahi:x:70:
slocate:x:21:
ntp:x:38:
ak_bng:x:1000:
localadmin:x:1001:
am_bng:x:1002:
localuser:x:1003:
apache:x:48:
cassandra:x:983:
mysql:x:27:
cloudera-scm:x:982:
hadoop:x:1011:yarn,hdfs,mapred
postgres:x:26:
zookeeper:x:981:
yarn:x:980:
hdfs:x:979:
mapred:x:978:
kms:x:977:kms
httpfs:x:976:httpfs
hbase:x:975:
hive:x:974:impala
sentry:x:973:
solr:x:972:
sqoop:x:971:
flume:x:970:
spark:x:969:
oozie:x:968:
hue:x:967:
impala:x:966:
llama:x:965:
kudu:x:964:
I need to run sqoop scripts from command line as user ak_bng
As the vaishak user, sqoop wants to write to /user/hive/warehouse/sales.db. However, vaishak does not have permission to write to that directory, so sqoop throws
Permission denied by sticky bit: user=ak_bng, path="/user/hive/warehouse/sales.db/categories":hive:hive:drwxr-xr-t, parent="/user/hive/warehouse/sales.db":hive:hive:drwxr-xr-t
Try to specify a target directory that is owned by vaishak and rerun. For example: /user/vaishak/sales.db

Call From kv.local/172.20.12.168 to localhost:8020 failed on connection exception, when using tera gen

I am working with hadoop teragen to check the hadoop mapreduce benchmarking with the terasort.
But when i run the following command,
hadoop jar /Users/**/Documents/hadoop-2.6.4/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.4.jar teragen -Dmapreduce.job.maps=100 1t random-data
I got the following exception,
17/06/01 15:09:21 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
17/06/01 15:09:22 INFO client.RMProxy: Connecting to ResourceManager at /127.0.0.1:8032
17/06/01 15:09:23 INFO terasort.TeraSort: Generating -727379968 using 100
17/06/01 15:09:23 INFO mapreduce.JobSubmitter: number of splits:100
17/06/01 15:09:23 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1496303775726_0003
17/06/01 15:09:23 INFO impl.YarnClientImpl: Submitted application application_1496303775726_0003
17/06/01 15:09:23 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1496303775726_0003/
17/06/01 15:09:23 INFO mapreduce.Job: Running job: job_1496303775726_0003
17/06/01 15:09:27 INFO mapreduce.Job: Job job_1496303775726_0003 running in uber mode : false
17/06/01 15:09:27 INFO mapreduce.Job: map 0% reduce 0%
17/06/01 15:09:27 INFO mapreduce.Job: Job job_1496303775726_0003 failed with state FAILED due to: Application application_1496303775726_0003 failed 2 times due to AM Container for appattempt_1496303775726_0003_000002 exited with exitCode: -1000
For more detailed output, check application tracking page:http://localhost:8088/proxy/application_1496303775726_0003/Then, click on links to logs of each attempt.
Diagnostics: Call From KV.local/172.20.12.168 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
java.net.ConnectException: Call From KV.local/172.20.12.168 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
at org.apache.hadoop.ipc.Client.call(Client.java:1473)
at org.apache.hadoop.ipc.Client.call(Client.java:1400)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy34.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy35.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1977)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1118)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:608)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:706)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:369)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1522)
at org.apache.hadoop.ipc.Client.call(Client.java:1439)
... 31 more
As the error show, it is not able to connect to localhost:8020, but when i chech the namenode web UI, it shows that the namenode is active. Please see the below screenshot:
I found many posts related to this, but none helped me out. I also checked out the hosts file, which contains the following line:
127.0.0.1 localhost
172.20.12.168 localhost
Can anybody help me out sorting out this problem?
The following procedure helped me out in solving the issue:
Stop all the services.
Delete namenode and datanode directories as specified in hdfs-site.xml.
Create new namenode and datanode directories and modify hdfs-site.xml accordingly.
in core-site.xml, make the following changes or add the following properties:
<property>
<name>fs.defaultFS</name>
<value>hdfs://172.20.12.168/</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://172.20.12.168:8020</value>
</property>
Make the following changes in hadoop-2.6.4/etc/hadoop/hadoop-env.sh file:
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_91.jdk/Contents/Home
Restart dfs, yarn and mr as follows:
start-dfs.sh
start-yarn.sh
mr-jobhistory-daemon.sh start historyserver

Using Sqoop to import data from Redshift To Hive

I'm getting the error: Could not load db driver class.
The connection and error is below. Under that is a list of the jar files in the lib directory. What am i doing wrong?
sqoop import
--connect jdbc:redshift://< >
--username < > --password < >
--driver com.amazon.redshift.jdbc.Driver
--table import-all-tables
17/04/21 11:14:46 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.2.0-258
17/04/21 11:14:46 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/04/21 11:14:46 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
17/04/21 11:14:46 INFO manager.SqlManager: Using default fetchSize of 1000
17/04/21 11:14:46 INFO tool.CodeGenTool: Beginning code generation
17/04/21 11:14:46 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: com.amazon.redshift.jdbc.Driver
java.lang.RuntimeException: Could not load db driver class: com.amazon.redshift.jdbc.Driver
at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:856)
at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:744)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:767)
at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:270)
at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:241)
at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:227)
at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295)
at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1845)
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1645)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:148)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235)
at org.apache.sqoop.Sqoop.main(Sqoop.java:244)
[t lib]$ ls
ant-contrib-1.0b3.jar hsqldb-1.8.0.10.jar kite-hadoop-compatibility-1.0.0.jar parquet-generator-1.4.1.jar
ant-eclipse-1.0-jvm1.2.jar jackson-annotations-2.3.0.jar mysql-connector-java.jar parquet-hadoop-1.4.1.jar
avro-1.7.5.jar jackson-core-2.3.1.jar opencsv-2.3.jar parquet-jackson-1.4.1.jar
avro-mapred-1.7.5-hadoop2.jar jackson-core-asl-1.9.13.jar paranamer-2.3.jar RedshiftJDBC42-1.2.1.1001 (2).jar
commons-codec-1.4.jar jackson-databind-2.3.1.jar parquet-avro-1.4.1.jar slf4j-api-1.6.1.jar
commons-compress-1.4.1.jar jackson-mapper-asl-1.9.13.jar parquet-column-1.4.1.jar snappy-java-1.0.5.jar
commons-io-1.4.jar kite-data-core-1.0.0.jar parquet-common-1.4.1.jar xz-1.0.jar
commons-jexl-2.1.1.jar kite-data-hive-1.0.0.jar parquet-encoding-1.4.1.jar
commons-logging-1.1.1.jar kite-data-mapreduce-1.0.0.jar parquet-format-2.0.0.jar
your jdbc is not exist in sqoop/lib so download your valid jdbc driver and copy to sqoop/lib

Resources