Why am I Receiving a Table Not Found Error for Saved Sqoop Jobs? - sqoop

I am using AWS EMR.
When I execute a saved sqoop job, I receive the following error:
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
19/03/12 22:36:21 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-
1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hive/lib/log4j-slf4j-impl-
2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
19/03/12 22:36:21 DEBUG tool.JobTool: Enabled debug logging.
19/03/12 22:36:22 DEBUG hsqldb.HsqldbJobStorage: Checking for table:
SQOOP_ROOT
19/03/12 22:36:22 DEBUG hsqldb.HsqldbJobStorage: Found table:
SQOOP_ROOT
19/03/12 22:36:22 DEBUG hsqldb.HsqldbJobStorage: Looking up property
sqoop.hsqldb.job.storage.version for version null
19/03/12 22:36:22 DEBUG hsqldb.HsqldbJobStorage: => 0
19/03/12 22:36:22 DEBUG hsqldb.HsqldbJobStorage: Looking up property
sqoop.hsqldb.job.info.table for version 0
19/03/12 22:36:22 DEBUG hsqldb.HsqldbJobStorage: => SQOOP_SESSIONS
19/03/12 22:36:22 DEBUG hsqldb.HsqldbJobStorage: Checking for table: SQOOP_SESSIONS
19/03/12 22:36:22 DEBUG hsqldb.HsqldbJobStorage: Found table:
SQOOP_SESSIONS
19/03/12 22:36:22 DEBUG hsqldb.HsqldbJobStorage: Restoring job: myJob0
19/03/12 22:36:22 DEBUG hsqldb.HsqldbJobStorage: Job: myJob0; Getting
properties with class schema
19/03/12 22:36:22 DEBUG hsqldb.HsqldbJobStorage: Job: myJob0; Getting
properties with class SqoopOptions
19/03/12 22:36:22 DEBUG hsqldb.HsqldbJobStorage: Job: myJob0; Getting
properties with class config
19/03/12 22:36:22 DEBUG hsqldb.HsqldbJobStorage: System property set:
0
19/03/12 22:36:22 DEBUG hsqldb.HsqldbJobStorage: Stored property set:
0
19/03/12 22:36:22 DEBUG util.SqoopJsonUtil: Passed mapJsonStr ::null
to parse
--table or --query is required for import. (Or use sqoop import-all-
tables.)
Try --help for usage instructions.
19/03/12 22:36:22 DEBUG hsqldb.HsqldbJobStorage: Flushing current
transaction
19/03/12 22:36:22 DEBUG hsqldb.HsqldbJobStorage: Closing connection
Note: When I execute this command without creating a sqoop job then it works fine.
So, there is no possibility that I have missed table name or any syntax error.
When I check the saved sqoop job then I found that db.table name is missing from the Sqoop Metastore.
Please suggest to me How I can resolve this issue.
Thanks in advance.

Related

I am getting an error while creating a sqoop job

Warning: /opt/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hadoop-3.3.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2022-11-06 08:26:53,928 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
2022-11-06 08:26:54,565 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
$ sqoop job --create parjob -- import --connect jdbc:mysql://ms.itversity.com/nyse_export --username nyse_user --password itversity --table finalj --m 1 --target-dir /user/cloudera/fidir --incremental append --check-column id --last-value 0
I am getting this error when I execute the sqoop job command
there is no errors showing you can check ( sqoop job --list ) whether job got created or not

Install Hive on Windows

I am trying to install hive 3.1.2 on windows 10, hadoop 3.2.2.
I can start hadoop server and start hive shell by run "hive".
First problem is it show a lot of WARN:
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2021-09-09T21:01:22,001 INFO [main] org.apache.hadoop.hive.conf.HiveConf - Found configuration file file:/C:/my_programs/hive_3.1.2/conf/hive-site.xml
2021-09-09T21:01:22,303 WARN [main] org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.enable.impersonation does not exist
2021-09-09T21:01:23,557 WARN [main] org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.enable.impersonation does not exist
Hive Session ID = f879881f-c49b-449b-b8cf-81302c585358
Logging initialized using configuration in jar:file:/C:/my_programs/hive_3.1.2/lib/hive-common-3.1.2.jar!/hive-log4j2.properties Async: true
2021-09-09T21:01:25,309 INFO [main] org.apache.hadoop.hive.ql.session.SessionState - Created HDFS directory: /tmp/hive/admin/f879881f-c49b-449b-b8cf-81302c585358
2021-09-09T21:01:25,317 INFO [main] org.apache.hadoop.hive.ql.session.SessionState - Created local directory: C:/Users/admin/AppData/Local/Temp/admin/f879881f-c49b-449b-b8cf-81302c585358
2021-09-09T21:01:25,325 INFO [main] org.apache.hadoop.hive.ql.session.SessionState - Created HDFS directory: /tmp/hive/admin/f879881f-c49b-449b-b8cf-81302c585358/_tmp_space.db
2021-09-09T21:01:25,345 INFO [main] org.apache.hadoop.hive.conf.HiveConf - Using the default value passed in for log id: f879881f-c49b-449b-b8cf-81302c585358
2021-09-09T21:01:25,345 INFO [main] org.apache.hadoop.hive.ql.session.SessionState - Updating thread name to f879881f-c49b-449b-b8cf-81302c585358 main
2021-09-09T21:01:25,383 WARN [f879881f-c49b-449b-b8cf-81302c585358 main] org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.enable.impersonation does not exist
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
2021-09-09T21:01:53,237 INFO [f879881f-c49b-449b-b8cf-81302c585358 main] CliDriver - Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>
It still run hive shell when I start run
hive> show databases;
it come to error like this:
hive> show databases;
2021-09-09T21:05:01,341 INFO [f879881f-c49b-449b-b8cf-81302c585358 main] org.apache.hadoop.hive.conf.HiveConf - Using the default value passed in for log id: f879881f-c49b-449b-b8cf-81302c585358
FAILED: HiveException java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
2021-09-09T21:05:43,092 INFO [f879881f-c49b-449b-b8cf-81302c585358 main] org.apache.hadoop.hive.conf.HiveConf - Using the default value passed in for log id: f879881f-c49b-449b-b8cf-81302c585358
2021-09-09T21:05:43,093 INFO [f879881f-c49b-449b-b8cf-81302c585358 main] org.apache.hadoop.hive.ql.session.SessionState - Resetting thread name to main
hive>
I have read some solution and I think that problem come from hive metastore.
I followed by tutorial that connect derby metastore with hive.
But when I try to run
schematool -dbType derby -initSchema
Window cannot run schematool as a command line.
So I really confuse how can init db to hive, or can do in another way?
Update 2021/9/20:
I have fix all variable in my paths, and right now I got stuck at new problem. The error is quite clear but no solution found on my research:
PS C:\my_programs\hive_3.1.2> .\bin\hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/my_programs/hive_3.1.2/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/my_programs/hadoop-3.2.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2021-09-20T20:22:34,274 INFO [main] org.apache.hadoop.hive.conf.HiveConf - Found configuration file file:/C:/my_programs/hive_3.1.2/conf/hive-site.xml
2021-09-20T20:22:34,694 WARN [main] org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.enable.impersonation does not exist
2021-09-20T20:22:37,728 WARN [main] org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.enable.impersonation does not exist
Hive Session ID = 89fc5e06-2a55-496c-aea0-ab5512839ac3
Logging initialized using configuration in jar:file:/C:/my_programs/hive_3.1.2/lib/hive-exec-3.1.2.jar!/hive-log4j2.properties Async: true
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.conf.Configuration.getTimeDuration(Ljava/lang/String;JLjava/util/concurrent/TimeUnit;Ljava/util/concurrent/TimeUnit;)J
at org.apache.hadoop.hdfs.client.impl.DfsClientConf.<init>(DfsClientConf.java:248)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:307)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:291)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:173)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3354)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3403)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3371)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:477)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:226)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:624)
at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:591)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:747)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
PS C:\my_programs\hive_3.1.2>
It seems like error at hdfs config. But I don't know what I need to do with hadoop config, core-site.xml or hdfs-site.xml, or st else. Do we need some config at hadoop side to connect with hive???

Unable to read HiveServer2 configs from ZooKeeper

I use HDP3.1. And I Ambari to deploy hadoop cluster and hive. After deployed, I can run hive in shell successfully. And then I deploy Apache Kylin2.6, it can sync hive table. But when I build the cube, I got the following error:
java.io.IOException: OS command error exit with return code: 1, error message: SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://datacenter1:2181,datacenter2:2181,datacenter3:2181/default;password=hdfs;serviceDiscoveryMode=zooKeeper;user=hdfs;zooKeeperNamespace=hiveserver2
19/02/15 10:04:53 [main]: INFO jdbc.HiveConnection: Connected to datacenter3:10000
19/02/15 10:04:53 [main]: WARN jdbc.HiveConnection: Failed to connect to datacenter3:10000
19/02/15 10:04:53 [main]: ERROR jdbc.Utils: Unable to read HiveServer2 configs from ZooKeeper
Error: Could not open client transport for any of the Server URI's in ZooKeeper: Failed to open new session: java.lang.IllegalArgumentException: Cannot modify dfs.replication at runtime. It is not in list of params that are allowed to be modified at runtime (state=08S01,code=0)
Cannot run commands specified using -e. No current connection
The command is:
hive -e "USE default;
I run hive command in shell. It's success. The connection string is same as the string when run build cube in kylin. I'm confused why it is success in shell but failed in building cube.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://datacenter1:2181,datacenter2:2181,datacenter3:2181/default;password=hdfs;serviceDiscoveryMode=zooKeeper;user=hdfs;zooKeeperNamespace=hiveserver2
19/02/15 12:10:19 [main]: INFO jdbc.HiveConnection: Connected to datacenter3:10000
Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
0: jdbc:hive2://datacenter1:2181,datacenter2:>
You can try to add these two properties to hive-site.xml.
<property>
<name>hive.security.authorization.sqlstd.confwhitelist</name>
<value>mapred.*|hive.*|mapreduce.*|spark.*</value>
</property>
<property>
<name>hive.security.authorization.sqlstd.confwhitelist.append</name>
<value>mapred.*|hive.*|mapreduce.*|spark.*</value>
</property>
Finally, I found the root cause. There is 'Cannot modify dfs.replication at runtime.' error message in the error log. Kylin set this property in $KYLIN_HOME/conf/kylin_hive_conf.xml. And when it is running hive command, it will auto append the properties in that file. The final command likes: hive --hiveconf dfs.replication=2 ..........
It looks like that dfs.replication property can't be appened to hive command. I removed this property in kylin_hive_conf.xml. And it works now.

Sqoop list-tables issue

I'm using the Hortonworks Sandbox 2.2 VM and am having issues when running Sqoop against oracle. I'm executing the command like:
sqoop list-tables --connect jdbc:oracle:thin:#mydbhost.com:1521/sid --username user --password password
It executes, but nothing happens:
Warning: /usr/hdp/2.2.4.2-2/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
15/05/29 15:55:58 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5.2.2.4.2-2
15/05/29 15:55:58 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
15/05/29 15:55:58 INFO oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled.
15/05/29 15:55:58 INFO manager.SqlManager: Using default fetchSize of 1000
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.4.2-2/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.4.2-2/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.4.2-2/hive/lib/hive-jdbc-0.14.0.2.2.4.2-2-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/05/29 15:55:59 INFO manager.OracleManager: Time zone has been set to GMT
If I specify --driver oracle.jdbc.OracleDriver as a parameter, then the list-tables command works fine, but the import fails with the error "ORA-00933: SQL command not properly ended". I read in several places that specifying the --driver argument is not the right way to go about things, but when I don't specify it I can't get anything to work.
What am I doing wrong here?
Try the below steps
1. placed ojdbc6.jar in $SQOOP_HOME/lib
2. removed the --driver option
3. quoted all the parameters
In addition to #gopikrishna_BD anwser, Oracle db stores table names in uppercase by defalut. So while doing sqoop import, give the table name in uppercase. You should also give the database name in uppercase.
This article will help you to learn more about Sqoop with Oracle.

Hive error when running from hortonworks sandbox

I am following this document to test the sentiment analysis - can someone please help me out -- thanks!!
[root#sandbox ~]# hive -f hiveddl.sql
15/04/12 15:43:23 WARN conf.HiveConf: HiveConf of name hive.optimize.mapjoin.mapreduce does not exist
15/04/12 15:43:23 WARN conf.HiveConf: HiveConf of name hive.heapsize does not exist
15/04/12 15:43:23 WARN conf.HiveConf: HiveConf of name hive.server2.enable.impersonation does not exist
15/04/12 15:43:23 WARN conf.HiveConf: HiveConf of name hive.auto.convert.sortmerge.join.noconditionaltask does not exist
Logging initialized using configuration in file:/etc/hive/conf/hive-log4j.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-2041/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-2041/hive/lib/hive-jdbc-0.14.0.2.2.0.0-2041-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Added [json-serde-1.1.6-SNAPSHOT-jar-with-dependencies.jar] to class path
Added resources: [json-serde-1.1.6-SNAPSHOT-jar-with-dependencies.jar]
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.hadoop.hive.serde2.objectinspector.primitive.AbstractPrimitiveJavaObjectInspector.<init>(Lorg/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils$PrimitiveTypeEntry;)V
#
There is already this issue reported and answered on github:
Github issue link

Resources