Hive wont run in Hortonworks 2.2.4 - hadoop

I've just downloaded Hortonworks Sandbox 2.2.4, and I noticed when I follow the Hortonwork's tutorial on Hive, I get this,
HCatClienterroroncreatetable: {
"statement": "use default; create table nyse_stocks(`exchange` string, `stock_symbol` string, `date` string, `stock_price_open` float, `stock_price_high` float, `stock_price_low` float, `stock_price_close` float, `stock_volume` bigint, `stock_price_adj_close` float) row format delimited fields terminated by '\\t';",
"error": "unable to create table: nyse_stocks",
"exec": {
"stdout": "",
"stderr": "
15/05/05 09:57:50 WARN conf.HiveConf: HiveConf of name hive.heapsize does not exist
15/05/05 09:57:50 WARN conf.HiveConf: HiveConf of name hive.server2.enable.impersonation does not exist
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.4.2-2/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.4.2-2/hive/lib/hive-jdbc-0.14.0.2.2.4.2-2-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Command was terminated due to timeout(60000ms). See templeton.exec.timeout property",
"exitcode": 143
}
}(error500)
When I ssh into the Sandbox, and I simply type hive on shell, I get this the output inside the stderr,
15/05/05 09:57:50 WARN conf.HiveConf: HiveConf of name hive.heapsize does not exist
15/05/05 09:57:50 WARN conf.HiveConf: HiveConf of name hive.server2.enable.impersonation does not exist
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.4.2-2/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.4.2-2/hive/lib/hive-jdbc-0.14.0.2.2.4.2-2-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
What could be the solution for this?

There is a problem (due to incompatibility) with SerDe jar, inside sandbox.
I had the same problem and after a few googles, I found a solution at
https://github.com/brandonswilson.
Here is what I did:
a-)removed the first line from hiveddl.sql
b-)changed line 34 to:
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
That's it!
(After to run hive -f hiveddl.sql I got a problem at line 119:
-- FAILED: SemanticException Unrecognized file format in STORED AS clause: 'RCFILESE' --
Was: STORED AS RCFilese
I changed to: STORED AS RCFile
You could try to change before to run.)
Daniel Galizi.

Related

Hive installation error : com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V

I am new to hadoop, after installing Hive when I enter hive command in command prompt it giving me following error. Installed Hadoop version is 3.1.3. Derby is also installed with 10.12.1.1 version.
C:\apache-hive-2.1.0-bin\bin>hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/apache-hive-2.1.0-bin/lib/log4j-slf4j-impl-
2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/hadoop-3.1.3/share/hadoop/common/lib/slf4j-log4j12-
1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only
errors to the console.
Connecting to jdbc:hive2://
com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
Beeline version 2.1.0 by Apache Hive
com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
Connection is already closed.
C:\apache-hive-2.1.0-bin\bin>

Trying to understand output of pyspark dataframe to parquet format conversion

I am running below psypark code (pyspark version is 1.6.0)
from pyspark.sql import HiveContext
from pyspark import SparkContext, SparkConf
if __name__ == '__main__':
conf = SparkConf().setAppName('Testing')
sc = SparkContext(conf=conf)
hivec = HiveContext(sc)
df = hivec.sql("select * from product_replica where product_price>100")
df.write.option("compression","snappy").mode("overwrite").save("/user/cloudera/practice1/problem8/product/output",format="parquet")
sc.stop()
The output i am getting is as below, even though the parquet files are created in hdfs directory but i cannot read them with parque-tools. I get error .parquet file doesnt exist
[cloudera#quickstart sparkTransformationsPractice]$ spark-submit hive_context.py
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/usr/lib/hive/lib/hive-exec-1.1.0-cdh5.13.0.jar!/shaded/parquet/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/usr/lib/hive/lib/hive-jdbc-1.1.0-cdh5.13.0-standalone.jar!/shaded/parquet/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/usr/lib/parquet/lib/parquet-format-2.1.0-cdh5.13.0.jar!/shaded/parquet/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/usr/lib/parquet/lib/parquet-hadoop-bundle-1.5.0-cdh5.13.0.jar!/shaded/parquet/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/usr/lib/parquet/lib/parquet-pig-bundle-1.5.0-cdh5.13.0.jar!/shaded/parquet/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation. SLF4J: Actual binding is of type
[shaded.parquet.org.slf4j.helpers.NOPLoggerFactory] 19/12/11 11:21:14
WARN thread.QueuedThreadPool: 2 threads could not be stopped
Can you please explain what this output means, I think i might be doing something wrong

Unable to read HiveServer2 configs from ZooKeeper

I use HDP3.1. And I Ambari to deploy hadoop cluster and hive. After deployed, I can run hive in shell successfully. And then I deploy Apache Kylin2.6, it can sync hive table. But when I build the cube, I got the following error:
java.io.IOException: OS command error exit with return code: 1, error message: SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://datacenter1:2181,datacenter2:2181,datacenter3:2181/default;password=hdfs;serviceDiscoveryMode=zooKeeper;user=hdfs;zooKeeperNamespace=hiveserver2
19/02/15 10:04:53 [main]: INFO jdbc.HiveConnection: Connected to datacenter3:10000
19/02/15 10:04:53 [main]: WARN jdbc.HiveConnection: Failed to connect to datacenter3:10000
19/02/15 10:04:53 [main]: ERROR jdbc.Utils: Unable to read HiveServer2 configs from ZooKeeper
Error: Could not open client transport for any of the Server URI's in ZooKeeper: Failed to open new session: java.lang.IllegalArgumentException: Cannot modify dfs.replication at runtime. It is not in list of params that are allowed to be modified at runtime (state=08S01,code=0)
Cannot run commands specified using -e. No current connection
The command is:
hive -e "USE default;
I run hive command in shell. It's success. The connection string is same as the string when run build cube in kylin. I'm confused why it is success in shell but failed in building cube.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://datacenter1:2181,datacenter2:2181,datacenter3:2181/default;password=hdfs;serviceDiscoveryMode=zooKeeper;user=hdfs;zooKeeperNamespace=hiveserver2
19/02/15 12:10:19 [main]: INFO jdbc.HiveConnection: Connected to datacenter3:10000
Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
0: jdbc:hive2://datacenter1:2181,datacenter2:>
You can try to add these two properties to hive-site.xml.
<property>
<name>hive.security.authorization.sqlstd.confwhitelist</name>
<value>mapred.*|hive.*|mapreduce.*|spark.*</value>
</property>
<property>
<name>hive.security.authorization.sqlstd.confwhitelist.append</name>
<value>mapred.*|hive.*|mapreduce.*|spark.*</value>
</property>
Finally, I found the root cause. There is 'Cannot modify dfs.replication at runtime.' error message in the error log. Kylin set this property in $KYLIN_HOME/conf/kylin_hive_conf.xml. And when it is running hive command, it will auto append the properties in that file. The final command likes: hive --hiveconf dfs.replication=2 ..........
It looks like that dfs.replication property can't be appened to hive command. I removed this property in kylin_hive_conf.xml. And it works now.

sqoop execute import can't print mapreduce information

Today I installed sqoop-1.4.6 in hadoop cluster, when I use sqoop import data to hdfs and it's ok, but when I see the sqoop log, it's only a little log and not print mapreduce to the log file. I think the reason from hadoop but I don't know how to resolve it.
The sqoop log info shows:
SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found
binding in
[jar:file:/home/hadoop/hadoop-2.5.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/home/hbase/hbase-1.1.5/lib/phoenix-4.7.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/home/hbase/hbase-1.1.5/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation. SLF4J: Actual binding is of type
[org.slf4j.impl.Log4jLoggerFactory] Note:
/tmp/sqoop-hadoop/compile/737c3aeaaf2b0b38c618b5a7bd2b3411/QueryResult.java
uses or overrides a deprecated API. Note: Recompile with
-Xlint:deprecation for details.

Hive error when running from hortonworks sandbox

I am following this document to test the sentiment analysis - can someone please help me out -- thanks!!
[root#sandbox ~]# hive -f hiveddl.sql
15/04/12 15:43:23 WARN conf.HiveConf: HiveConf of name hive.optimize.mapjoin.mapreduce does not exist
15/04/12 15:43:23 WARN conf.HiveConf: HiveConf of name hive.heapsize does not exist
15/04/12 15:43:23 WARN conf.HiveConf: HiveConf of name hive.server2.enable.impersonation does not exist
15/04/12 15:43:23 WARN conf.HiveConf: HiveConf of name hive.auto.convert.sortmerge.join.noconditionaltask does not exist
Logging initialized using configuration in file:/etc/hive/conf/hive-log4j.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-2041/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-2041/hive/lib/hive-jdbc-0.14.0.2.2.0.0-2041-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Added [json-serde-1.1.6-SNAPSHOT-jar-with-dependencies.jar] to class path
Added resources: [json-serde-1.1.6-SNAPSHOT-jar-with-dependencies.jar]
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.hadoop.hive.serde2.objectinspector.primitive.AbstractPrimitiveJavaObjectInspector.<init>(Lorg/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils$PrimitiveTypeEntry;)V
#
There is already this issue reported and answered on github:
Github issue link

Resources