facing issue while starting hive server and hive web interface - hadoop

((1))
I'm getting the below error while starting thrift server:
hive --service hiveserver
Starting Hive Thrift Server
org.apache.thrift.transport.TTransportException: Could not create ServerSocket on address 0.0.0.0/0.0.0.0:10000.
when I ran netstat port 10000 was already in use..
$ netstat -nl | grep 10000
tcp6 0 0 :::10000 :::* LISTEN
How do I resolve this?
((2))
While starting hive web interface getting below error
hive --service hwi
$ hive --service hwi
13/01/01 22:05:36 INFO hwi.HWIServer: HWI is starting up
13/01/01 22:05:37 INFO mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
13/01/01 22:05:37 INFO mortbay.log: jetty-6.1.26
13/01/01 22:05:37 INFO mortbay.log: Extract /opt/hive/lib/hive-hwi-0.9.0.jar to /tmp/Jetty_127_0_0_1_3606_hive.hwi.0.9.0.jar__hwi__.6ogsv5/webapp
13/01/01 22:05:37 WARN mortbay.log: failed SocketConnector#127.0.0.1:3606: java.net.BindException: Address already in use
13/01/01 22:05:37 WARN mortbay.log: failed Jetty20SShims$Server#21e554: java.net.BindException: Address already in use
Exception in thread "main" java.net.BindException: Address already in use
at java.net.PlainSocketImpl.socketBind(Native Method)
Please help.
Thanks in advance!!

Your port address seems to be used by some other program, you may follow below mentioned steps :-
((1)) Start hive server using another port address
hive --service hiveserver -p 10001 &
((2))
a] create hive-site.xml file if not present in $HIVE_HOME/conf folder
b] put following lines in it
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hive.hwi.listen.host</name>
<value>localhost</value>
</property>
<property>
<name>hive.hwi.listen.port</name>
<value>9998</value>
</property>
<property>
<name>hive.hwi.war.file</name>
<value>lib/hive-hwi-0.10.0.war</value>
<description>This sets the path to the HWI war file, relative to ${HIVE_HOME}. </description>
</property>
</configuration>
c] start hive web interface
hive --service hwi
d] browse localhost:9998/hwi/

I faced same problem here is the solution I got
1)set port numer
export HIVE_PORT=10000
2) Check which services is listening
sudo lsof -i -P | grep -i "listen"
3)if there is process relevant to port 10000 kill it
kill -9 pid
4) Start hive server
$HIVE_HOME/bin --service hiveserver
If it not work go to step 2 and start server again

Stop Hive;
Add the following properties in hive-site.xml
1) hive.hwi.listen.host = host
2) hive.hwi.listen.port = 9999
3) hive.hwi.war.file = /lib/hive-common-0.12.0.2.0.6.1-102.jar {This sets the path to the HWI war file, relative to $HIVE_HOME}
Start Hive again
Start HWI on Hive server with the command
nohup hive --service hwi &
Now, you can access HWI as host:9999/hwi

normally this issue arises .
Either you are changing the hostname so that what ever the user u have created in the metastore it still refering to the old metastore hostname.
Case -1 either metastore is not up which throws the above error so run the bin/metatool -listFSRoot if it ran without error then u r able to connect hive safly.
but still issue is not resolved case -2
Case-2 what ever the table created in the hive still points to the old hiveuser which was pointing to the old host name so u cant featch the record from the hive table .
Solution :- revert the host name in all the file with old host name and then run the hadoop and hive stack one after other.
Apart from this if any one has other solution please share.This I resolved in my production box.

If this kind of issue arises run
$ bin/metatool -listFSRoot
If it runs with out error then try to run the metastore and then check the hive can fetch the record from a table or not.

Related

Spark I/O error constructing remote block

I want to create an home-made spark cluster with two computer in the same network. The setup is the following:
A) 192.168.1.9 spark master with hadoop hdfs installed
Hadoop has this core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://0.0.0.0:9000</value>
</property>
</configuration>
B) 192.168.1.6 with spark only (slave)
From B I want to access to a file in A's hadoop hdfs using the spark command:
...
# Load files
file_1 = "input_1.pkl"
file_2 = "input_2.pkl"
hdfs_base_path = "hdfs://192.168.1.9:9000/folderx/"
sc.addFile(hdfs_base_path + file_1)
sc.addFile(hdfs_base_path + file_2)
# Get files back
with open(SparkFiles.get(file_1), 'rb') as fw:
// use fw
However, if I want to test the program in B, when I execute the program in B using the command:
./spark-submit --master local program.py
The output is the following:
17/07/25 19:02:51 INFO SparkContext: Added file hdfs://192.168.1.9:9000/bigdata/input_1_new_grid.pkl at hdfs://192.168.1.9:9000/bigdata/input_1_new_grid.pkl with timestamp 1501002171301
17/07/25 19:02:51 INFO Utils: Fetching hdfs://192.168.1.9:9000/bigdata/input_1_new_grid.pkl to /tmp/spark-838c3774-36ec-4db1-ab01-a8a8c627b100/userFiles-b4973f80-be6e-4f2e-8ba1-cd64ddca369a/fetchFileTemp1979399086141127743.tmp
17/07/25 19:02:51 WARN BlockReaderFactory: I/O error constructing remote block reader.
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
And later:
17/07/25 19:02:51 WARN DFSClient: Failed to connect to /127.0.0.1:50010 for block, add to deadNodes and continue. java.net.ConnectException: Connection refused
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
The program tries to access 127.0.0.1:50010, and it is wrong. Should I install hadoop also in B? If it is not necessary, what is the correct configuration? Thank you!
btw, just in case anyone comes to find some sort of solution, I fixed my issue by pointed quickstart.cloudera to real IP address instead of 127.0.0.1.
The default /etc/hosts is
127.0.0.1 quickstart.cloudera quickstart localhost localhost.domain
What you want is
127.0.0.1 localhost localhost.domain
xxxIP_Address_oF_YOUR_VM quickstart.cloudera quickstart
You may want to modify /usr/bin/cloudera-quickstart-ip as well, because every time you restart your VM, the hosts file may got reset again.

HBase fails to start in single node cluster mode on Mac OSX

I am trying to get a personal HBase development environment set up. I have hdfs and yarn running, but cannot get HBase to start.
I have started up hadoop 2.7.1, by running start-dfs.sh and start-yarn.sh. I have verified these are running by testing hdfs dfs -mkdir /test and running a sample MR job bundled in the examples, I have browsed HDFS at port 50070.
I have started zookeeper 3.4.6 on port 2181 and set its dataDir. My zoo.cfg has:
dataDir=/Users/.../tools/hd/zookeeper_data
clientPort=2181
I observe its zookeeper_server.PID file in the dataDir I chose, and when I run jps I see the below:
51074 NodeManager
50743 DataNode
50983 ResourceManager
50856 SecondaryNameNode
57848 QuorumPeerMain
58731 Jps
50653 NameNode
QuorumPeerMain above matches the PID in zookeeper_server.PID, as I would expect. Is this expectation correct? From what I have done so far, should it be expected that any more processes should be showing here?
I installed hbase-1.1.2. I configure hbase-site.xml. I set the hbase.rootDir to be hdfs://localhost:8200/hbase, my hdfs is running at localhost:8200. I set hbase.zookeeper.property.dataDir to my zookeeper's dataDir, with the expectation that it will use this property to find the PID of a running zookeeper. Is this expectation correct or have I misunderstood? The config in hbase-site.xml is:
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:8020/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>Users/.../tools/hd/zookeeper_data</value>
</property>
When I run start-hbase.sh my server fails to start. I see this log message:
2015-09-26 19:32:43,617 ERROR [main] master.HMasterCommandLine: Master exiting
To investigate I ran hbase master start and get more detail:
2015-09-26 19:41:26,403 INFO [Thread-1] server.NIOServerCnxn: Stat command output
2015-09-26 19:41:26,405 INFO [Thread-1] server.NIOServerCnxn: Closed socket connection for client /127.0.0.1:63334 (no session established for client)
2015-09-26 19:41:26,406 INFO [main] zookeeper.MiniZooKeeperCluster: Started MiniZooKeeperCluster and ran successful 'stat' on client port=2182
Could not start ZK at requested port of 2181. ZK was started at port: 2182. Aborting as clients (e.g. shell) will not be able to find this ZK quorum.
2015-09-26 19:41:26,406 ERROR [main] master.HMasterCommandLine: Master exiting
java.io.IOException: Could not start ZK at requested port of 2181. ZK was started at port: 2182. Aborting as clients (e.g. shell) will not be able to find this ZK quorum.
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:214)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:139)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2304)
So I have a few questions:
Should I be trying to set up a zookeeper before running HBase?
Why when I have started a zookeeper and told HBase where its dataDir is, does HBase try to start its own zookeeper?
Anything obviously stupid/misguided in the above?
The script you are using to start hbase start-hbase.sh will try to start the following components, in order:
zookeeper
hbase master
hbase regionserver
hbase master-backup
So, you could either stop the zookeeper which is started by you (or) you could start the daemons individually yourself:
# start hbase master
bin/hbase-daemon.sh --config ${HBASE_CONF_DIR} start master
# start region server
bin/hbase-daemons.sh --config ${HBASE_CONF_DIR} --hosts ${HBASE_CONF_DIR}/regionservers start regionserver
HBase stand alone starts it's own zookeeper (if you run start-hbase.sh), but it if fails to start or keep running, the other need hbase daemons won't work.
Make sure you explicitly set the properties for your interface lo0 in the hbase-site.xml file:
<property>
<name>hbase.zookeeper.dns.interface</name>
<value>lo0</value>
</property>
<property>
<name>hbase.regionserver.dns.interface</name>
<value>lo0</value>
</property>
<property>
<name>hbase.master.dns.interface</name>
<value>lo0</value>
</property>
I found that when my wifi was on, if these entries were missing, zookeeper filed to start.

Hive Metastore tries to create a Derby connection instead of MySQL

I am using Hive 0.11 and Metastore in local mode. When I try to start the Metastore daemon, it exits after spitting the following error message:
2013-11-21 08:47:19.541 GMT Thread[main,5,main] java.io.FileNotFoundException: derby.log (Permission denied)
2013-11-21 08:47:19.646 GMT Thread[main,5,main] Cleanup action starting
ERROR XBM0H: Directory /metastore_db cannot be created.
This is my hive-site.xml. I am using MySQL as Metastore storage. What I don't understand is why is Hive trying to create metastore_db locally.
Thanks.
Set hive.metastore.local property as false. (Removed as of Hive 0.10: If hive.metastore.uris is empty local mode is assumed, remote otherwise)
Set hive.metastore.uris property with valid uri (Host and port for the Thrift metastore server)
For eg:
<property>
<name>hive.metastore.uris</name>
<value>thrift://hap-db:9083</value>
<description>IP address (or fully-qualified domain name) and port of the metastore host</description>
</property>
Hi faced similar issue on hive 0.14. I had installed hive as root user and was trying to run hive services as a sudo user i use for all hadoop jobs.
Once i changed the installation owner to sudo and restarted it worked . so this error is mostly related to file permissions issue.

Hbase master not able to start

I am trying to start hbase master but getting the below error:
Could not start ZK at requested port of 2181. ZK was started at port: 2182. Aborting as clients (e.g. shell) will not be able to find this ZK quorum.
13/07/14 06:33:23 ERROR master.HMasterCommandLine: Failed to start master
java.io.IOException: Could not start ZK at requested port of 2181. ZK was started at port: 2182. Aborting as clients (e.g. shell) will not be able to find this ZK quorum.
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:134)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1684)
13/07/14 06:33:23 INFO server.NIOServerCnxn: Closed socket connection for client /127.0.0.1:46283 (no session established for client)
hbase-site.xml
<configuration>
<!-- Changing the default port for REST since it conflicts with yarn nodemanager -->
<property>
<name>hbase.rest.port</name>
<value>8070</value>
<description>The port for the HBase REST server.</description>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:8020/hbase</value>
</property>
</configuration>
It seems like something else is already using port 2181, or perhaps you had started another ZK instance earlier on this port. Either stop that processe or change its port. If that's is not possible then set hbase.zookeeper.property.clientPort to 2182 in hbase-site.xml.
Please note that HBase needs ZK's services, even in standalone mode, so you should make sure that it's running OK.
HTH
Surprisingly, it can be an issue with privileges to connect to the port 2181, but not to 2182. Instead of ./start-hbase.sh try:
sudo ./start-hbase.sh
In my case it helped.
When starting Hbase in standalone mode, a single JVM hosts the HBase Master, an HBase RegionServer, and a ZooKeeper quorum peer. So, you don't need to start a ZK instance separately.
In your case, hbase is not able to start the ZK because another instance is probably already running on port 2181. So, just close that ZooKeeper instance and restart hbase. Also, do ensure proper permissions for the hbase rootdir.

hbase cannot connect to hadoop

I am running an hdfs instance in pseudo-distributed mode, and tried to make another hbase instance connected to it on the same server. Logs in hadoop are fine, but I constantly got the connection failure in hbase' log
==================================================================================
2012-05-01 10:49:07,212 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181
2012-05-01 10:49:07,213 WARN org.apache.zookeeper.ClientCnxn: Session 0x13708dc552d0001 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
2012-05-01 10:49:08,882 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181
2012-05-01 10:49:08,882 WARN org.apache.zookeeper.ClientCnxn: Session 0x13708dc552d0001 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
==================================================================================
Configuration of core-site.xml#hadoop
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Configuration of hbase-site.xml#hbase
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
I also tried to replace localhost with the actual ip of the server, but got the same error.
First, you need to make sure your hbase master node is running, you can use jps to check.
If it is not running, you can run it by start-hbase.sh command or hbase master start.
And then check its status by other commands, like netstat -an | grep 9000
Second, if the previous method does not work, check your firewall configuration such as iptables and SELinux.
Use sudo iptables -L to check your iptables configuration. You can disable the iptables by sudo service iptables stop command under redhat based linux systems.
Use getenforce to check if SElinux is in enforcing mode.
Third, check the system configuration, for example, ssh etc.
You need to replace the core hadoop jar in the $HBASE_HOME/bin/lib/hadoop-{{version}}core.jar with the one in the $HADOOP_HOME/hadoop-{{version}}core.jar
I was running into the same problem when I tried to install hbase 0.92 from hbase 0.90-xxx which was working fine, i replaced the hbase-env.sh and hbase-site.xml from the old hbase to the new but forgot to copy the hadoop core jar.
I'm always suspicious when I see localhost in a config. Also when you use localhost, then it becomes very difficult (to impossible) to access any services from the psuedo distributed system from a host other than the one you are running on.
You did say you tried the IP address, but you might want to make sure its the IP address that the node really thinks its at.
Check the zookeeper logs from when it starts up and see what IP address it "thinks" its at. There should be a line like:
2012-01-31 09:32:46,083 - INFO [main:Environment#97] - Server environment:host.name=ip-10-8-127-58.ec2.internal
Then use the value of host.name as the value for all places in ALL Hadodop, HBase, Hive, Zookeeper, etc config files that need a hostname (assuming they are all on the same machine as you said you are in psuedo-distributed mode)
Also you did not show your hbase.zookeeper.quorum setting in hbase-site.xml. That is where hbase gets its knowledge of the zookeeper's address
<property>
<name>hbase.zookeeper.quorum</name>
<value>ip-10-8-127-58.ec2.internal</value>
</property>
I think Hbase cannot find the zookeeper quorum, you have to set the hbase.zookeeper.quorum property in hbase-site.xml. Also check if classpath is set properly or not, check this doc out http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html#classpath

Resources