Recently I have been working on Ambari. But after I installed successfully, everything is working well except the HBase. Only the HBase master is good, and other RegionServers all get the alert:
Connection failed: [Errno 111] Connection refused to server1.hadoop:16030. (the domain name differs from machines.)
Anyone have the same problem?
I have fixed this issue. I read the log file in /var/log/hbase/*.log of my region servers ,and find that it's clock is not sync with the master's. So I make all the servers to sync its clock to the master's using ntpd. Then I restart the ambari components and no alert showed up!!!
Think this may help those with the problem.
Related
I started the Cloudera manager successfully but while trying to see the list of files it throws error related to namenode. After that I checked the namenode is started or not , but it says
Unable to connect
Firefox can't establish a connection to the server at quickstart.cloudera:50070.
The site could be temporarily unavailable or too busy. Try again in a few moments.
If you are unable to load any pages, check your computer's network connection. If your computer or network is protected by a firewall
or proxy, make sure that Firefox is permitted to access the Web.
How can I resolve this?
We have 5 node hortonworks cluster with ambari monitors installed in all nodes and metrics collector installed in master node.
I am getting Connection failed: [Errno 111] Connection refused to 0.0.0.0:6188
PFA for error.
https://drive.google.com/file/d/0B85rPUe3-QPXbXJSRzJmdUwwQU0/view?usp=sharing
I followed the below document and tried removing the service and added it.
https://cwiki.apache.org/confluence/display/AMBARI/Moving+Metrics+Collector+to+a+new+host
First of all, I am not able to find the origin of the error. Please share your experience if you ever faced this problem.
This happens sometimes that port is already being use by another process when we try to move collector to new host with 'Curl' commands specified on apache wiki.
Istead of doing using this you can leverage the feature which Ambari provides from it's GUI to move components from one host to another host .
'Move Master Wizard'
Follow the steps stated at Move Master Wizard , Ambari will take care rest of things for you.
I have fixed this issue by killing the process running in that port and restart the service. You can also do a manual reboot of the machine to fix this issue.
After successful running the maven project in eclipse, I cannot connect to http://localhost:8082. It shows the below error:
connection refused
What should I do to resolve the problem?
This exception usually occurs when there is no service listening on the port you are trying to connect to. A couple of things could be happening:
You have not started your server.
Your server is not waiting to
accept connections.
You are trying to connect to the wrong port
number.
You should check your services. I think they are either stopped or not working. You can check the services:-
a) if using windows by clicking services in control panel
b) if using linux type in the command Line sudo mqsql service.
you can refer this link- https://www.cyberciti.biz/faq/how-to-find-out-if-mysql-is-running-on-linux/
2)Also check the database connection string might you have not used correct string.
This is our first steps using big data stuff like apache spark and hadoop.
We have a installed Cloudera CDH 5.3. From the cloudera manager we choose to install spark. Spark is up and running very well in one of the nodes in the cluster.
From my machine I made a little application that connects to read a text file stored on hadoop HDFS.
I am trying to run the application from Eclipse and it displays these messages
15/02/11 14:44:01 INFO client.AppClient$ClientActor: Connecting to master spark://10.62.82.21:7077...
15/02/11 14:44:02 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster#10.62.82.21:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster#10.62.82.21:7077
15/02/11 14:44:02 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster#10.62.82.21:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: no further information: /10.62.82.21:7077
The application is has one class the create a context using the following line
JavaSparkContext sc = new JavaSparkContext(new SparkConf().setAppName("Spark Count").setMaster("spark://10.62.82.21:7077"));
where this IP is the IP of the machine spark working on.
Then I try to read a file from HDFS using the following line
sc.textFile("hdfs://10.62.82.21/tmp/words.txt")
When I run the application I got the
Check your Spark master logs, you should see something like:
15/02/11 13:37:14 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkMaster#mymaster:7077]
15/02/11 13:37:14 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sparkMaster#mymaster:7077]
15/02/11 13:37:14 INFO Master: Starting Spark master at spark://mymaster:7077
Then when your connecting to the master, be sure to use exactly the same hostname as found in the logs above (do not use the IP address):
.setMaster("spark://mymaster:7077"));
Spark standalone is a bit picky with this hostname/IP stuff.
When you create your Spark master using the shell command "sbin/start-master.sh". go the the address http://localhost:8080 and check the "URL" row.
I notice no accepted answer, just for info I thought I'd mention a couple things.
First, in the spark-env.sh file in the conf directory, the SPARK_MASTER_IP and SPARK_LOCAL_IP settings can be hostnames. You don't want them to be, but they can be.
As noted in another answer, Spark can be a little picky about hostname vs. IP address, because of this resolved bug/feature: See bug here. The problem is, it's not clear if they "resolved" is simply by telling us to use IP instead of hostname?
Well I am having this same problem right now, and the first thing you do is check the basics.
Can you ping the box where the Spark master is running? Can you ping the worker from the master? More importantly, can you password-less ssh to the worker from the master box? Per 1.5.2 docs you need to be able to do that with a private key AND have the worker entered in the conf/slaves file. I copied the relevant paragraph at the end.
You can get a situation where the worker can contact the master but the master can't get back to the worker so it looks like no connection is being made. Check both directions.
Finally of all the combinations of settings, in a limited experiment just now I only found one that mattered: On the master, in spark-env.sh, set the SPARK_MASTER_IP to the IP address, not hostname. Then connect from the worker with spark://192.168.0.10:7077 and voila it connects! Seemingly none of the other config parameters are needed here.
Here's the paragraph from the docs about ssh and slaves file in conf:
To launch a Spark standalone cluster with the launch scripts, you
should create a file called conf/slaves in your Spark directory, which
must contain the hostnames of all the machines where you intend to
start Spark workers, one per line. If conf/slaves does not exist, the
launch scripts defaults to a single machine (localhost), which is
useful for testing. Note, the master machine accesses each of the
worker machines via ssh. By default, ssh is run in parallel and
requires password-less (using a private key) access to be setup. If
you do not have a password-less setup, you can set the environment
variable SPARK_SSH_FOREGROUND and serially provide a password for each
worker.
Once you have done that, using the IP address should work in your code. Let us know! This can be an annoying problem, and learning that most of the config params don't matter was nice.
I am new to Spark and I am trying to run it on EC2. I follow the tutorial on spark webpage by using spark-ec2 to launch a Spark cluster. Then, I try to use spark-submit to submit the application to the cluster. The command looks like this:
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://ec2-54-88-9-74.compute-1.amazonaws.com:7077 --executor-memory 2G --total-executor-cores 1 ./examples/target/scala-2.10/spark-examples_2.10-1.0.0.jar 100
However, I got the following error:
ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
Please let me know how to fix it. Thanks.
You're seeing this issue because the master node of your spark-standalone cluster cant open a TCP connection back to the drive (on your machine). The default mode of spark-submit is client which runs the driver on the machine that submitted it.
A new cluster mode was added to spark-deploy that submits the job to the master where it is then run on a client, removing the need for a direct connection. Unfortunately this mode is not supported in standalone mode.
You can vote for the JIRA issue here: https://issues.apache.org/jira/browse/SPARK-2260
Tunneling your connection via SSH is possible but latency would be a big issue since the driver would be running locally on your machine.
I'm curious if you still having this issue ... But in case anyone is asking here is a brief answer. As clarified by jhappoldt, the master node of your spark-standalone cluster cant open a TCP connection back to the drive (on your local machine). Two workarounds are possible, tested and succeeded.
(1) From EC2 Management Console, create a new security group and add rules to enable TCP back and forth from your PC (public IP). (what I did was adding TCP rules inbound and outbound) ... Then add this security group to your master instance. (right click --> Networking --> Change security groups). Note: add it and don't remove the already established security groups.
This solution work well, but in your specific scenario, deploying your application from local machine to EC2 cluster, you will face further problems (resource related) so the next option is the best one
(2) Having your .jar file (or .egg) copy it to the master node using scp. You can check this link http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AccessingInstancesLinux.html for information about how to do that; and deploy your application from the master node. Note: spark is already pre-insalled so you will do nothing but write the same exact command you write on your local machine from ~/spark/bin. This shall work perfect.
Are you executing the command on your local machine, or on the created EC2 node? If you're doing it locally, make sure port 7077 is open in the security settings, as its closed to the outside by default.