After I start memos-master on Ubuntu 14.04, I'm unable to get to http://:5050
therefore I want to verify if Mesos is listening on the default port 5050.
I'm following the instructions here.
vagrant#master2:~$ sudo start mesos-master
mesos-master start/running, process 5272
vagrant#master1:~$ mesos help
Usage: mesos <command> [OPTIONS]
Available commands:
help
start-agents.sh
daemon.sh
stop-masters.sh
start-masters.sh
start-slaves.sh
start-cluster.sh
master
stop-slaves.sh
agent
stop-cluster.sh
stop-agents.sh
log
execute
scp
tail
resolve
ps
init-wrapper
local
cat
I tried this to verify, but no result.
vagrant#master1:~$ sudo netstat -tnlp | grep 5050
I know Mesos is running but I get connection refused.
vagrant#master1:~$ curl http://192.168.2.1:5050
curl: (7) Failed to connect to 192.168.2.1 port 5050: Connection refused
I see you are using Vagrant so go the the browser in the host machine and type <master2_ip>:5050
<master2_ip> - replace it with the IP address of the master2 using ipaddr or ifconfig to find the ip.
If the mesos is up and running you will get the mesos dashboard or else port unreachable error.
Post your vagrantFile here.
This CDH cluster has been install for months, and be used to backup logs.
Today I try to run flink on yarn, and want to open yarn web ui to check flink taskmanagers' state, i find 8088 port connect refuse.
```
This site can’t be reached
47.74.***.*** refused to connect.
Search Google for *** *** 8088
ERR_CONNECTION_REFUSED
```
yarn port & address config as follows:
```
yarn.resourcemanager.address 8032
yarn.resourcemanager.scheduler.address 8030
yarn.resourcemanager.resource-tracker.address 8031
yarn.resourcemanager.admin.address 8033
yarn.resourcemanager.webapp.address 8088
yarn.resourcemanager.webapp.https.address 8090
```
Even curl 'http://ip:8088' on the resource manager host, also get "connection refuse"
```
[root#bigdata-cdh02 ~]# netstat -tunlp|grep 8088
tcp 0 0 172.21.0.20:8088 0.0.0.0:* LISTEN 20606/java
```
BTW, I check yarn logs, it seems that yarn has successfully allocated resources for flink.
I installed spark on a cluster of machines w/o public DNS (just created machines on a cloud).
Hadoop looks to be installed and worked correctly, but Sparks listens on 7077 and 6066 as 127.0.0.1 instead of public ip so worker nodes can't connect to it.
What is wrong?
My /etc/hosts on the master node looks like:
127.0.1.1 namenode namenode
127.0.0.1 localhost
XX.XX.XX.XX namenode-public
YY.YY.YY.YY hadoop-2
ZZ.ZZ.ZZ.ZZ hadoop-1
My $SPARK_HOME/conf/spark-env.sh looks like:
export JAVA_HOME=/usr/lib/jvm/java-8-oracle
export SPARK_PUBLIC_DNS=namenode-public
export SPARK_WORKER_CORES=6
export SPARK_LOCAL_IP=XX.XX.XX.XX
sudo netstat -pan|grep 7077
tcp 0 0 127.0.1.1:7077 0.0.0.0:* LISTEN 6670/java
You should specify SPARK_MASTER_HOST in spark-env.sh (it must be the address of your machine that is visible to the slave nodes). Moreover, you may need to add rules for ports 7077 and 6066 in iptables.
I'd really appreciate some help to get cloudera manager running on AWS EC2.
Its my first install, and I'm aiming to use the AWS Free Tier to spin up a few nodes and do some training on Hadoop cluster and the cloudera distribution. I'm using the RedHat RHEL 7.2 image on AWS EC2.
I am following the instructions here... Cloudera Manager installation
I have installed cloudera manager OK, and get to the screen where it invites you to use a browser to log-in to the cloudera manager server. But that's where the problem starts. It seems the app is not listening on port 7180, so there's no hope of connecting from another machine across the network. I can't even connect locally, on the server, yet the service appears to be running OK. But its not listening on port 7180.
Q1 - How can I confirm the config is set to use port 7180.?
Q2 - are there obvious steps that I'm missing here ?
Thanks in advance,
[Edit..]
I'm beginning to wonder if the Free EC2 host is running short on memory to run cloudera manager. I saw one comment that implied that....AWS Forum post . But the process doesn't crash or report any problems in its logfile. So it must be OK, right?
[Edit.... with more diagnostic info....]
Here's a list of the diagnostics I've checked:-
SELinux is not running [for install and testing purposes],
WAN firewalls,
EC2 firewall/Security group,
Local firewall on server,
Cloudera manager log,
Is the service up and running?
Can you connect locally?
Securtity group on the EC2 instance, it contains:-
SSH and Port 7180,
Firewall/iptables/firewalld on the RedHat instance, tried:-
adding ports to iptables, then
dissabling iptables, then
adding ports to firewalld, then
dissabling the firewalld service,
$ sudo iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT tcp -- anywhere anywhere tcp dpt:ssh
ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:7180
ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:7182
But I'm getting the feeling that the installation of cloudera manager is not happy, or not running correctly.
I've checked the cloudera manager log, and it ends with the following.
$ tail /var/log/cloudera-scm-server/cloudera-scm-server.log
2016-02-25 11:02:23,581 INFO main:com.cloudera.cmon.components.MetricSchemaUpdate: persisting 19264 new metrics
2016-02-25 11:02:28,920 INFO main:com.cloudera.cmon.components.MetricSchemaUpdate: persisting 0 updated metrics
2016-02-25 11:02:28,924 INFO main:com.cloudera.cmon.components.MetricSchemaManager: Cross entity aggregates processed.
And when I use tail -f, and restart the cloudera-scm-server service, the log scrolls a lot, and comes back the same state. If I search for ERROR, there are no lines with "ERR".
$ sudo service cloudera-scm-server start
Starting cloudera-scm-server (via systemctl): [ OK ]
$ sudo systemctl status cloudera-scm-server
● cloudera-scm-server.service - LSB: Cloudera SCM Server
Loaded: loaded (/etc/rc.d/init.d/cloudera-scm-server)
Active: active (exited) since Thu 2016-02-25 12:23:03 EST; 44s ago
Docs: man:systemd-sysv-generator(8)
Process: 747 ExecStart=/etc/rc.d/init.d/cloudera-scm-server start (code=exited, status=0/SUCCESS)
So, if I try to test the service, by connecting from the local machine I get the sort of behavious that makes me thing its just not listening, and maybe not started correctly.
Try poke it with a curl from the same shell as the cloudera-scm-server service was started
$ curl localhost:7180
curl: (7) Failed connect to localhost:7180; Connection refused
$ wget localhost:7180
--2016-02-25 08:00:16-- http://localhost:7180/
Resolving localhost (localhost)... ::1, 127.0.0.1
Connecting to localhost (localhost)|::1|:7180... failed: Connection refused.
Connecting to localhost (localhost)|127.0.0.1|:7180... failed: Connection refused.
Try check what ports are listening on that machine, no 7180 , what's up with that???
$ netstat -nltp
(No info could be read for "-p": geteuid()=1000 but you should be root.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:7432 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN -
tcp6 0 0 :::7432 :::* LISTEN -
tcp6 0 0 :::22 :::* LISTEN -
tcp6 0 0 ::1:25 :::* LISTEN -
Here's what to look for, and a possible solution - give it more memory...
Check the status of the cloudera-scm-server service using [depending on your flavour of linux]
$ sudo service cloudera-scm-server status
OR
$ sudo systemctl status cloudera-scm-server
Look for the status - Active: active (running)
But if you find - Active: active (exited)
you may have a problem during the startup of the cloudera-scm-server.
In which case, look at the log files for cloudera-scm-server
$sudo ls -l /var/log/cloudera-scm-server
$sudo cat /var/log/cloudera-scm-server/cloudera-scm-server.out
JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x000000078dc58000, 265809920, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 265809920 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /tmp/hs_err_pid831.log
[ec2-user#ip-172-31-31-166 ~]$ sudo tail -100 /var/log/cloudera-scm-server/cloudera-scm-server.out
JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x000000078dc58000, 265809920, 0) failed; error='Cannot allocate memory' (errno=12)
Use the command top to indicate how much memory is available to your system.
Possible solution - have a look at this discussion at Cloudera forum
In this case the java heap size was too small.
As we see that heap was exhausted, assuming this is not a memory leak
or something of the sort, Cloudera Manager may need more heap to
operate. This can be configured in:
/etc/default/cloudera-scm-server You could, for instance, change "-Xmx2G" to "-Xmx3G" or "-Xmx4G" If the problem still
happens, perhaps the heap dumps will yeild some clues.
I'd suggest you tail the logs. If you are using the free tier, cloudera manager will take a while to come up... possibly up to 5 minutes or more after you start the cloudera-scm-server.
The logs should show if there are any errors, possibly issues with memory allocation since the free tier servers have limited memory available. The little snippet of log entries looks fine and typical - it will go through a long list of processes before the UI comes up on 7180.
Also while that is going on, run top or even free -g to see how much resources are being used - particularly memory.
I was having the exact same issue, cannot hit the CM login using public DNS or IP on port 7180.
Following steps will help you :
iptables stopped (service iptables stop)
SELinux disabled (got to /etc/selinux/config and disbaled the selinux)
curl/wget localhost:7180 works (check the curl status)
ufw allow 7180
service httpd status should be running.
check va/log/cloudera-scm-server log : if any error found then troubleshoot the error
cloudera-scm-server status (should be running state)
netstat -nap | grep 7180 returns (if running other service then kill it)
telnet localhost 7180 (should be connected)
Cannot connect to Cloudera Manager, not listening on port 7180
1] Check the status:
sudo service cloudera-scm-server status
*cloudera-scm-server.service - LSB: Cloudera SCM Server Loaded: loaded (/etc/rc.d/init.d/cloudera-scm-server; bad; vendor preset: disabled) Active: active (exited) since UTC; 47min ago Docs: man:systemd-sysv-generator(8) rm /var/run/cloudera-scm-server.pid
NOTE : The Cloudera Manager service will not be running as it exited abnormally.
Running service cloudera-scm-server status will print following message "cloudera-scm-server dead but pid file exists".
Reason: Out of memory.
Solution : Examine the heap dump that the Cloudera Manager Server creates when it runs out of memory. The heap dump file is created in the /tmp directory, has file extension .hprof and file permission of 600. Its owner and group will be the owner and group of the Cloudera Manager server process, normally cloudera-scm:cloudera-scm.
Link : http://www.cloudera.com/documentation/manager/5-0-x/Cloudera-Manager-Diagnostics-Guide/cm5dg_troubleshooting_cluster_config.html
Check the status of `cloudera-scm-server` and follow the instructions ahead:
[root#quickstart ~]# `service cloudera-scm-server status`
By default, Cloudera's QuickStart VM manages CDH using Linux's configuration
and service management. To use Cloudera Manager instead, you must shut down
and disable the existing CDH services and then start Cloudera Manager. You can
do this by running the following command:
`sudo /home/cloudera/cloudera-manager`
[root#quickstart ~]# `sudo /home/cloudera/cloudera-manager `
`[QuickStart] Shutting down CDH services via init scripts...
JMX enabled by default
Using config: /etc/zookeeper/conf/zoo.cfg
[QuickStart] Disabling CDH services on boot...
[QuickStart] Starting Cloudera Manager services...
[QuickStart] Deploying client configuration...
[QuickStart] Starting CM Management services...
[QuickStart] Enabling CM services on boot...
[QuickStart] Starting CDH services...`
________________________________________________________________________________
Success! You can now log into Cloudera Manager from the QuickStart VM's browser:
http://quickstart.cloudera:7180
Username: cloudera
Password: cloudera
I have my YARN resource manager on a different node than my namenode, and I can see that something is running, which I take to be the resource manager. Ports 8031 and 8030 are bound, but not port 8032, to which my client tries to connect.
I am on CDH 5.3.1, and the following is part of the output of lsof -i
java 12478 yarn 230u IPv4 61325 0t0 TCP hadoop2.adastragrp.com:48797->hadoop2.adastragrp.com:8031 (ESTABLISHED)
java 13753 yarn 159u IPv4 61302 0t0 TCP hadoop2.adastragrp.com:8031 (LISTEN)
java 13753 yarn 170u IPv4 61308 0t0 TCP hadoop2.adastragrp.com:8030 (LISTEN)
java 13753 yarn 191u IPv4 61326 0t0 TCP hadoop2.adastragrp.com:8031->hadoop2.adastragrp.com:48797 (ESTABLISHED)
How do I diagnose what's wrong here? I suspect that the resource manager is running, but can't bind to port 8032, but I have no idea why that could be.
In the cloudera manager, the ResourceManager is shown as having good health, but at the same time I get this report:
ResourceManager summary: hadoop2.adastragrp.com (Availability:
Unknown, Health: Good). This health test is bad because the Service
Monitor did not find an active ResourceManager.
[Edit]
I can execute yarn application -list locally on the resource manager node, but when I do the same on a different node, it tries to connect to the resource manager correctly, but fails to do so. Both nodes are connected, can ping each other, and so on. I disabled the iptables service on the VM.
nmap output:
PORT STATE SERVICE REASON
8032/tcp filtered unknown host-prohibited
Wether the port was occupied by other process? For example, you stop your hadoop cluster abnormally, result in some process still running. If so, try to ps -e|grep java,and kill it.
Gotcha, on CentOS 6 stopping the iptables service didn't really disable the firewall. I had to disable it with system-config-firewall.