Job tracking URL in Google Compute engine not working - hadoop

I am using Google Compute Engine to run Mapreduce jobs on Hadoop (pretty much all default configs). While running the job I get a tracking URL of the form http://PROJECT_NAME:8088/proxy/application_X_Y/ but it fails to open. Did I forget to configure something?

To elaborate on the option Amal mentioned in the other answer of using the "external ip address" of your Google Compute Engine VM, you can obtain the external IP address by running gcloud compute instances describe --zone <your zone> <your master hostname> and looking for natIP.
To open port 8088, you'll have to set up a firewall rule opening that port, likely on your default Google Compute Engine network. You'll want to specify a your.ip.address.here/32 address in the --source-ranges to restrict incoming traffic to just your local machine dialing into your VM, otherwise the anyone in the IP source-ranges would be able to access your Hadoop pages.
If you had used bdutil to turn up your cluster, there's an alternative way which is much easier and more secure; simply run
bdutil <your flags used in deployment, like -e hadoop2, --prefix, etc.> socksproxy
to open SSH with dynamic port forwarding to use as a SOCKS5 proxy that your browser can point to. If you're running on Linux or Mac and have Chrome or Firefox installed, bdutil should also print out a copy/paste command for starting a fresh isolated browser pre-configured to use the socks proxy so that you can click through all the useful links.
If bdutil didn't print out a browser command or you didn't use bdutil, you can also run and configure your SSH socks proxy using these instructions. An SSH-based socks proxy is more secure than opening up firewall ports, and also allows the Hadoop page links to work (otherwise you have to keep manually replacing the hostnames with the external IP addresses).

One correction. You are using YARN. So there is no jobtracker. Jobtracker is present in hadoop 1.x. In YARN, the processing layer became a generic framework and the jobtracker got replaced with Resource manager and application master. The UI that you mentioned in the question was of Resource Manager.
For your problem, try the following tips.
Use the public ip address of the resource manager instance instead of PROJECT_NAME.
Check whether the 8088 port is opened for accessing it from outside.

Another (more secure) way to do this is to use gcloud compute to make an ssh tunnel to your deployment, and then launch Chrome though it.
$ gcloud compute ssh clustername --zone=us-central1-a --ssh-flag="-D 1080" --ssh-flag="-N" --ssh-flag="-n"
You will need to replace clustername with the name of your deployment, and change the --zone if necessary.
From there, you can launch Chrome through it and then reach the hadoop job tracking URL.
$ chrome --proxy-server="socks5://localhost:1080" \
--host-resolver-rules="MAP * 0.0.0.0 , \
EXCLUDE localhost" --user-data-dir=/tmp/clustername

Related

Secured NiFi won´t comunicate with NiFi Registry

I have standalone secured NiFi 1.12.1 in Docker running all fine. I am sucessfully using Site-To-Site remote processors, Site-To-Site forwarding of Nifi bulletins, calling NiFi API for self-monitoring and such things. I log in through certificate. So far all fine.
Problem crops up when I try to use NiFi Registry. I have access to two instances: secure and insecure.
No matter what exact format I specify (FQDN, just a name, with /nifi-registry or without), when I try to access (e.g. though importing a process group) the either NiFi Registry from NiFi, it fails with o.a.n.w.a.config.NiFiCoreExceptionMapper org.apache.nifi.web.NiFiCoreException: Unable to obtain listing of buckets: java.net.ConnectException: Connection refused (Connection refused). Returning Conflict response.. In logs is just this message with enormous stack-trace and nothing more.
I checked all certificates and they seem OK (certification path, certificate is for clientAuth as well as for serverAuth). I even use them to log into NiFi myself...
What surprises me the most is the fact, that it works for things like Site-To-Site protocols, API calls and such, but not for NiFi Registry.
Please don´t you know what might be a problem? Or any ideas what to check?
TL; DR:
Use IP addresses or edit /etc/hosts. Problem is in translation of hostname to IP address.
When I attempted to access the API of NiFi Registry directly from NiFi through InvokeHTTP, I noticed an important thing - nothing in different container responded to me (failed to connect to target):
#Safe NiFi - the one I am troubleshooting
https://<my FQDN>:8443/nifi-api/flow/registries
#Safe NiFi Registry (another container) - the one I am trying connect to
https://<my FQDN>:18443/nifi-registry-api/buckets
#Unsafe NiFi (another container) - just for test
http://<my FQDN>:28080/nifi-api/flow/registries
#Unsafe NiFi Registry (another container) - just for test
http://<my FQDN>:38080/nifi-registry-api/buckets
Then it dawned on me: To solve problem with Site-To-Site connections (discrepancy in name of container vs HTTPS certificate issued for hosting machine) I gave the container the same name as the hosting Docker machine. To verify I used IP addresses instead of FQDNs and it worked. Also checking /etc/hosts confirmed this - the FQDN pointed to IP address of container instead of Docker host.
Thus, one given FQDN was in container resolved as localhost and as Docker host everywhere else. And since on localhost on the NiFi Registry port(s) nothing listen ...
So as a solution either mangle the /etc/hosts to remove the offending line, or use IP addresses to force traversing through the Docker host.

Access hadoop nodes web UI from multiple links

i am using the following setup for hadoop's nodes web ui access :
dfs.namenode.http-address : 127.0.0.1:50070
By which i am able to access the nodes web ui link only form the local machine as :
http://127.0.0.1:50070
Is there any way by which i can make it accessible from outside as well ? say like :
http://<Machine-IP>:50070
Thanks in Advance !!
You can use hostname or ipaddress instead of localhost/127.0.0.1.
Make sure you can ping the hostname or ip from the remote machine. If you can ping it then you can able to access web ui.
To ping it
Open cmd/terminal
type the below command in remote machines
ping hostname/ip
From http://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-web-interfaces.html
The following table lists web interfaces that you can view on the core
and task nodes. These Hadoop interfaces are available on all clusters.
To access the following interfaces, replace slave-public-dns-name in
the URI with the public DNS name of the node. For more information
about retrieving the public DNS name of a core or task node instance,
see Connecting to Your Linux/Unix Instances Using SSH in the Amazon
EC2 User Guide for Linux Instances. In addition to retrieving the
public DNS name of the core or task node, you must also edit the
ElasticMapReduce-slave security group to allow SSH access over TCP
port 22. For more information about modifying security group rules,
see Adding Rules to a Security Group in the Amazon EC2 User Guide for
Linux Instances.
YARN ResourceManager
YARN NodeManager
Hadoop HDFS NameNode
Hadoop HDFS DataNode
Spark HistoryServer
Because there are several application-specific interfaces available on
the master node that are not available on the core and task nodes, the
instructions in this document are specific to the Amazon EMR master
node. Accessing the web interfaces on the core and task nodes can be
done in the same manner as you would access the web interfaces on the
master node.
There are several ways you can access the web interfaces on the master
node. The easiest and quickest method is to use SSH to connect to the
master node and use the text-based browser, Lynx, to view the web
sites in your SSH client. However, Lynx is a text-based browser with a
limited user interface that cannot display graphics. The following
example shows how to open the Hadoop ResourceManager interface using
Lynx (Lynx URLs are also provided when you log into the master node
using SSH).
Copy lynx http://ip-###-##-##-###.us-west-2.compute.internal:8088/
There are two remaining options for accessing web interfaces on the
master node that provide full browser functionality. Choose one of the
following:
Option 1 (recommended for more technical users): Use an SSH client to connect to the master node, configure SSH tunneling with local port
forwarding, and use an Internet browser to open web interfaces hosted
on the master node. This method allows you to configure web interface
access without using a SOCKS proxy.
to do this use the command
$ ssh -gnNT -L 9002:localhost:8088 user#example.com
where user#example.com is your username. Note the use of -g to open access to external ip addresses (beware this is a security risk)
you can check this is running using
nmap localhost
to close this ssh tunnel when done use
ps aux | grep 9002
to find the pid of your running ssh process and kill it.
Option 2 (recommended for new users): Use an SSH client to connect to the master node, configure SSH tunneling with dynamic port
forwarding, and configure your Internet browser to use an add-on such
as FoxyProxy or SwitchySharp to manage your SOCKS proxy settings. This
method allows you to automatically filter URLs based on text patterns
and to limit the proxy settings to domains that match the form of the
master node's DNS name. The browser add-on automatically handles
turning the proxy on and off when you switch between viewing websites
hosted on the master node, and those on the Internet. For more
information about how to configure FoxyProxy for Firefox and Google
Chrome, see Option 2, Part 2: Configure Proxy Settings to View
Websites Hosted on the Master Node.
This seems like insanity to me but I have been unable to find how to configure access in core-site.xml to override the web interface for the ResourceManager which by default it is available at localhost:8088/ and if Amazon think this is the way then I tend to go along with it

Mesosphere not allowing External Traffic

I spun up a Mesosphere cluster on Digital Ocean (development) and it's not allowing me to allow external (non vpn) connections to containers or apps. How can this be solved ?
To ensure that the world doesn't have access to your cluster normally, there have been iptables rules installed. By default, these allow full access inside the cluster and nothing externally.
If you're interested in running real applications, I'd recommend the following:
Put HAProxy on a single node.
Setup the haproxy-marathon-bridge script.
On the same box that you installed HAProxy on, setup iptables to allow access to the port that HAProxy is listening on.
By doing this, you'll have a single place to refer to when giving access to applications running on your Mesos cluster. No matter where the app or container is scheduled (with marathon), you'll always be able to reach it via. haproxy.

How to restart single node hadoop cluster on ec2

I have installed a single node haodoop cluster on using Hortonworks/Ambari on Amazon's ec2 host.
Since I don't want this cluster running 24/7, I stop the instance when done. When I reboot the instance later, I get a new IP address and then ambari no longer is able to start the Hadoop related services.
Is there a way other than completely redeploying to reconfigure the cluster so the services will start?
It looks like the IP address lives in various xml files under /etc, in the postgres database table ambari, and possibly other places I haven't found yet.
I tried updating the xml files and postgres database with updated versions of the ip address, internal and external dns names as I could find them, but to no avail. I have not been able to restart the services.
The reason I am doing this is to possibly save the deployment time and data configuration on hdfs and other project specific setup each time I restart the host.
Any suggestions?
Thanks!
Elastic IP can be used. Also, since you mentioned it being a single node cluster - you can use localhost or private IP.
If you use elastic IP, your UIs will always be on the same public IP. However, if you use private IP or localhost and do not associate your instance with an elastic IP you will have to look for public IP everytime you start the instance and then connect to the web UI using the IP.
Thanks for the help, both Harman and TJ are correct. I haven't used an elastic IP because I might have more than one of these running and a time, and for now at least, I don't mind looking up the public ip address.
Harman's suggestion of using "localhost" as the fqdn when setting up ambari in the first place is a really good idea in retrospect. Unless I go through the whole setup again, that's water under the bridge for me, but I recommend this to others who might read this post.
In my case, I figured this out on my own before coming back to the page. The specific step I took was insanely simple after all, thanks to Occam's Razor.
I added the following line in /etc/hosts:
<new internal IP> <old internal dns name>
and then did
ambari-server restart. from the command line. Then I am able to restart all services after logging into ambari.

Use spark-submit to submit a application to EC2 cluster

I am new to Spark and I am trying to run it on EC2. I follow the tutorial on spark webpage by using spark-ec2 to launch a Spark cluster. Then, I try to use spark-submit to submit the application to the cluster. The command looks like this:
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://ec2-54-88-9-74.compute-1.amazonaws.com:7077 --executor-memory 2G --total-executor-cores 1 ./examples/target/scala-2.10/spark-examples_2.10-1.0.0.jar 100
However, I got the following error:
ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
Please let me know how to fix it. Thanks.
You're seeing this issue because the master node of your spark-standalone cluster cant open a TCP connection back to the drive (on your machine). The default mode of spark-submit is client which runs the driver on the machine that submitted it.
A new cluster mode was added to spark-deploy that submits the job to the master where it is then run on a client, removing the need for a direct connection. Unfortunately this mode is not supported in standalone mode.
You can vote for the JIRA issue here: https://issues.apache.org/jira/browse/SPARK-2260
Tunneling your connection via SSH is possible but latency would be a big issue since the driver would be running locally on your machine.
I'm curious if you still having this issue ... But in case anyone is asking here is a brief answer. As clarified by jhappoldt, the master node of your spark-standalone cluster cant open a TCP connection back to the drive (on your local machine). Two workarounds are possible, tested and succeeded.
(1) From EC2 Management Console, create a new security group and add rules to enable TCP back and forth from your PC (public IP). (what I did was adding TCP rules inbound and outbound) ... Then add this security group to your master instance. (right click --> Networking --> Change security groups). Note: add it and don't remove the already established security groups.
This solution work well, but in your specific scenario, deploying your application from local machine to EC2 cluster, you will face further problems (resource related) so the next option is the best one
(2) Having your .jar file (or .egg) copy it to the master node using scp. You can check this link http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AccessingInstancesLinux.html for information about how to do that; and deploy your application from the master node. Note: spark is already pre-insalled so you will do nothing but write the same exact command you write on your local machine from ~/spark/bin. This shall work perfect.
Are you executing the command on your local machine, or on the created EC2 node? If you're doing it locally, make sure port 7077 is open in the security settings, as its closed to the outside by default.

Resources