How to attach volume to a Docker container executor in Hadoop - hadoop

I am setting up a hadoop testbed that has have two nodes / servers (A, B). Server B contains the docker daemon and other hadoop related services such as Data Node, Secondary name node and Node manager while server A has resource manager and name node. When a container is spawned / launched on Server B using DCE (Docker container executor), I want to attach a volume to it.
Can somebody kindly suggest on how I could do this in the DCE environment?

According to me we can add volume to Docker Container in 2 ways
1) Specify Volume in Docker file
FROM ubuntu
RUN mkdir /myvol
RUN echo "hello world" > /myvol/greeting
VOLUME /myvol
2) Specify the Volume during run time
docker run -it v /hdfs/foldername dockerrepositoryyname:version /bin/bash
For more details refer https://docs.docker.com/engine/reference/builder/
Hope this Help!!!...

Related

Write to HDFS running in Docker from another Docker container running Spark

I have a docker image for spark + jupyter (https://github.com/zipfian/spark-install)
I have another docker image for hadoop. (https://github.com/kiwenlau/hadoop-cluster-docker)
I am running 2 containers from the above 2 images in Ubuntu.
For the first container:
I am able to successfully launch jupyter and run python code:
import pyspark
sc = pyspark.sparkcontext('local[*]')
rdd = sc.parallelize(range(1000))
rdd.takeSample(False,5)
For the second container:
In the host Ubuntu OS, I am able to successfully go to the
web browser localhost:8088 : And browse the Hadoop all applications
localhost:50070: and browse the HDFS file system.
Now I want to write to the HDFS file system (running in the 2nd container) from jupyter (running in the first container).
So I add the additional line
rdd.saveAsTextFile("hdfs:///user/root/input/test")
I get the error:
HDFS URI, no host: hdfs:///user/root/input/test
Am I giving the hdfs path incorrectly ?
My understanding is that, I should be able to talk to a docker container running hdfs from another container running spark. Am I missing anything ?
Thanks for your time.
I haven't tried docker compose yet.
The URI hdfs:///user/root/input/test is missing an authority (hostname) section and port. To write to hdfs in another container you would need to fully specify the URI and make sure the two containers were on the same network and that the HDFS container has the ports for the namenode and data node exposed.
For example, you might have set the host name for the HDFS container to be hdfs.container. Then you can write to that HDFS instance using the URI hdfs://hdfs.container:8020/user/root/input/test (assuming the Namenode is running on 8020). Of course you will also need to make sure that the path you're seeking to write has the correct permissions as well.
So to do what you want:
Make sure your HDFS container has the namenode and datanode ports exposed. You can do this using an EXPOSE directive in the dockerfile (the container you linked does not have these) or using the --expose argument when invoking docker run. The default ports are 8020 and 50010 (for NN and DN respectively).
Start the containers on the same network. If you just do docker run with no --network they will start on the default network and you'll be fine. Start the HDFS container with a specific name using the --name argument.
Now modify your URI to include the proper authority (this will be the value of the docker --name argument you passed) and port as described above and it should work

Which node is running Cloudera Manager out of N hadoop nodes?

I have a large hadoop cluster (24 nodes). I have CLI access to these nodes. First few is not running Cloudera Manager (cloudera-scm-server).
How can I find out which node is running Cloudera Manager?
Any help is appreciated.
Cloudera Manager will have two services. one is Server another is agents.
As you said you have CLI access to all the node. So run below command on all the nodes to find which is server and open (server will be running on only 1 machine)
sudo service cloudera-scm-server status
Another simple method to find CDH Server address
ssh to any node and move to /etc/cloudera-scm-agent. There you will find config.ini file, in that you will find server_host address

Mesos slave node unable to restart

I've setup a Mesos cluster using the CloudFormation templates from Mesosphere. Things worked fine after cluster launch.
I recently noticed that none of the slave nodes are listed in the Mesos dashboard. EC2 console shows the slaves are running & pass health checks. I restarted nodes on cluster but that didn't help.
I ssh'ed into one of the slaves and noticed mesos-slave services are not running. Executed sudo systemctl status dcos-mesos-slave.service but that couldn't start the service.
Looked in /var/log/mesos/ and tail -f mesos-slave.xxx.invalid-user.log.ERROR.20151127-051324.31267 and saw the following...
F1127 05:13:24.242182 31270 slave.cpp:4079] CHECK_SOME(state::checkpoint(path, bootId.get())): Failed to create temporary file: No space left on device
But the output of df -h and free show there is plenty of disk space left.
Which leads me to wonder, why is it complaining about no disk space?
Ok I figured it out.
When running Mesos for a long time or under frequent load, the /tmp folder won't have any disk space left since Mesos uses the /tmp/mesos/ as the work_dir. You see, the filesystem can only hold a certain number of file references(inodes). In my case, slaves were collecting large number of file chuncks from image pulls in /var/lib/docker/tmp.
To resolve this issue:
1) Remove files under /tmp
2) Set a different work_dir location
It is good practice to run
docker rmi -f $(docker images | grep "<none>" | awk "{print \$3}")
this way you will free space by deleting unused docker images

Restart hive service on AWS EMR

I am very new to HIVE as well AWS-EMR. As per my requirement, i need to create Hive Metastore Outside the Cluster (from AWS EMR to AWS RDS).
I followed the instruction given in
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-dev-create-metastore-outside.html
I made changes in hive-site.xml and able to setup hive metaStore to Amazon RDS mysql server. To bring the changes in action, currently i am rebooting the complete cluster so hive start storing metastore to AWS-RDS. This way it is working.
But i want to avoid rebooting the cluster, is there any way i can restart the service?
Just for those who are gonna come from Google
To restart any EMR service
In order to restart a service in EMR, perform the following actions:
Find the name of the service by running the following command:
initctl list
For example, the YARN Resource Manager service is named “hadoop-yarn-resourcemanager”.
Stop the service by running the following command:
sudo stop hadoop-yarn-resourcemanager
Wait a few seconds, then start the service by running the following command:
sudo start hadoop-yarn-resourcemanager
Note: Stop/start is required; do not use the restart command.
Verify that the process is running by running the following command:
sudo status hadoop-yarn-resourcemanager
Check for the process using ps, and then check the log file for any errors in the log directory /var/log/.
Source : https://aws.amazon.com/premiumsupport/knowledge-center/restart-service-emr/
sudo stop hive-metastore
sudo start hive-metastore
On EMR 5.x I have found this to work:
hive --service metastore --stop
hive --service metastore --start
For me this approach worked:
Get the pid
Kill the process
Process restarts by itself
Commands for 1 & 2:
ps aux | grep MetaStore
sudo -u hive kill <pid from above>
Here if you are not familiar with ps you can use the following command which will show the headers for PID and only one line of the hive Metastore command:
ps aux | egrep "MetaStore|PID" | grep -v grep
Hive Server restarted by itself. Validate again by ps the pig would have changed.
ps aux | grep MetaStore
You don't have to restart the entire cluster. While launching the cluster, you can specify a hive-site.xml file with the details of RDS. If you are not following this option and making the changes manually after launching the cluster, you don't need to restart the entire cluster. Just restart the hive-metastore service alone. Hive metastore is running in the master node only
You can launch the cluster either by using multiple ways.
1) AWS console
2) Using API (Java, Python etc)
3) Using AWS cli
You can keep the hive-site.xml in S3 and perform this activity as a bootstrap step while launching the cluster. AWS api is providing the feature to specify custom hive-site.xml from S3 rather than the one created by default.
If you are using hive from the master machine alone, you don't have to make the changes in all the machines.
An example of specifying the hive-site.xml while launching EMR using aws cli is given below
aws emr create-cluster --name "Test cluster" --ami-version 3.3 --applications Name=Hue Name=Hive Name=Pig \
--use-default-roles --ec2-attributes KeyName=myKey \
--instance-type m3.xlarge --instance-count 3 \
--bootstrap-actions Name="Install Hive Site Configuration",Path="s3://elasticmapreduce/libs/hive/hive-script",\
Args=["--base-path","s3://elasticmapreduce/libs/hive","--install-hive-site","--hive-site=s3://mybucket/hive-site.xml","--hive-versions","latest"]

Unable to start Mesos slave on single node cluster

From what I know I am able to set up Mesos master, slave, zookeeper, marathon on a single node.
But once I execute the command to start mesos-master and after that I am trying to start mesos-slave as well but I don't have any way to continue to execute other commands else where. I have to stop the running and run but the problem is mesos-master already stop running.
Don't execute the commands directly from your shell, you want to start all of those components (zookeeper, mesos-master, mesos-slave, and marathon) as services.
/etc/init.d/zookeeper start
start mesos-master
start mesos-slave
start marathon
I forget if zookeeper creates the init script as part of the install for you or not, you may have to find it in the Hadoop docs.
As for the other 3, they all use 'upstart' and you can find the configuration files in /etc/init/

Resources