How to restart HDFS on Amazon EMR

How to restart HDFS on Amazon EMR - hadoop

I have made some changes in the settings for HDFS on an Amazon EMR cluster. I want to restart the namenode and the datanode for the changes to take effect. I am not able to find any start and stop scripts to do so on neither the namenode(master) nor the datanodes. What should be the way to restart the cluster?

On EMR4 , run following on master host -
sudo /sbin/start hadoop-hdfs-namenode
ssh -i <key.pem> <slave-hostname1> "sudo /sbin/restart hadoop-hdfs-datanode"
ssh -i <key.pem> <slave-hostname2> "sudo /sbin/restart hadoop-hdfs-datanode"
ssh -i <key.pem> <slave-hostname3> "sudo /sbin/restart hadoop-hdfs-datanode"

You have to manually restart the cluster. This can be either performed manually or using a simple shell script.
1) Get the list of hostnames or ipaddress of all the nodes,
2) ssh into the node using the key
3) Restart the required service.
If you are good in programming, you can create a general utility that will get the list of ipaddress of all the nodes corresponding to an EMR by using the cluster id and perform the service restart in individual nodes.
Otherwise, get the hostname or ipaddress of all the nodes manually and create a script like the below one and execute from the master node
sudo service hadoop-hdfs-namenode restart
ssh -i <key.pem> <hostname1> "sudo service hadoop-hdfs-datanode restart"
ssh -i <key.pem> <hostname2> "sudo service hadoop-hdfs-datanode restart"
ssh -i <key.pem> <hostname3> "sudo service hadoop-hdfs-datanode restart"

On EMR 5.x this is what I used:
Copy PEM file to your head node and set these values:
CLUSTER_ID="j-XXXXXXXXXXX"
IDENT="cluster.pem"
Run this:
nodes=$(aws emr list-instances \
--cluster-id $ \
--instance-group-types CORE \
--instance-states RUNNING \
--output text \
--query "Instances[*].PublicDnsName" )
for node in nodes; do
ssh -i $IDENT \
-o StrictHostKeyChecking=no \
-o UserKnownHostsFile=/dev/null \
$node "sudo stop hadoop-hdfs-datanode; sudo start hadoop-hdfs-datanode"
done

Related

How to use bash commands alongside Docker restart policies?

In a ROS project, I have the following bash script that I use to run a docker container:
#!/bin/bash
source ~/catkin_ws/devel/setup.bash
rosnode kill some_ros_node
roslaunch supporting_ros_package launch_file.launch &
docker run -it \
--restart=always \
--privileged \
--net=host \
my_image:latest \
/bin/bash -c\
"
roslaunch my_package my_launch_file.launch
"
export containerId=$(docker ps -l -q)
However, what I'd like to happen is, for every time the container restarts (especially as the machine is booted up), the bash commands preceding the docker run command to also re-run on the host machine (not within the container).
How might I achieve this?

There are a few ways I can think of doing this:
Add this script to a system service. See this answer regarding adding a system service: See this
Add this script into another container that is also set to restart always ... but mount the docker socket into this other container like this: See this

translate my containers starter file to docker-compose.yml

I am newer in big data domain, and this is my first time using Docker. I just found this amazing project: https://kiwenlau.com/2016/06/26/hadoop-cluster-docker-update-english/ which create a hadoop cluster composed of one master and two slaves using Docker.
After doing all the installation, I just run containers and they work fine. There is start-containers.sh file which give me the hand to lunch the cluster. I decide to install some tools like sqoop to import my local relational data base to Hbase, and that's work fine. After that I stop all Docker container in my pc by tapping
docker stop $(docker ps -a -q)
In the second day, when I tried to relaunch containers by running the same script ./start-container.sh , I found this error:
start hadoop-master container...
start hadoop-slave1 container...
start hadoop-slave2 container...
Error response from daemon: Container
e942e424a3b166452c9d2ea1925197d660014322416c869dc4a982fdae1fb0ad is
not running
even, I lunch this daemon; containers of my cluster cannot connect to each other, and I can't access to data which is stored on Hbase.
First can any one tell me why this daemon don't work.
PS: in the start-container.sh file there is a line which removes containers if they exist before creating them, I delete this line because If I don't delete them, every time I do all things from the beginning.
After searching I found that is preferable to use the docker compose which give me the hand to lunch all container together.
But I can't found how to translate my start-container.sh file to docker-compose.yml file. Is this the best way to lunch all my containers in the same time ? This is the content of start-containers.sh file:
#!/bin/bash
sudo docker network create --driver=bridge hadoop
# the default node number is 3
N=${1:-3}
# start hadoop master container
#sudo docker rm -f hadoop-master &> /dev/null
echo "start hadoop-master container..."
sudo docker run -itd \
--net=hadoop \
-p 50070:50070 \
-p 8088:8088 \
-p 7077:7077 \
-p 16010:16010 \
--name hadoop-master \
--hostname hadoop-master \
spark-hadoop:latest &> /dev/null
# sudo docker run -itd \
# --net=hadoop \
# -p 5432:5432 \
# --name postgres \
# --hostname hadoop-master \
# -e POSTGRES_PASSWORD=0000
# --volume /media/mobelite/0e5603b2-b1ad-4662-9869-8d0873b65f80/postgresDB/postgresql/10/main:/var/lib/postgresql/data \
# sameersbn/postgresql:10-2 &> /dev/null
# start hadoop slave container
i=1
while [ $i -lt $N ]
do
# sudo docker rm -f hadoop-slave$i &> /dev/null
echo "start hadoop-slave$i container..."
port=$(( 8040 + $i ))
sudo docker run -itd \
-p $port:8042 \
--net=hadoop \
--name hadoop-slave$i \
--hostname hadoop-slave$i \
spark-hadoop:latest &> /dev/null
i=$(( $i + 1 ))
done
# get into hadoop master container
sudo docker exec -it hadoop-master bash

Problems with restarting containers
I am not sure if I understood the mentioned problems with restarting containers correctly. Thus in the following, I try to concentrate on potential issues I can see from the script and error messages:
When starting containers without --rm, they will remain in place after being stopped. If one tries to run a container with same port mappings or same name (both the case here!) afterwards that fails due to the container already being existent. Effectively, no container will be started in the process. To solve this problem, one should either re-create containers everytime (and store all important state outside of the containers) or detect an existing container and start it if existent. With names it can be as easy as doing:
if ! docker start hadoop-master; then
docker run -itd \
--net=hadoop \
-p 50070:50070 \
-p 8088:8088 \
-p 7077:7077 \
-p 16010:16010 \
--name hadoop-master \
--hostname hadoop-master \
spark-hadoop:latest &> /dev/null
fi
and similar for the other entries. Note that I do not understand why one would
use the combination -itd (interactive, assign TTY but go to background) for
a service container like this? I'd recommend going with just -d here?
Other general scripting advice: Prefer bash -e (causes the script to stop on unhandled errors).
Docker-Compose vs. Startup Scripts
The question contains some doubt whether docker-compose should be the way to go or if a startup script should be preferred. From my point of view, the most important differences are these:
Scripts are good on flexibility: Whenever there are things that need to be detected from the environment which go beyond environment variables, scripts provide the needed flexibility to execute commands and to be have environment-dependently. One might argue that this goes partially against the spirit of the isolation of containers to be dependent on the environment like this, but a lot of Docker environments are used for testing purposes where this is not the primary concern.
docker-compose provides a few distinct advantages "out-of-the-box". There are commands up and down (and even radical ones like down -v --rmi all) which allow environments to be created and destroyed quickly. When scripting, one needs to implement all these things separately which will often result in less complete solutions. An often-overlooked advantage is also portability concerns: docker-compose exists for Windows as well. Another interesting feature (although not so "easy" as it sounds) is the ability to deploy docker-compose.yml files to Docker clusters. Finally docker-compose also provides some additional isolation (e.g. all containers become part of a network specifically created for this docker-compose instance by default)
From Startup Script to Docker-Compose
The start script at hand is already in a good shape to consider moving to a docker-compose.yml file instead. The basic idea is to define one service per docker run instruction and to transform the commandline arguments into their respective docker-compose.yml names. The Documentation covers the options quite thoroughly.
The idea could be as follows:
version: "3.2"
services:
hadoop-master:
image: spark-hadoop:latest
ports:
- 50070:50070
- 8088:8088
- 7077:7077
- 16010:16010
hadoop-slave1:
image: spark-hadoop:latest
ports:
- 8041:8042
hadoop-slave2:
image: spark-hadoop:latest
ports:
- 8042:8042
hadoop-slave2:
image: spark-hadoop:latest
ports:
- 8043:8042
Btw. I could not test the docker-compose.yml file because the image spark-hadoop:latest does not seem to be available through docker pull:
# docker pull spark-hadoop:latest
Error response from daemon: pull access denied for spark-hadoop, repository does not exist or may require 'docker login'
But the file above might be enough to get an idea.

Can I ssh into an ec2 instance and run commands in that instance using execute shell?

I am going to launch few ec2 instance with jenkins using aws cli and then I want to ssh into those instance and install some packages in them.Is there anyway I could ssh into these instances and install these packages using execute shell?I can't use SSH plugin as I don't know the ip beforehand.
Any help would be appreciated.

I want to ssh into those instances and install some packages in them
If this is the only reason that you want to ssh, I will not recommend installing package using ssh after instance creation. better to put these installations in command in user-data or create AMI that has already these packages.
User Data and Shell Scripts
If you are familiar with shell scripting, this is the easiest and most
complete way to send instructions to an instance at launch. Adding
these tasks at boot time adds to the amount of time it takes to boot
the instance. You should allow a few minutes of extra time for the
tasks to complete before you test that the user script has finished
successfully.
In the example script below, the script creates and configures our web server.
#!/bin/bash
yum update -y
amazon-linux-extras install -y lamp-mariadb10.2-php7.2 php7.2
yum install -y httpd mariadb-server
systemctl start httpd
systemctl enable httpd
usermod -a -G apache ec2-user
chown -R ec2-user:apache /var/www
chmod 2775 /var/www
find /var/www -type d -exec chmod 2775 {} \;
find /var/www -type f -exec chmod 0664 {} \;
echo "<?php phpinfo(); ?>" > /var/www/html/phpinfo.php
AWS-EC2 user-data
Use this with AWS-cli
aws ec2 run-instances --image-id ami-a4c7edb2 --count 1 \
--instance-type t2.micro --key-name mynewkey \
--subnet-id subnet-5630306b --user-data file://ud.txt
aws-ec2-cli-userdata
So using the above is the standard way to deal with Ec2 installation and configuration at instance creation time also you will not need instance IP.

AWS EC2 docker machine with amazonec2 driver - Host already exists

The following command should create a new docker machine on a shiny new Amazon EC2 instance:
docker-machine \
--storage-path /path/to/folder/docker_machines \
create \
--driver amazonec2 \
--amazonec2-access-key <my key> \
--amazonec2-secret-key <my secret> \
--amazonec2-vpc-id <my vpc> \
--amazonec2-region <my region> \
--amazonec2-zone <my AZ> \
--amazonec2-security-group <existing Sec Grp> \
--amazonec2-ami ami-da05a4a0 \
--amazonec2-ssh-keypath /path/to/private/key \
--engine-install-url=https://web.archive.org/web/20170623081500/https://get.docker.com \
awesome-new-docker-machine
I ran this command once, and encountered a legitimate problem (bad path to private key). Once I fixed that and ran the command again, I get this error:
Host already exists: "awesome-new-docker-machine"
However, I can't find this docker machine anywhere:
$ docker-machine ls
NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS
I even tried a docker-machine rm and docker-machine kill just for giggles. No difference.
I can't see a new EC2 instance on Amazon having been created from the first, erroneous run of the command.
How can I "clean up" whatever's existing (somewhere) so I can recreate the machine correctly?

So, it turns out that the first run of the command created some initial artifacts in a new folder awesome-new-docker-machine under /path/to/folder/docker_machines.
Deleting this folder and trying again worked perfectly.

how to change the DCOS attributes without restarting slave?

I am facing the problem to add/change attributes of the slave machines in the DCOS environment.
After changing attributes in
vi /var/lib/dcos/mesos-slave-common
MESOS_ATTRIBUTES=TYPE:DB;DB_TYPE:MONGO;
file, it not immediately getting updated in the cluster.
I have to run the following commands
systemctl stop dcos-mesos-slave
rm -f /var/lib/mesos/slave/meta/slaves/latest
systemctl start dcos-mesos-slave
This means essentially I have to restart the service in the slave.
And the slave is down for at least 1 hour,
Is there any other way achieve this?

As variant we are using some hack, we create /var/lib/dcos/mesos-slave-common file and "froze" it by changing access right, like:
echo "MESOS_ATTRIBUTES=TYPE:DB;DB_TYPE:MONGO;" | sudo tee /var/lib/dcos/mesos-slave-common
sudo chmod -w /var/lib/dcos/mesos-slave-common
# And after that you can execute node installation. Ugly, but that is working :)
sudo dcos_install.sh slave

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio