AWS EC2 docker machine with amazonec2 driver - Host already exists - amazon-ec2

The following command should create a new docker machine on a shiny new Amazon EC2 instance:
docker-machine \
--storage-path /path/to/folder/docker_machines \
create \
--driver amazonec2 \
--amazonec2-access-key <my key> \
--amazonec2-secret-key <my secret> \
--amazonec2-vpc-id <my vpc> \
--amazonec2-region <my region> \
--amazonec2-zone <my AZ> \
--amazonec2-security-group <existing Sec Grp> \
--amazonec2-ami ami-da05a4a0 \
--amazonec2-ssh-keypath /path/to/private/key \
--engine-install-url=https://web.archive.org/web/20170623081500/https://get.docker.com \
awesome-new-docker-machine
I ran this command once, and encountered a legitimate problem (bad path to private key). Once I fixed that and ran the command again, I get this error:
Host already exists: "awesome-new-docker-machine"
However, I can't find this docker machine anywhere:
$ docker-machine ls
NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS
I even tried a docker-machine rm and docker-machine kill just for giggles. No difference.
I can't see a new EC2 instance on Amazon having been created from the first, erroneous run of the command.
How can I "clean up" whatever's existing (somewhere) so I can recreate the machine correctly?

So, it turns out that the first run of the command created some initial artifacts in a new folder awesome-new-docker-machine under /path/to/folder/docker_machines.
Deleting this folder and trying again worked perfectly.

Related

Connect to postgres on docker container via localhost is failed

I'm constructing a postgres server on docker container by docker run command as follows (environmental parameters are set properly).
docker run \
--name zero2prod \
-e POSTGRES_USER=${DB_USER} \
-e POSTGRES_PASSWORD=${DB_PASSWORD} \
-e POSTGRES_DB=${DB_NAME} \
-p "${DB_PORT}":5432 \
-d postgres \
postgres -N 1000
But, psql command failed to connect the server. I typed the command as follows.
PGPASSWORD="${DB_PASSWORD}" psql -h "localhost" -U "${DB_USER}" -p "${DB_PORT}" -d "postgres"
The error message is this.
psql: error: could not connect to server: FATAL: password authentication failed for user "postgres"
Does anyone knows why the command failed?
Note: I'm using windows 10 machine and docker environment was installed using Docker Desktop for Windows.
I found an solution.
psql command executed successfully. I changed p option of docker run changed as follows.
Before : -p 5432:5432
After : -p 5555:5432
I don't know why psql command failed when the host port is same with container port 5432.

translate my containers starter file to docker-compose.yml

I am newer in big data domain, and this is my first time using Docker. I just found this amazing project: https://kiwenlau.com/2016/06/26/hadoop-cluster-docker-update-english/ which create a hadoop cluster composed of one master and two slaves using Docker.
After doing all the installation, I just run containers and they work fine. There is start-containers.sh file which give me the hand to lunch the cluster. I decide to install some tools like sqoop to import my local relational data base to Hbase, and that's work fine. After that I stop all Docker container in my pc by tapping
docker stop $(docker ps -a -q)
In the second day, when I tried to relaunch containers by running the same script ./start-container.sh , I found this error:
start hadoop-master container...
start hadoop-slave1 container...
start hadoop-slave2 container...
Error response from daemon: Container
e942e424a3b166452c9d2ea1925197d660014322416c869dc4a982fdae1fb0ad is
not running
even, I lunch this daemon; containers of my cluster cannot connect to each other, and I can't access to data which is stored on Hbase.
First can any one tell me why this daemon don't work.
PS: in the start-container.sh file there is a line which removes containers if they exist before creating them, I delete this line because If I don't delete them, every time I do all things from the beginning.
After searching I found that is preferable to use the docker compose which give me the hand to lunch all container together.
But I can't found how to translate my start-container.sh file to docker-compose.yml file. Is this the best way to lunch all my containers in the same time ? This is the content of start-containers.sh file:
#!/bin/bash
sudo docker network create --driver=bridge hadoop
# the default node number is 3
N=${1:-3}
# start hadoop master container
#sudo docker rm -f hadoop-master &> /dev/null
echo "start hadoop-master container..."
sudo docker run -itd \
--net=hadoop \
-p 50070:50070 \
-p 8088:8088 \
-p 7077:7077 \
-p 16010:16010 \
--name hadoop-master \
--hostname hadoop-master \
spark-hadoop:latest &> /dev/null
# sudo docker run -itd \
# --net=hadoop \
# -p 5432:5432 \
# --name postgres \
# --hostname hadoop-master \
# -e POSTGRES_PASSWORD=0000
# --volume /media/mobelite/0e5603b2-b1ad-4662-9869-8d0873b65f80/postgresDB/postgresql/10/main:/var/lib/postgresql/data \
# sameersbn/postgresql:10-2 &> /dev/null
# start hadoop slave container
i=1
while [ $i -lt $N ]
do
# sudo docker rm -f hadoop-slave$i &> /dev/null
echo "start hadoop-slave$i container..."
port=$(( 8040 + $i ))
sudo docker run -itd \
-p $port:8042 \
--net=hadoop \
--name hadoop-slave$i \
--hostname hadoop-slave$i \
spark-hadoop:latest &> /dev/null
i=$(( $i + 1 ))
done
# get into hadoop master container
sudo docker exec -it hadoop-master bash
Problems with restarting containers
I am not sure if I understood the mentioned problems with restarting containers correctly. Thus in the following, I try to concentrate on potential issues I can see from the script and error messages:
When starting containers without --rm, they will remain in place after being stopped. If one tries to run a container with same port mappings or same name (both the case here!) afterwards that fails due to the container already being existent. Effectively, no container will be started in the process. To solve this problem, one should either re-create containers everytime (and store all important state outside of the containers) or detect an existing container and start it if existent. With names it can be as easy as doing:
if ! docker start hadoop-master; then
docker run -itd \
--net=hadoop \
-p 50070:50070 \
-p 8088:8088 \
-p 7077:7077 \
-p 16010:16010 \
--name hadoop-master \
--hostname hadoop-master \
spark-hadoop:latest &> /dev/null
fi
and similar for the other entries. Note that I do not understand why one would
use the combination -itd (interactive, assign TTY but go to background) for
a service container like this? I'd recommend going with just -d here?
Other general scripting advice: Prefer bash -e (causes the script to stop on unhandled errors).
Docker-Compose vs. Startup Scripts
The question contains some doubt whether docker-compose should be the way to go or if a startup script should be preferred. From my point of view, the most important differences are these:
Scripts are good on flexibility: Whenever there are things that need to be detected from the environment which go beyond environment variables, scripts provide the needed flexibility to execute commands and to be have environment-dependently. One might argue that this goes partially against the spirit of the isolation of containers to be dependent on the environment like this, but a lot of Docker environments are used for testing purposes where this is not the primary concern.
docker-compose provides a few distinct advantages "out-of-the-box". There are commands up and down (and even radical ones like down -v --rmi all) which allow environments to be created and destroyed quickly. When scripting, one needs to implement all these things separately which will often result in less complete solutions. An often-overlooked advantage is also portability concerns: docker-compose exists for Windows as well. Another interesting feature (although not so "easy" as it sounds) is the ability to deploy docker-compose.yml files to Docker clusters. Finally docker-compose also provides some additional isolation (e.g. all containers become part of a network specifically created for this docker-compose instance by default)
From Startup Script to Docker-Compose
The start script at hand is already in a good shape to consider moving to a docker-compose.yml file instead. The basic idea is to define one service per docker run instruction and to transform the commandline arguments into their respective docker-compose.yml names. The Documentation covers the options quite thoroughly.
The idea could be as follows:
version: "3.2"
services:
hadoop-master:
image: spark-hadoop:latest
ports:
- 50070:50070
- 8088:8088
- 7077:7077
- 16010:16010
hadoop-slave1:
image: spark-hadoop:latest
ports:
- 8041:8042
hadoop-slave2:
image: spark-hadoop:latest
ports:
- 8042:8042
hadoop-slave2:
image: spark-hadoop:latest
ports:
- 8043:8042
Btw. I could not test the docker-compose.yml file because the image spark-hadoop:latest does not seem to be available through docker pull:
# docker pull spark-hadoop:latest
Error response from daemon: pull access denied for spark-hadoop, repository does not exist or may require 'docker login'
But the file above might be enough to get an idea.

Google Dataproc initialization script error File not Found

I'm using Google Dataproc to initialize a Jupyter cluster.
At first I used the "dataproc-initialization-actions" available in github, and it works like a charm.
This is the create cluster Call available in the documentation:
gcloud dataproc clusters create my-dataproc-cluster \
--metadata "JUPYTER_PORT=8124" \
--initialization-actions \
gs://dataproc-initialization-actions/jupyter/jupyter.sh \
--bucket my-dataproc-bucket \
--num-workers 2 \
--properties spark:spark.executorEnv.PYTHONHASHSEED=0,spark:spark.yarn.am.memory=1024m \
--worker-machine-type=n1-standard-4 \
--master-machine-type=n1-standard-4
But I want to customize it, so I got the initialization file and saved it o my Google Storage (that is under the same project where I'm trying to create the cluster). So, I changed the call to point to my script instead, like this:
gcloud dataproc clusters create my-dataproc-cluster \
--metadata "JUPYTER_PORT=8124" \
--initialization-actions \
gs://myjupyterbucketname/jupyter.sh \
--bucket my-dataproc-bucket \
--num-workers 2 \
--properties spark:spark.executorEnv.PYTHONHASHSEED=0,spark:spark.yarn.am.memory=1024m \
--worker-machine-type=n1-standard-4 \
--master-machine-type=n1-standard-4
But running this I got the following error:
Waiting on operation [projects/myprojectname/regions/global/operations/cf20
466c-ccb1-4c0c-aae6-fac0b99c9a35].
Waiting for cluster creation operation...done.
ERROR: (gcloud.dataproc.clusters.create) Operation [projects/myprojectname/
regions/global/operations/cf20466c-ccb1-4c0c-aae6-fac0b99c9a35] failed: Multiple
Errors:
- Google Cloud Dataproc Agent reports failure. If logs are available, they can
be found in 'gs://myjupyterbucketname/google-cloud-dataproc-metainfo/231e5160-75f3-
487c-9cc3-06a5918b77f5/my-dataproc-cluster-m'.
- Google Cloud Dataproc Agent reports failure. If logs are available, they can
be found in 'gs://myjupyterbucketname/google-cloud-dataproc-metainfo/231e5160-75f3-
487c-9cc3-06a5918b77f5/my-dataproc-cluster-w-1'..
Well the files where there, so I think it may not be some access permission problem. The file named "dataproc-initialization-script-0_output" has the following content:
/usr/bin/env: bash: No such file or directory
Any ideas?
Well, found my answer here
Turns out the script had windows line endings instead of unix line endings.
Made an online convertion using dos2unix and now it runs fine.
With help from #tix I could check that the file was reacheable using a SSH connection to the cluster (Successful "gsutil cat gs://myjupyterbucketname/jupyter.sh")
AND, the initialization file was correctly saved locally in the directory "/etc/google-dataproc/startup-scripts/dataproc-initialization-script-0"

installing transmission on debian with docker, right trouble

I am a biginner with docker, I try to make my transmission container work!
First I am running on a debian 8.1.
To make things work I created this 2 folder:
mkdir -p /opt/docker/transmission/config
mkdir -p /opt/docker/transmission/downloads
after this I added the "good" right:
chown -R root:docker /opt/docker
chmod -R 775 /opt/docker
At the end I tried to create my docker by doing this:
docker run -d \
--net="host" \
--name="Transmission" \
-e USERNAME="root" \
-e PASSWORD="mdp" \
-v /opt/docker/transmission/config:/config \
-v /opt/docker/transmission/downloads:/downloads \
-v /etc/localtime:/etc/localtime:ro \
gfjardim/transmission
The command docker logs transmission gives:
Couldn't save temporary file "/config/resume/myfile.resume.tmp.hYhTPF": No such file or directory
I guessed that the folder resume in config was not created so I created it, but I didn't work.
The message in transmission GUI is:
unable to save resume file: permission denied
I cannot reproduce you errors using your commands. I suggest you delete your /opt/docker/transmission/ directory and then run the docker run command.
Docker will take care of creating those repositories.
I didn't find a solution for gfjardim/transmission container
But by changing container it worked directly.
I followed the dperson/transmission in the hub site, and it work perfectly.
Thanks for your help.

How to restart HDFS on Amazon EMR

I have made some changes in the settings for HDFS on an Amazon EMR cluster. I want to restart the namenode and the datanode for the changes to take effect. I am not able to find any start and stop scripts to do so on neither the namenode(master) nor the datanodes. What should be the way to restart the cluster?
On EMR4 , run following on master host -
sudo /sbin/start hadoop-hdfs-namenode
ssh -i <key.pem> <slave-hostname1> "sudo /sbin/restart hadoop-hdfs-datanode"
ssh -i <key.pem> <slave-hostname2> "sudo /sbin/restart hadoop-hdfs-datanode"
ssh -i <key.pem> <slave-hostname3> "sudo /sbin/restart hadoop-hdfs-datanode"
You have to manually restart the cluster. This can be either performed manually or using a simple shell script.
1) Get the list of hostnames or ipaddress of all the nodes,
2) ssh into the node using the key
3) Restart the required service.
If you are good in programming, you can create a general utility that will get the list of ipaddress of all the nodes corresponding to an EMR by using the cluster id and perform the service restart in individual nodes.
Otherwise, get the hostname or ipaddress of all the nodes manually and create a script like the below one and execute from the master node
sudo service hadoop-hdfs-namenode restart
ssh -i <key.pem> <hostname1> "sudo service hadoop-hdfs-datanode restart"
ssh -i <key.pem> <hostname2> "sudo service hadoop-hdfs-datanode restart"
ssh -i <key.pem> <hostname3> "sudo service hadoop-hdfs-datanode restart"
On EMR 5.x this is what I used:
Copy PEM file to your head node and set these values:
CLUSTER_ID="j-XXXXXXXXXXX"
IDENT="cluster.pem"
Run this:
nodes=$(aws emr list-instances \
--cluster-id $ \
--instance-group-types CORE \
--instance-states RUNNING \
--output text \
--query "Instances[*].PublicDnsName" )
for node in nodes; do
ssh -i $IDENT \
-o StrictHostKeyChecking=no \
-o UserKnownHostsFile=/dev/null \
$node "sudo stop hadoop-hdfs-datanode; sudo start hadoop-hdfs-datanode"
done

Resources