Getting "RESOURCE:MEMORY" error on a new cluster in AWS-ECS - amazon-ec2

I have set up a band new cluster using ecs-cli on AWS with following commands:
Configure cluster : ecs-cli configure --cluster cluster_name --region region_name --default-launch-type EC2 --config-name config_name
Use default profile : ecs-cli configure default --config-name config_name
Create Cluster : ecs-cli up --keypair key_name --capability-iam --size 1 --instance-type t2.micro --security-group sg_id --vpc vpc_id --subnets subnet_id --cluster-config config_name
The cluster was created successfully on ECS. But when I am trying to run my docker-compose file to run jenkins and jenkins data volume containers (already pushed to ECR) I am getting "RESOURCE:MEMORY" error even though the CPU and memory utilisation is 0%.
Deploy docker compose file to cluster : ecs-cli compose up --cluster-config config_id
Actual Result:
WARN[0000] Skipping unsupported YAML option for service... option name=networks service name=jenkins
WARN[0000] Skipping unsupported YAML option for service... option name=networks service name=jenkins_dv
INFO[0000] Using ECS task definition TaskDefinition="aws-infra:4"
INFO[0000] Couldn't run containers reason="RESOURCE:MEMORY"
jenkins:
image: jenkins:latest
cpu_shares: 50
mem_limit: 524288000
ports: ["8080:8080", "50000:50000"]
volumes_from: ['jenkins_dv']
jenkins_dv:
image: jenkins_dv:latest
cpu_shares: 50
mem_limit: 524288000
Even when I am running the docker compose file after deleting cpu_shares and mem_limits (as it is not required for EC2 instances), I am getting same error. Since the cluster is new and does not have any CPU or memory being utilised the tasks should be created successfully. What am I doing wrong here?

I have got the solution to this issue. I have allocated memory limit as 500MB (in bytes) to both the containers. As per AWS documentation tc.micro has 1GB memory but if you open your instance (Cluster> EC2 Instance > container instance) and view the memory allocation, the actual memory allocated is slightly less than 1GB. I updated my file and gave memory limit as 250MB (in bytes) to both the containers and it worked.

Related

Unable to setup external etcd cluster in Kubernetes v1.15 using kubeadm

I'm trying to setup Kubernetes cluster with multi master and external etcd cluster. Followed these steps as described in kubernetes.io. I was able to create static manifest pod files in all the 3 hosts at /etc/kubernetes/manifests folder after executing Step 7.
After that when I executed command 'sudo kubeadmin init', the initialization got failed because of kubelet errors. Also verified journalctl logs, the error says misconfiguration of cgroup driver which is similar to this SO link.
I tried as said in the above SO link but not able to resolve.
Please help me in resolving this issue.
For installation of docker, kubeadm, kubectl and kubelet, I followed kubernetes.io site only.
Environment:
Cloud: AWS
EC2 instance OS: Ubuntu 18.04
Docker version: 18.09.7
Thanks
After searching few links and doing few trails, I am able to resolve this issue.
As given in the Container runtime setup, the Docker cgroup driver is systemd. But default cgroup driver of Kubelet is cgroupfs. So as Kubelet alone cannot identify cgroup driver automatically (as given in kubernetes.io docs), we have to provide cgroup-driver externally while running Kubelet like below:
cat << EOF > /etc/systemd/system/kubelet.service.d/20-etcd-service-manager.conf
[Service]
ExecStart=
ExecStart=/usr/bin/kubelet --cgroup-driver=systemd --address=127.0.0.1 --pod->manifest-path=/etc/kubernetes/manifests
Restart=always
EOF
systemctl daemon-reload
systemctl restart kubelet
Moreover, no need to run sudo kubeadm init, as we are providing --pod-manifest-path to Kubelet, it runs etcd as Static POD.
For debugging, logs of Kubelet can be checked using below command
journalctl -u kubelet -r
Hope it helps. Thanks.

Checking docker container's stats from inside the container

I am writing a healthcheck routine for a docker container. By design it should check how much CPU and memory it's using and return an "unhealthy" 1 if they exceed limits.
Is there a way to check container CPU and memory usage from INSIDE the container by running a .sh script?
All metrics are available in cgroup filesystem inside container. Read more here: https://docs.docker.com/config/containers/runmetrics
The client is connected to the docker socket (/var/run/docker.sock), which is not available inside the container. A workaround would be to mount host's /var/run/docker.sock into the container with the following option when you are starting the container:
-v /var/run/docker.sock:/var/run/docker.sock
For example,
docker run -it -v /var/run/docker.sock:/var/run/docker.sock $MY_IMAGE_NAME

cannot configure HDFS address using gethue/hue docker image

I'm trying to get the Hue docker image from gethue/hue, but it seems to ignore the configuration I give him and always look for HDFS on localhost instead of the docker container I ask him to look for.
Here is some context:
I'm using the following docker compose to launch a HDFS cluster:
hdfs-namenode:
image: bde2020/hadoop-namenode:1.1.0-hadoop2.7.1-java8
hostname: namenode
environment:
- CLUSTER_NAME=davidov
ports:
- "8020:8020"
- "50070:50070"
volumes:
- ./data/hdfs/namenode:/hadoop/dfs/name
env_file:
- ./hadoop.env
hdfs-datanode1:
image: bde2020/hadoop-datanode:1.1.0-hadoop2.7.1-java8
depends_on:
- hdfs-namenode
links:
- hdfs-namenode:namenode
volumes:
- ./data/hdfs/datanode1:/hadoop/dfs/data
env_file:
- ./hadoop.env
This launches images from BigDataEurope, which are already properly configured, including:
- the activation of webhdfs (in /etc/hadoop/hdfs-site.xml):
- dfs.webhdfs.enabled set to true
- the hue proxy user (in /etc/hadoop/core-site.xml):
- hadoop.proxyuser.hue.hosts set to *
- hadoop.proxyuser.hue.groups set to *
The, I launch hue following their instructions:
First, I launch a bash prompt inside the docker container:
docker run -it -p 8888:8888 gethue/hue:latest bash
Then, I modify desktop/conf/pseudo-distributed.ini to point to the correct hadoop "node" (in my case a docker container with the address 172.30.0.2:
[hadoop]
# Configuration for HDFS NameNode
# ------------------------------------------------------------------------
[[hdfs_clusters]]
# HA support by using HttpFs
[[[default]]]
# Enter the filesystem uri
fs_defaultfs=hdfs://172.30.0.2:8020
# NameNode logical name.
## logical_name=
# Use WebHdfs/HttpFs as the communication mechanism.
# Domain should be the NameNode or HttpFs host.
# Default port is 14000 for HttpFs.
## webhdfs_url=http://172.30.0.2:50070/webhdfs/v1
# Change this if your HDFS cluster is Kerberos-secured
## security_enabled=false
# In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
# have to be verified against certificate authority
## ssl_cert_ca_verify=True
And then I launch hue using the following command (still inside the hue container):
./build/env/bin/hue runserver_plus 0.0.0.0:8888
I then point my browser to localhost:8888, create a new user ('hdfs' in my case), and launch the HDFS file browser module. I then get the following error message:
Cannot access: /user/hdfs/.
HTTPConnectionPool(host='localhost', port=50070): Max retries exceeded with url: /webhdfs/v1/user/hdfs?op=GETFILESTATUS&user.name=hue&doas=hdfs (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 99] Cannot assign requested address',))
The interesting bit is that it still tries to connect to localhost (which of course cannot work), even though I modified its config file to point to 172.30.0.2.
Googling the issue, I found another config file: desktop/conf.dist/hue.ini. I tried modifying this one and launching hue again, but same result.
Does any one know how I could correctly configure hue in my case?
Thanks in advance for your help.
Regards,
Laurent.
Your one-off docker run command is not on the same network as the docker-compose containers.
You would need something like this, replacing [projectname] with the folder you started docker-compose up in
docker run -ti -p 8888:8888 --network="[projectname]_default" gethue/hue bash
I would suggest using Docker Compose also for the Hue container and volume mount for a INI files under desktop/conf/ that you can specify simply
fs_defaultfs=hdfs://namenode:8020
(since you put hostname: namenode in the compose file)
You'll also need to uncomment the WebHDFS line for your changes to take affect
All INI files are merged in the conf folder for Hue

Running Elastic Search with Laradock

I'm trying to get ElasticSearch running with Laradock. ES looks to be supported out of the box with Laradock.
Here's my docker command (run from <project root>/laradock/:
docker-compose up -d nginx postgres redis beanstalkd elasticsearch
However if I run docker ps, the elasticsearch container isn't running.
Both ports 9200 and 9300 are not consumed:
lsof -i :9200
Not sure why the elasticsearch container doesn't persist, it seems to just self close.
output of docker ps -a after running docker-compose up ...
http://pastebin.com/raw/ymfvLPLT
Condensed version:
IMAGE STATUS PORTS
laradock_nginx Up 36 seconds 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp
laradock_elasticsearch Exited (137) 34 seconds ago
laradock_beanstalkd Up 37 seconds 0.0.0.0:11300->11300/tcp
laradock_php-fpm Up 38 seconds 9000/tcp
laradock_workspace Up 39 seconds 0.0.0.0:2222->22/tcp
tianon/true Excited (0) 41 seconds ago
laradock_postgres Up 41 seconds 0.0.0.0:5432->5432/tcp
laradock_redis Up 40 seconds 0.0.0.0:6379->6379/tcp
Output of docker events after running docker-compose up ...
http://pastebin.com/cE9bjs6i
Try to check logs first:
docker logs laradock_elasticsearch_1
(or another name of elasticsearch container)
In my case it was
ERROR: [1] bootstrap checks failed
[1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
I found solution here
namely, i've run on my Ubuntu machine
sudo sysctl -w vm.max_map_count=262144
I don't think the problem is related to Laradock, since Elasticsearch is supposed to be running on it's own, I would first check the memory:
open Docker Dashboard -> Settings -> Resources -> Advanced: and increase the memory.
check your Machine memory, Elasticsearch won't run if there is not enough memory in your machine.
or:
open your docker-compose.yml file
increase the mem_limit: 1g then
docker-compose up -d --build elasticsearch
If it's still not working, remove all the images, update laradock to latest version and setup it new.

Launching of Spark 1.4.0 EC2 doesn't work

After launched a t2.micro instance with Debian and import my AWS keys, i tried to launch a Spark cluster on Frankfurt server with this command :
spark-1.4.0-bin-hadoop2.6/ec2/spark-ec2 -k spark_frankfurt -i spark_frankfurt.pem -s 1 -t t2.micro --region=eu-central-1 --hadoop-major-version=2 launch mycluster
But it replies me the following answer :
Setting up security groups...
Searching for existing cluster mycluster in region eu-central-1...
Could not resolve AMI at: https://raw.github.com/mesos/spark-ec2/branch-1.3/ami-list/eu-central-1/hvm
In fact Frankfurt, eu-central-1, is not in the AMI list on the official EC2 repository : https://github.com/mesos/spark-ec2/tree/branch-1.4/ami-list.
Thus it's normal that it doesn't work for the moment.

Resources