Docker containers become unresponsive/hang on error - windows

I'm running Docker Desktop on Windows and am having a problem with containers becoming unresponsive on startup errors. This doesn't happen 'every' time, but by far most of the time. Consequently, I have to be very careful to start my containers 1 at a time, and if I see one error, I have to "Restart Docker Desktop" and start the starting again.
I'm using docker-compose and as a specific example, this morning I started elasticsearch, zookeeper, then kafka. Kafka threw an exception regarding the zookeeper state and shuts down - but now the kafka container is unresponsive in docker. I can't stop it (it's already stopped?) but it shows as running. I can't CLI into it, I can't restart it. The only way forwards is to restart docker using the debug menu. (If I have the restart:always flag on, then the containers will actually restart automatically, but given they're throwing errors, it will just spin around in circles starting then dying without my being able to stop/kill/remove the offending container)
Once I've restarted docker, I'll be able to view the log of the container and see the error that was thrown...
This happens with pretty much all of my containers, however it does appear that if I start the container whilst viewing the log window within Docker Desktop, it is perhaps 'more likely' that I'll be able to start the container again if it has an error.
I've tried several different containers and this seems to be a pretty common issue for us, it doesn't appear to relate to any specific settings that I'm passing into the containers, however an extract from our docker-compose file is below:
volumes:
zData:
kData:
eData:
zookeeper:
container_name: zookeeper
image: bitnami/zookeeper:latest
environment:
ALLOW_ANONYMOUS_LOGIN: "yes" #Dev only
ZOOKEEPER_ROOT_LOGGER: WARN, CONSOLE
ZOOKEEPER_CONSOLE_THRESHOLD: WARN
ports:
- "2181:2181"
volumes:
- zData:/bitnami/zookeeper:rw
logging:
driver: "fluentd"
options:
fluentd-address: localhost:24224
tag: zookeeper
fluentd-async-connect: "true"
kafka:
container_name: kafka
image: bitnami/kafka:latest
depends_on:
- zookeeper
environment:
ALLOW_PLAINTEXT_LISTENER: "yes" # Debug only
KAFKA_ADVERTISED_PORT: 9092
KAFKA_ADVERTISED_HOST_NAME: kafka
KAFKA_CREATE_TOPICS: xx1_event:1:1,xx2_event:1:1,xx3_event:1:1,xx4_event:1:1
KAFKA_JMX_OPTS: -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=${DOCKER_HOSTNAME} -Dcom.sun.management.jmxremote.rmi.port=9096 -Djava.net.preferIPv4Stack=true
JMX_PORT: 9096
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
hostname: kakfa
ports:
- 9092:9092
- 9096:9096
volumes:
- kData:/bitnami/kafka:rw
logging:
driver: "fluentd"
options:
fluentd-address: localhost:24224
tag: zookeeper
fluentd-async-connect: "true"
elasticsearch:
image: bitnami/elasticsearch:latest
container_name: elasticsearch
cpu_shares: 2048
environment:
ELASTICSEARCH_HEAP_SIZE: "2048m"
xpack.monitoring.enabled: "false"
ports:
- 9200:9200
- 9300:9300
volumes:
- C:/config/elasticsearch.yml:/opt/bitnami/elasticsearch/config/my_elasticsearch.yml:rw
- eData:/bitnami/elasticsearch/data:rw
I've wondered about the potential for this to be a resourcing issue, however I'm running this on an a reasonably spec'd laptop (i7 laptop, SSD, 16GB RAM) using WSL2 (also happens when using Hyper-V) and RAM limits don't look like they're being approached. And when there are no errors on startup, the system runs fine and uses far more resources.
Any ideas on what I could try? I'm surprised there's not many more people struggling with this?

There is currently an issue https://github.com/moby/moby/issues/40063 where containers will hang/freeze/become unresponsive when logging is set to fluentd in asynchronous mode AND the fluentd container is not operational.

Related

Facing error response from daemon-Windows

I am trying to run apache Kafka on windows using docker and my docker-compose.yml code is as follows:
version: "3"
services:
spark:
image: jupyter/pyspark-notebook
ports:
- "9092:9092"
- "4010-4109:4010-4109"
volumes:
- ./notebooks:/home/jovyan/work/notebooks/
zookeeper:
image: 'bitnami/zookeeper:latest'
container_name: zookeeper
ports:
- '2181:2181'
environment:
- ALLOW_ANONYMOUS_LOGIN=yes
kafka:
image: 'bitnami/kafka:latest'
container_name: kakfa
ports:
- '9092:9092'
environment:
- KAFKA_BROKER_ID=1
- KAFKA_LISTENERS=PLAINTEXT://:9092
- KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://127.0.0.1:9092
- KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
- ALLOW_PLAINTEXT_LISTENER=yes
depends_on:
- zookeeper
When I execute the command
docker-compose -f docker-compose.yml up
I get an error: Error response from daemon: driver failed programming external connectivity on endpoint kafka-spark-1 (452eae1760b7860e3924c0e630943f825a809272760c8aa8bbb2f58ab2865377): Bind for 0.0.0.0:9092 failed: port is already allocated
I have tried net stop winnat and net start winnat, unfortunately this solution didn't work.
Would appreciate any kind of help!
Spark isn't running Kafka
Remove the ports here
image: jupyter/pyspark-notebook
ports:
- "9092:9092"
Also, change variable for Kafka to use the proper hostname, otherwise Spark will not work with it...
KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092
Then you can also remove ports for Kafka container since you wouldn't have access from the host. Unless you add external listeners.
You may also be interested in an example notebook I use to test PySpark with Kafka.

RabbitMQ not working when services are dockerized

So I have 2 services. Kweet (which is a tweet) and User. When I run the 2 services manually + rest of the services in docker, it works. Rest of services include MongoDB, RabbitMQ, Spring Cloud Gateway, Eureka Discovery. But the moment I run the 2 (micro)services in Docker (I just use a docker compose), the rabbitmq functionality stops working. The normal API calls work, it's specifically the RabbitMQ calls that fail.
RabbitMQ functionality:
EditUsername, edits username in user-service.
Than sends data via RabbitMQ (this is where it goes wrong I think) to kweet-service where it edits the username of a kweet.
Docker-compose file:
version: '3.8'
services:
eureka-service:
build: ./eureka-discovery-service
restart: always
container_name: eureka-service
ports:
- 8087:8087
api-gateway:
build: ./api-gateway
restart: always
container_name: api-gateway
depends_on:
- eureka-service
ports:
- 8080:8080
user:
build: ./user
restart: unless-stopped
container_name: user-ms
ports:
- 8081:8081
depends_on:
- eureka-service
kweet:
build: ./kweet
restart: unless-stopped
container_name: kweet-ms
depends_on:
- eureka-service
ports:
- 8082:8082
mongodb:
image: mongo
restart: always
container_name: mongodb
ports:
- 27017:27017
rabbitmq:
image: rabbitmq:management
restart: always
container_name: rabbitmq
hostname: rabbitmq
ports:
- 5672:5672
- 15672:15672
When I try to make a call the console shows:
user-ms | 2022-04-27 08:52:04.823 INFO 1 --- [nio-8081-exec-4] o.s.a.r.c.CachingConnectionFactory : Attempting to connect to: [localhost:5672]
The postman status I get back is 503 Service Unavailable which isn't from any try-catch's I made. Anybody have any clue where the problem might be?
EDIT[ConnectionFactory]:
I tried to use the documentation and added a the CachingConnectionFactory but it had the same result. Am I doing it wrong?
I added this to the RabbitMQ/Message-config (HOST, USERNAME, PASSWORD come from application.properties:
#Bean
public AmqpTemplate template() {
CachingConnectionFactory connectionFactory = new CachingConnectionFactory(HOST);
connectionFactory.setUsername(USERNAME);
connectionFactory.setPassword(PASSWORD);
final RabbitTemplate rabbitTemplate = new RabbitTemplate(connectionFactory);
rabbitTemplate.setMessageConverter(converter());
return rabbitTemplate;
}
EDIT [docker-compose]:
Found this source (https://www.linkedin.com/pulse/binding-your-docker-app-container-rabbitmq-phani-bushan/) that got rid of my 503 Service Unavailable error. The problem I found now is that whenever I start up the containers, it generates new queues and exchanges that aren't the ones I set up in my application.properties.
Now whenever I make a call, it shows this log:
user-ms | 2022-04-28 07:36:28.825 INFO 1 --- [nio-8081-exec-1] o.s.a.r.c.CachingConnectionFactory : Created new connection: rabbitConnectionFactory#2ca65ce4:0/SimpleCo
nnection#7e7052f [delegate=amqp://guest#172.23.0.4:5672/, localPort= 43208]
Things tried:
change host to [rabbitmq-container-name] in code via CachingConnectionFactory
change host to [rabbitmq-container-name] in docker compose with environment: - spring_rabbitmq_host=[rabbitmq-container-name]
build: ./user
restart: unless-stopped
container_name: user-ms
depends_on:
- eureka-service
- rabbitmq
ports:
- 8081:8081
environment:
- spring_rabbitmq_host=[rabbitmq-container-name]
Instead of [rabbitmq-container-name] I've tried host.docker.internal and localhost
When you dockerize your services are no longer listening in localhost. If you need to network connect services you need to use container_name instead of localhost.
localhost points to the container itself, where only one service is listening. Do not mistake for when you develop on your laptop without containers, where everything is in localhost
More about this here
By default Compose sets up a single network for your app. Each container for a service joins the default network and is both reachable by other containers on that network, and discoverable by them at a hostname identical to the container name.
You must configure, somewhere in your user-ms application (we do not know what kind of applicatin is), that RabbitMQ service is listening at rabbitmq (container_name) not localhost.
Try to use the container with environment variables. For me it is enough for working.
rabbitmq:
image: rabbitmq:management
restart: always
container_name: rabbitmq
environment:
RABBITMQ_DEFAULT_USER: guest
RABBITMQ_DEFAULT_PASS: guest
RABBITMQ_DEFAULT_VHOST: /
ports:
- 5672:5672
- 15672:15672
I can open this container in a browser: http://localhost:15672/
I use PHP FPM and Symfony and I pass env value to this container to connect to rabbitmq.
services:
php-fpm:
environment:
ENQUEUE_DSN: amqp://guest:guest#rabbitmq:5672
For Java you need to find and define or redefine application properties. The config may use the default value for localhost, like this spring.rabbitmq.host=localhost, but you need to use Docker's host, it is rabbitmq.
spring.rabbitmq.host=rabbitmq
spring.rabbitmq.port=5672
spring.rabbitmq.username=guest
spring.rabbitmq.password=guest
I have made a grave mistake. 2 things that I did before it started working.
Change Dockerfile ADD command to COPY (Don't think that was the problem though)
Deleted all my images and containers and re-made them a.k.a instead of docker-compose up I should've been typing docker-compose up --build. This was most likely the issue
brb gonna cry in a corner

How to check rabbitMQ connection(health check) up or not?

I'm running 4 microservice using docker. Here one service depends on other services. That is why I need to check before using any service other services up or not?
To up all services I'm writing a bash script.
For my working purpose, I am using sleep until up properly rabbitmq.
what is the better solution to check rabbitmq up or not? Until not up rabbitmq I have to wait.
Now for my working pupose i am using like that -
# wait for rabbitmq container be ready
sleep 14
This is the docker-compose container for rabbitMQ
rabbitmq:
image: 'rabbitmq:3.8.9'
container_name: rabbitmq_dev
restart: always
ports:
- 5675:5672
environment:
- RABBITMQ_DEFAULT_USER=rabbit
- RABBITMQ_DEFAULT_PASS=pass
depends_on:
- consul
networks:
- my_networks
I think HealthCheck can solve your problem.
Reference links: Docker Compose wait for container X before starting Y

hostname in docker-compose.yml fails to be recognized on on mac (but works on linux)

I am using the docker-compose 'recipe' below to bring up a container that runs a component of the storm stream processing framework. I am finding that on Mac's
when i enter the container (once it is up and running via docker exec -t -i <container-id> bash)
and I do ping storm-supervisor I get the error
'unknown host'. However, when i run the same docker-compose script on Linux
the host is recognized and ping succeeds.
The failure to resolve the host leads to problems with the Storm component... but what
that component is doing can be ignored for this question. I'm pretty sure if I figured out
how to get the Mac's docker-compose behavior to match Linux's then I would have no problem.
I think i am experiencing the issue mentioned in this post:
https://forums.docker.com/t/docker-compose-not-setting-hostname-when-network-mode-host/16728
version: '2'
services:
supervisor:
image: sunside/storm-supervisor
container_name: storm-supervisor
hostname: storm-supervisor
network_mode: host
ports:
- "8000:8000"
environment:
- "LOCAL_HOSTNAME=localhost"
- "NIMBUS_ADDRESS=localhost"
- "NIMBUS_THRIFT_PORT=49627"
- "DRPC_PORT=49772"
- "DRPCI_PORT=49773"
- "ZOOKEEPER_ADDRESS=localhost"
- "ZOOKEEPER_PORT=2181"
thanks in advance for any leads or tips !
"network_mode: host" will not work well on docker mac. I experienced the same issue where I had few of my containers in bridge network and the others in host network.
However, you can move all your containers to a custom bridge network. It solved for me.
You can edit your docker-compose.yml file to have a custom bridge network.
version: '2'
services:
supervisor:
image: sunside/storm-supervisor
container_name: storm-supervisor
hostname: storm-supervisor
ports:
- "8000:8000"
environment:
- "LOCAL_HOSTNAME=localhost"
- "NIMBUS_ADDRESS=localhost"
- "NIMBUS_THRIFT_PORT=49627"
- "DRPC_PORT=49772"
- "DRPCI_PORT=49773"
- "ZOOKEEPER_ADDRESS=localhost"
- "ZOOKEEPER_PORT=2181"
networks:
- storm
networks:
storm:
external: true
Also, execute the below command to create the custom network.
docker network create storm
You can verify it by
docker network ls
Hope it helped.

docker-compose: connection refused between containers, but service accessible from host

TL;DR: How do I have to change my below docker-compose.yml in order to allow one container to use a service of another over a custom (non-standard) port?
I have a pretty common setup: containers for a web app (Padrino [Ruby]), Postgres, Redis, and a queueing framework (Sidekiq). The web app comes with its custom Dockerfile, the remaining services come either from standard images (Postgres, Redis), or mount the data from the web app (Sidekiq). They are ties together via the following docker-compose.yml:
version: '2'
services:
web:
build: .
command: 'bundle exec puma -C config/puma.rb'
volumes:
- .:/myapp
ports:
- "9000:3000"
depends_on:
- postgres
- redis
sidekiq:
build: .
command: 'bundle exec sidekiq -C config/sidekiq.yml -r ./config/boot.rb'
volumes:
- .:/myapp
depends_on:
- postgres
- redis
postgres:
image: postgres:9.5
environment:
POSTGRES_USER: my-postgres-user
POSTGRES_PASSWORD: my-postgres-pass
ports:
- '9001:5432'
volumes:
- 'postgres:/var/lib/postgresql/data'
redis:
image: redis
ports:
- '9002:6379'
volumes:
- 'redis:/var/lib/redis/data'
volumes:
redis:
postgres:
One key point to notice here is that I am exposing the containers services on non-standard ports (9000-9002).
If I start the setup with docker-compose up, the Redis and Postgres containers come up fine, but the containers for the web app and Sidekiq fail since they can't connect to Redis at redis:9002. Remarkably enough, the same setup works if I use 6379 (the standard Redis port) instead of 9002.
docker ps also looks fine afaik:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
9148566c2509 redis "docker-entrypoint.sh" Less than a second ago Up About a minute 0.0.0.0:9002->6379/tcp rubydockerpadrino_redis_1
e6d47321c939 postgres:9.5 "/docker-entrypoint.s" Less than a second ago Up About a minute 0.0.0.0:9001->5432/tcp rubydockerpadrino_postgres_1
What's even more confusing: I can access the Redis container from the host via redis-cli -h localhost -p 9002 -n 0, but the web app and Sidekiq containers fail to establish a connection.
I am using this docker version on MacOS:
Docker version 1.12.3, build 6b644ec, experimental
Any ideas what I am doing wrong? I'd appreciate any hint how to get my setup running.
When you bind ports like this '9002:6379' you're telling Docker to forward traffic from localhost:9002 -> redis:6379. That's why this works from your host machine:
redis-cli -h localhost -p 9002 -n 0
However, when containers talk to each other, they are all connected to the same network by default (the Docker bridge or docker0). By default, containers can communicate with each other freely on this network, without needing any ports opened. Within this network, your redis container is listening for traffic on it's usual port (6379), host isn't involved at all. That's why your container to container communication works on 6379.

Resources