Exception during topic deletion when Kafka is hosted in Docker in Windows - windows

I host Kafka in Docker in Windows. Wurstmeister/Kafka docker image is used. Kafka data is stored in local Windows folder for persistency. Windows folder is mapped to Kafka docker image via Docker volumes. I can create topics, publish and consume messages. However when I try to delete topic I receive the following error:
Error while deleting test-0 in dir /var/lib/kafka. (kafka.server.LogDirFailureChannel)
java.io.IOException: Failed to rename log directory from /var/lib/kafka/test-0 to /var/lib/kafka/test-0.a81ff9700e4e4c3e8b20c6d949971b64-delete
at kafka.log.LogManager.asyncDelete(LogManager.scala:671)
at kafka.cluster.Partition.$anonfun$delete$1(Partition.scala:178)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:217)
at kafka.utils.CoreUtils$.inWriteLock(CoreUtils.scala:225)
at kafka.cluster.Partition.delete(Partition.scala:173)
at kafka.server.ReplicaManager.stopReplica(ReplicaManager.scala:341)
at kafka.server.ReplicaManager.$anonfun$stopReplicas$2(ReplicaManager.scala:373)
at scala.collection.Iterator.foreach(Iterator.scala:929)
at scala.collection.Iterator.foreach$(Iterator.scala:929)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1417)
at scala.collection.IterableLike.foreach(IterableLike.scala:71)
at scala.collection.IterableLike.foreach$(IterableLike.scala:70)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at kafka.server.ReplicaManager.stopReplicas(ReplicaManager.scala:371)
at kafka.server.KafkaApis.handleStopReplicaRequest(KafkaApis.scala:190)
at kafka.server.KafkaApis.handle(KafkaApis.scala:104)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:65)
at java.lang.Thread.run(Thread.java:748)
Could somebody help me to cope with this issue?
UPD: Below you can find contents of docker-compose file that I use to run Kafka:
version: '3'
services:
zookeeper:
image: zookeeper
ports:
- "2181:2181"
environment:
ZOO_MY_ID: 1
volumes:
- ./zookeeper_data:/data
- ./zookeeper_datalog:/datalog
kafka:
depends_on:
- zookeeper
image: wurstmeister/kafka
ports:
- "9092:9092"
environment:
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_HOST_NAME: localhost
KAFKA_LOG_DIRS: /var/lib/kafka
KAFKA_AUTO_CREATE_TOPICS_ENABLE: 'false'
KAFKA_BROKER_ID: 1
volumes:
- ./kafka_logs:/var/lib/kafka

This issue still exists in Windows for Kafka ver 1.1.0 (kafka_2.12-1.1.0) when I try to delete the topic.
The topic gets marked for deletion and the Kafka server fails with java.nio.file.AccessDeniedException when trying to rename the logs directory 'test-0'
Deleting the whole test-0 logs folder does not help.
Reinstalling the Kafka server does not help either - even after reinstalling, the info about the topic marked for deletion remains.
Took me a couple of hours to figure out that the info about the topic sits in the Zookeeper - in one of the log files!
Solution
Stop the Zookeeper process.
Go to your Zookeeper logs folder zookeeper-3.x.x\bin\zookeeper-3.x.xdata\version-2\ and delete the latest log.xx files.
Restart Zookeper.
Restart Kafka server.

Delete version-2 from Zookeeper logs folder.
Delete all things in Kafka-logs folder.
Then restart the Zookeeper and Kafka server:
zookeeper-server-start.bat D:\kafka_2.11-2.4.1\config\zookeeper.properties
kafka-server-start.bat D:\kafka_2.11-2.4.1\config\server.properties

Deletion of topic fails due to Java's File.rename function. It works differently in some cases in Windows environment (for example if file is in use). Kafka developers already changed this function to Utils.atomicMoveWithFallback (see this issue for details), but seems it was not included into Kafka 2.11-0.11.0. So you need to work with Kafka version that has this fix. Hope this will helps.

This solved the problem
1.Stop both Kafka and Zookeeper processes.
2.Delete all old log directories.
3.Change the log.dir to point to new directory in server.properties
4.Change the dataDir to point to new directory in zookeeper.properties
Then restart the Zookeeper and Kafka server:
C:\kafka_2.12-2.4.0> zookeeper-server-start.bat .\config\zookeeper.properties
C:\kafka_2.12-2.4.0> kafka-server-start.bat .\config\server.properties

I solved this problem after reseting the credentials for my shared drive
Docker config > Shared Drives > Reset credentials

1, Change the log.dir with a new name in server.properties ->Kafka/config
log.dir=C:/Programs/kafka/kafka_2.12-2.3.0/kafka-test-logs
2, Remove the old log folder from C:/Programs/kafka/kafka_2.12-2.3.0/
3, Remove all logs and snapshot from C:\Programs\zookeeper\apache-zookeeper-3.5.5-bin\data
or delete the data folder where your log is stored
Additionally, I had an error when starting the consumer(Leader Not Available Kafka in Console Producer),
I added,
port = 9092
advertised.host.name = localhost
to the server.properties
Now am able to publish and consume messages

Related

Connecting to a Mongo container from Spring container

I have a problem here that I really cannot understand. I already saw few topics here with the same problem and those topics was successfully solved. I basically did the same thing and cannot understand what I'm doing wrong.
I have a Spring application container that tries to connect to a Mongo container through the following Docker Composer:
version: '3'
services:
app:
build: .
ports:
- "8080:8080"
links:
- db
db:
image: mongo
volumes:
- ./database:/data
ports:
- "27017:27017"
In my application.properties:
spring.data.mongodb.uri=mongodb://db:27017/app
Finally, my Dockerfile:
FROM eclipse-temurin:11-jre-alpine
WORKDIR /home/java
RUN mkdir /home/java/bar
COPY ./build/libs/foo.jar /home/java/bar/foo.jar
CMD ["java","-jar", "/home/java/bar/foo.jar"]
When I run docker compose up --build I got:
2022-11-17 12:08:53.452 INFO 1 --- [null'}-db:27017] org.mongodb.driver.cluster : Exception in monitor thread while connecting to server db:27017
Caused by: java.net.UnknownHostException: db
Running the docker compose ps I can see the mongo container running well, and I am able to connect to it through Mongo Compass and with this same Spring Application but outside of container. The difference running outside of container is the host from spring.data.mongodb.uri=mongodb://db:27017/app to spring.data.mongodb.uri=mongodb://localhost:27017/app.
Also, I already tried to change the host for localhost inside of the spring container and didnt work.
You need to specify MongoDB host, port and database as different parameters as mentioned here.
spring.data.mongodb.host=db
spring.data.mongodb.port=27017
spring.data.mongodb.authentication-database=admin
As per the official docker-compose documentation the above docker-compose file should worked since both db and app are in the same network (You can check if they are in different networks just in case)
If the networking is not working, as a workaround, instead of using localhost inside the spring container, use the server's IP, i.e, mongodb://<server_ip>:27017/app (And make sure there is no firewall blocking it)

Failed to flush WorkerSourceTask{id=local-file-source-0},timed out while waiting for producer to flush outstanding messages, 1 left

I am following this github repo
https://github.com/hannesstockner/kafka-connect-elasticsearch/
and I am trying to read data from file source into elastic search
I am getting an error when i run standalone.sh script
Failed to flush WorkerSourceTask{id=local-file-source-0}, timed out while waiting for producer to flush outstanding messages, 1 left ({ProducerRecord(topic=recipes, partition=null, key=null, value=[B#6704e57f=ProducerRecord(topic=recipes, partition=null, key=null, value=[B#6704e57f})
And these are my config:
connect-elasticsearch-sink.properties
name=local-elasticsearch-sink
connector.class=com.hannesstockner.connect.es.ElasticsearchSinkConnector
tasks.max=1
es.host=10.200.10.1
topics=recipes
index.prefix=kafka_
connect-file-source.properties
name=local-elasticsearch-sink
connector.class=com.hannesstockner.connect.es.ElasticsearchSinkConnector
tasks.max=1
es.host=10.200.10.1
topics=recipes
index.prefix=kafka_
connect-standalone.properties
bootstrap.servers=10.200.10.1:9092
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=false
value.converter.schemas.enable=false
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
offset.storage.file.filename=/tmp/connect.offsets
# Flush much faster than normal, which is useful for testing/debugging
#offset.flush.interval.ms=20000
offset.flush.timeout.ms=20000
and docker config:
kafka:
image: flozano/kafka:0.9.0.0
ports:
- "2181:2181"
- "9092:9092"
environment:
ADVERTISED_HOST: ${DOCKER_IP}
elasticsearch:
image: elasticsearch:2.1
ports:
- "9200:9200"
- "9300:9300"
I tried to set offset.flush.timeout.ms=20000 and producer.buffer.memory=10 in my standlone.properties file following thread but no luck:
Kafka Connect - Failed to flush, timed out while waiting for producer to flush outstanding messages
If you want to read files into Elastic (or Kafka), it'd be preferred you use Filebeat.
The FileSourceConnector is documented as being an example, not a production-level product. Meanwhile, there are other connectors like "Spooldir connector" or "kafka-connect-fs" project
Further, the actual Elasticsearch Kafka Connector that is supported and actively developed is here
Plus, you should use a different Kafka Docker image that is also maintained and up to date (such as those from bitnami or confluentinc), which you can use with a Docker Kafka Connect image such as mine instead of reading local files.
Your Elasticsearch docker image version is also 6+ years old.

Docker containers become unresponsive/hang on error

I'm running Docker Desktop on Windows and am having a problem with containers becoming unresponsive on startup errors. This doesn't happen 'every' time, but by far most of the time. Consequently, I have to be very careful to start my containers 1 at a time, and if I see one error, I have to "Restart Docker Desktop" and start the starting again.
I'm using docker-compose and as a specific example, this morning I started elasticsearch, zookeeper, then kafka. Kafka threw an exception regarding the zookeeper state and shuts down - but now the kafka container is unresponsive in docker. I can't stop it (it's already stopped?) but it shows as running. I can't CLI into it, I can't restart it. The only way forwards is to restart docker using the debug menu. (If I have the restart:always flag on, then the containers will actually restart automatically, but given they're throwing errors, it will just spin around in circles starting then dying without my being able to stop/kill/remove the offending container)
Once I've restarted docker, I'll be able to view the log of the container and see the error that was thrown...
This happens with pretty much all of my containers, however it does appear that if I start the container whilst viewing the log window within Docker Desktop, it is perhaps 'more likely' that I'll be able to start the container again if it has an error.
I've tried several different containers and this seems to be a pretty common issue for us, it doesn't appear to relate to any specific settings that I'm passing into the containers, however an extract from our docker-compose file is below:
volumes:
zData:
kData:
eData:
zookeeper:
container_name: zookeeper
image: bitnami/zookeeper:latest
environment:
ALLOW_ANONYMOUS_LOGIN: "yes" #Dev only
ZOOKEEPER_ROOT_LOGGER: WARN, CONSOLE
ZOOKEEPER_CONSOLE_THRESHOLD: WARN
ports:
- "2181:2181"
volumes:
- zData:/bitnami/zookeeper:rw
logging:
driver: "fluentd"
options:
fluentd-address: localhost:24224
tag: zookeeper
fluentd-async-connect: "true"
kafka:
container_name: kafka
image: bitnami/kafka:latest
depends_on:
- zookeeper
environment:
ALLOW_PLAINTEXT_LISTENER: "yes" # Debug only
KAFKA_ADVERTISED_PORT: 9092
KAFKA_ADVERTISED_HOST_NAME: kafka
KAFKA_CREATE_TOPICS: xx1_event:1:1,xx2_event:1:1,xx3_event:1:1,xx4_event:1:1
KAFKA_JMX_OPTS: -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=${DOCKER_HOSTNAME} -Dcom.sun.management.jmxremote.rmi.port=9096 -Djava.net.preferIPv4Stack=true
JMX_PORT: 9096
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
hostname: kakfa
ports:
- 9092:9092
- 9096:9096
volumes:
- kData:/bitnami/kafka:rw
logging:
driver: "fluentd"
options:
fluentd-address: localhost:24224
tag: zookeeper
fluentd-async-connect: "true"
elasticsearch:
image: bitnami/elasticsearch:latest
container_name: elasticsearch
cpu_shares: 2048
environment:
ELASTICSEARCH_HEAP_SIZE: "2048m"
xpack.monitoring.enabled: "false"
ports:
- 9200:9200
- 9300:9300
volumes:
- C:/config/elasticsearch.yml:/opt/bitnami/elasticsearch/config/my_elasticsearch.yml:rw
- eData:/bitnami/elasticsearch/data:rw
I've wondered about the potential for this to be a resourcing issue, however I'm running this on an a reasonably spec'd laptop (i7 laptop, SSD, 16GB RAM) using WSL2 (also happens when using Hyper-V) and RAM limits don't look like they're being approached. And when there are no errors on startup, the system runs fine and uses far more resources.
Any ideas on what I could try? I'm surprised there's not many more people struggling with this?
There is currently an issue https://github.com/moby/moby/issues/40063 where containers will hang/freeze/become unresponsive when logging is set to fluentd in asynchronous mode AND the fluentd container is not operational.

Hosted Service RabbitMQ connection failure using Docker Compose

Here is the log
MassTransit.RabbitMqTransport.Integration.ConnectionContextFactory.CreateConnection(ISupervisor supervisor)
[02:51:48 DBG] Connect: guest#localhost:5672/
[02:51:48 WRN] Connection Failed: rabbitmq://localhost/
RabbitMQ.Client.Exceptions.BrokerUnreachableException: None of the specified endpoints were reachable
The RabbitMQ control panel is showing the exchanges and queues as created and when I make a publish request I see the queue come through but then get a MassTransit timeout as it tries to respond.
Here is my docker yaml setup. I assume MassTransit pulls its settings to connect from appsettings.json.
version: '3.4'
services:
hostedservice:
environment:
- ASPNETCORE_ENVIRONMENT=development
ports:
- "80"
rabbitmq3:
hostname: "rabbitmq"
image: rabbitmq:3-management
environment:
- RABBITMQ_DEFAULT_USER=guest
- RABBITMQ_DEFAULT_PASS=guest
- RABBITMQ_DEFAULT_VHOST=/
ports:
# AMQP protocol port
- '5672:5672'
# HTTP management UI
- '15672:15672'
I'd suggest using the MassTransit Docker template to get a working setup. Or you can look at the source code and see how when running in a container, the template using rabbitmq as the host name to connect.
You can download the template using NuGet.
Thanks Chris that moved me to the container template usage that cleared up the connection issue!

MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled

Get continuously this error in var/reports file.
I tried below link solution but still it not fixed.
Can anyone please help me for this as it goes on critical now.
MISCONF Redis is configured to save RDB snapshots
I have written this same answer here. Posting it here as well
TL;DR Your redis is not secure. Use redis.conf from this link to secure it
long answer:
This is possibly due to an unsecured redis-server instance. The default redis image in a docker container is unsecured.
I was able to connect to redis on my webserver using just redis-cli -h <my-server-ip>
To sort this out, I went through this DigitalOcean article and many others and was able to close the port.
You can pick a default redis.conf from here
Then update your docker-compose redis section to(update file paths accordingly)
redis:
restart: unless-stopped
image: redis:6.0-alpine
command: redis-server /usr/local/etc/redis/redis.conf
env_file:
- app/.env
volumes:
- redis:/data
- ./app/conf/redis.conf:/usr/local/etc/redis/redis.conf
ports:
- "6379:6379"
the path to redis.conf in command and volumes should match
rebuild redis or all the services as required
try to use redis-cli -h <my-server-ip> to verify (it stopped working for me)

Resources