Kafka container timeout - amazon-ec2

I have deployed hyperldger-fabric kafka based ordering service using ansible on aws. Everything working fine for me till yesterday. Today when I launch a network , kafka container unable to communicate with zookeeper. Here are docker logs of kafka containers
[2017-11-16 08:23:36,075] FATAL Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 6000
at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:1223)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:155)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:129)
at kafka.utils.ZkUtils$.createZkClientAndConnection(ZkUtils.scala:89)
at kafka.utils.ZkUtils$.apply(ZkUtils.scala:71)
at kafka.server.KafkaServer.initZk(KafkaServer.scala:278)
at kafka.server.KafkaServer.startup(KafkaServer.scala:168)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37)
at kafka.Kafka$.main(Kafka.scala:67)
at kafka.Kafka.main(Kafka.scala)
[2017-11-16 08:23:36,077] INFO shutting down (kafka.server.KafkaServer)
[2017-11-16 08:23:36,080] INFO shut down completed (kafka.server.KafkaServer)
[2017-11-16 08:23:36,081] FATAL Fatal error during KafkaServerStartable startup. Prepare to shutdown (kafka.server.KafkaServerStartable)
org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 6000
at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:1223)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:155)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:129)
at kafka.utils.ZkUtils$.createZkClientAndConnection(ZkUtils.scala:89)
at kafka.utils.ZkUtils$.apply(ZkUtils.scala:71)
at kafka.server.KafkaServer.initZk(KafkaServer.scala:278)
at kafka.server.KafkaServer.startup(KafkaServer.scala:168)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37)
at kafka.Kafka$.main(Kafka.scala:67)
at kafka.Kafka.main(Kafka.scala)
[2017-11-16 08:23:36,082] INFO shutting down (kafka.server.KafkaServer)
I havent change any code or anything else that's why I am unable to figure out what causes the problem . Any trick to solve this issue?

Finally fixed that issue. It was due to iptables setting which blocks icmp packets to be forwarded from flannel interface to docker interface thus docker containers couldn't communicate to each other. By adding iptable rules everything works fine for me .

Related

ejabberdctl start succeeds,but status and stop failed to connect to node

I was following this guide to set up jabbed on cluster http://chadillac.github.io/2012/11/17/easy-ejabberd-clustering-guide-mnesia-mysql/
I am using two was instances having ip
Master -> 111.222.333.444
Slave -> 222.333.444.555
But since I do not have DNS configured so I am using ip addresses like 111.222.333.444 etc instead of ‘master.domain.com’ .
I haven’t been successful at seeing up the cluster yet but before that I am having a problem at my master node .
I start the server with
/tmp/ej1809/sbin/ejabberdctl start
Then I get no output but I see in the logs that that the server started.
then I check the status using
/tmp/ej1809/sbin/ejabberdctl status
But I get the error as
Failed RPC connection to the node 'ejabberd#111.222.333.444’: nodedown
And even when I try to stop the node using /tmp/ej1809/sbin/ejabberdctl stop then also
I get
Failed RPC connection to the node 'ejabberd#111.222.333.444’: nodedown
But I cannot understand the reason behind it.
Can anyone help me solve it please?
Stop and kill processes like epmd, erl, beam.
Then start ejabberd with "ejabberdctl live", that will keep the erlang shell open for you to see the log messages in realtime, including the erlang node name:
...
13:21:22.662 [info] ejabberd 19.02.52 is started in the node ejabberd#localhost in 7.07s
13:21:22.667 [info] Start accepting TCP connections at 0.0.0.0:5444 for ejabberd_http
13:21:22.667 [info] Application ejabberd started on node ejabberd#localhost
You can check if "epmd" knows about that node:
$ epmd -names
epmd: up and running on port 4369 with data:
name ejabberd at port 33519
Then let's see if ejabberdctl can connect with that node:
$ ejabberdctl help | grep "node name:"
--node nodename ejabberd node name: ejabberd#localhost
And finally:
$ ejabberdctl status
The node ejabberd#localhost is started with status: started
ejabberd 19.02.52 is running in that node
I assume you didn't yet edit anything in ejabberdctl.cfg, specifically the ERLANG_NODE. But if you did, I recommend to reinstall ejabberd, to ensure you have default configuration, and then retry those steps. Once ejabberd works perfectly, you can start modifying the configuration files (ejabberd.yml and ejabberdctl.cfg) to suit your real requirements (clustering, etc).
At some time, if you have problems setting clustering, you may find some ideas to debug the problem in
https://ejabberd.im/interconnect-erl-nodes/index.html

Cannot produce events to Confluent Kafka deployed on AWS EC2 from local machine

I'm trying to connect from an external client (my laptop) to a broker in a Kafka cluster that I have running on ec2 machines. When I try and connect from my local machine I get the following error:
$ ./kafka-console-producer --broker-list AWS.PRIV.ATE.IP:9092 --topic test
>hi
>[2018-09-20 13:28:53,952] ERROR Error when sending message to topic test with key: null, value: 2 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for test-0: 1519 ms has passed since batch creation plus linger time
The topic exists because if I run (from local machine)
$ ./kafka-topics --list --zookeeper AWS.PRIV.ATE.IP:2181
__confluent.support.metrics
__consumer_offsets
_schemas
connect-configs
connect-offsets
connect-status
test
The cluster configuration is from Confluent's AWS quickstart template: https://github.com/aws-quickstart/quickstart-confluent-kafka/blob/master/templates/confluent-kafka.template and I'm running the open source version.
The three broker ec2 instances are visible to my local machine, which I verified by stopping the Kafka broker, starting a simple HTTP server on port 9092, and successfully curling that server using the internal IP address of the ec2 instance.
If I ssh into one of the broker instances I can successfully produce and consume messages across the cluster. The only update I've made to the out-of-the-box configuration provided by the template is changing listeners=PLAINTEXT://ec2-AWS-PUB-LIC-IP.compute-1.amazonaws.com:9092 in server.properties on each machine and then restarted the kafka server.
I can provide more configuration or debugging info if necessary. Believe the issue is something regarding IP address discoverability/visibility but I'm not entirely sure what.
You need to set advertised.listeners too.
See https://rmoff.net/2018/08/02/kafka-listeners-explained/ for details.

Kafka inside Docker - how to read/write to a topic from command line?

I have a Kafka running inside Docker with SSL enabled.
docker ps:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b8f6b1c573a1 nginx:1.13.9-alpine "nginx -g 'daemon of…" 9 hours ago Up 9 hours 80/tcp, 0.0.0.0:8081->443/tcp ng
761ce6ee2960 confluentinc/cp-schema-registry:4.0.0 "/etc/confluent/dock…" 9 hours ago Up 9 hours 0.0.0.0:8080->8080/tcp, 8081/tcp sr
16d7b81dfbc8 confluentinc/cp-kafka:4.0.0 "/etc/confluent/dock…" 9 hours ago Up 9 hours 0.0.0.0:9092-9093->9092-9093/tcp k1
9be579992536 confluentinc/cp-zookeeper:4.0.0 "/etc/confluent/dock…" 9 hours ago Up 9 hours 2888/tcp, 0.0.0.0:2181->2181/tcp, 3888/tcp zk
How to write to a topic from command line?
Tried (topic 'test' exists):
kafka-console-producer --broker-list kafka:9093 --topic test
# [2018-04-23 17:55:14,325] ERROR Error when sending message to topic test with key: null, value: 2 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
# org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms. (60s timeout)
kafka-console-producer --broker-list kafka:9092 --topic test
>aa
#[2018-04-23 18:00:59,443] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 1 : {test=TOPIC_AUTHORIZATION_FAILED} (org.apache.kafka.clients.NetworkClient)
#[2018-04-23 18:00:59,444] ERROR Error when sending message to topic test with key: null, value: 2 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
# org.apache.kafka.common.errors.TopicAuthorizationException: Not authorized to access topics: [test]
kafka-console-producer --broker-list localhost:9092 --topic test dnk306#9801a7a5b33d
>aa
#[2018-04-23 21:52:47,056] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 1 : {test=TOPIC_AUTHORIZATION_FAILED} (org.apache.kafka.clients.NetworkClient)
#[2018-04-23 21:52:47,056] ERROR Error when sending message to topic test with key: null, value: 2 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
#org.apache.kafka.common.errors.TopicAuthorizationException: Not authorized to access topics: [test]
When inside the schema registry or Kafka containers, and if you used the Docker compose configuration, this is possible as the containers will be linked by Compose, however, your container hostname is likely k1 and not kafka
If you are inside the Kafka container or using CLI commands outside the container then it's simply localhost:9093 (because you forwarded the port).
If you are inside some container other than the Kafka one and you want to resolve the Kafka container by hostname, you must add --link kafka at docker run. See Docker documentation for linking containers
Also important, you'll need to link the Kafka container to the Zookeeper container, and Schema Registry to one or the other ZK or Kafka, depending on how it's configured.
https://docs.confluent.io/current/installation/docker/docs/quickstart.html
Also SSL Kafka Docker compose examples here
TopicAuthorizationException: Not authorized to access topics: [test]
This indicates a successful connection to the Kafka container. You're next step is to ensure your Java environment has the necessary keys to access the broker over SSL

Getting NettyTcpClient connection error with Spring WebSocket with amqp broker (rabbitmq) while running in docker

I am trying to use the rabbitmq as the StompBroker in the WebSocketConfig. Have added the dependencies reactor-core and reactor-net
It works fine when starting up locally (localhost) but when I try to start the project as a docker container (rabbitmq image is correctly built with STOMP plugin), I get below error.
r.io.net.impl.netty.tcp.NettyTcpClient : Failed to connect to /127.0.0.1:61613. Attempting reconnect in 5000ms
I tried setting the setRelayHost("rabbitmq") and setRelayPort(61613) (since rabbitmq is the service name in the docker compose, using it to resolve the host name). No luck still.

Is a local Zookeeper cluster required to run Apache Storm in local cluster mode?

I have been trying to get a local copy of Storm working, following the guide in the storm-starter repo, and this tutorial.
When trying to run a topology with mvn compile exec:java -Dstorm.topology=org.apache.storm.starter.ExclamationTopology, the output eventually continues looping & spamming:
28534 [Thread-9-SendThread(localhost:2000)] INFO o.a.s.s.o.a.z.ClientCnxn - Opening socket connection to server localhost/127.0.0.1:2000. Will not attempt to authenticate using SASL (unknown error)
28534 [Thread-9-SendThread(localhost:2000)] WARN o.a.s.s.o.a.z.ClientCnxn - Session 0x152f7728a6a0011 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_45]
at Sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_45]
at org.apache.storm.shade.org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) ~[storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
at org.apache.storm.shade.org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) [storm-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
It seems it is trying to connect to a local Zookeeper cluster, but I have not seen the dependency or install requirement for Zookeeper in the Storm docs or in this other tutorial.
Do I need to install Zookeeper and is this just missing from the docs? Perhaps I'm mistaken and it is looking for something else at port 2000 on my localhost? If not, what is going wrong in my local setup?
If you run locally and use LocalCluter you do not need to install Zookeeper.
If you run locally in pseudo-distributed mode (ie, start up Nimubs and Supervisor locally) and use StormSubmitter you do need to install Zookeeper locally.

Resources