Kafka's Consumer can't start - hadoop

I type the command line as below.but it show it is unable to connect to localhost:2181. I have started the zookeeper.
bin/kafka-console-consumer.sh --zookeeper --localhost:2181 --topic pain --from-beginning
Exception in thread "main" org.I0Itec.zkclient.exception.ZkException: Unable to connect to --localhost:2181
at org.I0Itec.zkclient.ZkConnection.connect(ZkConnection.java:66)
at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:876)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:84)
at kafka.consumer.ZookeeperConsumerConnector.connectZk(ZookeeperConsumerConnector.scala:171)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:126)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:143)
at kafka.consumer.Consumer$.create(ConsumerConnector.scala:94)
at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:145)
at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
Caused by: java.net.UnknownHostException: --localhost
at java.net.InetAddress.getAllByName0(InetAddress.java:1252)
at java.net.InetAddress.getAllByName(InetAddress.java:1164)
at java.net.InetAddress.getAllByName(InetAddress.java:1098)
at org.apache.zookeeper.client.StaticHostProvider.init(StaticHostProvider.java:61)
at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:445)
at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:380)
at org.I0Itec.zkclient.ZkConnection.connect(ZkConnection.java:64)
... 9 more
[root#crxy2 kafka_2.10-0.8.2.0]# jps
5487 QuorumPeerMain
5862 Jps
5518 Kafka
[root#crxy2 kafka_2.10-0.8.2.0]# bin/kafka-topics.sh --zookeeper localhost:2181 --list
pain

It should be bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic pain --from-beginning
Note the --zookeeper localhost:2181 instead of --zookeeper --localhost:2181

Try following command:
bin/kafka-console-consumer.sh --bootstrap-server localhost:9093 --topic pain --from beginning
Here I have assumed that one of your kafka brokers or kafka servers are running at localhost : 9093
if not provide address to any one alive kafka broker for parameter
--bootstrap-server

Related

What is this error on spark-submit by HDFS HA yarn

here is my error log:
$ /spark-submit --master yarn --deploy-mode cluster pi.py
...
2021-12-23 01:31:04,330 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category WRITE is not supported in state standby. Visit https://s.apache.org/sbnn-error
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88)
at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1954)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1442)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setPermission(FSNamesystem.java:1895)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setPermission(NameNodeRpcServer.java:860)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setPermission(ClientNamenodeProtocolServerSideTranslatorPB.java:526)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
, while invoking ClientNamenodeProtocolTranslatorPB.setPermission over master/172.17.0.2:8020. Trying to failover immediately.
...
Why I get this erorr??
NOTE. Spark master is run 'master', so spark-submit command run in 'master'
NOTE. Spark worker is run 'worker1' and 'worker2' and 'worker3'
NOTE. ResourceManager run in 'master' and 'master2'
ADD. When print above error log, master2's DFSZKFailoverController is disappeard to jps command result.
ADD. When print above error log, master's Namenode is disappeard to jps command result.
It happens when Spark is unable to access HDFS.
If configured correctly HDFS client will handle the StandbyException by attempting to fail itself over to the other NameNode in the HA, and then it will reattempt the operation.
Replace active Namenode URI manually and check if you are still having the same error, if not HA is not properly configured.

Unable to send message from Producer to Consumer

kafka-console-consumer.bat --bootstrap-server localhost:2181 --topic baeldung
kafka-console-producer.bat --broker-list localhost:9092 --topic baeldung
Message is not going from producer to consumer.
In the consumer you've mistakenly used zookeeper port 2181. It has to be localhost:9092
Consumer Script:
kafka-console-consumer.bat --bootstrap-server localhost:2181 --topic
baeldung
Producer Script :
kafka-console-producer.bat --broker-list localhost:9092 --topic
baeldung
In above commands, broker addresses are different. Producer seems to have correct address localhost:9092 while consumer script has zookeeper address localhost:2181. Change it to localhost:9092 like this ::
kafka-console-consumer.bat --bootstrap-server localhost:9092--topic baeldung
Commands you need in order to run producer and consumer:
Cosumer:
kafka-console-consumer.bat --bootstrap-server localhost:9092 --topic baeldung
Producer
kafka-console-producer.bat --broker-list localhost:9092 --topic baeldung
If you want to consume messages from the beginning, include --from-beginning in consumer, otherwise it will consume latest messages by default.
If you add more brokers to your cluster, in order to consume/produce from all brokers just add your brokers ports like: localhost:9092,localhost:9093,localhost:9094

Kafka | Unable to publish data to broker - ClosedChannelException

I am trying to run simple kafka producer consumer example on HDP but facing below exception.
[2016-03-03 18:26:38,683] WARN Fetching topic metadata with correlation id 0 for topics [Set(page_visits)] from broker [BrokerEndPoint(0,sandbox.hortonworks.com,9092)] failed (kafka.client.ClientUtils$)
java.nio.channels.ClosedChannelException
at kafka.network.BlockingChannel.send(BlockingChannel.scala:120)
at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:75)
at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:74)
at kafka.producer.SyncProducer.send(SyncProducer.scala:115)
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:59)
at kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82)
at kafka.producer.async.DefaultEventHandler$$anonfun$handle$1.apply$mcV$sp(DefaultEventHandler.scala:68)
at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:89)
at kafka.utils.Logging$class.swallowError(Logging.scala:106)
at kafka.utils.CoreUtils$.swallowError(CoreUtils.scala:51)
at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:68)
at kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:105)
at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:88)
at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:68)
at scala.collection.immutable.Stream.foreach(Stream.scala:547)
at kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:67)
at kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:45)
[2016-03-03 18:26:38,688] ERROR fetching topic metadata for topics [Set(page_visits)] from broker [ArrayBuffer(BrokerEndPoint(0,sandbox.hortonworks.com,9092))] failed (kafka.utils.CoreUtils$)
kafka.common.KafkaException: fetching topic metadata for topics [Set(page_visits)] from broker [ArrayBuffer(BrokerEndPoint(0,sandbox.hortonworks.com,9092))] failed
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:73)
at kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82)
at kafka.producer.async.DefaultEventHandler$$anonfun$handle$1.apply$mcV$sp(DefaultEventHandler.scala:68)
at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:89)
at kafka.utils.Logging$class.swallowError(Logging.scala:106)
at kafka.utils.CoreUtils$.swallowError(CoreUtils.scala:51)
at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:68)
at kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:105)
at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:88)
at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:68)
at scala.collection.immutable.Stream.foreach(Stream.scala:547)
at kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:67)
at kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:45)
Caused by: java.nio.channels.ClosedChannelException
at kafka.network.BlockingChannel.send(BlockingChannel.scala:120)
at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:75)
at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:74)
at kafka.producer.SyncProducer.send(SyncProducer.scala:115)
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:59)
... 12 more
[2016-03-03 18:26:38,693] WARN Fetching topic metadata with correlation id 1 for topics [Set(page_visits)] from broker [BrokerEndPoint(0,sandbox.hortonworks.com,9092)] failed (kafka.client.ClientUtils$)
java.nio.channels.ClosedChannelException
Here is command that I am using for producer.
./kafka-console-producer.sh --broker-list sandbox.hortonworks.com:9092 --topic page_visits
After doing bit of googling , I found that I need to add advertised.host.name property in server.properties file .
Here is my server.properties file.
# Generated by Apache Ambari. Thu Mar 3 18:12:50 2016
advertised.host.name=sandbox.hortonworks.com
auto.create.topics.enable=true
auto.leader.rebalance.enable=true
broker.id=0
compression.type=producer
controlled.shutdown.enable=true
controlled.shutdown.max.retries=3
controlled.shutdown.retry.backoff.ms=5000
controller.message.queue.size=10
controller.socket.timeout.ms=30000
default.replication.factor=1
delete.topic.enable=false
fetch.purgatory.purge.interval.requests=10000
host.name=sandbox.hortonworks.com
kafka.ganglia.metrics.group=kafka
kafka.ganglia.metrics.host=localhost
kafka.ganglia.metrics.port=8671
kafka.ganglia.metrics.reporter.enabled=true
kafka.metrics.reporters=org.apache.hadoop.metrics2.sink.kafka.KafkaTimelineMetricsReporter
kafka.timeline.metrics.host=sandbox.hortonworks.com
kafka.timeline.metrics.maxRowCacheSize=10000
kafka.timeline.metrics.port=6188
kafka.timeline.metrics.reporter.enabled=true
kafka.timeline.metrics.reporter.sendInterval=5900
leader.imbalance.check.interval.seconds=300
leader.imbalance.per.broker.percentage=10
listeners=PLAINTEXT://sandbox.hortonworks.com:6667
log.cleanup.interval.mins=10
log.dirs=/kafka-logs
log.index.interval.bytes=4096
log.index.size.max.bytes=10485760
log.retention.bytes=-1
log.retention.hours=168
log.roll.hours=168
log.segment.bytes=1073741824
message.max.bytes=1000000
min.insync.replicas=1
num.io.threads=8
num.network.threads=3
num.partitions=1
num.recovery.threads.per.data.dir=1
num.replica.fetchers=1
offset.metadata.max.bytes=4096
offsets.commit.required.acks=-1
offsets.commit.timeout.ms=5000
offsets.load.buffer.size=5242880
offsets.retention.check.interval.ms=600000
offsets.retention.minutes=86400000
offsets.topic.compression.codec=0
offsets.topic.num.partitions=50
offsets.topic.replication.factor=3
offsets.topic.segment.bytes=104857600
producer.purgatory.purge.interval.requests=10000
queued.max.requests=500
replica.fetch.max.bytes=1048576
replica.fetch.min.bytes=1
replica.fetch.wait.max.ms=500
replica.high.watermark.checkpoint.interval.ms=5000
replica.lag.max.messages=4000
replica.lag.time.max.ms=10000
replica.socket.receive.buffer.bytes=65536
replica.socket.timeout.ms=30000
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
socket.send.buffer.bytes=102400
zookeeper.connect=sandbox.hortonworks.com:2181
zookeeper.connection.timeout.ms=15000
zookeeper.session.timeout.ms=30000
zookeeper.sync.time.ms=2000
After adding property i am getting same exception.
Any suggestion.
I had similar problem. First I have checked listeners property for Kafka broker in the Ambari
Also possible to check with:
[root#sandbox bin]# cat /usr/hdp/current/kafka-broker/conf/server.properties | grep listeners
listeners=PLAINTEXT://sandbox.hortonworks.com:6667
Ambari replaces localhost with hostname as you can see and the port is same - 6667.
Then I checked that broker really listens on that port:
[root#sandbox bin]# netstat -tulpn | grep 6667
tcp 0 0 10.0.2.15:6667 0.0.0.0:* LISTEN 11137/java
Next step was to launch producer:
./kafka-console-producer.sh --broker-list 10.0.2.15:6667 --topic test
At last I have launched consumer:
./kafka-console-consumer.sh --zookeeper 10.0.2.15:2181 --topic test --from-beginning
After typing few words with hitting Enter on producer side, consumer received messages.
As per the log it seems the kafka server(broker) is not running. The broker server should run first.
Producers and consumers are client programs that will interact with the broker servers and zookeeper also.
Before running the producer or consumer please check whether broker and zookeeper are running successfully or not.
Run the server
./kafka-server-start.sh ../config/server.properties
check the logs for any errors, if no errors then start producing the messages to the server.
Check the zookeeper service also.
modified the file /usr/hdp/current/kafka-broker/config/server.properties with the following 2 lines
advertised.host.name=sandbox.hortonworks.com
listeners=PLAINTEXT://sandbox.hortonworks.com:6667,PLAINTEXT://0.0.0.0:6667
run the following execution commands
./kafka-console-producer.sh --broker-list sandbox.hortonworks.com:6667 --topic tst2
./kafka-console-consumer.sh --zookeeper localhost:2181 --topic tst2 --from-beginning
with this its working fine

Kafka My producer works topic is Par: 0 ,Lead:1, Rep:1, Isr:1 BUT NOT Par: 0 ,Lead:2, Rep:2, Isr:2

I have a kakfa cluster with 3 kafka nodes and 3 zk nodes.
The producer is on AWS machine trying to push data on my kafka cluster running on my intranet servers.
When the topic (JOB_AWS_14) is created from console with
Partition: 0 Leader: 1 Replicas: 1 Isr: 1
it works fine.
But where a topic(JOB_AWS_8) is created with
Partition: 0 Leader: 2 Replicas: 2 Isr: 2
it is not working.
What settings went wrong and how to correct it.
Please help me.
# bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic JOB_AWS_14
Topic:JOB_AWS_14 PartitionCount:1 ReplicationFactor:1 Configs:
Topic: JOB_AWS_14 Partition: 0 Leader: 1 Replicas: 1 Isr: 1
# bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic JOB_AWS_8
Topic:JOB_AWS_8 PartitionCount:1 ReplicationFactor:1 Configs:
Topic: JOB_AWS_8 Partition: 0 Leader: 2 Replicas: 2 Isr: 2
Your producer is reachable to Kafka node-1 which is leader for topic "JOB_AWS_14" so you are able to produce messages to that topic while coming to topic "JOB_AWS_8" leader is Kafka node-2 and your producer might not be reachable to Node-2. Make sure your producer is reachable to node-2.

Spring XD on YARN: not able to stream kafka source to hdfs sink

Spring XD on YARN: not able to stream kafka source to hdfs sink
I have a proper HDFS Resource Manager up and running.
I'm able to bring up Admin server and containers successfully in YARN.
But I'm not able to stream Kafka(source) to HDFS(sink)
I configured the custom modules provided for Kafka(source) and hdfs(sink).
But when I produce a kafka message for a topic, nothing is happening in the YARN cluster.
Setup details:
HDFS / YARN apache version 2.6.0
Spring XD on YARN --- spring-xd-1.2.0.RELEASE-yarn.zip
I just tried this with ambari deployed XD cluster and then XD on YARN, both worked fine. For yarn I used spring-xd-1.2.1.RELEASE-yarn.zip and just used xd-shell deployed by ambari. I just took settings from a cluster and used ambari's postgred db where I created xdjob database with springxd user.
My config/servers.yml looked something like this before bin/xd-yarn push.
xd:
appmasterMemory: 512M
adminServers: 1
adminMemory: 512M
adminJavaOpts: -XX:MaxPermSize=128m
adminLocality: false
containers: 3
containerMemory: 512M
containerJavaOpts: -XX:MaxPermSize=128m
containerLocality: false
spring:
yarn:
applicationBaseDir: /xd/yarn/
---
xd:
container:
groups: yarn
---
spring:
yarn:
siteYarnAppClasspath: "/etc/hadoop/conf,/usr/hdp/current/hadoop-client/*,/usr/hdp/current/hadoop-client/lib/*,/usr/hdp/current/hadoop-hdfs-client/*,/usr/hdp/current/hadoop-hdfs-client/lib/*,/usr/hdp/current/hadoop-yarn-client/*,/usr/hdp/current/hadoop-yarn-client/lib/*"
siteMapreduceAppClasspath: "/usr/hdp/current/hadoop-mapreduce-client/*,/usr/hdp/current/hadoop-mapreduce-client/lib/*"
config:
mapreduce.application.framework.path: '/hdp/apps/2.2.6.0-2800/mapreduce/mapreduce.tar.gz#mr-framework'
mapreduce.application.classpath: '$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/2.2.6.0-2800/hadoop/lib/hadoop-lzo-0.6.0.2.2.4.2-2.jar:/etc/hadoop/conf/secure'
---
spring:
hadoop:
fsUri: hdfs://ambari-2.localdomain:8020
resourceManagerHost: ambari-3.localdomain
resourceManagerPort: 8050
resourceManagerSchedulerAddress: ambari-3.localdomain:8030
jobHistoryAddress: ambari-3.localdomain:10020
---
zk:
namespace: xd
client:
connect: ambari-3.localdomain:2181
sessionTimeout: 60000
connectionTimeout: 30000
initialRetryWait: 1000
retryMaxAttempts: 3
---
xd:
customModule:
home: ${spring.hadoop.fsUri}/xd/yarn/custom-modules
---
xd:
transport: kafka
messagebus:
kafka:
brokers: ambari-2.localdomain:6667
zkAddress: ambari-3.localdomain:2181
---
spring:
datasource:
url: jdbc:postgresql://ambari-1.localdomain/xdjob
username: springxd
password: springxd
driverClassName: org.postgresql.Driver
validationQuery: select 1
---
server:
port: 0
---
spring:
profiles: admin
management:
port: 0
You can i.e. use time source but I used http source. Finding correct address use runtime containers and runtime modules to see where http source is running. From xd-yarn shell you can use admininfo which queries zk to see where xd admin is running. This is needed so that you can connect xd-shell to admin running on yarn.
xd:>admin config server http://ambari-2.localdomain:50254
Successfully targeted http://ambari-2.localdomain:50254
Then let's just create kafka streams using sink/source.
xd:>stream create httpToKafkaStream --definition "http | kafka --topic=mytopic --brokerList=ambari-2.localdomain:6667" --deploy
xd:>stream create kafkaToHdfsStream --definition "kafka --zkconnect=ambari-3.localdomain:2181 --topic=mytopic --outputType=text/
xd:>http post --target http://ambari-5.localdomain:9000 --data "message1"
xd:>http post --target http://ambari-5.localdomain:9000 --data "message2"
xd:>http post --target http://ambari-5.localdomain:9000 --data "message3"
xd:>http post --target http://ambari-5.localdomain:9000 --data "message4"
xd:>http post --target http://ambari-5.localdomain:9000 --data "message5"
xd:>hadoop fs cat /xd/kafkaToHdfsStream/kafkaToHdfsStream-0.txt
message1
message2
message3
message4
message5
NOTE: you need to have mytopic created, currently kafka source will fail to start if it doesn't exist.

Resources