Spring XD on YARN: not able to stream kafka source to hdfs sink - spring-xd

Spring XD on YARN: not able to stream kafka source to hdfs sink
I have a proper HDFS Resource Manager up and running.
I'm able to bring up Admin server and containers successfully in YARN.
But I'm not able to stream Kafka(source) to HDFS(sink)
I configured the custom modules provided for Kafka(source) and hdfs(sink).
But when I produce a kafka message for a topic, nothing is happening in the YARN cluster.
Setup details:
HDFS / YARN apache version 2.6.0
Spring XD on YARN --- spring-xd-1.2.0.RELEASE-yarn.zip

I just tried this with ambari deployed XD cluster and then XD on YARN, both worked fine. For yarn I used spring-xd-1.2.1.RELEASE-yarn.zip and just used xd-shell deployed by ambari. I just took settings from a cluster and used ambari's postgred db where I created xdjob database with springxd user.
My config/servers.yml looked something like this before bin/xd-yarn push.
xd:
appmasterMemory: 512M
adminServers: 1
adminMemory: 512M
adminJavaOpts: -XX:MaxPermSize=128m
adminLocality: false
containers: 3
containerMemory: 512M
containerJavaOpts: -XX:MaxPermSize=128m
containerLocality: false
spring:
yarn:
applicationBaseDir: /xd/yarn/
---
xd:
container:
groups: yarn
---
spring:
yarn:
siteYarnAppClasspath: "/etc/hadoop/conf,/usr/hdp/current/hadoop-client/*,/usr/hdp/current/hadoop-client/lib/*,/usr/hdp/current/hadoop-hdfs-client/*,/usr/hdp/current/hadoop-hdfs-client/lib/*,/usr/hdp/current/hadoop-yarn-client/*,/usr/hdp/current/hadoop-yarn-client/lib/*"
siteMapreduceAppClasspath: "/usr/hdp/current/hadoop-mapreduce-client/*,/usr/hdp/current/hadoop-mapreduce-client/lib/*"
config:
mapreduce.application.framework.path: '/hdp/apps/2.2.6.0-2800/mapreduce/mapreduce.tar.gz#mr-framework'
mapreduce.application.classpath: '$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/2.2.6.0-2800/hadoop/lib/hadoop-lzo-0.6.0.2.2.4.2-2.jar:/etc/hadoop/conf/secure'
---
spring:
hadoop:
fsUri: hdfs://ambari-2.localdomain:8020
resourceManagerHost: ambari-3.localdomain
resourceManagerPort: 8050
resourceManagerSchedulerAddress: ambari-3.localdomain:8030
jobHistoryAddress: ambari-3.localdomain:10020
---
zk:
namespace: xd
client:
connect: ambari-3.localdomain:2181
sessionTimeout: 60000
connectionTimeout: 30000
initialRetryWait: 1000
retryMaxAttempts: 3
---
xd:
customModule:
home: ${spring.hadoop.fsUri}/xd/yarn/custom-modules
---
xd:
transport: kafka
messagebus:
kafka:
brokers: ambari-2.localdomain:6667
zkAddress: ambari-3.localdomain:2181
---
spring:
datasource:
url: jdbc:postgresql://ambari-1.localdomain/xdjob
username: springxd
password: springxd
driverClassName: org.postgresql.Driver
validationQuery: select 1
---
server:
port: 0
---
spring:
profiles: admin
management:
port: 0
You can i.e. use time source but I used http source. Finding correct address use runtime containers and runtime modules to see where http source is running. From xd-yarn shell you can use admininfo which queries zk to see where xd admin is running. This is needed so that you can connect xd-shell to admin running on yarn.
xd:>admin config server http://ambari-2.localdomain:50254
Successfully targeted http://ambari-2.localdomain:50254
Then let's just create kafka streams using sink/source.
xd:>stream create httpToKafkaStream --definition "http | kafka --topic=mytopic --brokerList=ambari-2.localdomain:6667" --deploy
xd:>stream create kafkaToHdfsStream --definition "kafka --zkconnect=ambari-3.localdomain:2181 --topic=mytopic --outputType=text/
xd:>http post --target http://ambari-5.localdomain:9000 --data "message1"
xd:>http post --target http://ambari-5.localdomain:9000 --data "message2"
xd:>http post --target http://ambari-5.localdomain:9000 --data "message3"
xd:>http post --target http://ambari-5.localdomain:9000 --data "message4"
xd:>http post --target http://ambari-5.localdomain:9000 --data "message5"
xd:>hadoop fs cat /xd/kafkaToHdfsStream/kafkaToHdfsStream-0.txt
message1
message2
message3
message4
message5
NOTE: you need to have mytopic created, currently kafka source will fail to start if it doesn't exist.

Related

Filebeat not starting in windows

Facing problem with staring up the Filebeat in windows 10, i have modified the filebeat prospector log path with elasticsearch log folder located in my local machine "E:" drive also i have validated the format of filebeat.yml after made the correction but still am getting below error on start up.
Filebeat version : 6.2.3
Windows version: 64 bit
Filebeat.yml (validated yml format)
filebeat.prospectors:
-
type: log
enabled: true
paths:
- 'E:\Research\ELK\elasticsearch-6.2.3\logs\*.log'
filebeat.config.modules:
path: '${path.config}/modules.d/*.yml'
reload.enabled: false
setup.template.settings:
index.number_of_shards: 3
setup.kibana:
host: 'localhost:5601'
output.elasticsearch:
hosts:
- 'localhost:9200'
username: elastic
password: elastic
Filebeat Startup Log:
E:\Research\ELK\filebeat-6.2.3-windows-x86_64>filebeat --setup -e
2018-03-24T22:58:39.660+0530 INFO instance/beat.go:468 Home path: [E:\Research\ELK\filebeat-6.2.3-windows-x86_64] Config path: [E:\Research\ELK\filebeat-6.2.3-windows-x86_64] Data path: [E:\Research\ELK\filebeat-6.2.3-windows-x86_64\data] Logs path: [E:\Research\ELK\filebeat-6.2.3-windows-x86_64\logs]
2018-03-24T22:58:39.661+0530 INFO instance/beat.go:475 Beat UUID: f818bcc0-25bb-4545-bcd4-3523366a4c0e
2018-03-24T22:58:39.662+0530 INFO instance/beat.go:213 Setup Beat: filebeat; Version: 6.2.3
2018-03-24T22:58:39.662+0530 INFO elasticsearch/client.go:145 Elasticsearch url: http://localhost:9200
2018-03-24T22:58:39.665+0530 INFO pipeline/module.go:76 Beat name: DESKTOP-J932HJH
2018-03-24T22:58:39.666+0530 INFO [monitoring] log/log.go:97 Starting metrics logging every 30s
2018-03-24T22:58:39.666+0530 INFO elasticsearch/client.go:145 Elasticsearch url: http://localhost:9200
2018-03-24T22:58:39.672+0530 INFO elasticsearch/client.go:690 Connected to Elasticsearch version 6.2.3
2018-03-24T22:58:39.672+0530 INFO kibana/client.go:69 Kibana url: http://localhost:5601
2018-03-24T22:59:08.882+0530 INFO instance/beat.go:583 Kibana dashboards successfully loaded.
2018-03-24T22:59:08.882+0530 INFO elasticsearch/client.go:145 Elasticsearch url: http://localhost:9200
2018-03-24T22:59:08.885+0530 INFO elasticsearch/client.go:690 Connected to Elasticsearch version 6.2.3
2018-03-24T22:59:08.888+0530 INFO instance/beat.go:301 filebeat start running.
2018-03-24T22:59:08.888+0530 INFO registrar/registrar.go:108 Loading registrar data from E:\Research\ELK\filebeat-6.2.3-windows-x86_64\data\registry
2018-03-24T22:59:08.888+0530 INFO registrar/registrar.go:119 States Loaded from registrar: 5
2018-03-24T22:59:08.888+0530 INFO crawler/crawler.go:48 Loading Prospectors: 1
2018-03-24T22:59:08.889+0530 INFO log/prospector.go:111 Configured paths: [E:\Research\ELK\elasticsearch-6.2.3\logs\*.log]
2018-03-24T22:59:08.890+0530 INFO log/harvester.go:216 Harvester started for file: E:\Research\ELK\elasticsearch-6.2.3\logs\elasticsearch.log
2018-03-24T22:59:08.892+0530 ERROR fileset/factory.go:69 Error creating prospector: No paths were defined for prospector accessing config
2018-03-24T22:59:08.892+0530 INFO crawler/crawler.go:109 Stopping Crawler
2018-03-24T22:59:08.893+0530 INFO crawler/crawler.go:119 Stopping 1 prospectors
2018-03-24T22:59:08.897+0530 INFO log/prospector.go:410 Scan aborted because prospector stopped.
2018-03-24T22:59:08.897+0530 INFO log/harvester.go:216 Harvester started for file: E:\Research\ELK\elasticsearch-6.2.3\logs\elasticsearch_deprecation.log
2018-03-24T22:59:08.897+0530 INFO prospector/prospector.go:121 Prospector ticker stopped
2018-03-24T22:59:08.898+0530 INFO prospector/prospector.go:138 Stopping Prospector: 18361622063543553778
2018-03-24T22:59:08.898+0530 INFO log/harvester.go:237 Reader was closed: E:\Research\ELK\elasticsearch-6.2.3\logs\elasticsearch.log. Closing.
2018-03-24T22:59:08.898+0530 INFO crawler/crawler.go:135 Crawler stopped
2018-03-24T22:59:08.899+0530 INFO registrar/registrar.go:210 Stopping Registrar
2018-03-24T22:59:08.908+0530 INFO registrar/registrar.go:165 Ending Registrar
2018-03-24T22:59:08.910+0530 INFO instance/beat.go:308 filebeat stopped.
2018-03-24T22:59:08.948+0530 INFO [monitoring] log/log.go:132 Total non-zero metrics
2018-03-24T22:59:08.948+0530 INFO [monitoring] log/log.go:133 Uptime: 29.3387858s
2018-03-24T22:59:08.949+0530 INFO [monitoring] log/log.go:110 Stopping metrics logging.
2018-03-24T22:59:08.950+0530 ERROR instance/beat.go:667 Exiting: No paths were defined for prospector accessing config
Exiting: No paths were defined for prospector accessing config
Check this path ${path.config}/modules.d/
or check by command line "filebeat.exe modules list", if some modules are active, which do not work with windows.
For instance the system.yml (module) does not run on plain windows, because there is no syslog. But the system module is active by default. So you have to disable it first.
If I have it enabled, I run in the exactly the same error message, and filebeat stops.
Rewrite the first part of the yml using this format:
filebeat.prospectors:
- type: log
enabled: true
paths:
- /var/log/*.log
#- c:\programdata\elasticsearch\logs\*
Remove also the empty new line and take attention to the indentation.
I understand that this topic is a bit old however looking at the amount of views that this has received at the time of posting this (June 2019), I think it would be safe to add more informations as this is fairly frustrating to get while very easy to fix.
Before I explain what I did, allow me to say I had this problem on a Linux system but the problem/solution should be the same on all plateforms.
After having updated the logback-spring.xml and restarted the service, it kept refusing spitting back the following error:
ERROR instance/beat.go:824 Exiting: Can only start an input when all related states are finished: {Id:163850-64780 Finished:false Fileinfo:0xc42016c1a0 Source:/some/path/here/error.log Offset:0 Timestamp:2019-06-13 09:15:35.481163602 -0400 EDT m=+0.107516982 TTL:-1ns Type:log Meta:map[] FileStateOS:163850-64780}
My solution was simply to edit the /etc/filebeat/filebeat.yml and comment as much stuff as I could (Going back to nearly a vanilla/basic configuration).
After having done so, restarting filebeat worked and this ended up being a duplicate path entry with another file somewhere in the system, possibly under the modules.

Error starting jHipster Microservices with consul

I'm trying to build sample microservice app using this tutorial.
Jhipster version is 4.0.6
So i've created gateway, service and started consul using this command:
docker-compose -f src/main/docker/consul.yml up
from my gateway directory.
But the error occur on Spring Boot startup, here is the log:
2017-02-22 11:52:25.679 ERROR 3168 --- [ restartedMain] o.s.c.c.c.ConsulPropertySourceLocator : Fail fast is set and there was an error reading configuration from consul.
2017-02-22 11:52:32.491 WARN 3168 --- [ restartedMain] o.s.boot.SpringApplication : Error handling failed (ApplicationEventMulticaster not initialized - call 'refresh' before multicasting events via the context: org.springframework.boot.context.embedded.AnnotationConfigEmbeddedWebApplicationContext#2a5b2096: startup date [Thu Jan 01 03:00:00 AST 1970]; parent: org.springframework.context.annotation.AnnotationConfigApplicationContext#5108df79)
com.ecwid.consul.transport.TransportException: java.net.ConnectException: Connection refused: connect
Could you please assist with this issue?
UPDATE:
I've found that app tries to make GET request to URL on staptup:
http://localhost:8500/v1/kv/config/armory,dev/?recurse&token=
But the only data stored in Consul K/V storage is:
KEY: config/application/data
VALUE:
configserver:
name: Docker Consul Service
status: Connected to Consul Server running in Docker
jhipster:
security:
authentication:
jwt:
secret: my-secret-token-to-change-in-production
You must copy your application yaml configurations into your consul instance as explained in the doc into central-server-config directory if consul is running in dev profile or in its git repo if consul is running in prod profile.
So assuming your app is named "armory" you should copy your src/main/resources/config/application.yml to armory.yml and for each profiles (e.g. application-dev.yml to armory-dev.yml)

Spring-XD Sqoop Configuration

We are planning to execute sqoop jobs through Spring-XD and we did the configuration that as suggested in the Spring-XD documentation.
Below is the configuration from the servers.xml file
spring:
hadoop:
fsUri: hdfs://nnservice
resourceManagerHost: server-m01.mydomain.com
resourceManagerPort: 8032
resourceManagerSchedulerAddress: ${spring.hadoop.resourceManagerHost}:8030
jobHistoryAddress: server-m02.mydomain.com:10020
security:
authMethod: kerberos
userPrincipal: springxd#mydomain.COM
userKeytab: /home/springxd/springxd.keytab
namenodePrincipal: nn/_HOST#mydomain.COM
rmManagerPrincipal: rm/_HOST#mydomain.COM
config:
mapreduce.framework.name: yarn
dfs.nameservices: nnservice
dfs.ha.namenodes.nnservice: nn1, nn2
dfs.namenode.rpc-address.nnservice.nn1: server-m01.mydomain.com:8020
dfs.namenode.rpc-address.nnservice.nn2: server-m02.mydomain.com:8020
dfs.client.failover.proxy.provider.nnservice: org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
dfs.ha.automatic-failover.enabled: True
mapreduce.application.framework.path: '/hdp/apps/2.4.2.0-258/mapreduce/mapreduce.tar.gz#mr-framework'
mapreduce.application.classpath: '$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/2.4.2.0-258/hadoop/lib/hadoop-lzo-0.6.0.2.4.2.0-258.jar:/etc/hadoop/conf/secure'
yarnApplicationClasspath: '$HADOOP_CONF_DIR,/usr/hdp/current/hadoop-client/*,/usr/hdp/current/hadoop-client/lib/*,/usr/hdp/current/hadoop-hdfs-client/*,/usr/hdp/current/hadoop-hdfs-client/lib/*,/usr/hdp/current/hadoop-yarn-client/*,/usr/hdp/current/hadoop-yarn-client/lib/*'
The sqoop jobs are getting terminated with the below error
2016-10-28T10:08:35-0400 1.3.1.RELEASE INFO main mapreduce.Job - Job job_1477527264038_0051 failed with state FAILED due to: Application application_1477527264038_0051 failed 2 times due to AM Container for appattempt_1477527264038_0051_000002 exited with exitCode: 1
Any idea if we are missing on any configuration??

Kafka | Unable to publish data to broker - ClosedChannelException

I am trying to run simple kafka producer consumer example on HDP but facing below exception.
[2016-03-03 18:26:38,683] WARN Fetching topic metadata with correlation id 0 for topics [Set(page_visits)] from broker [BrokerEndPoint(0,sandbox.hortonworks.com,9092)] failed (kafka.client.ClientUtils$)
java.nio.channels.ClosedChannelException
at kafka.network.BlockingChannel.send(BlockingChannel.scala:120)
at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:75)
at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:74)
at kafka.producer.SyncProducer.send(SyncProducer.scala:115)
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:59)
at kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82)
at kafka.producer.async.DefaultEventHandler$$anonfun$handle$1.apply$mcV$sp(DefaultEventHandler.scala:68)
at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:89)
at kafka.utils.Logging$class.swallowError(Logging.scala:106)
at kafka.utils.CoreUtils$.swallowError(CoreUtils.scala:51)
at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:68)
at kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:105)
at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:88)
at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:68)
at scala.collection.immutable.Stream.foreach(Stream.scala:547)
at kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:67)
at kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:45)
[2016-03-03 18:26:38,688] ERROR fetching topic metadata for topics [Set(page_visits)] from broker [ArrayBuffer(BrokerEndPoint(0,sandbox.hortonworks.com,9092))] failed (kafka.utils.CoreUtils$)
kafka.common.KafkaException: fetching topic metadata for topics [Set(page_visits)] from broker [ArrayBuffer(BrokerEndPoint(0,sandbox.hortonworks.com,9092))] failed
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:73)
at kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82)
at kafka.producer.async.DefaultEventHandler$$anonfun$handle$1.apply$mcV$sp(DefaultEventHandler.scala:68)
at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:89)
at kafka.utils.Logging$class.swallowError(Logging.scala:106)
at kafka.utils.CoreUtils$.swallowError(CoreUtils.scala:51)
at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:68)
at kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:105)
at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:88)
at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:68)
at scala.collection.immutable.Stream.foreach(Stream.scala:547)
at kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:67)
at kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:45)
Caused by: java.nio.channels.ClosedChannelException
at kafka.network.BlockingChannel.send(BlockingChannel.scala:120)
at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:75)
at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:74)
at kafka.producer.SyncProducer.send(SyncProducer.scala:115)
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:59)
... 12 more
[2016-03-03 18:26:38,693] WARN Fetching topic metadata with correlation id 1 for topics [Set(page_visits)] from broker [BrokerEndPoint(0,sandbox.hortonworks.com,9092)] failed (kafka.client.ClientUtils$)
java.nio.channels.ClosedChannelException
Here is command that I am using for producer.
./kafka-console-producer.sh --broker-list sandbox.hortonworks.com:9092 --topic page_visits
After doing bit of googling , I found that I need to add advertised.host.name property in server.properties file .
Here is my server.properties file.
# Generated by Apache Ambari. Thu Mar 3 18:12:50 2016
advertised.host.name=sandbox.hortonworks.com
auto.create.topics.enable=true
auto.leader.rebalance.enable=true
broker.id=0
compression.type=producer
controlled.shutdown.enable=true
controlled.shutdown.max.retries=3
controlled.shutdown.retry.backoff.ms=5000
controller.message.queue.size=10
controller.socket.timeout.ms=30000
default.replication.factor=1
delete.topic.enable=false
fetch.purgatory.purge.interval.requests=10000
host.name=sandbox.hortonworks.com
kafka.ganglia.metrics.group=kafka
kafka.ganglia.metrics.host=localhost
kafka.ganglia.metrics.port=8671
kafka.ganglia.metrics.reporter.enabled=true
kafka.metrics.reporters=org.apache.hadoop.metrics2.sink.kafka.KafkaTimelineMetricsReporter
kafka.timeline.metrics.host=sandbox.hortonworks.com
kafka.timeline.metrics.maxRowCacheSize=10000
kafka.timeline.metrics.port=6188
kafka.timeline.metrics.reporter.enabled=true
kafka.timeline.metrics.reporter.sendInterval=5900
leader.imbalance.check.interval.seconds=300
leader.imbalance.per.broker.percentage=10
listeners=PLAINTEXT://sandbox.hortonworks.com:6667
log.cleanup.interval.mins=10
log.dirs=/kafka-logs
log.index.interval.bytes=4096
log.index.size.max.bytes=10485760
log.retention.bytes=-1
log.retention.hours=168
log.roll.hours=168
log.segment.bytes=1073741824
message.max.bytes=1000000
min.insync.replicas=1
num.io.threads=8
num.network.threads=3
num.partitions=1
num.recovery.threads.per.data.dir=1
num.replica.fetchers=1
offset.metadata.max.bytes=4096
offsets.commit.required.acks=-1
offsets.commit.timeout.ms=5000
offsets.load.buffer.size=5242880
offsets.retention.check.interval.ms=600000
offsets.retention.minutes=86400000
offsets.topic.compression.codec=0
offsets.topic.num.partitions=50
offsets.topic.replication.factor=3
offsets.topic.segment.bytes=104857600
producer.purgatory.purge.interval.requests=10000
queued.max.requests=500
replica.fetch.max.bytes=1048576
replica.fetch.min.bytes=1
replica.fetch.wait.max.ms=500
replica.high.watermark.checkpoint.interval.ms=5000
replica.lag.max.messages=4000
replica.lag.time.max.ms=10000
replica.socket.receive.buffer.bytes=65536
replica.socket.timeout.ms=30000
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
socket.send.buffer.bytes=102400
zookeeper.connect=sandbox.hortonworks.com:2181
zookeeper.connection.timeout.ms=15000
zookeeper.session.timeout.ms=30000
zookeeper.sync.time.ms=2000
After adding property i am getting same exception.
Any suggestion.
I had similar problem. First I have checked listeners property for Kafka broker in the Ambari
Also possible to check with:
[root#sandbox bin]# cat /usr/hdp/current/kafka-broker/conf/server.properties | grep listeners
listeners=PLAINTEXT://sandbox.hortonworks.com:6667
Ambari replaces localhost with hostname as you can see and the port is same - 6667.
Then I checked that broker really listens on that port:
[root#sandbox bin]# netstat -tulpn | grep 6667
tcp 0 0 10.0.2.15:6667 0.0.0.0:* LISTEN 11137/java
Next step was to launch producer:
./kafka-console-producer.sh --broker-list 10.0.2.15:6667 --topic test
At last I have launched consumer:
./kafka-console-consumer.sh --zookeeper 10.0.2.15:2181 --topic test --from-beginning
After typing few words with hitting Enter on producer side, consumer received messages.
As per the log it seems the kafka server(broker) is not running. The broker server should run first.
Producers and consumers are client programs that will interact with the broker servers and zookeeper also.
Before running the producer or consumer please check whether broker and zookeeper are running successfully or not.
Run the server
./kafka-server-start.sh ../config/server.properties
check the logs for any errors, if no errors then start producing the messages to the server.
Check the zookeeper service also.
modified the file /usr/hdp/current/kafka-broker/config/server.properties with the following 2 lines
advertised.host.name=sandbox.hortonworks.com
listeners=PLAINTEXT://sandbox.hortonworks.com:6667,PLAINTEXT://0.0.0.0:6667
run the following execution commands
./kafka-console-producer.sh --broker-list sandbox.hortonworks.com:6667 --topic tst2
./kafka-console-consumer.sh --zookeeper localhost:2181 --topic tst2 --from-beginning
with this its working fine

Datastax Opscenter - Agent not connecting

I setup Cassandra, OpsCenter and the needed DataStax agent on my EC2 Amazon machine. At the moment it's only one machine.
Everything seems to be running fine, except the node list is empty and so are the keyspaces in the Opscenter. The cassandra, datastax and opscenter logs show no errors and I followed the installation / configuration carefully. Then tried all the suggested fixes.
My guess is the problem lies in the communication between the agent and opscenter.
After a while these requests fail:
etc/cassandra/cassandra.yaml: (simplified)
cluster_name: 'CassandraCluster'
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "1.2.3.4"
listen_address: 1.2.3.4
rpc_address: 0.0.0.0
endpoint_snitch: Ec2Snitch
etc/opscenter/opscenterd.conf: (simplified)
[webserver]
port = 81
interface = 0.0.0.0
[authentication]
enabled = False
[stat_reporter]
[agents]
use_ssl = false
var/lib/datastax-agent/conf/address.yaml: (simplified)
stomp_interface: 1.2.3.4
local_interface: 1.2.3.4
use_ssl: 0
nodetool status output:
Note: Ownership information does not include topology; for complete information, specify a keyspace
Datacenter: eu-west_1_cassandra
===============================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 1.2.3.4 2.06 MB 256 100.0% 8a121c12-7cbf-4a2a-b111-4ad111c111d8 1a
Nothing really strange shows up in the log except for the repetitive occurence of the following line in the agent.log:
INFO [install-location-finder] 2015-03-11 15:26:04,690 New JMX connection (127.0.0.1:7199)
INFO [install-location-finder] 2015-03-11 15:27:04,698 New JMX connection (127.0.0.1:7199)
INFO [install-location-finder] 2015-03-11 15:28:04,709 New JMX connection (127.0.0.1:7199)
INFO [install-location-finder] 2015-03-11 15:29:04,716 New JMX connection (127.0.0.1:7199)
INFO [install-location-finder] 2015-03-11 15:30:04,724 New JMX connection (127.0.0.1:7199)
INFO [install-location-finder] 2015-03-11 15:31:04,731 New JMX connection (127.0.0.1:7199)
To supply all the info here are the logs:
opscenterd.log
agent.log
cassandra/system.log
In certain environments the persistent connection between the browser and opscenterd may fail. We're working on implementing a more robust connection that will work in all environments, but in the meantime you can use the following workaround:
http://www.datastax.com/documentation/opscenter/5.1/opsc/troubleshooting/opscTroubleshootingZeroNodes.html
Minimal configuration that I find working was setting this options below for address.yaml
stomp_interface: [opscenter-ip]
stomp_port: 61620
use_ssl: 0
cassandra_conf: /etc/cassandra/cassandra.yaml
jmx_host: [cassandra-node-ip]
jmx_port: 7199
Make sure you have sysstat installed also.

Resources