Kafka Connect to ElasticSearch - LOG4J2 or Connection refused error - elasticsearch

I am trying to configure Kafka Connect with Elastic search.
But when I try to start it I am getting following ERRORs:
ERROR StatusLogger Log4j2 could not find a logging implementation.
ERROR Failed to create client to verify connection. (io.confluent.connect.elasticsearch.Validator:120)
ElasticsearchException[java.util.concurrent.ExecutionException: java.net.ConnectException: Connection refused]; nested: ExecutionException[java.net.ConnectException: Connection refused]; nested: ConnectException[Connection refused]
I am trying to start it with following command in terminal:
connect-standalone.sh config/connect-standalone.properties config/elasticsearch.properties
This is connect-standalone.properties:
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter.schemas.enable=true
offset.flush.interval.ms=10000
# EDIT BELOW IF NEEDED
bootstrap.servers=localhost:9092
offset.storage.file.filename=/tmp/connect.offsets
plugin.path=/home/stjepan/kafka_2.13-3.2.3/connectors
This is elasticsearch.properties:
name=elasticsearch-sink
connector.class=io.confluent.connect.elasticsearch.ElasticsearchSinkConnector
tasks.max=1
topics=wikimedia.recentchange
key.ignore=true
connection.url=localhost:9200
# connection.url=https://kafka-course-5842482143.eu-west-1.bonsaisearch.net
# connection.username=he6de7ka5o
# connection.password=yozz8ryqmg
type.name=kafka-connect
# necessary for this connector
schema.ignore=true
behavior.on.malformed.documents=IGNORE
# OVERRIDE
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=false
value.converter.schemas.enable=false
What am I doing wrong? Wikimedia Connector works fine...
I didn't try anything else because I don't find any idea what can I change except ports, but it didn't helped...

You're probably referring to the course material of https://www.conduktor.io/apache-kafka-for-beginners (or the related course on Packt Pub)
I worked around this issue by using an OpenSearch Kafka Sink Connector:
https://github.com/aiven/opensearch-connector-for-apache-kafka
under https://github.com/aiven/opensearch-connector-for-apache-kafka/blob/main/config/quickstart-elasticsearch.properties there is a Connector Config that you can merge into your existing config.
Hope that helps

Related

"UnresolvedAddressException" - Confluent kafka server start

I have downloaded confluent kafka in windows machine. Zookeeper is running successfully while running kafka server i am getting below exception.
INFO Opening socket connection to server localhost/<unresolved>:2181. Will not attempt to
authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
WARN Session 0x0 for server localhost/<unresolved>:2181, unexpected error, closing socket
connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
java.nio.channels.UnresolvedAddressException
at java.base/sun.nio.ch.Net.checkAddress(Net.java:149)
at java.base/sun.nio.ch.Net.checkAddress(Net.java:157)
at java.base/sun.nio.ch.SocketChannelImpl.checkRemote(SocketChannelImpl.java:815)
at java.base/sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:837)
at org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(ClientCnxnSocketNIO.java:277)
at org.apache.zookeeper.ClientCnxnSocketNIO.connect(ClientCnxnSocketNIO.java:287)
at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1021)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1064)
Below options i tried to solve the issue but it didn't work.
1.Added listeners=PLAINTEXT://127.0.0.1:9092,advertised.listeners=PLAINTEXT://127.0.0.1:9092
in server.properties file
2.Added localhost ip address in hosts file
Below are properties in server.properties file
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=C:\KafkaLog
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=6000
confluent.support.metrics.enable=true
confluent.support.customer.id=anonymous
group.initial.rebalance.delay.ms=0
Zookeeper properties are
dataDir=C:\ZookeeperLog
clientPort=2181
maxClientCnxns=0

"java.nio.channels.UnresolvedAddressException" while starting kafka server

I started zookeeper after that i have ran "kafka-server-start.bat mypath\server.properties" command to start kafka server.
Getting following error in kafka server window.
INFO Opening socket connection to server localhost/<unresolved>:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
WARN Session 0x0 for server localhost/<unresolved>:2181, unexpected error, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
java.nio.channels.UnresolvedAddressException
at java.base/sun.nio.ch.Net.checkAddress(Net.java:149)
at java.base/sun.nio.ch.Net.checkAddress(Net.java:157)
at java.base/sun.nio.ch.SocketChannelImpl.checkRemote(SocketChannelImpl.java:815)
at java.base/sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:837)
at org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(ClientCnxnSocketNIO.java:277)
at org.apache.zookeeper.ClientCnxnSocketNIO.connect(ClientCnxnSocketNIO.java:287)
at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1021)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1064)
Below are the properties in server.properties
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=C:\KafkaLog
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=6000
confluent.support.metrics.enable=true
confluent.support.customer.id=anonymous
group.initial.rebalance.delay.ms=0
Zookeeper properties are
dataDir=C:\ZookeeperLog
clientPort=2181
maxClientCnxns=0
We tried to check the following things in order to resolve them.
Updated the dataDir & log.dirs properties in config files (for matching windows platform)
Verified the zookeeper startup using netstat -aon | findstr '2181' command
Updated the zookeeper.connect url to 127.0.0.1:2181
Added the below Missing loopback entries in hosts file and restarted the system.
127.0.0.1 localhost

ElasticsearchSinkConnector can't connect to Elastic

Using the kafka-lenses-dev image, I ran into problems connecting Kafka to the local Elasticsearch 7.1.1 instance. As you can see, the connection is refused, although the instance is up and running and it's possible to curl docs in. Is the version of Elastic problematic for the connector or is the config wrong?
Config
connector.class=io.confluent.connect.elasticsearch.ElasticsearchSinkConnector
type.name=kafka-connect
key.converter.schemas.enable=false
topics=cc_data
name=elastic-sink
value.converter.schemas.enable=false
connection.url=http://localhost:9200
key.ignore=true
Error
org.apache.kafka.connect.errors.ConnectException: Couldn't start ElasticsearchSinkTask due to connection error:
at io.confluent.connect.elasticsearch.jest.JestElasticsearchClient.<init>(JestElasticsearchClient.java:147)
at io.confluent.connect.elasticsearch.jest.JestElasticsearchClient.<init>(JestElasticsearchClient.java:114)
at io.confluent.connect.elasticsearch.ElasticsearchSinkTask.start(ElasticsearchSinkTask.java:120)
...
Caused by: io.searchbox.client.config.exception.CouldNotConnectException: Could not connect to http://localhost:9200
at io.searchbox.client.http.JestHttpClient.execute(JestHttpClient.java:60)
at io.confluent.connect.elasticsearch.jest.JestElasticsearchClient.getServerVersion(JestElasticsearchClient.java:168)
at io.confluent.connect.elasticsearch.jest.JestElasticsearchClient.<init>(JestElasticsearchClient.java:145)
Caused by: org.apache.http.conn.HttpHostConnectException: Connect to localhost:9200 [localhost/127.0.0.1] failed: Connection refused (Connection refused)
...
Caused by: java.net.ConnectException: Connection refused (Connection refused)
...
You've specified Elasticsearch as being at localhost:9200. Unless Elasticsearch is running on the same container as Kafka Connect then this hostname is wrong. You need to specify the hostname of Elasticsearch in a form that is reachable from where Kafka Connect is running.
As an example, you can see an example of how to configure this based on this Docker Compose here.

Setup multiple kafka connect sinks

I am working on streaming the data from postgreSQL to HDFS. I had setup confluent environment on HDP 2.6 sandbox. My jdbc source configs for postgreSQL are
name=jdbc_1
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=1
connection.url=jdbc:postgresql://host:port/db?currentSchema=schema&user=user&password=password
mode=timestamp
timestamp.column.name=col1
validate.non.null=false
topic.prefix=psql-
All other properties for connection are also fine and i am running it by
./bin/connect-standalone ./etc/kafka/connect-standalone.properties ./etc/kafka-connect-jdbc/source.properties
Its working fine and creating topics based on the number of tables in the database as
psql-table1
psql-table2
Now i want to run HDFS sinks on all the topics to create separate dir for every table in the postgreSQL database.
But when i run HDFS sink with command
./bin/connect-standalone ./etc/kafka/connect-standalone.properties ./etc/kafka-connect-hdfs/hdfs-postGres.properties
by running the source i am getting error
ERROR Stopping after connector error (org.apache.kafka.connect.cli.ConnectStandalone:113)
org.apache.kafka.connect.errors.ConnectException: Unable to start REST server
at org.apache.kafka.connect.runtime.rest.RestServer.start(RestServer.java:214)
at org.apache.kafka.connect.runtime.Connect.start(Connect.java:53)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:95)
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.eclipse.jetty.server.ServerConnector.openAcceptChannel(ServerConnector.java:331)
at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:299)
at org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)
at org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:235)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
at org.eclipse.jetty.server.Server.doStart(Server.java:398)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
at org.apache.kafka.connect.runtime.rest.RestServer.start(RestServer.java:212)
... 2 more
and if i stop the source connection and start the sink it works fine.
Anyone can help me that how i can setup multiple sink connectors.
Kafka Connect starts a rest server on port 8083.
If you run more that one standalone connector on a single machine, you need to change it with the rest.port property
Or you can run connect-distributed, then POST your source and sink configurations individually as JSON payloads running on a single Connect server, then you wouldn't have this Address already in use issue.

Netezza Jar not working with source jdbc connector in confluent

I'm facing issue while connecting Zetezza database from confluent source jdbc connector.
source-jdbc connector properties:
name=source-netezza
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=1
connection.url=jdbc:netezza://10.245.*.*:5480/DATABASE=mydb,user=admin,password=password
table.whitelist=netezza_str
mode=bulk
topic.prefix=netezza_str_
After executing the command I'm getting below errors
Invalid value java.sql.SQLException: No suitable driver found for jdbc:netezza://10.245.*.*:5480/mydb,user=admin,password=password for configuration Couldn't open connection to jdbc:netezza://10.245.*.*:5480/mydb,user=admin,password=password
I've tried with netezza-3.2.2 jar and nzjdbc jar.

Resources