Unable to connect hive with kafka - hadoop

I have a project that consist of training a model then storing the result of using the best model in hive using kafka Topic.
I tried various configuration and solution out there but in vain
This is the docker-compose file used.
version: "3"
services:
namenode:
image: bde2020/hadoop-namenode:1.1.0-hadoop2.8-java8
container_name: namenode
volumes:
- namenode:/hadoop/dfs/name
- ./infra/zeppelin/examples:/opt/sansa-examples
environment:
- CLUSTER_NAME=test
env_file:
- ./infra/hadoop/hadoop-hive.env
ports:
- "50070:50070"
- "8020:8020"
- "8081:8081"
datanode:
image: bde2020/hadoop-datanode:1.1.0-hadoop2.8-java8
container_name: datanode
volumes:
- datanode:/hadoop/dfs/data
env_file:
- ./infra/hadoop/hadoop-hive.env
links:
- namenode
spark-master:
image: bde2020/spark-master:2.1.0-hadoop2.8-hive-java8
container_name: spark-master
ports:
- "8090:800"
- "7077:7077"
environment:
- CORE_CONF_fs_defaultFS=hdfs://namenode:8020
- SPARK_PUBLIC_DNS=localhost
depends_on:
- namenode
- datanode
links:
- namenode
- datanode
spark-worker:
image: bde2020/spark-worker:2.1.0-hadoop2.8-hive-java8
container_name: spark-worker
ports:
- "8083:8083"
environment:
- "SPARK_MASTER=spark://spark-master:7077"
environment:
- CORE_CONF_fs_defaultFS=hdfs://namenode:8020
- SPARK_PUBLIC_DNS=localhost
links:
- spark-master
hue:
image: bde2020/hdfs-filebrowser:3.11
container_name: hue
ports:
- 8088:8088
environment:
- NAMENODE_HOST=namenode
- SPARK_MASTER=spark://spark-master:7077
links:
- spark-master
zeppelin:
image: bde2020/zeppelin:0.0.1-zeppelin-0.7.1-hadoop-2.8.0-spark-2.1.0
container_name: zeppelin
ports:
- 8080:8080
volumes:
- ./data:/data
- ./data:/opt/zeppelin/data
# - ./infra/zeppelin/conf:/opt/zeppelin/conf
- ./infra/zeppelin/logs:/opt/zeppelin/logs
- ./infra/zeppelin/notebooks:/opt/zeppelin/notebook
- ./infra/zeppelin/examples:/opt/sansa-examples
environment:
CORE_CONF_fs_defaultFS: "hdfs://namenode:8020"
SPARK_MASTER: "spark://spark-master:7077"
MASTER: "spark://spark-master:7077"
SPARK_SUBMIT_OPTIONS: "--jars /opt/sansa-examples/jars/sansa-examples-spark.jar --conf spark.serializer=org.apache.spark.serializer.KryoSerializer"
links:
- spark-master
hive-server:
image: bde2020/hive
container_name: hive-server
env_file:
- ./infra/hadoop/hadoop-hive.env
environment:
- "HIVE_CORE_CONF_javax_jdo_option_ConnectionURL=jdbc:postgresql://hive-metastore/metastore"
links:
- namenode
- hive-metastore
ports:
- 10000:10000
hive-metastore-postgresql:
image: bde2020/hive-metastore-postgresql
container_name: hive-metastore-postgresql
hive-metastore:
image: bde2020/hive
container_name: hive-metastore
env_file:
- ./infra/hadoop/hadoop-hive.env
links:
- namenode
- hive-metastore-postgresql
command: /opt/hive/bin/hive --service metastore
ports:
- 9083:9083
zookeeper:
image: confluentinc/cp-zookeeper
container_name: zookeeper
environment:
ZOOKEEPER_CLIENT_PORT: 2181
volumes:
- zookeeper:/var/lib/zookeeper
kafka:
image: wurstmeister/kafka
container_name: kafka
ports:
- "9092:9092"
environment:
KAFKA_ADVERTISED_HOST_NAME: localhost
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
links:
- zookeeper
depends_on:
- zookeeper
# kafka:
# image: confluentinc/cp-kafka
# container_name: kafka
# ports:
# - 9092:9092
# environment:
# KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
# KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
# KAFKA_NUM_PARTITIONS: 1
# KAFKA_DEFAULT_REPLICATION_FACTOR: 1
# KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
# KAFKA_DELETE_TOPIC_ENABLE: "true"
# volumes:
# - kafka:/var/lib/kafka
# links:
# - zookeeper
# depends_on:
# - zookeeper
nifi:
image: xemuliam/nifi
container_name: nifi
ports:
- 5080:5080
- 5443:8443
- 5081:5081
## for scaling we have to do this
# - 8080
links:
- zookeeper
- kafka
depends_on:
- zookeeper
- kafka
volumes:
- ./infra/nifi/conf:/opt/nifi/conf
- ./infra/nifi/logs:/opt/nifi/logs
- ./data:/opt/datafiles
- nifi:/opt/nifi/flowfile_repository
- nifi:/opt/nifi/database_repository
- nifi:/opt/nifi/content_repository
- nifi:/opt/nifi/provenance_repository
environment:
ZK_NODES_LIST: zookeeper
IS_CLUSTER_NODE: 1
ELECTION_TIME: 1 min
volumes:
namenode:
datanode:
zookeeper:
kafka:
nifi:
And this is the hadoop environement
HIVE_SITE_CONF_javax_jdo_option_ConnectionURL=jdbc:postgresql://hive-metastore-postgresql/metastore
HIVE_SITE_CONF_javax_jdo_option_ConnectionDriverName=org.postgresql.Driver
HIVE_SITE_CONF_javax_jdo_option_ConnectionUserName=hive
HIVE_SITE_CONF_javax_jdo_option_ConnectionPassword=hive
HIVE_SITE_CONF_datanucleus_autoCreateSchema=false
HIVE_SITE_CONF_hive_metastore_uris=thrift://hive-metastore:9083
HIVE_SITE_CONF_hive_fetch_task_conversion=none
CORE_CONF_fs_defaultFS=hdfs://namenode:8020
CORE_CONF_hadoop_http_staticuser_user=root
CORE_CONF_hadoop_proxyuser_hue_hosts=*
CORE_CONF_hadoop_proxyuser_hue_groups=*
HDFS_CONF_dfs_webhdfs_enabled=true
HDFS_CONF_dfs_permissions_enabled=false
YARN_CONF_yarn_log___aggregation___enable=true
YARN_CONF_yarn_resourcemanager_recovery_enabled=true
YARN_CONF_yarn_resourcemanager_store_class=org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore
YARN_CONF_yarn_resourcemanager_fs_state___store_uri=/rmstate
YARN_CONF_yarn_nodemanager_remote___app___log___dir=/app-logs
YARN_CONF_yarn_log_server_url=http://historyserver:8188/applicationhistory/logs/
YARN_CONF_yarn_timeline___service_enabled=true
YARN_CONF_yarn_timeline___service_generic___application___history_enabled=true
YARN_CONF_yarn_resourcemanager_system___metrics___publisher_enabled=true
YARN_CONF_yarn_resourcemanager_hostname=resourcemanager
YARN_CONF_yarn_timeline___service_hostname=historyserver
YARN_CONF_yarn_resourcemanager_address=resourcemanager:8032
YARN_CONF_yarn_resourcemanager_scheduler_address=resourcemanager:8030
YARN_CONF_yarn_resourcemanager_resource__tracker_address=resourcemanager:8031
My version of Hadoop is : 2.7.4
My version of Hive is : 2.3.2
This is the configuration of the interpeter Hive within zeppelin
enter image description here
And this is the error
enter image description here

You need to add the Kafka handler JAR to Hiveserver2 container classpath, as that's what's really executing the query, not Zeppelin.
The only way to do that, would be to mount a volume in your Compose file, under the path Hive uses to read libraries.
Otherwise, just use Spark Structured Streaming, or your Nifi container instead, to write to Kafka. Only metadata is stored "in Hive". The real data is stored in Kafka. Plus, I'm not so sure Cloudera maintains the Hive Kafka Handler, and it doesn't appear to be published to Maven Central, either.

Related

Web container can't connect to the Elasticsearch container in Azure App Service (using docker-compose)

Error at indexing step:
elastic_transport.ConnectionError: Connection error caused by: ConnectionError(Connection error caused by: ProtocolError(('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))))
Docker-compose on Azure App service:
version: '3'
services:
elasticsearch:
image: "elasticsearch:8.2.2"
container_name: elasticsearch
environment:
- discovery.type=single-node
- ES_JAVA_OPTS=-Xms1g -Xmx1g
- xpack.security.enabled=false
- bootstrap.memory_lock=true
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- es_config:/usr/share/elasticsearch/config
ports:
- "9200:9200"
networks:
- esnet
web:
image: web:latest
ports:
- "80:80"
links:
- elasticsearch
depends_on:
- elasticsearch
environment:
ELASTIC_HOST: ${ELASTIC_HOST}
networks:
- esnet
volumes:
es_config:
driver: azure_file
driver_opts:
share_name: elasticsearch
storage_account_name: storageaccount
networks:
esnet:
driver: bridge
ELASTIC_HOST=http://elasticsearch:9200
client = Elasticsearch(ELASTIC_HOST)
How to connect to ES?
Tldr;
You need to put both services in the same network from them to be able to access one another.
Solution
ELASTIC_HOST=https://elasticsearch:9200 <- httpS
client = Elasticsearch(ELASTIC_HOST)
version: '3'
services:
elasticsearch:
image: "elasticsearch:8.2.2"
container_name: elasticsearch
environment:
- discovery.type=single-node
- ES_JAVA_OPTS=-Xms1g -Xmx1g
- xpack.security.enabled=false
- bootstrap.memory_lock=true
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- es_config:/usr/share/elasticsearch/config
ports:
- "9200:9200"
networks:
- esnet
web:
image: web:latest
ports:
- "80:80"
links:
- elasticsearch
depends_on:
- elasticsearch
environment:
ELASTIC_HOST: ${ELASTIC_HOST}
networks: // <- This part needs to be added.
- esnet
volumes:
es_config:
driver: azure_file
driver_opts:
share_name: elasticsearch
storage_account_name: storageaccount
networks:
esnet:
driver: bridge

When I run in docker why localhost giving connection refused?

This is my docker-compose.yml
version: '3.7'
services:
zookeeper-1:
image: confluentinc/cp-zookeeper:latest
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
ports:
- 22181:2181
kafka-1:
image: confluentinc/cp-kafka:latest
depends_on:
- zookeeper-1
ports:
- 29092:29092
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper-1:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka-1:9092,PLAINTEXT_HOST://localhost:29092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
cassandra:
image: cassandra
container_name: cassandra
ports:
- 9042:9042
producer:
image: spring/producer
links:
- cassandra
depends_on:
- cassandra
- kafka-1
restart: always
consumer:
image: spring/consumer
links:
- cassandra
depends_on:
- cassandra
- kafka-1
restart: always
When I run with first three service in docker (kafka zookeper and cassandra). I can reach with producer and consumer with intellj runner. But when I add docker compose file (producer and consumer) as a services and docker-compose up, producer and consumer services getting localhost:9042 Connection refused error.
Why I cannot reach to cassandra from producer and consumer when i run with Docker. What differences?
This is my producer application.yml
spring:
kafka:
producer:
bootstrap-servers: localhost:29092
key-serializer: org.apache.kafka.common.serialization.IntegerSerializer
value-serializer: org.apache.kafka.common.serialization.StringSerializer
properties:
acks: all
retries: 10
admin:
properties:
bootstrap.servers: localhost:29092
template:
default-topic: users-events
data:
cassandra:
port: 9042
keyspace-name: mykeyspace
username: cassandra
schema-action: create_if_not_exists

Kafka-Elasticsearch Sink Connector not working

I am trying to send data from Kafka to Elasticsearch. I checked that my Kafka Broker is working because I can see the messages I produce to a topic is read by a Kafka Consumer. However, when I try to connect Kafka to Elasticsearch I get the following error.
Command:
connect-standalone etc/schema-registry/connect-avro-standalone.properties \
etc/kafka-connect-elasticsearch/quickstart-elasticsearch.properties
Error:
ERROR Stopping due to error (org.apache.kafka.connect.cli.ConnectStandalone)
org.apache.kafka.connect.errors.ConnectException: Failed to connect to and describe Kafka cluster. Check worker's broker connection and security properties.
at org.apache.kafka.connect.util.ConnectUtils.lookupKafkaClusterId(ConnectUtils.java:64)
at org.apache.kafka.connect.util.ConnectUtils.lookupKafkaClusterId(ConnectUtils.java:45)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:83)
Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment.
at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
at org.apache.kafka.connect.util.ConnectUtils.lookupKafkaClusterId(ConnectUtils.java:58)
... 2 more
Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment.
My Docker Compose File:
version: '3'
services:
zookeeper:
container_name : zookeeper
image: zookeeper
ports:
- 2181:2181
- 2888:2888
- 3888:3888
kafka:
container_name : kafka
image: bitnami/kafka:1.0.0-r5
depends_on:
- zookeeper
ports:
- "9092:9092"
environment:
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_BROKER_ID: "42"
KAFKA_ADVERTISED_HOST_NAME: "kafka"
ALLOW_PLAINTEXT_LISTENER: "yes"
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
elasticsearch:
container_name : elasticsearch
image:
docker.elastic.co/elasticsearch/elasticsearch:7.8.0
environment:
- node.name=elasticsearch
- cluster.name=es-docker-cluster
- discovery.seed_hosts=elasticsearch
- bootstrap.memory_lock=false
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
- discovery.type=single-node
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- data99:/usr/share/elasticsearch/data
ports:
- 9200:9200
kibana:
container_name : kibana
image: docker.elastic.co/kibana/kibana:7.8.0
# environment:
# - SERVER_NAME=Local kibana
# - SERVER_HOST=0.0.0.0
# - ELASTICSEARCH_URL=elasticsearch:9400
ports:
- "5601:5601"
depends_on:
- elasticsearch
kafka-connect:
container_name : kafka-connect
image: confluentinc/cp-kafka-connect:5.3.1
ports:
- 8083:8083
depends_on:
- zookeeper
- kafka
volumes:
- $PWD/connect-plugins:/connect-plugins
environment:
CONNECT_BOOTSTRAP_SERVERS: kafka:9092
CONNECT_REST_ADVERTISED_HOST_NAME: "localhost"
CONNECT_REST_PORT: 8083
CONNECT_GROUP_ID: kafka-connect
CONNECT_CONFIG_STORAGE_TOPIC: docker-kafka-connect-configs
CONNECT_OFFSET_STORAGE_TOPIC: docker-kafka-connect-offsets
CONNECT_STATUS_STORAGE_TOPIC: docker-kafka-connect-status
CONNECT_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_INTERNAL_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_INTERNAL_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_KEY_CONVERTER-SCHEMAS_ENABLE: "false"
CONNECT_VALUE_CONVERTER-SCHEMAS_ENABLE: "false"
CONNECT_REST_ADVERTISED_HOST_NAME: "kafka-connect"
CONNECT_LOG4J_ROOT_LOGLEVEL: "ERROR"
CONNECT_LOG4J_LOGGERS: "org.apache.kafka.connect.runtime.rest=WARN,org.reflections=ERROR"
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: "1"
CONNECT_TOPICS: "test-elasticsearch-sink"
CONNECT_TYPE_NAME: "type.name=kafka-connect"
CONNECT_PLUGIN_PATH: '/usr/share/java' #'/usr/share/java'
# Interceptor config
CONNECT_PRODUCER_INTERCEPTOR_CLASSES: "io.confluent.monitoring.clients.interceptor.MonitoringProducerInterceptor"
CONNECT_CONSUMER_INTERCEPTOR_CLASSES: "io.confluent.monitoring.clients.interceptor.MonitoringConsumerInterceptor"
CLASSPATH: /usr/share/java/monitoring-interceptors/monitoring-interceptors-5.3.1.jar
CONNECT_KAFKA_HEAP_OPTS: "-Xms256m -Xmx512m"
volumes:
data99:
driver: local
I checked some other questions and answers but couldn't come up with a solution to this problem.
Thanks in advance!
The Connect container starts Connect Distributed Server already. You should use HTTP and JSON properties to configure the Elastic connector rather than exec into the container shell and issue connect-standalone commands which default to using a broker running in the container itself.
Similarly, the Elastic quickstart file expects Elasticsearch running within the Connect container, by default

Access Kafka in Remote Host by IP Address running with Docker-Compose and Spring Boot

I have this docker-compose.yml in which I run Zookeeper, Kafka, Kafka Connect, and KafDrop, the thing is, when I run locally I can connect from my Spring Boot application to consume some topic messages.
What I need is to run the same configuration on a Linux machine and be able to connect from the Spring Boot application the same way.
When run it remotely on the Linux machine everything seems to be running Ok, but when I try to connect from the Spring Boot application I receive some erros showing that somethin is wrong in the connection.
I will try to explain step by step and see if someone can give a "light" on that:
docker-compose.yml:
version: '3'
services:
zookeeper:
image: confluentinc/cp-zookeeper:latest
networks:
- broker-kafka
ports:
- 2181:2181
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
kafka:
image: confluentinc/cp-kafka:latest
networks:
- broker-kafka
restart: unless-stopped
depends_on:
- zookeeper
ports:
- 9092:9092
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_LISTENERS:
INTERNAL://kafka:29092,
EXTERNAL://localhost:9092
KAFKA_ADVERTISED_LISTENERS:
INTERNAL://kafka:29092,
EXTERNAL://localhost:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP:
INTERNAL:PLAINTEXT,
EXTERNAL:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_LOG_RETENTION_HOURS: 12
connect:
image: cdc:latest
networks:
- broker-kafka
depends_on:
- zookeeper
- kafka
ports:
- 8083:8083
environment:
CONNECT_BOOTSTRAP_SERVERS: kafka:29092
CONNECT_REST_PORT: 8083
CONNECT_GROUP_ID: connect-1
CONNECT_CONFIG_STORAGE_TOPIC: connect-1-config
CONNECT_OFFSET_STORAGE_TOPIC: connect-1-offsets
CONNECT_STATUS_STORAGE_TOPIC: connect-1-status
CONNECT_KEY_CONVERTER: org.apache.kafka.connect.json.JsonConverter
CONNECT_VALUE_CONVERTER: org.apache.kafka.connect.json.JsonConverter
CONNECT_OFFSET.STORAGE.REPLICATION.FACTOR: 1
CONNECT_CONFIG.STORAGE.REPLICATION.FACTOR: 1
CONNECT_OFFSET.STORAGE.PARTITIONS: 1
CONNECT_STATUS.STORAGE.REPLICATION.FACTOR: 1
CONNECT_STATUS.STORAGE.PARTITIONS: 1
CONNECT_REST_ADVERTISED_HOST_NAME: localhost
kafdrop:
image: obsidiandynamics/kafdrop:latest
networks:
- broker-kafka
depends_on:
- kafka
ports:
- 19000:9000
environment:
KAFKA_BROKERCONNECT: kafka:29092
networks:
broker-kafka:
driver: bridge
What I need is to expose to my network this IP machine to be accessed by my Spring Boot application.
Let´s suppose this Linux machine has the IP 10.12.54.99.
How can I make it Kafka be accessible by: 10.12.54.99:9090 ?
Here is my application.properties:
spring.kafka.bootstrap-servers=10.12.54.99:9092
spring.kafka.producer.key-serializer=org.apache.kafka.common.serialization.StringSerializer
spring.kafka.producer.value-serializer=org.apache.kafka.common.serialization.StringSerializer
spring.kafka.consumer.enable-auto-commit=false
spring.kafka.consumer.auto-commit-interval=100
spring.kafka.consumer.max-poll-records=10
spring.kafka.consumer.key-deserializer=org.springframework.kafka.support.serializer.ErrorHandlingDeserializer
spring.kafka.consumer.value-deserializer=org.springframework.kafka.support.serializer.ErrorHandlingDeserializer
spring.kafka.consumer.group-id=connect-sql-server
spring.kafka.consumer.auto-offset-reset=earliest
spring.kafka.listener.ack-mode=manual-immediate
spring.kafka.listener.poll-timeout=3000
spring.kafka.listener.concurrency=3
spring.kafka.properties.spring.deserializer.key.delegate.class=org.apache.kafka.common.serialization.StringDeserializer
spring.kafka.properties.spring.deserializer.value.delegate.class=org.apache.kafka.common.serialization.StringDeserializer
This is a only consumer-specif application (no producers are used here).
When I run the application:
2020-12-07 10:59:40.361 WARN 58716 --- [ntainer#0-0-C-1] org.apache.kafka.clients.NetworkClient : [Consumer clientId=consumer-connect-sql-server-1, groupId=connect-sql-server] Connection to node -1 (/10.12.54.99:9092) could not be established. Broker may not be available.
2020-12-07 10:59:40.362 WARN 58716 --- [ntainer#0-0-C-1] org.apache.kafka.clients.NetworkClient : [Consumer clientId=consumer-connect-sql-server-1, groupId=connect-sql-server] Bootstrap broker 10.12.54.99:9092 (id: -1 rack: null) disconnected
All the firewall ports are enabled in the Linux firewall machie.
Any enlightenment would be very much appreciated.
You need to bind your server's public ip in order to be able to access brokers remotely. However if you don't want to hardcode the ip, you can use .env file.
Do the following:
Create config.env file.
Add this line in config.env and add your host ip as below:
DOCKER_HOST_IP=111.111.11.111
Update your docker-compose:
version: '3'
services:
zookeeper:
image: confluentinc/cp-zookeeper:latest
networks:
- broker-kafka
ports:
- ${DOCKER_HOST_IP:-127.0.0.1}:2181:2181
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
kafka:
image: confluentinc/cp-kafka:latest
networks:
- broker-kafka
restart: unless-stopped
depends_on:
- zookeeper
ports:
- ${DOCKER_HOST_IP:-127.0.0.1}:9092:9092
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_LISTENERS:
INTERNAL://kafka:29092,
EXTERNAL://localhost:9092
KAFKA_ADVERTISED_LISTENERS:
INTERNAL://kafka:29092,
EXTERNAL://${DOCKER_HOST_IP:-127.0.0.1}:9092:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP:
INTERNAL:PLAINTEXT,
EXTERNAL:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_LOG_RETENTION_HOURS: 12
connect:
image: cdc:latest
networks:
- broker-kafka
depends_on:
- zookeeper
- kafka
ports:
- 8083:8083
environment:
CONNECT_BOOTSTRAP_SERVERS: kafka:29092
CONNECT_REST_PORT: 8083
CONNECT_GROUP_ID: connect-1
CONNECT_CONFIG_STORAGE_TOPIC: connect-1-config
CONNECT_OFFSET_STORAGE_TOPIC: connect-1-offsets
CONNECT_STATUS_STORAGE_TOPIC: connect-1-status
CONNECT_KEY_CONVERTER: org.apache.kafka.connect.json.JsonConverter
CONNECT_VALUE_CONVERTER: org.apache.kafka.connect.json.JsonConverter
CONNECT_OFFSET.STORAGE.REPLICATION.FACTOR: 1
CONNECT_CONFIG.STORAGE.REPLICATION.FACTOR: 1
CONNECT_OFFSET.STORAGE.PARTITIONS: 1
CONNECT_STATUS.STORAGE.REPLICATION.FACTOR: 1
CONNECT_STATUS.STORAGE.PARTITIONS: 1
CONNECT_REST_ADVERTISED_HOST_NAME: localhost
kafdrop:
image: obsidiandynamics/kafdrop:latest
networks:
- broker-kafka
depends_on:
- kafka
ports:
- 19000:9000
environment:
KAFKA_BROKERCONNECT: kafka:29092
networks:
broker-kafka:
driver: bridge
It will bind to 127.0.0.1, if DOCKER_HOST_IP is not found.
Run the following command:
sudo docker-compose -f path-to-docker-compose.yml --env-file path-to-config.env up -d --force-recreate

ElasticSearch cluster in docker always shut down one master in 3 node configuration

I run elasticsearch cluster in three docker containers, docker-compose.yml as below.
When I run whichever two of them, the cluster can be set up which status is GREEN, but when I start the third one, one of node in the cluster is forced shutdown (the docker container quits) and no error message is logged in the shutdown elasticsearch node.
node 1 docker-compose.yml:
version: '2'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:5.4.1
container_name: elasticsearch1
environment:
- cluster.name=MoquiElasticSearch
- bootstrap.memory_lock=true
- discovery.zen.minimum_master_nodes=2
- xpack.security.enabled=false
- transport.publish_host=192.168.2.101
- http.publish_host=192.168.2.101
- discovery.zen.ping.unicast.hosts=192.168.2.101:9301,192.168.2.101:9302
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
- TZ=Asia/Shanghai
ulimits:
memlock:
soft: -1
hard: -1
mem_limit: 1g
ports:
- 9200:9200
- 9300:9300
node 2 docker-compose.yml
version: '2'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:5.4.1
container_name: elasticsearch2
environment:
- cluster.name=MoquiElasticSearch
- bootstrap.memory_lock=true
- discovery.zen.minimum_master_nodes=2
- xpack.security.enabled=false
- transport.publish_host=192.168.2.101
- transport.publish_port=9301
- transport.tcp.port=9301
- http.publish_host=192.168.2.101
- http.publish_port=9201
- http.port=9201
- discovery.zen.ping.unicast.hosts=192.168.2.101:9300,192.168.2.101:9302
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
- TZ=Asia/Shanghai
ulimits:
memlock:
soft: -1
hard: -1
mem_limit: 1g
ports:
- 9201:9201
- 9301:9301
node 3 docker-compose.yml
version: '2'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:5.4.1
container_name: elasticsearch3
environment:
- cluster.name=MoquiElasticSearch
- bootstrap.memory_lock=true
- discovery.zen.minimum_master_nodes=2
- xpack.security.enabled=false
- transport.publish_host=192.168.2.101
- transport.publish_port=9302
- transport.tcp.port=9302
- http.publish_host=192.168.2.101
- http.publish_port=9202
- http.port=9202
- discovery.zen.ping.unicast.hosts=192.168.2.101:9300,192.168.2.101:9301
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
- TZ=Asia/Shanghai
ulimits:
memlock:
soft: -1
hard: -1
mem_limit: 1g
ports:
- 9202:9202
- 9302:9302
This is caused by the docker on Mac that it has memory limit (was 2.5G) so that it can't afford 3 nodes, so one of them is forced to shut down.
After increase the dedicated memory to docker engine, all 3 nodes are up and running and elasticsearch cluster is GREEN

Resources