Alternative to Kakfa-connect due to ARM incompability - macos

I've been using kafka (using the wurstmeister images, but i have also tried to set up the broker and zookeeper using the confluent images and it works) for a while and I am now trying to set up kafka-connect so I can directly load messages from a kafka topic to S3. However, i've been running into several issues. Qemu errors, java.lang.ExceptionInInitializerError at org.eclipse.jetty.http.MimeTypes, etc, which I have read that it has something to do with lack of ARM support (https://github.com/confluentinc/common-docker/issues/117 and https://github.com/docker/buildx/issues/542). I have tried to run the docker compose with platform: linux/amd64, but it still doesn't work.
I was wondering if anyone has any workarounds to make kafka-connect work or if you know any alternatives.
Thanks!

You don't need Docker to run Kafka Connect.
Install Java on your Mac
Download Kafka
Run Zookeeper and Kafka (this might have issues with M1 Mac)
run bin/connect-distributed.sh config/connect-distributed.properties
If you really need Docker, you can rebuild images from other sources, such as mine, which builds from adoptopenjdk:11-jre base image, which supports ARM

Related

Why is RocksDB library throwing loadlibrary error in Kafka Streams

We are running Kafka Streams using a AWS Fargate container and get the below error while starting the application. How can I avoid this?
Exception in thread java.lang.UnsatisfiedLinkError: /tmp/librocksdbjni3589189542893555938.so: Error loading shared library ld-linux-x86-64.so.2:
No such file or directory (needed by /tmp/librocksdbjni3589189542893555938.so)
I don't know what image AWS Fargate container uses, however, RocksDB images are not available for all operating systems.
As you can infer from the error, the RocksDB image you use requires ld-linux-x86-64.so.2 that seems not to be available in the container image. Not sure if you can tweak the image accordingly.
You can also try to compile RocksDB from scratch and target the build for the container image. As an alternative, you could also run with in-memory stores instead of RocksDB or implement a custom state store.

Can I run mesos/marathon application at specific host?

I wanna use marathon as cluster monitoring and management. Bellow scenario is possible?
My Scenario
Cassandra 5EA was already deployed and are running.
Cassandra hosts are physical machine.
I want to run script that verifies healthness of cassandra each host. ex) cassandra process, disk usage, number of file, ..
If problem found at host, than run correcting script on that host. Script launched manually.
Each script can be run by marathon application. But I couldn't found run application on (specific) error host.
No restriction of adding machines and installing mesos components.
And if you know more suitable tool, please recommend!!
If you are not running Cassandra on Mesos I think Marathon is not the best choice. From your description, it looks like you need a monitoring tool (e.g., Nagios) rather than service Orchestration.
Please extend your question with more information. It's not clear what you are asking.

Windows Zookeeper Cluster

I would like to understand if it is possible to run a Zookeeper cluster on Windows and if so, how to set it up. If it is not possible, given that it is a Java application, I would like to understand why.
From my research, it seems Windows is supported for standalone setup and to aid in development but not for production zookeeper system requirements
As of Zookeeper 3.4.9, Windows is supported for production. From ZooKeeper Administrator's Guide 3.4.9:

Can I run DCE (Docker Container Executor) on Yarn with Kerberos?

The hadoop documentation states that DCE does not support a cluster with secure mode (Kerberos): https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/DockerContainerExecutor.html
Are people working on this? Is there a way around this limitation?
Ok. There's no current work on DCE (YARN-2466). Efforts have shifted towards supporting Docker containers in the LinuxContainerExecutor (YARN-3611). This will support Kerberos. There is currently no documentation yet (YARN-5258) and many of these features are expected to be part of the 2.8 Apache release.
Source and more info:
https://community.hortonworks.com/questions/39064/can-i-run-dce-docker-container-executor-on-yarn-wi.html

Issues in elasticsearch server running in docker

I am new to Docker. I have developed a webapp that needs elasticsearch server up and running, and the server itself listens for any user request. IN our workflow, we would like to see elasticsearch up and running, then logstash to be able to publish data (by reading the data we have with the help of logstash conf files), and finally launch the webapp we have. I am advised to use docker compose as it helps to compose multiple servers.
SO, we have three sub-directories, one for each es, logstash and webapp.
In my first step, I have in my elasticsearch dockerfile the below lines
FROM elasticsearch:latest
RUN plugin -i elasticsearch/marvel/latest
Similarly, I have Dockerfile in other sub-directories as well.
I use the command 'docker-compose up' to start the process. Once, ES is built ad logstash will be built. When logstash is being built, it tries to publish data to ES. But it finds here that ES is not up and running.I see connection refused exception. Can someone tell why this error comes? The contents in the Dockerfile is ok?
I think, I am not using docker / docker-compose the way I should use. May be, couple of pointers to learning materials will be helpful. I find plenty but could not relate to the use case I have.
There are two phases when running up. The "build" phase, which runs docker build, and the "run" phase, which runs the equivalent of docker run. During the build phase there is no support for linking to other containers, so you won't be able to connect to them.
An easy solution is to run part of the setup during the "run" phase, but the downside is that it isn't cached and has to be performed each time.
To include it as part of the build phase, you can export the json that needs to be sent to ES, include it as part of the ES build, and then have a step that does something like this:
start es
wait for es to be available
run curl to PUT/POST the data to ES
stop es

Resources