Maybe a naive question, but can anyone provide me the sbt dependency for KSQL?
I checked on Maven, but couldn't find any.
Is the dependency hosted some place other than Maven, if yes what would be the revolver I will have to add in my build.sbt file?
I'm trying to write a Scala app which uses Ksql to query on some kafka topics to create a dashboard with some metrics.
None of the Confluent dependencies are in Maven Central
See
https://docs.confluent.io/current/installation/clients.html#maven-repository-for-jars
And I think this is the KSQL client target
<dependency>
<groupId>io.confluent.ksql</groupId>
<artifactId>ksql-engine</artifactId>
</dependency>
Example Java code - https://github.com/confluentinc/ksql/tree/master/ksqldb-examples/src/main/java/io/confluent/ksql/embedded
You don't need to embed KSQL in your code, though. It's meant to run independently on the KSQL Server, which you can submit from code or use the KSQL CLI. In your application, you'd use a regular consumer or Kafka Streams API directly
I would suggest trying the new Scala Kafka Streams wrapper, too
Related
Have been trying for a while now and Im sure the solution is simple enough, just struggling to find it. Im pretty new so be easy on me..!
Its a requirement to do this using a premade init-script, which is then selected in the UI when configuring the cluster.
I am trying to install com.microsoft.azure:azure-eventhubs-spark_2.12:2.3.18 to a cluster on Azure Databricks. Following the documentations example (it is installing a postgresql driver) they produce an init script using the following command:
dbutils.fs.put("/databricks/scripts/postgresql-install.sh","""
#!/bin/bash
wget --quiet -O /mnt/driver-daemon/jars/postgresql-42.2.2.jar https://repo1.maven.org/maven2/org/postgresql/postgresql/42.2.2/postgresql-42.2.2.jar""", True)```
My question is, what is the /mnt/driver-daemon/jars/postgresql-42.2.2.jar section of this code? And what would I have to do to make this work for my situation?
Many thanks in advance.
/mnt/driver-daemon/jars/postgresql-42.2.2.jar here is the output path where the jar file will be put. But it makes no sense as this jar won't be put into CLASSPATH and won't be found by Spark. Jars need to be put into /databricks/jars/ directory, where they will be picked up by Spark automatically.
But this method with downloading of jars works only for jars without dependencies, and for libraries like EventHubs connector this is not a case - they won't work if dependencies aren't downloaded as well. Instead it's better to use Cluster UI or Libraries API (or Jobs API for jobs) - with these methods, all dependencies will be fetched as well.
P.S. But really, instead of using EventHubs connector, it's better to use Kafka protocol that is supported by EventHubs as well. There are several reasons for that:
It's better from performance standpoint
It's better from stability standpoint
Kafka connector is included into DBR, so you don't need to install anything extra
You can read how to use Spark + EventHubs + Kafka connector in the EventHubs documentation.
Problem Description
I have developed a custom Mulesoft connector using AnypointStudio and following all guidelines on how to do it. However, I am struggling on writing MUnit functional tests for that connector or involving some example flows. The issue is the connector project cannot "import itself", meaning components that I developed for people importing my connector (via Maven for example) are not available for me in my src/main/mule (Flows) location on the Mule Palette.
Question
Is there a way to import components from my connector inside the connector itself so that it can use them for example flow? If not, is the right approach here to make new separate project which will import my connector and then have all my tests there?
Test cases for Mule 4 connectors can be done as described at the documentation, using JUnit and Java test cases: https://docs.mulesoft.com/mule-sdk/1.1/testing-writing-your-first-test-case
Maven knows how to handle the dependencies for tests so that should not be a problem.
If you want to also integrate MUnit tests you can take a peek at how other connectors do it. You can inspect the open source connectors.
Examples:
File Connector: https://github.com/mulesoft/mule-file-connector/
HTTP Connector: https://github.com/mulesoft/mule-http-connector/
I'm attempting to solve a problem using kstreams. I'm currently hitting this error when doing an aggregation.
Exception in thread "main" java.lang.NoClassDefFoundError: org/rocksdb/RocksDBException
at org.apache.kafka.streams.state.internals.RocksDbWindowBytesStoreSupplier.get(RocksDbWindowBytesStoreSupplier.java:50)
at org.apache.kafka.streams.state.internals.RocksDbWindowBytesStoreSupplier.get(RocksDbWindowBytesStoreSupplier.java:24)
at org.apache.kafka.streams.state.internals.WindowStoreBuilder.build(WindowStoreBuilder.java:40)
at org.apache.kafka.streams.state.internals.WindowStoreBuilder.build(WindowStoreBuilder.java:26)
at org.apache.kafka.streams.processor.internals.InternalTopologyBuilder$StateStoreFactory.build(InternalTopologyBuilder.java:141)
at org.apache.kafka.streams.processor.internals.InternalTopologyBuilder.buildProcessorNode(InternalTopologyBuilder.java:966)
at org.apache.kafka.streams.processor.internals.InternalTopologyBuilder.build(InternalTopologyBuilder.java:869)
at org.apache.kafka.streams.processor.internals.InternalTopologyBuilder.build(InternalTopologyBuilder.java:822)
at org.apache.kafka.streams.processor.internals.InternalTopologyBuilder.build(InternalTopologyBuilder.java:805)
at org.apache.kafka.streams.KafkaStreams.<init>(KafkaStreams.java:667)
at org.apache.kafka.streams.KafkaStreams.<init>(KafkaStreams.java:624)
at org.apache.kafka.streams.KafkaStreams.<init>(KafkaStreams.java:534)
My code is effectively this:
KStream<String, InputData> input = builder.stream(topicname);
KTable<Windowed<String>, CustomAgg> grouped =
input.groupByKey()
.windowedBy(TimeWindows.of(Duration.ofMillis(60000)))
.aggregate(
CustomAgg::new,
(k, v, agg) -> agg.add(v),
Materialized.<String, CustomAgg, WindowStore<Bytes, byte[]>>as("aggs").withValueSerde(new CustomAggSerde()));
grouped.toStream().print(Printed.toSysOut());
kafka-streams version 2.1.0
I can't seem to find any resources online on how to setup rocksDB for kafka streams - any advice would be much appreciated. (I have it installed with brew but I'm not sure how I need to point to it, any setup, does it need to be in my pom.xml file etc). Working on MacOS currently for development.
Thanks!
You do not need to install RocksDB for Kafka Streams. RocksDB is a dependency of Kafka Streams. If you have Kafka Streams as a dependency in your build automation tool (e.g. maven or gradle), the RocksDB JAR should be automatically downloaded during a build and put onto your class path.
Without a build automation tool you probably need to put the RocksDB JAR on the class path manually. The correct version of RocksDB for Kafka Streams 2.1.0 should be 5.14.2.
The error you get seems to be a class path issue, so maybe it is related to the above.
try insert below dependency in you pom.xml:
<dependency>
<groupId>org.rocksdb</groupId>
<artifactId>rocksdbjni</artifactId>
<version>4.9.0</version>
</dependency>
this link might be helpful to you:
https://technology.amis.nl/software-development/java/getting-started-with-kafka-streams-building-a-streaming-analytics-java-application-against-a-kafka-topic/
I recently discovered that Spring has an alpha version of a spring cloud stream provider that leverages jms (ActiveMQ virtual destinations under the hood). This is absolutely fascinating and I want to test it out. I am having difficulty finding a snapshot of the dependencies I can use or being able to pull and build the correct github projects so I have the dependencies in my local repository. I would appreciate any assistance on this.
http://activemq.apache.org/amqp.html
https://github.com/spring-cloud/spring-cloud-stream-binder-jms
https://github.com/spring-cloud/spring-cloud-stream-binder-jms/tree/master/spring-cloud-stream-binder-jms-activemq
We are in the process of restructuring the repositories for the JMS binder and we don't have the CI processes that build the necessary artifacts yet (should be there in by early next week).
For now, you can try building and installing https://github.com/spring-cloud/spring-cloud-stream-binder-jms (which also contains the ActiveMQ support). We'll decide later if we need a separate repository for ActiveMQ.
Spring XD on YARN: ver 1.2.1 direct binding support for kafka source.
1.I know this is not supported yet(as of ver 1.3.0), any definite date/ver would help our project schedule ?
2.This direct binding for kafka source support is very critical for our project. We are in a situation to totally abandon Spring XD YARN in our project just because of this.
Trying to do
stream create --name directkafkatohdfs --definition "kafka | hdfs"
stream deploy directkafkatohdfs --properties "module.*.count=0"
Hitting the exception "must be a positive number. 0-count kafka sources are not currently supported"
I just want to eliminate the use of message bus/transport(redis/kafka/rabbitMQ) and want to have a direct binding of source(kafka) and sink(sink) in the same YARN container.
1.I know this is not supported yet(as of ver 1.3.0), any definite date/ver would help our project schedule.
2.This direct binding for kafka source support is very critical for our project. We are in a situation to totally abandon Spring XD YARN in our project just because of this.
Thanks
Satish Srinivasan
satsrinister#gmail.com
Thanks for the interest in Spring XD :).
For Spring XD 1.x, we suggest using composition instead of direct binding with the Kafka bus - or, in your case, the Kafka source. However, apart from that, in Spring XD 1.x it is not possible to create an entire stream without at least one hop over the bus (regardless of the type of bus or modules being used).
We are addressing direct binding (including support for entire directly bound streams) as part of Spring Cloud Data Flow (http://cloud.spring.io/spring-cloud-dataflow/) - which is the next evolution of Spring XD. We are intending to support it as a specific configuration option, rather than as a side-effect of zero-count modules. From an end-user perspective, SCDF supports the same DSL as Spring XD (with minor variations) and has the same administration UI, and definitely supports YARN, so it should be a fairly seamless transition. I would suggest starting to take a look at that. The upcoming 1.0.0.M2 release of Spring Cloud Data Flow will not support direct binding via DSL yet, but the intent is to support it in the final release which is currently planned for Q1 2016.