Unable to write streamed data to sink file using Spring cloud dataflow - spring

I am trying to create data flow pipeline using spring cloud data flow using shell(Not UI). Source being twitterstream and sink as File. Here is what i did to configure file-sink :
dataflow:>stream create demo --definition "twitterstream --credentials | file --dir=/opt/datastream --mode=APPEND --filename=tweets.txt"
I can consume data from kafka topic but unable to write on above sink location, file is not even created . NO error log while deploying the stream. Eventually i will change it to HDFS from local file system. Is there anything missing ?
PS: I tried default file-sink (without definition), which is supposed to create default file inside /tmp/xd/output, didn't happen either.

On the latest 1.0.0.RELEASE (GA) release, the following stream definition works.
dataflow:>stream create demo --definition "twitterstream | file --directory=/someFolder --mode=APPEND --name=demo.txt"
A couple of things to point out:
1) The twitterstream source does not support --credentials as an OOTB property. See here.
2) The file sink does not support --filename as an OOTB property; you'd have to use --name instead. See here.

Related

Configuring Debezium MySQL connector via env vars

The only way to configure a Debezium connector (MySQL in my case) is to send a config to a running Kafka Connect instance via HTTP.
My question is: is it possible to supply this configuration when starting the Connect instance? Via a properties file or (ideally) via env vars?..
If you execute a connector worker in standalone mode, you can supply configuration via command line (see details here):
bin/connect-standalone worker.properties connector1.properties [connector2.properties connector3.properties ...]
For distributed mode, you can only use the REST API. But you can do some automation using tools like Ansible.

Error deploying bar file using IBM Integration Toolkit

I am working on a project with IBM Integration Toolkit 10.0.
I have some existing bar files which I am trying to deploy to remote IIB 10.0 node on linux machine using the IBM Integration Bus Toolkit.
The jar file myaction.node.jar has java class FEBNode in package as com.abc.xyz.FEBNode
The jar file resides in /opt/abc/newplugin directory and this path is configured for "User lil path".
I am getting error like
BIP2241E: A Loadable Implementation Library (.lil, .jar, or .par) is not found for message flow node
type 'ComAbcXyzFEBNode' in message flow 'Actions.JavaActions'.
The integration node received an instruction to create a message flow node of type 'ComAbcXyzFEBNode', in message flow 'Actions.JavaActions'.
The integration node cannot create nodes of this type because an implementation library for this node type does not exist in the LIL path.
The error is also seen while deploying from command line
./mqsideploy -i localhost -p 4414 -e default -a /tmp/barfiles/MyWrapper.bar.
Please give me some pointers on how to resolve this error.
I would recommend you to read about the shared-classes. This should fix your issue easily

How to change the "kafka connect" component port?

On port 8083 I am running Influxdb for which I am even getting the GUI on http://localhost:8083
Now come to kafka, Here I am referring the setup as per https://kafka.apache.org/quickstart
starting the zookeeeper which is in folder /opt/zookeeper-3.4.10 by the command: bin/zkServer.sh start
So zookeeper is started now starting kafka under /opt/kafka_2.11-1.1.0 folder as :
bin/kafka-server-start.sh config/server.properties
create a topic named "test" with a single partition and only one replica:
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
Topic is created and can be checked in with command :
bin/kafka-topics.sh --list --zookeeper localhost:2181
Uptill here everything is fine and tuned.
Now I need to use "Kafka connect" component to import/export data.
So I am creating a seed data as: echo -e "foo\nbar" > test.txt
Now using connector configuration for "kafka connect" to work :
bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.properties
after running above command I am getting : Address already in use
Kafka connect has stopped
I even changed the rest.port=8084 in the /opt/kafka_2.11-1.1.0/config/connect-distributed.properties so as it don't get conflict with influxdb which already on 8083. Still I am getting the same Address already in use,
Kafka connect has stopped as shown in screenshots.
Since you're using Kafka Connect in Standalone mode, you need to change the REST port in config/connect-standalone.properties:
rest.port=18083
To understand more about Standalone vs Distributed you can read the doc here.
Kafka Standalone mode, uses Port 8084 as the Rest API post, by default. Due to this reason, if someone else is using that port already, the process with throw a BindException.
To change the port used above, navigate to the config/connect-standalone.properties file in the Kafka Root directory.
Add the following key value property to change the Port being used for Rest API opening. (Kafka should have included this in the properties file by default, else many developers go nuts trying to find the port mapping used in the standalone mode). Put a different port as you wish.
rest.port=11133
Kafka 3.0.0
Since Kafka Connect is intended to be run as a service, it also provides a REST API for managing connectors. The REST API server can be configured using the listeners configuration option. This field should contain a list of listeners in the following format: protocol://host:port,protocol2://host2:port2. Currently supported protocols are http and https.
For example: listeners= http://localhost:8080,https://localhost:8443
By default, if no listeners are specified, the REST server runs on port 8083 using the HTTP protocol.
More details: https://kafka.apache.org/documentation/#connect_rest
Change the port definition in config/server.properties:
# The port the socket server listens on
port=9092

Spring cloud data flow shell : Stuck on "The stream is being deployed"

I successfully registered three apps named appSink, appSource and appProcessor as follows
dataflow:>app register --name appSource --type source --uri maven://com.example:source:jar:0.0.1-SNAPSHOT --force
Successfully registered application 'source:appSource'
dataflow:>app register --name appProcessor --type processor --uri maven://com.example:processor:jar:0.0.1-SNAPSHOT --force
Successfully registered application 'processor:appProcessor'
dataflow:>app register --name appSink --type sink --uri maven://com.example:sink:jar:0.0.1-SNAPSHOT --force
Successfully registered application 'sink:appSink'
dataflow:>app list
╔══════════╤═════════════╤════════╤════╗
║ source │ processor │ sink │task║
╠══════════╪═════════════╪════════╪════╣
║appSource│appProcessor│appSink│ ║
╚══════════╧═════════════╧════════╧════╝
I then created and deployed a stream as follows:
dataflow:>stream create --name myStream --definition 'appSource | appProcessor | appSink’
Created new stream 'myStream'
dataflow:>stream deploy --name myStream
I get the message
Deployment request has been sent for stream 'myStream'
In the streams list I see
║myStream1 │source-app | processor-app | sink-app│The stream is being deployed. ║
The deployment never finishes it seems. The data flow server logs are just stuck on this
o.s.c.d.spi.local.LocalAppDeployer : Deploying app with deploymentId myStream1.source-app instance 0.
Why is my stream not deploying successfully?
Do you see any java processes running in your local (that correspond to the applications being deployed)?
You can try remote debugging your application deployment using the doc: https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#_remote_debugging
You can also try inheriting the apps logging using
https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#_log_redirect
I am seeing this same problem. I inherited the logging as you suggested. The UI never moves off of Deploying status. There are no errors in the logs and my stream is working when I test it.
Add spring boot actuator dependency in your project, dataflow calls /health and /info to see if the app is deployed or not.

Missing modules in XD

I am trying to resolve very basic issue with Quick Start guide of Spring XD. But have already spent more than an hour.
I am following guide at http://projects.spring.io/spring-xd/#quick-start
But when I try create stream using following
**stream create --definition "time | log" --name ticktock --deploy**
It does not find standard module "log".
**Command failed org.springframework.xd.rest.client.impl.SpringXDException: Could not find module with name 'log' and type 'sink'**
I tried changing XD_HOME values to
/Users/sudhir/MyPrograms/spring-xd-1.2.0.RELEASE
/Users/sudhir/MyPrograms/spring-xd-1.2.0.RELEASE/spring-xd/xd
/Users/sudhir/MyPrograms/spring-xd-1.2.0.RELEASE/spring-xd
Tried to run xd-singlenode and xd-shell from XD_HOME using complete path.
Well, in any case you should just be able to cd into
$INSTALLDIRECTORY/spring-xd-1.2.0.RELEASE/xd/bin and then run ./xd-singlenode.
cd $INSTALLDIRECTORY/spring-xd-1.2.0.RELEASE/xd/bin
./xd-singlenode
Once you singlenode container is up and running start up xd-shell through $INSTALLDIRECTORY/spring-xd-1.2.0.RELEASE/shell/bin/xd-shell
./xd-shell
And you should at least be able to get tictoc up and running. The ENV stuff is probably just your own environment.

Resources