Create kafka connector during startup without REST API - apache-kafka-connect

i have to leverage kafka connector as source but creating connector by REST api in production is something I want to avoid, is it possible to create connector during startup without using REST api?
I understand REST api provides the flexibility to dynamically create/configure connectors and size the tasks, but really want to if the same can be done during startup either by providing any configuration problem.
Currently I start connector in distributed mode by supplying properties file and I want to mention the database, filters, transformers and other details there itself.
Let me know if there is a way to achieve.

Related

logstash vs spring cloud data flow, which one is suitable for data preprocessing?

I'm using spring boot along with elasticsearch to make a search system on my website.
I've some data that i need to push in elastic search, this data ( a product for example ) must be processed before ( passed to another micro-service that filters the JSON, adds some fields for a better search result do some calculations and return the object i want to store ). is it possible to do so with log stash, or do i need to use Spring Cloud Data Flow ? thanks in advance.
what i want to do:
save a product ( product service )
log the saved product or stream it.
process it before storage ( another service )
save the document ( elastic search server )
Thanks in advance.
Obviously it depends on various factors but I can try to provide some insights on Spring Cloud Data Flow from the technical standpoint.
If you want to construct a streaming pipeline where your filtering apps are connected via a messaging system that does this flow of data processing, you can checkout Spring Cloud Data Flow.
Spring Cloud Data Flow (and the underlying framework supports such as Spring Cloud Stream and Spring Cloud Task) provides the operational benefits over how you manage your streaming pipelines but it may not make sense if you don't need a data pipeline with a messaging system etc., In those cases, you would just stick to a simple Spring Boot app that does this whole filtering model. As soon as you start exploring the distribution of these applications loosely coupled via messaging system, Spring Cloud Data Flow would be handy.
Please checkout SCDF guide to understand some of the features and recipes to know more about what SCDF can offer and choose what fits in your case.

save springboot output to elasticsearch engine

I have rest API through which I am sending and getting massages to the Kafka server using spring-boot. Now I want to save those messages to elasticsearch. How to do it can anyone help?
Actually this is a systematic job in which case, is somehow like setting up a database storage architecture.
TO BE SIMPLE AND SHORT:
First you need to decide which ES version you want to use, because there are some breaking changes between ES 2.x to 7.x. And those differences may affect the way you design the schema of your storage.
Assume you use latest 7.x ES, you will need to create index(es) where you want the data fetched from kafka to be stored into. Checkout https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html
Later you have indexes created, you need to apply and learn some basic knowledge about ES high level rest client and low level rest client. The low level rest client enables you the basic connection to ES cluster via HTTP. And high level rest client apis give you sufficient ways to do ops like documents CRUD, search, aggregations for your data. You can easily find dependencies via maven and use them in your Spring Boot Application. Checkout https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-high.html

Is it possible to create a Kafka Connector without REST request?

Is it possible to create a Kafka Connector without REST request?
I have started my worker (distributed) through a java code and want my connector also gets started along with it. I dont want to use REST call (not from browser and not from code) to create my connector.
I just want a simple kafka api to invoke which creates my connector.
Any help will be appreciated.

Spring Cloud Connector Plan Information

I am using Spring Cloud Connector to bind to databases. Is there any way to get the plan of the bound service? When I extend an AbstractCloudConfig and do
cloud().getSingletonServiceInfosByType(PostgresqlServiceInfo.class)...
I will have information on the url and how to connect to the postgres. PostgresqlServiceInfo and others do not carry along the plan data. How can I extend the service info, in order to read this information form VCAP_SERVICES?
Thanks
By design, the ServiceInfo classes in Spring Cloud Connectors carry just enough information to create the connection beans necessary for an app to consume the service resources. Connectors was designed to be platform-neutral, and fields like plan, label, and tags that are available on Cloud Foundry are not captured because they might not be available on other platforms (e.g. Heroku).
To add the plan information to a ServiceInfo, you'd need to write your own ServiceInfo class that includes a field for the value, then write a CloudFoundryServiceInfoCreator to populate the value from the VCAP_SERVICES data that the framework provides as a Map. See the project documentation for more information on creating such an extension.
Another (likely easier) option is to use the newer java-cfenv project instead of Spring Cloud Connectors. java-cfenv supports Cloud Foundry only, and gives access to the full set of information in VCAP_SERVICES. See the project documentation for an example of how you can use this library.

How to monitor streaming apps Inside SCDF?

I am novice to Spring Cloud Data flow and Stream Cloud Streaming Applications.
Currently my project diagram looks like following :
I route a POST request from outside client using zuul API gateway to a microservice called Composite. Composite creates a stream using REST POST and deployes onto Spring Cloud Data Flow Server. As far as I know the microservices mongodb and file run as co-existing JVM processes. If My client has to know the status of stream, status of the processed data, How should Composite Microservice interact with Spring Cloud Data Flow Server? Currently when I make POST call to deploy the stream I dont even get the status from SCDF Server. Does SCDF expose any hooks to look at the individual apps? Also how can I change the flow #runtime to create a dynamic mesh?
Currently I am using Local Spring Cloud Data Flow Server for development.
Runtime platform is local
Local runtime is recommended only for development purpose and if you're preparing for production, please make sure to choose a platform variant (eg: cf, k8s, yarn, ..) that comes with non-functional requirements to support reliable and durable execution of all the applications running in streaming pipeline.
As far as I know the microservices mongodb and file run as co-existing JVM processes.
If your stream definition is file | mongodb, you'd have 2 different JVM's even when using Local runtime. They're independent Boot applications.
How should Composite Microservice interact with Spring Cloud Data Flow Server?
Not clear what you mean by "composite" here. All the microservice applications in SCDF communicate via messaging middleware such as Kafka or Rabbit. SCDF provides the orchestration capability to run such applications into various runtime platforms.
Currently when I make POST call to deploy the stream I dont even get the status from SCDF Server
You can use SCDF's REST-APIs to query for current status of the apps and it is platform agnostic. You can view the list of supported APIs by hitting the root URL (see image below) - there's a gap in docs - we will fix it. Following APIs could be useful for status checks.
Does SCDF expose any hooks to look at the individual apps?
Once the apps are deployed in a runtime platform, you can take advantage of Boot's actuator endpoints to explore more details such as trace, metrics, health, env among others at each application level. See Boot's actuator endpoints for more details. For instance, if your mongodb app is running locally and on port 23000, then you can check granular metrics for this application at: http://localhost:23000/metrics.
[As an FYI: future SCDF releases would include integrating Spring Boot + Spring Cloud Sleuth metrics and visual representation of the same.]
Also how can I change the flow #runtime to create a dynamic mesh?
If you're referring to editing a running streaming pipeline with addition/deletes, we are currently exploring design approach to support this functionality.

Resources