How to reload external configuration data in NiFi processor - apache-nifi

I have a custom nifi processor that uses external data for some user controlled configuration. I want to know how to signal the processor to reload the data when it is changed.
I was thinking that a flofile could be sent to signal the processor but I am concerned that in a clustered environment only one processor would get the notification and all the others would still be running on old configuration.

The most common ways to watch a file for changes are the JDK WatchService or Apache Commons IO Monitor...
https://www.baeldung.com/java-watchservice-vs-apache-commons-io-monitor-library
https://www.baeldung.com/java-nio2-watchservice
Your processor could use one of these and reload the data when the file changed, just make sure to synchronize access to relevant fields in your processor between the code that is reloading them and the code that is using them during execution.

Related

How to send updated data from java program to NiFi?

I have micro-services running and when a web user update data in DB using micro-service end-points, I want to send updated data to NiFi also. This data contains updated list of names, deleted names, edited names etc. How to do it? which processor I have to use from NiFi side?
I am new to NiFi. I am yet to try anything from my side. I am reading google documents which can guide me.
No source code is written. I want to start it. But I will share here once I write it.
Expected result is NiFi should get updated list of names and NiFi should refer updated list for generating required alerts/triggers etc.
You can actually do it in lots of ways. MQ, Kafka, HTTP(usinh ListenHTTP). Just deploy the relevant one to you and configure it, even listen to a directory(using ListFile & FetchFile).
You can connect NiFi to pretty much everything, so just choose how you want to connect your micro services to NiFi.

Store external data into NiFi Registry

Is that possible to store external data (not NiFi flow) into NiFi Registry using REST API?
https://nifi.apache.org/docs/nifi-registry-docs/index.html
As i know, NiFi Registry designed for versioning NiFi flow. But i want to know whether it is capable of storing other data into NiFi registry and retrieve it based on versions.
As of today, it is not currently possible to store data/objects in NiFi Registry other than a NiFi Flow and its configuration (component properties, default variable values, controller services, etc).
There have been discussions about extending NiFi Registry’s storage capabilities to include other items. Often discussed is NiFi extensions, such as NAR bundles which are the archive format for components such as custom processors. This would allow custom components to be versioned in the same place as a flow and downloaded at runtime based on a flow definition rather than pre-installed on a NiFi/MiNiFi instances.
Today though, only Flows are supported. Other data or components has to be stored/versioned somewhere else.
If you have data you want to associate with a specific flow version snapshot, here is a suggestion: You could store that data externally in another service and use the flow version snapshot comment field to store a URI/link to where the associated data resides. If you use a machine parsable format such as JSON in the snapshot comment to store this URI metadata, an automated process could retrieve this data from an external system by reading this field when doing an operation involving a specific flow snapshot version.

Using Apache Nifi in a docker instance, for a beginner

So, I want, very basically, to be able to spin up a container which runs Nifi, with a template I already have. I'm very new to containers, and fairly new to Nifi. I think I know how to spin up a Nifi container, but not how to make it so that it will automatically run my template every time.
You can use the apache/nifi Docker container found here as a starting point, and use a Docker RUN/COPY command to inject your desired flow. There are three ways to load an existing flow into a NiFi instance.
Export the flow as a template (an XML file containing the exported flow segment) and import it as a template into your running Nifi instance. This requires the "destination" NiFi instance to be running and uses the NiFi API.
Create the flow you want, manually extract the entire flow from the "source" NiFi instance by copying $NIFI_HOME/conf/flow.xml.gz, and overwrite the flow.xml.gz file in the "destination" NiFi's conf directory. This does not require the destination NiFi instance to be running, but it must occur before the destination NiFi starts.
Use the NiFi Registry to version control the original flow segment from the source NiFi and make it available to the destination NiFi. This seems like overkill for your scenario.
I would recommend Option 2, as you should have the desired flow as you want it. Simply use COPY /src/flow.xml.gz /destination/flow.xml.gz in your Dockerfile.
If you literally want it to "run my template every time", you probably want to ensure that the processors are all in enabled state (showing a "Play" icon) when you copy/save off the flow.xml.gz file, and that in your nifi.properties, nifi.flowcontroller.autoResumeState=true.

NiFi PutSFTP 1.2.0 - Stop Option Not Always Displayed

In NiFi 1.2.0, using a two-node cluster, I have a simple flow with two processors:
GenerateFlowFile 1.2.0 - Generates data files
PutSFTP 1.2.0 - SCP puts files
Often after I've started both processors and let them run for a short while, I can stop the GenerateFlowFile processor, but I'm not able to stop (or start, for that matter) the PutSFTP processor. The Start and Stop items don't display in the context menu, and I can only view and not edit the processor's configuration. The PutSFTP processor's status icon indicates that it is stopped.
I'm not convinced that the behavior that I'm seeing is specific PutSFTP processors.
Why might this processor be "unstoppable"?
This isn't a direct answer to the question, but I just noticed that, when I refresh my browser, the PutSFTP process is startable again. The problem seems to lie with the Web application failing to update the processor's context menu for some reason.
I'm using Chrome 62.0.3202.94 (64-bit).

How does one setup a Distributed Map Cache for NiFi?

I'm brand new to NiFi and simply playing around with processors.
I'm trying to incorporate Wait and Notify processors in my testing, but I have to setup a Distributed Map Cache (server and client?).
The NiFi documentation assumes a level of understanding that I do not have.
I've installed memcached on my computer (macOS) and verified that it's running on Port 11211 (default). I've created a DistributedMapCacheClientService and DistributedMapCacheServer under NiFi's CONTROLLER SERVICES, but I'm getting java.net.SocketTimeoutException & other errors.
Is there a good tutorial on this entire topic? Can someone suggest how to move forward?
the DistributedMapCacheClientService and DistributedMapCacheServer does not require additional software.
To create these services, right-click on the canvas, select Configure and then select the Controller Services tab. You can then add new services by clicking the + button on the right and searching by name.
create DistributedMapCacheServer with default parameters (port 4557) and enable it. this will start built-in cache server.
create DistributedMapCacheClientService with hostname localhost and other default parameters and enable it
create a simple flow GenerateFlowFile set the run schedule and not zero bytes size in parameters.
connect it to PutDistributedMapCache set Entry Identifier as Key01 and choose your DistributedMapCacheClientService
try to run it. and if port 4557 not used by other software the put cache should work.
#Darshan
Yey it will work beacause in the documentation of DistributedMapCacheClientService says that it :
Provides the ability to communicate with a DistributedMapCacheServer. This can be used in order to share a Map between nodes in a NiFi cluster

Resources