NiFi: Production usage without web UI - apache-nifi

Here are some commonly suggested approaches for using NiFi without web UI, along with their respective limitations. Is there a better way to use NiFi in Production without using web UI while still being able to makes changes to data flow design dynamically?
REST API approach: The REST APIs can be used only with previous knowledge of the ID of the components and do not work with NAME of the components.
MiNiFi approach: The MiNiFi is more focused on collection of data at the source. Additionally, the MiNiFi configuration too is tied to the previous knowledge of ID vs NAME of the components.

A typical NiFi dataflow goes through the following environment lifecycle.
You build your flow in a development NiFi setup. You run it, test it, debug it, fix it.
Once you are sure that the flow runs as expected, promote it to the QA setup and perform similar actions.
Finally when your flow passes QA, promote it to the production setup. Have stringent policies set so that no one expect the support team or the admin have access to make the changes to the flow(s).
In other words, you don't have to rely on the REST API (event the UI changes are done through internal REST API calls) or disable Web UI, if you follow the proper dev-qa-prod promotion.
On a side note, you can leverage NiFi Registry to do the dev-qa-prod lifecycle.

Related

Policy changes to specific processor

Good afternoon. I'm able to change the global policies for NiFi through REST API, however, I'm trying to edit the access policies for an ARBITRARY processor. I have no idea how to do so. Everything in the NiFi REST API website calls everything else a component (or maybe I'm misinterpreting...)
Anyway, I appreciate all the help/guidance!
The NiFi UI uses the API behind the scenes to perform every action. You can set policies on process groups, remote process groups, processors, funnels, input & output ports, queues, controller services, and reporting tasks. Collectively, these resources are called "components".
If a policy is not set on a specific component, it inherits the policies set on the parent object (i.e. the process group containing it). You can override these policies directly at a granular level.
To set the policy for a specific component, use the POST /policies API. The easiest way to observe the explicit API invocation necessary is to use your browser's developer tools to record the calls made by the UI client while you manually perform the action and then use those API calls.
There are also other tools which make this process easier, such as the official NiFi CLI Toolkit and the (unofficial but very good) NiPyAPI.

why rate limiting logic should be placed with application code rather then web server

I am exploring to put rate limiting functionality on rest API which are developed using spring boot.
After going through many articles, I came to know that the best way to put rate limiting functionality is with application code, rather then putting it on web servers.
My question is how do you decide that which functionality should go where. Since, its monitoring your incoming calls and nothing to do with business logic, the ideal place should be a web server.
My question is how do you decide that which functionality should go
where. Since, its monitoring your incoming calls and nothing to do
with business logic, the ideal place should be a web server.
Technically the web server could do the job but in the facts, a web server doesn't have necessarily all needed information, it is not specialized for API consuming and it may also make the testability of this feature much harder.
Some practical reasons why the webserver side could be a bad choice :
the developers don't have necessarily the configuration of the HTTP web server in local.
you want to write unit and integration test to check that the rate limitations are applied as specified. Creating a configuration for automated testing is much simpler in the scope of your Java application than with a configuration file defined on a web server.
web servers reasons in terms of HTTP request-response, not in terms of service.
Rate limitations may be applied according to the IP but not only, the username, the user roles, the type of service may influence the limitations. Not sure that you could get all of these easily from an HTTP server.
For example roles are stored on the server side or in a database.
A better option is setting these mechanisms by adding specific and specialized classes or configuration files, which simplifies their reading, their maintenance and their testability.
As you mention Spring Boot in your tags, that and that should interest you.
I recommend spring-cloud-gateway's rate limiter
you could separate this functionality from your business logic by using Filters.
https://www.baeldung.com/spring-boot-add-filter

NiFi: Modify flow without disruption or downtime using Java API

Is there a way to modify NiFi flow dynamically using Java API? The use case is to add a processor to an active data flow (data is flowing through it). The new processor should be added at the beginning of the flow without application disruption or downtime.
In case Java API is not available, please feel free to suggest alternatives. I have already looked at change-nifi-flow-using-rest-api-part-1. Thanks.
Any action you can perform from the UI can also be performed from REST API, the UI is just making calls to the REST API behind the scenes.
I would suggest opening Chrome's Dev Tools and performing the action you are interested in and then seeing what requests were made to perform the action. You can then script these operations however you need.
In addition, if you are trying to deploy flows then you should be taking advantage of NiFi Registry which allows you to place a flow under version control. You can then make changes from your local instance or dev instance, and upgrade the flow in production in-place without stopping your whole NiFi instance.

Apache Nifi - About .Nar File Changes and Nifi Restarting

I want suggestions for my application:
I have Multitenancy in Nifi. For each Process group, I have different Tenants/Users.
For any changes in one Tenant/user like in his custom processor(.nar file will create), we need to copy-paste that .nar file into lib folder and again restart the nifi. But due to this full Nifi server has restarted because of that Each Tenant/User and Processes group get restarted.
So, Please give Some Suggestions So that we can restart only one Tenant/user or process group Or Without Restart Nifi .nar file will reflect?
NiFi does not currently have the kind of warm restart option that you describe, however a lot of the base functionality needed to support it is in the code base and the concept is on the community roadmap.
Some options that might help you today:
Consider segregating the tenants with a high rate of code change into separate development environments. You could possibly leverage the Docker builds to provide flexibility and easy automation. You could then promote the end-of-day versions of your Nars into the 'Production' cluster each night, hopefully without disturbing users.
Consider utilising the NiFi Site-to-Site capability to have linked NiFi environments instead of a single shared one. Processors that change regularly could be called out to and updated in their own schedule
Consider why you are changing processor code so regularly, there may be a better approach than hard coding logic and parameters into the processors - the variable registry, various controller services, flow registry, etc. all provide a very rich featureset.

Is there a way to capture Nifi API calls that the UI makes?

Since the Nifi GUI is really making api calls under the hood, is there anyway to capture those requests or logs? I've been using chrome dev tools. Just wondering if there is a way to capture this within nifi for governance purposes.
Chrome Dev tools is the best bet to get the actual API calls.
For auditing purposes there is something a little bit different... from the menu in the top-right there is "Flow Configuration History" which shows every change that has been made to the flow, and who made it (when in a secure instance).
The flow configuration history is also available through the ReportingTask API if you wanted to implement a custom reporting task to push these events somewhere.

Resources