unable to modify flow in Apache NiFi 1.14.0 in HTTP mode - apache-nifi

I understand that the official documentation recommends using NiFi with HTTPS, but it nonetheless contains a word for using NiFi under HTTP, like the nifi.web.http.port property.
Also, I'd like to incrementally incorporate and evolve the NiFi instance into our's current data infrastructure, starting with non-critical data pipelines. So, the TLS layer right now is not necessary and could add friction during the deployment phase. So, I decide to go on the HTTP path.
After changing some settings, I am able to access NiFi's GUI at http://localhost:8080/nifi but I find out that I cannot make any change to the Flow. Write operations, i.e POST / PUT / DELETE requests, are rejected by HTTP 403.
NiFi doc says:
And by monitoring the API traffic between the GUI and NiFi instance, I can confirm that the PermissionsEntity has both canRead:true and canWrite:true.
I used a containerized NiFi instance.
Has anyone also encounter similar problems?

The root canvas may have been set for the default single-user that NiFi 1.14 generates if it starts up without security configuration.
First thing to try is right-clicking on the canvas and granting yourself access if you can.
The second option: try (re)moving the flow.xml.gz, users.xml and authorizations.xml and then restarting Nifi. New files will be generated that may work better with anonymous access.
Either way, setting up security now will probably mean less friction down the road, not more. I strongly advise you to bite the bullet and get it set up securely.

Related

Nifi communication with nifi-registry in restricted environments (http_proxy)

I setup nifi and nifi-registry to different servers and they communicate fine with https and cert authorization and authentication.
Now i face a problem in exactly the same setup for another nifi that need to communicate with the same nifi-registry. The problem is that the new nifi is on a restricted area, with http_proxy. I search many days for a solution for that. I don't find anything in the documentation about that.
At nifi in the controller settings/registry-clients, is there any way that i can inform nifi that the communication will be through http_proxy and not straight?
Nothing on the documentation talks about that. Maybe people face it with another way? Or simple is not possible?
The version of nifi and nifi-registry are 1.15.3.
I think I would probably need a clearer understanding of where the proxy is, but this page describes proxy configuration in front of NiFi and what fields that proxy would need to set to sit in front of NiFi: https://docs.cloudera.com/HDPDocuments/HDF3/HDF-3.5.2/nifi-configuration-best-practices/content/proxy_configuration.html
and for NiFi registry:
https://nifi.apache.org/docs/nifi-registry-docs/html/administration-guide.html#proxy_configuration

Is it possible for NiFi to connect to itself with a Remote Process Group?

I'm working on a project that heavily uses Apache NiFi v1.10.0. I'm getting tired of clicking through hundreds of process groups to apply small fixes that are essentially the same.
I've recently discovered Remote Process Groups and I was wondering if there is a way to connect NiFi instance to itself and implement DRY this way? I was thinking of implementing repeating components inside the root component and accessing them with remote access inside other process groups. Is this possible?
Right now I'm getting only SSLHandshakeException / PKIX path building failed
If there are other ways to implement DRY - please tell me.
#Alex. I feel your pain, in a previous role, they had a process group of 100s of flows, and would copy and paste the entire main group, turning into 1000s of flows. All copies with small modifications in random places.
Although I am advocate of programming this way to get a POC operational, I am a huge advocate of evaluating how to make the flows dynamic from the highest level. The process I use to do this is to go through DFDLC and versioning flow until, for example, I have 1 process group that can replace 2 by cleaning up the flow design differences between each other. We consider this part of optimizing the flow to reduce the total number of active processors too.
I highly recommend you do NOT use remote process groups within the same cluster. I also recommend you make common flows on the main canvas, and connect them with input/output ports when you need to move from a deeper process group back up to the main canvas. You will end up with a flow like this:
You can definitely do site-to-site to self, however it will be less performant because now you are taking local flow files and transferring them over a network connection to all the nodes in the cluster, even though some of them will go back to the same node they are on.
You could use NiFi Registry and created versioned flows that contain reusable functionality. Then you make the change once, commit it back to NiFi Registry, and then update the other instances of that versioned flow.
Bryan and Steven both offered good solutions. I will address the PKIX path building error you are encountering -- this indicates that NiFi is attempting to make an HTTPS connection to another service (likely in this case itself), and doesn't know how to verify the presented public certificate. The solution is to reference an SSLContextService configured with a truststore that contains the certificate. The Apache NiFi walkthroughs provide step-by-step instructions for performing these tasks.

NiFi: Modify flow without disruption or downtime using Java API

Is there a way to modify NiFi flow dynamically using Java API? The use case is to add a processor to an active data flow (data is flowing through it). The new processor should be added at the beginning of the flow without application disruption or downtime.
In case Java API is not available, please feel free to suggest alternatives. I have already looked at change-nifi-flow-using-rest-api-part-1. Thanks.
Any action you can perform from the UI can also be performed from REST API, the UI is just making calls to the REST API behind the scenes.
I would suggest opening Chrome's Dev Tools and performing the action you are interested in and then seeing what requests were made to perform the action. You can then script these operations however you need.
In addition, if you are trying to deploy flows then you should be taking advantage of NiFi Registry which allows you to place a flow under version control. You can then make changes from your local instance or dev instance, and upgrade the flow in production in-place without stopping your whole NiFi instance.

NiFi: Production usage without web UI

Here are some commonly suggested approaches for using NiFi without web UI, along with their respective limitations. Is there a better way to use NiFi in Production without using web UI while still being able to makes changes to data flow design dynamically?
REST API approach: The REST APIs can be used only with previous knowledge of the ID of the components and do not work with NAME of the components.
MiNiFi approach: The MiNiFi is more focused on collection of data at the source. Additionally, the MiNiFi configuration too is tied to the previous knowledge of ID vs NAME of the components.
A typical NiFi dataflow goes through the following environment lifecycle.
You build your flow in a development NiFi setup. You run it, test it, debug it, fix it.
Once you are sure that the flow runs as expected, promote it to the QA setup and perform similar actions.
Finally when your flow passes QA, promote it to the production setup. Have stringent policies set so that no one expect the support team or the admin have access to make the changes to the flow(s).
In other words, you don't have to rely on the REST API (event the UI changes are done through internal REST API calls) or disable Web UI, if you follow the proper dev-qa-prod promotion.
On a side note, you can leverage NiFi Registry to do the dev-qa-prod lifecycle.

Apache Nifi - About .Nar File Changes and Nifi Restarting

I want suggestions for my application:
I have Multitenancy in Nifi. For each Process group, I have different Tenants/Users.
For any changes in one Tenant/user like in his custom processor(.nar file will create), we need to copy-paste that .nar file into lib folder and again restart the nifi. But due to this full Nifi server has restarted because of that Each Tenant/User and Processes group get restarted.
So, Please give Some Suggestions So that we can restart only one Tenant/user or process group Or Without Restart Nifi .nar file will reflect?
NiFi does not currently have the kind of warm restart option that you describe, however a lot of the base functionality needed to support it is in the code base and the concept is on the community roadmap.
Some options that might help you today:
Consider segregating the tenants with a high rate of code change into separate development environments. You could possibly leverage the Docker builds to provide flexibility and easy automation. You could then promote the end-of-day versions of your Nars into the 'Production' cluster each night, hopefully without disturbing users.
Consider utilising the NiFi Site-to-Site capability to have linked NiFi environments instead of a single shared one. Processors that change regularly could be called out to and updated in their own schedule
Consider why you are changing processor code so regularly, there may be a better approach than hard coding logic and parameters into the processors - the variable registry, various controller services, flow registry, etc. all provide a very rich featureset.

Resources