Nifi - Update Remote Process Group through Rest API - apache-nifi

We are using templates to package up some data transfer jobs between two nifi clusters, one acting as a sender, the other as the receiver. One of our jobs contains a remote process group and all worked fine at the point the template was created.
However when we deploy the template through our environments (dev, test, pre, prod), it is tedious and annoying to have to manually delete and a recreate a remote process group in the user interface. I'd like to automate this to simplify deploying templates and reduce the manual intervention.
Is it possible to update a remote processor group and its port configuration through the rest-api ?
Do I just use the REST api to create a new RPG with the correct configuration ?
Does anyone have any experience with this?

There is a JIRA to address this issue [1] which will be worked in conjunction with some of the ongoing Flow Registry (SDLC for flows) efforts. Until then, the best option would be (2) above.
[1] https://issues.apache.org/jira/browse/NIFI-4526

Related

How to update flows from dev to prod with state

I have a nifi flow it has keeps some state with the ListS3 processor.
I have a dev instance and a prod instance.
I want some options of deploying from dev to prod where the state is kept and where I don't manually have to go in and change all the processor's and process groups.
It seems like this can't be done with templates? Based on the following stackoverflow question:
how does NIFI listfile maintains its timestamp?
edit:
Just so there is no misunderstanding I want to keep prod state when deploying.
It sounds like you aren't using NiFi registry, so you're downloading a flow template and then importing it. This can't preserve state, as it's not the same flow.
You should be using NiFi Registry to version control your flows, which supports this Dev -> Prod workflow.
Build your flow in Dev NiFi, version to Registry.
In prod, add a new Process Group and select the Import option when it asks you for a name. You'll be able to pick your versioned flow.
Run your flow so that it stores some state. View the processors state to verify.
Now update the flow in Dev, and commit the local change to Registry.
Then, update the flow in Prod to the latest version from Registry. It will preserve state on the stateful processor.
For detailed steps on installing & using Registry, see these links:
https://nifi.apache.org/docs/nifi-registry-docs/html/getting-started.html
https://pierrevillard.com/2018/04/09/automate-workflow-deployment-in-apache-nifi-with-the-nifi-registry/
https://alasdairb.com/2021/03/22/nifi-in-production-nifi-registry/
https://docs.cloudera.com/HDPDocuments/HDF3/HDF-3.2.0/versioning-a-dataflow/content/connecting-to-a-nifi-registry.html
https://docs.cloudera.com/HDPDocuments/HDF3/HDF-3.4.0/getting-started-with-nifi-registry/content/import-a-versioned-flow.html
https://docs.cloudera.com/HDPDocuments/HDF3/HDF-3.4.0/getting-started-with-nifi-registry/content/save-changes-to-a-versioned-flow.html
https://docs.cloudera.com/HDPDocuments/HDF3/HDF-3.4.0/getting-started-with-nifi-registry/content/start-version-control-on-a-process-group.html

Node (maven) to deploy the application to several environments

On Jelastic, I created a node for building an application (maven), there are several identical environments (NGINX + Spring Boot), the difference is in binding to its database and configured SSL.
The task is to ensure that after building the application (* .jar), deploy at the same time go to these several environments, how to implement it?
When editing a project, it is possible to specify only one environment, multi-selection is not provided.
it`s allowed to specify just one environment
We suggest creating a few environments using one Repository branch, and run updates by API https://docs.jelastic.com/api/#!/api/environment.Vcs-method-Update pushing whole code to VCS.
It's possible to use CloudScripting technology for attaching custom logic to onAfterBuildProject event and deploying the project to additional environments after build is complete. Please check this JPS as an example of the code syntax. Most likely you will need to use DeployProject API method.

Multitenency in Apache Nifi

I am working on a cloud based application using Apache Nifi, for this we required to support Multitenency. But the current Nifi implementation only supports role based access for users, for a single flow.
I could understand that the flow state is saved as a single compressed XML file for a Nifi instance. So that who ever logins into that instance can view the same flow. Our requirement is to create unique flows for each user login. I tried to replicate state saving gz XML file for each users, but couldn't succeeded as the FlowService/FlowController which loads the XML file, is instantiated at the application startup and they are singleton. Please correct me, if Iam wrong with this approach. Or is there any other solution for adding Multitenant support with Nifi. I also wonder the reason behind the Nifi as a single user application.
Multi-tenant support will be introduced in Apache NiFi 1.0.0. There is a BETA release available [1]. This will support assigning permissions on a per component basis. However, the different tenants still share a canvas. There has been discussions of introducing a workspace concept that could provide visually separate dataflows.
[1] https://nifi.apache.org/download.html

Tagging EC2 machines in Pipeline's EMR Cluster (ImportCluster in the S3->DynamoDB example)

I'm trying to run the S3->DynamoDB example and having some problems running the EMR cluster that is created for the MyImportJob activity.
We configured our IAM accounts such that every user can create EC2 machines with a specific 'team_id' tag (of his team). That helps us control the resources, prevent mistakes and monitor usage.
When Pipeline tries to launch the EMR cluster, it (probably) does it without the tags and therefore it fails with Terminated with errors: User account is not authorized to call EC2. I tried to find a configuration in the EMRCluster resource but couldn't find anything that will help me set that. I'm pretty sure that it fails because of the tags policy.
Any idea how I can overcome this?
Does it help if If create a CloudFormation template for that? Do I have more control there? (I'm going to create the pipeline as a part of the application template anyway, just wanted to experience the product before).
Thanks!
I could not find a solution for how to add tags to EMR(and how to set it to be visible to all users) so I have created a python script to run as bootstrap action. If its still relevant you can find it here

Making gitolite trigger teamcity builds

Rather than having teamcity log onto the gitolite server several tens of thousands of times each day - and also sitting around waiting for the poll to happen (or starting it manually).
It would be nice if it was possible to set it up gitolite hooks that inform TeamCity that the repository has changed.
Is such a configuration possible with TeamCity and gitolite?
I know Jenkins has a github plugin that works nicely - I use that setup for some Minecraft CI I am running privately.
One way would be to gitolite (through a VREF hook) to call TeamCity through its REST API, in order to launch a build through web request.
You just need to make web request to the following URL:
http://YOURSERVER/httpAuth/action.html?add2Queue=btId
, where btId is build type Id – unique identifier for each build configuration.
To get it, you can just look for it in browser address bar, when clicking on build configuration, or use TeamCity REST API for details.
The OP Morten Nilsen didn't need a VREF:
add a file "post-receive" to .gitolite/hooks/common and
run gitolite setup --hooks-only

Resources