Pass data (e.g. log message) from processor to Slack or LogMessage processor - apache-nifi

Suppose I want have a processor to send a Slack message and I want to reuse it from many other processors. E.g. one might need to send "file received" while another might send "failed to unzip file", etc. I'd rather have a single PutSlack processor and set the Webhook Text property to #{logPrefix} -- ${message}. That way all of the other processors can use this single processor to post a message in the Slack channel.

Yes, a single PutSlack is enough, you can configure PutSlack processor properties to dynamically evaluate their values from the incoming FlowFile attributes since all the properties Supports Expression Language.

Related

How to get NiFi processor error into Flowfile attribute?

I have a PutGCSObject processor for which I want to capture the error into a flow file attribute.
As in the Picture, when there is an error for the Processor, it sends to failure with all the pre-existing attributes as-is.
I want the error message to be a part of the same flow file as an attribute. How can I achieve that ?
There is actually a way to get it.
Here is how i do it:
1: I route all ERROR connections to a main "monitoring process group"
2: Here is my "monitoring process group"
In updateattribute I capture filename as initial_filename
Then in my next step I query the bulletins
I then parse the output as individual attributes.
After I have the parsed bulleting output I use a RouteOnAttribute proc to drop all bulletins I don't need (some of them I have already used and notified on).
Once I only have my actual ERROR bulletin left, I use ExecuteStreamingCommand to run a python script using nipyapi module to get more info about the error, such as where it is in my flow, hierarchy, a description of the processor that failed, some proc stats and also I have metadata catalog about each proc/process group with their custodians and business use case.
This data is then posted to sumologic for logging and also I trigger a series of notifications (Slack + PagerDuty hook to create an incident lifecycle).
I hope this helps
There's no universal way to append error messages as flowfile attributes. Also, we tend to strongly avoid anything like that because of the potential to bubble up error messages with sensitive data to users who might not be authorized to see those details.

NiFi How to get the current processor Name and Processor group name through the custom processor using (Java)

I'm Creating the NiFi Custom processor using Java,
one of the requirement is to get the previous processor name and processor group (like a breadcrumb) using java code.
The previous processor name and process group name is not immediately (nor meant to be) available to processors, can you explain more about your use case? You can perhaps use a SiteToSiteProvenanceReportingTask to send provenance information back to your own NiFi instance (an Input Port, e.g.) and find the events that correspond to FlowFiles entering your custom processor, the events should have the source (previous) processor and destination (your custom) processor.
If instead you code your custom processor using InvokeScriptedProcessor with Groovy for example, then you can "bend the rules" and get at the previous processor name and such, as Groovy allows access to private members and you can assume the implementation of the ProcessContext in onTrigger is an instance of StandardProcessContext, so you can get at its members which include upstream connections and thus the previous processor. For a particular FlowFile though, I'm not sure you can use this approach to know which upstream processor it came from.
Alternatively, you could add an UpdateAttribute after each "previous processor" to set attribute(s) with the information about that processor, but that has to be hardcoded and applied to every corresponding part of the flow.
I faced this some time back. I used InvokeHTTP processor and used nifi-api/process-groups/${process_group_id} Web Service
This is how I implemented:
Identify the process group where the error handling should be done. [Action Group]
Create a new process group [Error Handling Group] next to the Action Group and add relationship to transfer files to Error Handling Group.
Use the InvokeHTTP processor and set HTTP Method to GET
Set Remote URL to http://{nifi-instance}:{port}/nifi-api/process-groups/${action_group_process_group_id}
You will get response in JSON which you will have to customize according to your needs
Please let me know if you need the XML file that I am using. I can share that. It just works fine for me

Number of flow files in HandleHttpRequest processor

My HandleHttpRequest receives multiple files in a request. I need to process all these files and then only I need to send response. I looked at its source to extend it but there is no easy way as most of the methods are private.
I request a new attribute (something like flowfiles.count) to be added to the flow files so that a wait/sync mechanism can be implemented.
Or define a method in HttpContextMap to get the number of flowfiles which can be provided at the time of register.
Is there any solution that I can use for now?
Thanks in advance
starting from nifi 1.8.0 the feature exists.
from additional information of the HandleHttpRequest 1.8.0 processor:
To handle requests with Content-Type: multipart/form-data containing multiple parts, additional attention needs to be paid. Each part generates a FlowFile of its own. To each these FlowFiles, some special attributes are written:
http.context.identifier
http.multipart.fragments.sequence.number
http.multipart.fragments.total.number
These attributes could be used to implement a gating mechanism for HandleHttpResponse processor to wait for the processing of FlowFiles with sequence number http.multipart.fragments.sequence.number until up to http.multipart.fragments.total.number of flow files are processed, belonging to the same http.context.identifier, which is unique to the request.

Access to queue attributes?

I have a number of GenerateTableFetch processors that send Flowfiles to a downstream UpdateAttributes processor. From the UpdateAttributes, the Flowfile is passed to an ExecuteSQL processor:
Is there any way to add an attribute to a flow file coming off a queue with the position of that Flowfile in the queue? For example, After I reset/clear the state for a GenerateTableFetch, I would like to know if this is the first batch of Flowfiles coming from GenerateTableFetch. I can see the position of the FlowFile in the queue, but it would nice is there's a way that I could add that as an attribute that is passed downstream. Is this possible?
This is not an available feature in Apache NiFi. The position of a flowfile in a queue is dynamic, and will change as flowfiles are removed from the queue, either by downstream processing or by flowfile expiration.
If you are simply trying to determine if the queue was empty before a specific flowfile was added, your best solution at this time is probably to use an ExecuteScript processor to get the desired connection via the REST API, then use FlowFileQueue#isActiveQueueEmpty() to determine if the specified queue is currently empty, and add a boolean attribute to the flowfile indicating it is the "first of a batch" or whatever logic you want to apply.
"Batches" aren't really a NiFi concept. Is there a specific action you want to take with the "first" flowfile? Perhaps there is other logic (i.e. the ExecuteSQL processor hasn't operated on a flowfile in x seconds, etc.) that could trigger your desired behavior.

To identify the xsd of xml message which is received from MQ

In IBM MQ, I have a requirement where I can get many types of xml from the queue. The xml messages will be conformed to already specified xsd (there are say, 5 xsd - which means I can get 5 different xml). When I get the message from queue, I would like to know the type of xml (if its xsd1 or xsd2 or so on)
The reason why I would want to know is, I am using a JaxB interface with SAX implementation, for which I need to give the java object corresponding to the xml as parameter. So I have to know which xsd the input and is and assign the parameter correspondingly.
The options I have is to set a property in the header to the message, but the party who is dropping the message into MQ is not ready.
What other options do I have? Can I get the file name (of xml) from the mq and find the xsd based on the name of the file? Or do I have to do I sax parsing and identify the root tag and derive the xsd type? Any other better option anybody has in mind?
Think of MQ like the Post Office. When you get a letter, the post office doesn't mess with anything on the inside (the payload) and if it changes the outside, it only changes routing information. If you want to sort incoming mail to different recipients, whoever is sending it has to put the data against which the sort criteria operate on the outside of the envelope. If that doesn't work, you must open the envelope and look for the recipient name, department, or whatever on the papers inside.
Your MQ message is that envelope. The sort criteria can be different queue names, a property of the message, a property of the message header, or something in the payload. But unless the sender explicitly sets the destination queue name based on the selection criteria, or sets the message or header property, your only option is to inspect the payload and figure it out.
If you have to inspect the payload, this is a perfect scenario for IBM Integration Broker. But you can also write an application to perform this function. Very often this is performed by a Dispatch app which gets the message, figures out where it goes, then puts it onto another queue and COMMITs the GET and PUT operations. But if the dispatch app must parse the XML to determine the correct queue, the message has to be parsed twice - once by the dispatcher, once by the receiving app.
I think you can do:
Does the incoming message has the file name at the beginning of the message body? In that case, after receiving the message your application can read first few bytes to get the file name. Based on the file name, application can use appropriate Xsd and pass the entire message body.

Resources