Transform data with NIFI - apache-nifi

What's the best practice with NIFI to extract an attribute in a flowfile and transform it in a Text Format Example :
{ "data" : "ex" } ===> My data is ex
How can I do this with NIFI wihtout using a executeScript Processor

You could use ExtractText to extract the values into attributes. If you added a property in ExtractText like foo = {"(.+)" : "(.+)"} then your flow file would get two attributes for each of the capture groups in the regex:
foo.1 = data
foo.2 = ex
Then you can use ReplaceText with a Replacement Value of:
My ${foo.1} is ${foo.2}

Related

How to combine the attributes from Different ExecuteScript Processor and add them in UpdateAttribute in NiFi

I have a usecase to run different scripts(ExecuteScript Processor)based on different config sets as (ex : AB - ExecuteScriptAB, AC -ExecuteScriptAC , AL - ExecuteScriptAL, AM - ExecuteScriptAM)
Inside each ExecuteScript Processor, I have used session.putAttribute() using python script to get the execute status code of all scripts.
For AB : flowFile = session.putAttribute(flowFile,"returnCodeAB",str(exec_code));
For AC : flowFile = session.putAttribute(flowFile,"returnCodeAC",str(exec_code));
For AL : flowFile = session.putAttribute(flowFile,"returnCodeAL",str(exec_code));
For AM : flowFile = session.putAttribute(flowFile,"returnCodeAM",str(exec_code));
Now, I want to add all the values of these 4 attributes i.e returnCodeAB+returnCodeAC+returnCodeAL+returnCodeAM and return a final status code value. As these scripts execute seperately , i am unable to merge them and add the values .. they act as different processors.
I tried to use UpdateAttribute Processor after these scripts and create and attribute in Advanced Tab but it did not help. Can someone help me with efficient solution?

How can update a flow file attribute value data type (String to byte) in the NiFi

I am using NiFi version 1.8.0.3.3.0.0-165, and not getting an idea for converting an attribute value data type (String to byte).
Is it possible to convert the data type of NiFi flow file attribute.
for attributes you can use this guide
Apache NiFi Expression Language Guide
if you don't find the solution you can use a groovy script to load your attribute and do whatever you want
def flowFile = session.get()
if(!flowFile) return
def val = flowFile.getAttribute('yourattribue')
//mod your val
flowFile = session.putAttribute(flowFile, 'yourattributeout', yourattributeout)
session.transfer(flowFile, REL_SUCCESS)

How to store json object in a variable using apache nifi?

The following flowfile is the response of an "InvokeHttp":
[
{"data1":"[{....},{...},{....}]","info":"data-from_site"},
{"data2":"[{....},{...},{....}]","info":"data-from_site"},
{"data3":"[{....},{...},{....}]","info":"data-from_site"}
]
I did a "SplitJson", i got each json record as a single flowfile
flowfile 1:
{"data1":"[{....},{...},{....}]","info":"data-from_site"}
flowfile 2:
{"data2":"[{....},{...},{....}]","info":"data-from_site"}
flowfile 3:
{"data3":"[{....},{...},{....}]","info":"data-from_site"}
I want to store each json record in each flowfile in a variable like that:
variable1 = "{"data1":"[{....},{...},{....}]","info":"data-from_site"}"
variable2 = "{"data2":"[{....},{...},{....}]","info":"data-from_site"}"
variable3 = "{"data3":"[{....},{...},{....}]","info":"data-from_site"}"
can someone show me how to store the json record in a variable !
If I understand correctly what you want to do (by "variable", do you mean what is called "attribute" in NiFi?), you can use the EvaluateJsonPath processor configured with:
flowfile-attribute as Destination
json as Return type

Get id from previous processor NiFi

Processors I'm referring to
Is it possible that the processor "InvokeHTTP" takes the information "id" from the previous processor(in this case SELECT_FROM_SNOWFLAKE)?
Where i want to change
I would like the "Remote URL" to be something like:
http://${hostname()}:8080/nifi-api/processors/${previousProcessorId()}
No, you can't. But you can get name, id or other properties for current processor group using ExecuteScript or ExecuteGroovy processors somewhere in this flow to find these informations with script:
def flowFile = session.get()
if(!flowFile) return
processGroupId = context.procNode?.processGroupIdentifier ?: 'unknown'
processGroupName = context.procNode?.getProcessGroup().getName() ?: 'unknown'
flowFile = session.putAttribute(flowFile, 'processGroupId', processGroupId)
flowFile = session.putAttribute(flowFile, 'processGroupName', processGroupName)
session.transfer(flowFile, REL_SUCCESS)
After that, you can find get the id of this snow_flake processor in this processor group for example in rest api.
the Remote URL property in InvokeHTTP processor supports nifi expression language.
So, if previous processor sets attribute hostname then you can use it as http://${hostname}:8080/...
However SelectSQL returns result in Avro format.
Probably before InvokeHTTP you need to convert avro to json and then evaluatejsonpath to extract required values into attributes.

How to take Entire flowfile content in nifi processor

I am using nifi to develop the data drifting. In my flow using SelectHiveQL processor. The output(flowFile) of the selectHiveQL need to take into next processor.
what is the suitable processor to take the flowFile content and store into userdefined variable have to use the same variable in Executescript to manipulate the data.
The ExecuteScript processor has direct access to the content of the incoming flowfile via the standard API. Here is an example:
def flowFile = session.get();
if (flowFile == null) {
return;
}
// This uses a closure acting as a StreamCallback to do the writing of the new content to the flowfile
flowFile = session.write(flowFile,
{ inputStream, outputStream ->
String line
// This code creates a buffered reader over the existing flowfile input
final BufferedReader inReader = new BufferedReader(new InputStreamReader(inputStream, 'UTF-8'))
// For each line, write the reversed line to the output
while (line = inReader.readLine()) {
outputStream.write("${line.reverse()}\n".getBytes('UTF-8'))
}
} as StreamCallback)
flowFile = session?.putAttribute(flowFile, "reversed_lines", "true")
session.transfer(flowFile, /*ExecuteScript.*/ REL_SUCCESS)
It is dangerous to move the flowfile content to an attribute because attributes and content memory are managed differently in NiFi. There is a more detailed explanation of the differences in the Apache NiFi In Depth guide.
You could use ExtractText to extract the content of your flowfile to an attribute.
In the ExtractText processor, you would create a property(the name you give this property will be a new attribute in your flowfile), and the value of the property will be the regular expression (\A.+\Z). In my experience, this regex is enough to capture the entire content of the flowfile, though I suppose mileage could vary depending on the type of content within your flowfile.

Resources