Write attributes to a file in Apache NiFi - apache-nifi

​Hi,
I am using GetSNMP processor to connect a radio. As per the NiFi documentation, this information is written to flow file attributes not to flow file contents. So, I used AttributesToJSON processor. After that I used PutFile processor to write these attributes to a file. Files are generated but there are not attributes written there. Only "{}" is written in each of the file. Using LogAttribute processor , I can see all attributes in the log file but I want them in a separate file.
Please guide.
Thanks,

SGaur,
If incoming flow file content is empty before putFile processor then it will writes empty content in local directory.
So you have to write attributes into flowfile content using ReplaceText.
For an example, You having this attributes like
${filename}-->input.1,
${input.content.1}-->content.1,
${input.content.2}-->content.2
comes before putFile.
Now you have to write those attributes into flow file content like below.,
In ReplaceText, Just mention replacement value to be like this-->
${filename},${input.content.1},${input.content.2}
It will replace content like below.,
input.1,content.1,content.2
Now it will write into local file using put file processor.
Hope this helpful for you.

Related

How can we use UpdateAttribute processor for assigning variable of FlowFile content in Apache Nifi

I need some help in UpdateAttribute processor:
I have a CSV file which contains hostnames. I need to separate each hostname in the FlowFile and pass it as a variable to the REST API.
My REST API part is working fine when passing data manually. However, I didn't get how to pass a variable value as hostname in it.
Sharing sample file:
SRHAPP001,SRHWEBAPP002,SRHDB006,SRHUATAPP4,ARHUATDB98
I don't quite understand your goal, but I assume that you try to pass the hostname to your REST API module by using FlowFile variables.
You can achieve this by using the ExtractText-Processor. You simply use RegEx for separating your hostnames from the CSV file.
For more information, see
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.12.1/org.apache.nifi.processors.standard.ExtractText/
How can I extract a substring from a flowfile data in Nifi?
If needed, you can split incoming FlowFiles on every hostname by using SplitContent-Processor

How to read from a CSV file

The problem:
I have a CSV file. I want to read from it, and use one of the values in it based on the content of my flow file. My flow file will be XML. I want to read the key using EvaluateXPath into an attribute, then use that key to read the corresponding value from the CSV file and put that into a flow file attribute.
I tried following this:
https://community.hortonworks.com/questions/174144/lookuprecord-and-simplecsvfilelookupservice-in-nif.html
but found requiring several controller services, including a CSV writer to be a big more than I would think is required to solve this.
Since you're working with attributes (and only one lookup value), you can skip the record-based stuff and just use LookupAttribute with a SimpleCsvFileLookupService.
The record-based components are for doing multiple lookups per record and/or lookups for each record in a FlowFile. In your case it looks like you have one "record" and you really just want to look up an attribute from another attribute for the entire FlowFile, so the above solution should be more straightforward and easier to configure.

Pass a directory as an argument to ExecuteStreamCommand

I have a Java program that is designed to process a directory full of data, passed as an argument to the JAR.
input_dir/
file1
file2
How can I tell NiFi to pass a directory to a ExecuteStreamCommand, as an argument, instead of an individual FlowFile ?
Is there a way to model a directory as a FlowFile ?
I tried to use GetFile just before ExecuteStreamCommand on input_dir parent directory in order to get ìnput_dir`, so it would be passed to the stream command.
It didn't work, as GetFile just crawls all the directories looking for actual files when "Recurse Subdirectories" attribute is set to true.
When set to false, GetFile doesn't get any files.
To summarize, I would like to find a way to pass a directory containing data to a ExecuteStreamCommand, not just a single FlowFile.
Hope it makes sens, thank you for your suggestions.
A flow file does not have to be a file from disk, it can be anything. If I am understanding you correctly, you just need a flow file to trigger your ExecuteStreamCommand. You should be able to do this with GenerateFlowFile (set the scheduling strategy appropriately). You can put the directory directly into ExecuteStreamCommand, or if you want it to be more dynamic you can add it as a flow file attribute in GenerateFlowFile, then reference it in ExecuteStreamCommand like ${my.dir} (assuming you called it my.dir in GenerateFlowFile).

NiFi PutFile set filename dosen't working

Get JSON data through Kafka broker.
The data are in the following format and the image data is encoded into Base64.
e.g){"filename":"test.jpg","filedata":"/9j/4AAQSkZJRgABAQEAYABgA....."}
I want to save image data that I received through Kafka as a file.
However, it is not working properly.
Below is the order in which I wrote the flow and describes only the key settings.
ConsumeKafka_2_0 processor
EvaluateJsonPath Processor
Destination flowfile-content
rawbytes $.filedata
EvaluateJsonPath Processor (error : did not have valid JSON Content)
Destination flowfile-attribute
filename $.filename
Base64EncodeContent processor
PutFile processor
When the flow is executed, the image file is saved normally, but the file name cannot be set. What should I do?
Do you have any site or examples to refer to?
The site of reference is https://community.hortonworks.com/articles/218015/ingesting-binary-files-like-pdf-jpg-png-to-hbase-w.html
according to PutFile documentation:
Reads Attributes filename: The filename to use when writing the FlowFile to disk.
you just need to use UpdateAttribute processor to set value for filename attribute
From question I understood that there is a kafka topic which has filename and base64 encoded file content in json format; You want to consume the kafka topic, decode file content with Base64 to construct an image, and store the image in the filename using PutFile.
I came up with a flow that will achieve this requirement and is self explanatory.
ConsumeKafkaRecord_2_0 (Consumes {"filename":"test.jpg","filedata":"/9j/4AAQSkZ.."})
EvaluateJsonPath
Destination: flowfile-attribute
rawtypes: $.filedata
filename: $.filename
ReplaceText (Changing flow file content to encoded image content for next processor)
Base64EncodeContent (rawtypes is decoded to image by this processor)
UpdateAttribute (File name to store the image is updated here)
PutFile
Unable to upload flow template here. Posting key processors screenshot
EvaluateJSONPath
ReplaceText - (Note replacement value)
UpdateAttribute
In step #2 you have replaced the flow file content with the value of $.filedata which is no longer JSON, so you can't use EvaluateJsonPath in step 3 since there is no more JSON.
If you reverse the steps 2 and 3, then you can extract the filename to an attribute and still have the original JSON in the flow file content, then extract the filedata to the content.

Apache NiFi to split data based on condition

Our requirement is split the flow data based on condition.
We thought to use "ExecuteStreamCommand" processor for that (intern it will use java class) but it is giving single flow data file only. We would like to have two flow data files, one is for matched and another is for unmatched criteria.
I looked at "RouteText" processor but it has no feature to use java class as part of it.
Let me know if anyone has any suggestion.
I think you could use GetMongo to read those definition values and store them in a map accessed by DistributedMapCacheClientService, then use RouteOnContent to route the incoming flowfiles based on the absence/presence of the retrieved values.
If that doesn't work, you could instead route the query result from GetMongo to PutFile and then use ScanContent, which reads from a dictionary file on the file system and routes flowfiles based on the absence/presence of those keywords in the content.
Finally, if all else fails, you can use ExecuteScript to combine those steps into a single processor and route to matched/unmatched relationships. It processes Groovy code easily, so you can directly invoke your existing Java class if necessary.

Resources