In my Nifi 1.3.0 dataflow the FetchElasticsearchHttp processor changes the filename attribute to its corresponding ID in the database. I was wondering if there was a way of changing it back using some of Nifi's in house processors.
I have thought about simply writing my own script to correct this but there seems to be no way of knowing what file it is so I can't just grab its name.
if I understood you correctly, you can use UpdateAttribute to copy the filename attribute to another property. There's no way to stop the processor from writing its properties, but you can surely stash it away yourself. The trick is to copy/rename before invoking the fetch processor.
Related
So Im using EvaluateJsonPath to extract three json values to attributes but I ran into this issue where when there's one flowfile it turns into three with same filename but different UUIDs after processing through EvaluateJsonPath. Is this how it's supposed to work or am I missing something?
I have 'flowfile-attribute' setup as Destination field value
It had to be some kind of bug, I added another exactly same processor and it works with no issues.
The problem:
I have a CSV file. I want to read from it, and use one of the values in it based on the content of my flow file. My flow file will be XML. I want to read the key using EvaluateXPath into an attribute, then use that key to read the corresponding value from the CSV file and put that into a flow file attribute.
I tried following this:
https://community.hortonworks.com/questions/174144/lookuprecord-and-simplecsvfilelookupservice-in-nif.html
but found requiring several controller services, including a CSV writer to be a big more than I would think is required to solve this.
Since you're working with attributes (and only one lookup value), you can skip the record-based stuff and just use LookupAttribute with a SimpleCsvFileLookupService.
The record-based components are for doing multiple lookups per record and/or lookups for each record in a FlowFile. In your case it looks like you have one "record" and you really just want to look up an attribute from another attribute for the entire FlowFile, so the above solution should be more straightforward and easier to configure.
I am getting some numerical data with API from URL and I am looking for a way to make some mathematical operations in apache nifi before putting data to file directory. Thanks already now.
By the way, I am using InvokeHTTP processor to get data and to put file in somewhere I am using PutFile processor. I searched some related websites but I could not find out a working way.
Try using QueryRecord processor and Define Record Reader/Writer controller services to read/write the flowfile.
Add new property to the QueryRecord processor by using Apache calcite SQL query with your mathematical operations on flowfile.
Results of the SQL query will be added to the outgoing flowfile in your desired format.
Ultimately the answer depends on whether the data you're working with is in the content of the FlowFile or in the attributes. If the data is small enough and it's only a couple operations, the suggested approach would be to work with the data as attributes and use NiFi's expression language to do the transformations.
There is a section of mathematical operations[1] in the Apache documentation[2]. The operations range from simple operand like plus/minus to exposing the java.lang.Math static methods.
[1] https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#numbers
[2] https://nifi.apache.org/docs.html
You can try ExecuteStreamCommand if you want to intake the whole file and then run operations. Alternatively, you can fiddle around with the variables on the flowfile - depending on how large your operation is.
For example if you have some initial variables you can include them in the name of your file and then extract them, run the operations within the variables of the flowfile, then add to the bottom of the original file
Our requirement is split the flow data based on condition.
We thought to use "ExecuteStreamCommand" processor for that (intern it will use java class) but it is giving single flow data file only. We would like to have two flow data files, one is for matched and another is for unmatched criteria.
I looked at "RouteText" processor but it has no feature to use java class as part of it.
Let me know if anyone has any suggestion.
I think you could use GetMongo to read those definition values and store them in a map accessed by DistributedMapCacheClientService, then use RouteOnContent to route the incoming flowfiles based on the absence/presence of the retrieved values.
If that doesn't work, you could instead route the query result from GetMongo to PutFile and then use ScanContent, which reads from a dictionary file on the file system and routes flowfiles based on the absence/presence of those keywords in the content.
Finally, if all else fails, you can use ExecuteScript to combine those steps into a single processor and route to matched/unmatched relationships. It processes Groovy code easily, so you can directly invoke your existing Java class if necessary.
I have a common process group that will infer avro schema based on the file i supplied. But I want to set the Avro Record Name to a name corresponding to the filename i am supplying. So I used ${filename}. But the InferAvroSchema got error saying the record name is empty. Note that before this, I already set the property "filename" to the flowfile attribute and it has a value since i tested it using ReplaceText to see if there's value for ${filename}
Unfortunately this looks like a bug in InferAvroSchema. Many of the properties support expression language, but then the processor doesn't evaluate them against the incoming flow file. So it ends up only being able to use a value typed directly into the property (non-EL), or a value from system or environment properties which doesn't really make sense for a lot of these properties.
I created this JIRA for the issue:
https://issues.apache.org/jira/browse/NIFI-2465
The fix is that all of the calls to evaluateAttributeExpressions() should be passing in a flow file like:
context.getProperty(CSV_HEADER_DEFINITION).evaluateAttributeExpressions(inputFlowFile).getValue()