Nifi - How to insert XML whole content into JSON attribute - apache-nifi

I am trying to insert the whole content of a row of an XML file into a JSON attribute (I am a newbie).
I am doing it this way (tell me if there is an easier way, it's good to now):
I have configured Extract text this way:
And to finish, I configure the Replace Text, giving a JSON format:
But he result appears to be wrong (doesn't work like a normal JSON file, for example if I a try to do a httpPost):
How can I fix this problem?
cheers

If you are concern regards to new lines and json key/values then use NiFi expression language functions on the extracted attribute(data).
ReplaceText Configs:
Replacement value:
{"name" : "user1","time" : "${now()}","data" : "${data:replaceAll('\s',''):escapeJson()}"}
Use escapeJson and replaceAll function to replace all spaces,newlines with ''
Replacement Strategy as Always Replace
(or)
Another way of preparing json message is by using AttributesToJson processor.
if we are using this processor then we need to prepare attributes/values before AttributesToJson processor by using UpdateAttribute processor
Flow:
1.SplitXml
2.ExtractText //add data property to extract content to flowfile attribute
3.UpdateAttribute //add name property -> user1
add time property -> ${now()}
add data property -> ${data:replaceAll('\s',''):escapeJson()}}
4.AttributeToJson //Attributes List -> name,time,data
Destination -> flowfile content
include core attributes -> false

Related

issue generating json file from AttributesToJSON in Nifi?

I have a scenario where list of files are coming from previous processor, where for each file, I have to create json file with attributes of the flowfile. In AttributesToJSON processor configuration there is option to extract pipeline attributes and can create json files/object, if we set Include Core Attributes to true, it will read some of the file properties and forms the json file
the out for the above case in my scenario is …
{"fragment.size":"125"
file.group:"root",
file.lastModifiedTime:"2020-12-22T15:09:13+0000",
fragment.identifier:"ee5770ea-8406-400a-a2fd-2362bd706fe0",
fragment.index:"1",
file.creationTime:"2020-12-22T15:09:13+0000",
file.lastAccessTime:"2020-12-22T17:34:22+0000",
segment.original.filename:"Sample-Spreadsheet-10000-rows.csv",
file.owner:"root",
fragment.count:"2",
file.permissions:"rw-r--r--",
text.line.count:"1"}
}
But the files has other properties, like absolute.path, filename, uuid are missing in the above json file.
My requirement is, get the absolute.path, filename and uuid and concatenate absolute.path+/+filename, assign this to custom attribute say filepath:absolute.path+/+filename and also add uuid to json object.
so my json file should like
{ uuid:"file uuid value", filepath:"absolute.path+/+filename" }
any inputs to get above form of json file
If you look at the docs for AttributesToJSON you can see that you can specificy attributes in the Attributes List property. So you could try listing the properties you want there.
Alternatively. Sounds like you have 1 FlowFile for each File? You could use UpdateRecord to insert fields. You can use the Literal Value for the Replacement Value Strategy which will let you use Expression Language to insert values - for example, you could add a Property called filename with value ${filename} to insert the value of the filename attribute to a field in the JSON called filename.
To concat the two fields you could do ${allAttributes("absolute.path", "filename"):join('/')} or use append().

How to add custom attributes to AttributesToJSON?

I have a scenario where list of files are coming from previous processor, where for each file, I have to create json file with attributes of the flowfile. In AttributesToJSON processor configuration there is option to extract pipeline attributes and can create json files/object, if we set Include Core Attributes to true, it will read some of the file properties and forms the json file.
the out for the above case in my scenario is …
{"fragment.size":"125"
file.group:"root",
file.lastModifiedTime:"2020-12-22T15:09:13+0000",
fragment.identifier:"ee5770ea-8406-400a-a2fd-2362bd706fe0",
fragment.index:"1",
file.creationTime:"2020-12-22T15:09:13+0000",
file.lastAccessTime:"2020-12-22T17:34:22+0000",
segment.original.filename:"Sample-Spreadsheet-10000-rows.csv",
file.owner:"root",
fragment.count:"2",
file.permissions:"rw-r--r--",
text.line.count:"1"}
}
But the files has other properties, like absolute.path, filename, uuid are missing in the above json file.
My requirement is, get the absolute.path, filename and uuid and concatenate absolute.path+/+filename, assign this to custom attribute say filepath:absolute.path+/+filename and also add uuid to json object.
so my json file should like
{
uuid:"file uuid value",
filepath:"absolute.path+/+filename"
}
any inputs to get above json file.
Use UpdateAttribute processor to delete the unnecessary attributes before passing to AttributestoJSON or you can also specify the exact attributes you need in the attributesToJSON processor.

How to set an Attribute to Array for AttributeToJSON Processor?

NiFi Version 1.8.0
I'm trying to build our my json, and one of my fields needs to be an array. I thought I could simply use the UpdateAttribute Processor to set my attribute to '["arrayItem1", "arrayItem2"]' and then I could used AttributeToJSON to convert the attribute to JSON and it would convert to an array. Unfortunately, it simply turns into a string.
In the simplest way, how can I set an attribute to be an array so my final JSON (when using AttributeToJSON) field has the specific array?
EDIT 1
I will have a few SyslogListeners, I want to set an attribute so I know what data came from where. I want to be able to tag this data, so I though of adding an UpdateAttribute to set my attribute. I would like this to be an array. So the tag for:
SyslogListener1 will be ["tag1", "tag2"]
SyslogListener2 will be ["tag3", "tag4"]
SyslogListener3 will be ["tag1", "tag3"]
I thought of just having my flow look like this: SyslogListener -> UpdateAttribute -> Then all the data is now in the main flow -> AttributeToJSON. However, when I look at my JSON, my field is a string, not an array. How can I make this field to be an array? What I used to do, was use ReplaceText , the only problem with this is I didn't want to create a ReplaceText for ever single instance. Is there a single processor that could handle this?
Does your incoming flow file have any existing content? If not, you can use ReplaceContent to set the content to ["arrayItem1", "arrayItem2"] or whatever you wish the JSON to look like.
If the incoming flow file has existing JSON content, you can add the field explicitly (without attributes) using JoltTransformJSON or UpdateRecord.
Not my ideal solution, but I simply added a ReplaceText for each instance I would need. In my case, it was 7 different tag formations. So my nifi looks a little ugly. I was hoping for a single processor solution where I could tell it my JSON field and make it an array. So my pipeline is:
SyslogListener -> UpdateAttribute (creates our tags attribute with the string tag1, tag2 and the other tag combinations because I have 7 total SyslogListeners with their own UpdateAttribute) -> Data is now in the main pipeline, and some Other processing stuff happens here -> AttributeToJSON (setting our json with some attributes including our tags attribute) -> My 7 ReplaceTexts (which checks to see if our tags field has "tag1, tag2" and then replaces it with ["tag1", "tag2"], I do this for all 7 cases) -> PutElasticSearchHttp
So ingesting rsyslog messages, doing a bit of enriching, making my data into a JSON, then saving it to ES.
If anyone knows a single processor solution to this, so I don't need to have 7 unique ReplaceTexts (and more if I need new tags).

Apache Nifi: How to convert string (text/plain) to JSON type using Nifi processor?

Please guide me the right component for converting string to json
using appropriate Nifi processor component.
Input is a string of content type text/plain
{ productName : "tv", locationName: " chennai"}
Output of EvaluateJsonPath is still the same as I am unable to evaluate json property based on json path due to wrong content type sent as input.
{
productName : "tv",
locationName: " chennai"
}
Note: Tried SplitText, AttirtubesToJson processors not able to achieve desired conversion.
This is because the input data is not valid JSON. I recreated this flow locally and the error from EvaluateJsonPath is
2017-08-22 10:20:21,079 ERROR [Timer-Driven Process Thread-5] o.a.n.p.standard.EvaluateJsonPath EvaluateJsonPath[id=0aec27af-015e-1000-fac5-4e0f455a10fe] FlowFile StandardFlowFileRecord[uuid=b903eeb0-8985-4517-910f-5e3bbbccb8dc,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1503421928125-1, container=default, section=1], offset=376, length=47],offset=0,name=91708717370829,size=47] did not have valid JSON content.
Which condenses to [flowfile] did not have valid JSON content. The processor uses a strict validator and the input you're providing is not valid JSON. You'll need to use text manipulation or regexes to update this content to the following:
{"productName":"tv", "locationName":"chennai"}
Once you have accomplished this (via ReplaceText, etc.), the EvaluateJsonPath processor will work properly.
Also, to be clear, EvaluateJsonPath is designed to execute JSONPath expressions to extract JSON values to flowfile attributes. It is not designed to manipulate arbitrary text into JSON format.
Update
There is no universal process to convert arbitrary data to JSON. Given the specific input you provided, the following values for ReplaceText will convert this to valid JSON:
Search Value: (?<!\")(\w+)(?=[\s:])
Replacement Value: "$1"
Replacement Strategy: Regex Replace
Evaluation Mode: Entire text
If you get incoming data that is invalid in some other way, you'll have to modify this process. You may be interested in something like JSONLint to validate and format your incoming data.

Using flowfile content

New to NiFi!
I've split a flowfile into a single line of text using splitJSON processor.
The NiFi flowfile contents are as follows:
abcdefg
I'd like to be able to take the text in the flowfile and either add it to a url to make a subsequent call using InvokeHTTP or add the contents of the flowfile as an attribute so I can make the subsequent call using InvokeHTTP like so
http://localhost/${my.newly.added.attribute}
How do i do this?
Any help would be appreciated!
Thanks in advance!
ExtractText will allow you to find sections of content and place in an attribute on the FlowFile. For your example, you could capture the entirety of the content and assign to an attribute my.newly.added.attribute. InvokeHTTP would then access it using Expression Language 2 as in your example.

Resources