I have following business need. Can anybody please suggest me NiFi WorkFlow I should create? Thanks
1) Through Kakfa I get metadata as JSON Object. This JSON Object has an image or video which is in binary format. This binary file is pretty huge.
2) I need to extract binary data and send it to HTTP rest (POST).
In my mind I have following workflow:
ConsumeKakfa==>EvaluateJsonPath==>UpdateAttributes=>InvokeHTTP
Explanation:
1) ConsumeKakfa will receives metadata as json object.
2) EvaluateJsonPath will extract content json attribute which has image or video data stored as base64.
3) UpdateAttribute will update the flowfile to insert POST payload.
4) InvokeHTTP will invoke POST HTTP rest call.
I am not sure whether huge data will be handle by InvokeHTTP.
your flow should be like this:
ConsumeKafka
EvaluateJsonPath (destination=content) stores evaluated base64 binary into flowfile content
Base64EncodeContent (decode) decodes base64 content into a binary
InvokeHTTP sends everything in content as a body
Related
I have to send Id and owner id two fields from a flat file to http transformation which hits a web service(rest API).They need the data to be sent as below json format
[
{
"Id":"000xxxvvbnh",
"Ownerid":"xxxvvv1b5dmk"
}
]
how do I pass this two fields in json format as one request to the web service?
And also I need to create multiple session doing same operation parallelly hitting webservice. Is target webservice or we need to create a target flat file to capture the success or failure response?
Use http transformation to send json data to api.
First of all, read data using SQ. Using expression transformation remove [,] brackets.
Then, create a http transformation, create 1 input port as inp_id_ownerid(string) and content type(default). Then attach expression output to this input.
http will create default output port HTTPOUT and you can use this to capture return data of api.
Mention the api URL correctly. Test the URL first using swagger first if needed.
Mapping should look like this -
SQ --> EXP--> HTTP-->EXP-->TGT
I want to report to an API with some information with attributes of each flowfile. Sometimes the API return some important information in JSON. My goal is update the attributes of the original flowfile with the new data that return the API.
My sketch-strategy to update the FlowFile -> AttributeToJSON (But the entire content of the FlowFile is replaced by the JSON, 1°problem) -> HttpInvoke to send the information to the API -> The API return a JSON with some data -> Extract some data from the JSON with some Process and update the attributes of the Flowfile
1° problem: I can separate the flowfile, the original and another (to modify it with AttributeToJSON). But how can I merge them in the future? Which process I need to combine the original flowfile and the "new" attributes that I build with the response of the API?
Perhaps I can save the orginal file in a directory with PutFile, and by another way process the info, and some point use the FetchFile (with attributes know where is save the file), and then take the data and Attributes together.
extra Can I send with HttpInvoke POST Request with only the attributes(one of them written in JSON)?.
You may want to take a look at the lookup processors -- LookupAttribute and LookupRecord. These processors allow you to enrich the existing flowfile with additional information.
It looks like right now, the RestLookupService is available for record enrichment but not attribute enrichment. You may want to file a Jira requesting this, and in the meantime you can use SimpleScriptedLookupService to make an HTTP invocation from that processor.
Get JSON data through Kafka broker.
The data are in the following format and the image data is encoded into Base64.
e.g){"filename":"test.jpg","filedata":"/9j/4AAQSkZJRgABAQEAYABgA....."}
I want to save image data that I received through Kafka as a file.
However, it is not working properly.
Below is the order in which I wrote the flow and describes only the key settings.
ConsumeKafka_2_0 processor
EvaluateJsonPath Processor
Destination flowfile-content
rawbytes $.filedata
EvaluateJsonPath Processor (error : did not have valid JSON Content)
Destination flowfile-attribute
filename $.filename
Base64EncodeContent processor
PutFile processor
When the flow is executed, the image file is saved normally, but the file name cannot be set. What should I do?
Do you have any site or examples to refer to?
The site of reference is https://community.hortonworks.com/articles/218015/ingesting-binary-files-like-pdf-jpg-png-to-hbase-w.html
according to PutFile documentation:
Reads Attributes filename: The filename to use when writing the FlowFile to disk.
you just need to use UpdateAttribute processor to set value for filename attribute
From question I understood that there is a kafka topic which has filename and base64 encoded file content in json format; You want to consume the kafka topic, decode file content with Base64 to construct an image, and store the image in the filename using PutFile.
I came up with a flow that will achieve this requirement and is self explanatory.
ConsumeKafkaRecord_2_0 (Consumes {"filename":"test.jpg","filedata":"/9j/4AAQSkZ.."})
EvaluateJsonPath
Destination: flowfile-attribute
rawtypes: $.filedata
filename: $.filename
ReplaceText (Changing flow file content to encoded image content for next processor)
Base64EncodeContent (rawtypes is decoded to image by this processor)
UpdateAttribute (File name to store the image is updated here)
PutFile
Unable to upload flow template here. Posting key processors screenshot
EvaluateJSONPath
ReplaceText - (Note replacement value)
UpdateAttribute
In step #2 you have replaced the flow file content with the value of $.filedata which is no longer JSON, so you can't use EvaluateJsonPath in step 3 since there is no more JSON.
If you reverse the steps 2 and 3, then you can extract the filename to an attribute and still have the original JSON in the flow file content, then extract the filedata to the content.
Before posting this question about Apache NiFi InvokeHTTP I have gone through all other questions and their answersbut I am still unsure the best flow I should have. My situation is as below:
1) From Apache Kakfa, I get raw metadata.
2) Using EvaluateJSONPath I get attribute I want.
3) Using RouteOnAttribute I created 3 routes based on the attribute value I got from step-2 above.
4) Now based on the attribute value I want to decide whether I should go for GET or for POST or for Delete.
5) My question is where/how to set POST message? GET message? Delete Message body?
6) I am able to set the URL in configuration part provided by InvokeHTTP. But message body I don't know which is that property? or its in flow file using ReplaceText?
I read somewhere that before you divert your Restful POST HTTP request to InvokeHTTP you must have another processor before which changes the content of flow file.
Ref: Configuring HTTP POST request from Nifi
Please help. Thanks.
regards,
Yeshwant
Adding on to what Bryan had explained, POST will use the FlowFile content as the message body so if you have some other data which you want to wipe/transform into something and then sent as the message body, you can leverage the following processors :
ExtractText to read data from the existing FlowFile content
ReplaceText to erase the existing content of the FlowFile and replace it with different one
To set the headers for the REST calls, InvokeHTTP has the property Attributes to Send property which takes a regex which will scanned against the incoming FlowFiles' attributes and whichever attributes are matched are taken and sent as HTTP header.
To add new attribute to your existing FlowFile, you can use UpdateAttribute
For a POST, the body will be whatever is in the flow file content.
a GET and DELETE typically wouldn't have a body since the information would typically be provided in the URL or query params.
I need to perform an HTTP Post from NiFi, but I don't want/need the request to carry all of the FlowFile's content.
Is there some way to pass attributes of a FlowFile but not the full content?
If the request body of your Http Post is JSON, you can use the AttributesToJSON Processor which allows you to pick which attributes you want to include in the resulting JSON. You can then configure the processor so the resulting JSON overwrites the existing flowfile content.
Keep in mind that the resulting JSON will be flat so you may need to transform it to the expected format. For that, you can use the JoltTransformJSON Processor.
Below is an example of what your dataflow might look like. I hope this helps!