Nifi InvokeHTTP processor not triggering response relationship - apache-nifi

Consider the following flow which authenticates via HTTP to a service. I'm seeing an HTTP status code of 201 (created) come back, which should trigger the response relationship/flow. However as you can see in the log below, only the original relationship is triggered.
The Flow
Green lines indicate "response" flow. Magenta indicates "original" flow.
POST /token properties
Log
You can see here that the original relationship is triggered, but the response is not -- even though the status code, 201, is in the "success" range.
2023-01-29 15:22:08,341 INFO [Timer-Driven Process Thread-7] o.a.n.processors.standard.LogAttribute LogAttribute[id=fe0ace38-0185-1000-376d-8737d0e020f8] logging for flow file StandardFlowFileRecord[uuid=6b9f010a-f287-449c-8bef-94840c5cfa2f,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1674862641879-1, container=default, section=1], offset=13494, length=107],offset=0,name=6b9f010a-f287-449c-8bef-94840c5cfa2f,size=107]
---------------------ORIGINAL---------------------
FlowFile Properties
Key: 'entryDate'
Value: 'Sun Jan 29 15:22:07 UTC 2023'
Key: 'lineageStartDate'
Value: 'Sun Jan 29 15:22:07 UTC 2023'
Key: 'fileSize'
Value: '107'
FlowFile Attribute Map Content
Key: 'filename'
Value: '6b9f010a-f287-449c-8bef-94840c5cfa2f'
Key: 'invokehttp.request.duration'
Value: '738'
Key: 'invokehttp.request.url'
Value: '...'
Key: 'invokehttp.response.url'
Value: '...'
Key: 'invokehttp.status.code'
Value: '201'
Key: 'invokehttp.status.message'
Value: ''
Key: 'invokehttp.tx.id'
Value: 'efca13ac-16a1-4a27-a8e1-d04110d48523'
Key: 'mime.type'
Value: 'application/json'
Key: 'path'
Value: './'
Key: 'responseBody'
Value: '...'
Key: 'uuid'
Value: '6b9f010a-f287-449c-8bef-94840c5cfa2f'
---------------------ORIGINAL---------------------
The only thing I though of which might be causing an issue is that I'm writing the response body to an attribute. I tried to test by setting this attribute name to empty string but that just gives me an error in the log. I assumed that without the attribute name set, the response body would be the FlowFile sent to the response relationship, but that doesn't seem to be working.
Update: I created a second InvokeHTTP processor and replaced the relationships / disabled the old one. The flow worked correctly until I set the Response Body Attribute Name, and then the response relationship stopped triggering. I need to set this attribute though, so I can extract the error message from the response in the case of failure. I think I'll have to enable the Response Generation Required option, and check the status code in the response relationship flow. This is not ideal, though.

When you use Response Body Attribute Name, only original route is triggered. It's InvokeHTTP's behaviour, you can check documentation.
FlowFile attribute name used to write an HTTP response body for FlowFiles transferred to the Original relationship.
You can use this way for your problem,
InvokeHTTP (original route)-> RouteOnAttribute - (Success - ${invokehttp.status.code.ge(200):and(${invokehttp.status.code.le(299)})})
When you set Response Body Attribute Name attribute, it means that you don't want new flowfile, you want just add a new attribute to existing flowfile.

Related

How to fetch multiline with ExtractGrok processor in ApacheNifi?

I am going to convert a log file events (which is recorded by LogAttribute processor) to JSON.
I am using ExtractGrok with this configuration:
STACK pattern in pattern file is (?m).*
Each log has this format:
2019-11-21 15:26:06,912 INFO [Timer-Driven Process Thread-4] org.apache.nifi.processors.standard.LogAttribute LogAttribute[id=143515f8-1f1d-1032-e7d2-8c07f50d1c5a] logging for flow file StandardFlowFileRecord[uuid=02eb9f21-4587-458b-8cee-ad052cb8e634,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1574339166853-1, container=default, section=1], offset=0, length=0],offset=0,name=0df20cc1-3f93-49df-81b1-dac18318ccd9,size=0]
------------- request was received----------
Standard FlowFile Attributes
Key: 'entryDate'
Value: 'Thu Nov 21 15:26:06 AST 2019'
Key: 'lineageStartDate'
Value: 'Thu Nov 21 15:26:06 AST 2019'
Key: 'fileSize'
Value: '0'
FlowFile Attribute Map Content
Key: 'filename'
Value: '0df20cc1-3f93-49df-81b1-dac18318ccd9'
Key: 'http.context.identifier'
Value: '9552bd22-ec3b-4ada-93a9-a5ce9b27de25'
Key: 'path'
Value: './'
Key: 'uuid'
Value: '02eb9f21-4587-458b-8cee-ad052cb8e634'
-------------- request was received----------
I expect rest of the message after first line saved in log, but I get only first line:
-------------- request was received----------
I check the expression in Grok Debugger and it works. but it doesn't work with NiFi.
How to config ExtractGrok to get all lines in log value?
I found the solution, I replace (?m).* with this one (?s).* and it works.

How can I timestamp messages in nifi?

Disclaimer: I know absolutely nothing about nifi.
I need to receive messages from the ListenHTTP processor, and then convert each message into a timestamped json message.
So, say I receive the message hello world at 5 am. It should transform it into {"timestamp": "5 am", "message":"hello world"}.
How do I do that?
Each flowfile has attributes, which are pieces of metadata stored in key/value pairs in memory (available for rapid read/write). When any operation occurs, pieces of metadata get written by the NiFi framework, both to the provenance events related to the flowfile, and sometimes to the flowfile itself. For example, if ListenHTTP is the first processor in the flow, any flowfile that enters the flow will have an attribute entryDate with the value of the time it originated in the format Thu Jan 24 15:53:52 PST 2019. You can read and write these attributes with a variety of processors (i.e. UpdateAttribute, RouteOnAttribute, etc.).
For your use case, you could a ReplaceText processor immediately following the ListenHTTP processor with a search value of (?s)(^.*$) (the entire flowfile content, or "what you received via the HTTP call") and a replacement value of {"timestamp_now":"${now():format('YYYY-MM-dd HH:mm:ss.SSS Z')}", "timestamp_ed": "${entryDate:format('YYYY-MM-dd HH:mm:ss.SSS Z')}", "message":"$1"}.
The example above provides two options:
The entryDate is when the flowfile came into existence via the ListenHTTP processor
The now() function gets the current timestamp in milliseconds since the epoch
Those two values can differ slightly based on performance/queuing/etc. In my simple example, they were 2 milliseconds apart. You can format them using the format() method and the normal Java time format syntax, so you could get "5 am" for example by using h a (full example: now():format('h a'):toLower()).
Example
ListenHTTP running on port 9999 with path contentListener
ReplaceText as above
LogAttribute with log payload true
Curl command: curl -d "helloworld" -X POST http://localhost:9999/contentListener
Example output:
2019-01-24 16:04:44,529 INFO [Timer-Driven Process Thread-6] o.a.n.processors.standard.LogAttribute LogAttribute[id=8246b0a0-0168-1000-7254-2c2e43d136a7] logging for flow file StandardFlowFileRecord[uuid=5e1c6d12-298d-4d9c-9fcb-108c208580fa,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1548374015429-1, container=default, section=1], offset=3424, length=122],offset=0,name=5e1c6d12-298d-4d9c-9fcb-108c208580fa,size=122]
--------------------------------------------------
Standard FlowFile Attributes
Key: 'entryDate'
Value: 'Thu Jan 24 16:04:44 PST 2019'
Key: 'lineageStartDate'
Value: 'Thu Jan 24 16:04:44 PST 2019'
Key: 'fileSize'
Value: '122'
FlowFile Attribute Map Content
Key: 'filename'
Value: '5e1c6d12-298d-4d9c-9fcb-108c208580fa'
Key: 'path'
Value: './'
Key: 'restlistener.remote.source.host'
Value: '127.0.0.1'
Key: 'restlistener.remote.user.dn'
Value: 'none'
Key: 'restlistener.request.uri'
Value: '/contentListener'
Key: 'uuid'
Value: '5e1c6d12-298d-4d9c-9fcb-108c208580fa'
--------------------------------------------------
{"timestamp_now":"2019-01-24 16:04:44.518 -0800", "timestamp_ed": "2019-01-24 16:04:44.516 -0800", "message":"helloworld"}
So, I added an ExecuteScript processor with this code:
import org.apache.commons.io.IOUtils
import java.nio.charset.StandardCharsets
import java.time.LocalDateTime
flowFile = session.get()
if(!flowFile)return
def text = ''
// Cast a closure with an inputStream parameter to InputStreamCallback
session.read(flowFile, {inputStream ->
text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
// Do something with text here
} as InputStreamCallback)
def outputMessage = '{\"timestamp\":\"' + LocalDateTime.now().toString() + '\", \"message:\":\"' + text + '\"}'
flowFile = session.write(flowFile, {inputStream, outputStream ->
text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
outputStream.write(outputMessage.getBytes(StandardCharsets.UTF_8))
} as StreamCallback)
session.transfer(flowFile, REL_SUCCESS)
and it worked.

How to get hold of the content?

I have spent several hours now trying to figure out the expression language to get hold of the flowfile content.
Have a simple test flow to try and learn Nifi where I have:
GetMongo -> LogAttributes -> Put Slack
-----------------------LOG1-----------------------
Standard FlowFile Attributes
Key: 'entryDate'
Value: 'Wed Sep 28 23:58:36 GMT 2016'
Key: 'lineageStartDate'
Value: 'Wed Sep 28 23:58:36 GMT 2016'
Key: 'fileSize'
Value: '70'
FlowFile Attribute Map Content
Key: 'filename'
Value: '43546945658800'
Key: 'path'
Value: './'
Key: 'uuid'
Value: 'd1e10623-0e90-44af-a620-6bed9776ed62'
-----------------------LOG1-----------------------
{ "_id" : { "$oid" : "57ec27ec35a0759d54fb465d" }, "keyA" : "valueA" }
In the putSlack expression for test I have tried:
${flowfile.content}
${message}
${payload}
${msg}
${flowfile-content}
${content}
There is no expression language that accesses the content of the flow file. The attributes and the content are purposely stored very differently in order to facilitate moving around a Flow File that could represent a large payload. Expression language operates on attributes only.
The ExtractText processor can be used to extract the whole content of the Flow File into an attribute, just keep in mind that should only be done when you know the content will have no problem fitting in memory.

API Blueprint and Dredd - Required field missing from response, but tests still pass

I am using a combination of API Blueprint and Dredd to test an API my application is dependent on. I am using attributes in API blueprint to define the structure of the response's body.
Apparently I'm missing something though because the tests always pass even though I've purposefully defined a fake "required" parameter that I know is missing from the API's response. It seems that Dredd is only testing whether the type of the response body (array) rather than the type and the parameters within it.
My API Blueprint file:
FORMAT: 1A
HOST: http://somehost.net
# API Title
## Endpoints [GET /endpoint/{date}]
+ Parameters
+ date: `2016-09-01` (string, required) - Date
+ Response 200 (application/json; charset=utf-8)
+ Attributes (array[Data])
## Data Structures
### Data
- realParameter: 2432432 (number)
- realParameter2: `some string` (string, required)
- realParameter3: `Something else` (string, required)
- realParameter4: 1 (number, required)
- fakeParam: 1 (number, required)
The response body:
[
{
"realParameter": 31,
"realParameter2": "some value",
"realParameter3": "another value",
"realParameter4": 8908
},
{
"realParameter": 54,
"realParameter2": "something here",
"realParameter3": "and here too",
"realParameter4": 6589
}
]
And my Dredd config file:
reporter: apiary
custom:
apiaryApiKey: somekey
apiaryApiName: somename
dry-run: null
hookfiles: null
language: nodejs
sandbox: false
server: null
server-wait: 3
init: false
names: false
only: []
output: []
header: []
sorted: false
user: null
inline-errors: false
details: false
method: []
color: true
level: info
timestamp: false
silent: false
path: []
blueprint: myApiBlueprintFile.apib
endpoint: 'http://ahost.com'
Does anyone have any idea why Dredd ignores the fact that "fakeParameter" doesn't actually show up in the response body and still allows the test to pass?
You've run into a limitation of MSON, the language API Blueprint uses for describing attributes. In many cases, MSON describes what MAY be present in the data structure rather than what MUST exactly be present.
The most prominent case are arrays, where basically any content of the array is optional and thus the underlying generated JSON Schema doesn't put any constraints on array contents. Dredd just respects that, so indirectly it becomes a Dredd issue too, however there's not much Dredd can do about it.
There's an issue for the problem: apiaryio/mson#66 You can follow and comment under the issue to get updated about this. Dredd is usually very prompt in getting the latest API Blueprint parser, so once it's implemented in the language itself, it won't take long to appear in Dredd.
Obvious (but tedious) workaround is to specify your own JSON Schema with stricter rules using the + Schema section alongside the + Attributes section.

How to get HTTP StatusCodes in ember-data

When I invoke
App.store.createRecord(App.User, { name: this.get("name") });
App.store.commit();
how do I know if its successful and how to wait for the asyn message?
Very limited error handling was recently added to DS.RESTAdapter in ember-data master.
When creating or updating records (with bulk commit disabled) and a status code between 400 and 599 is returned, the following will happen:
A 422 Unprocessable Entity will transition the record to the "invalid" state and will add any errors returned from the server to the record's errors property.
The adapter assumes the server will respond with JSON in the following format:
{
errors: {
name: ["can't be blank"],
password: ["must be at least 8 characters", "must contain a number"]
{
}
(The error messages themselves could be arrays of strings or just strings. ember-data doesn't currently care which.)
To detect this state:
record.get('isValid') === false
All other status codes will transition the record to the "error" state.
To detect this state, use:
record.get('isError') === true
More cases may eventually be handled by ember-data out of the box, but for the moment if you need something specific, you'll have to extend DS.RESTAdapter, customizing its didError function to add it in yourself.

Resources