Removing Special Characters from JSON in NIFI - apache-nifi

I have JSON input for Nifi flow with some special characters. Could someone help me with how to remove special characters following payload? we would need only value with array and double-quotes.
input json:
{
"TOT_NET_AMT": "["55.00"]",
"H_OBJECT": "File",
"H_GROSS_AMNT": "["55.00"]",
"TOT_TAX_AMT": "[9.55]"
}
Expected Result :
{
"TOT_NET_AMT": "55.00",
"H_OBJECT": "File",
"H_GROSS_AMNT": "55.00",
"TOT_TAX_AMT": "9.55"
}

Related

How to extract value from serialized json response in Jmeter

I am getting a response in form of serialized json format for an api request as below
{"Data":"{\"orderId\":null,\"Tokens\":{\"Key\":\"abcdefgh123456\",\"Txnid\":\"test_5950\"}","success":true,"Test":"success"}
I want to extract Key value in Jmeter and I have to use into next request. Can someone help me on extracting the value?
Your JSON seems incorrect. The valid JSON should be like:
{
"Data":{
"orderId":null,
"Tokens":{
"Key":"abcdefgh123456",
"Txnid":"test_5950"
},
"success":true,
"Test":"success"
}
}
Add a JSON Extractor to the request from where you want to extract the Key value.
assign a variable name, i.e key
JSON Path Expression will be : .Data.Tokens.Key
use the extracted value as ${key} into the next request.
If your JSON really looks exactly like you posted the most suitable Post-Processor would be Regular Expression Extractor
The relevant regular expression would be something like:
"Key"?\s*:?\s*"(\w+)"
where:
``?\s*` - arbitrary number of whitespaces (just in case)
\w - matches "word" character (alphanumeric plus underscores)
+ - repetition
() - grouping
More information:
Using RegEx (Regular Expression Extractor) with JMeter
Perl 5 Regex Cheat sheet
JMeter: Regular Expressions

How to remove first few lines of CSV in Logstash

This is the input I am using for logstash.
ItemId,AssetId,ItemName,Comment
11111,07,ABCDa,XYZa
11112,07,ABCDb,XYZb
11113,07,ABCDc,XYZc
11114,07,ABCDd,XYZd
11115,07,ABCDe,XYZe
11116,07,ABCDf,XYZf
11117,07,ABCDg,XYZg
Date,Time,Mill Sec,rows,columns
19-05-2020,13:03:46,534,2,2
19-05-2020,13:03:46,539,2,2
19-05-2020,13:03:46,544,2,2
19-05-2020,13:03:46,549,2,2
19-05-2020,13:03:46,554,2,2
I need to remove first 8 lines from the csv and make the next line as column header and parse rest of lines as usual. Is there a way to do that in logstash?
You could do this using the file input and then read it line by line using grok to make sure it has the right amount of fields comma separated and ignore the header one
Your input will look like this:
input {
file {
path => "/path/to/my.csv"
start_position => beginning
}
}
This will read each line into an event with the data in the field named message and then send it to your filters.
In your filter you'll use grok with a pattern like this:
filter {
grok {
match => { "message" => [
"^%{DATE:Date},%{TIME:Time},%{NUMBER:Mill_Sec},%{NUMBER:rows},%{NUMBER:colums}$"
]
}
}
}
This will present each line as an event looking like this:
{
"colums": "2",
"Time": "13:03:46",
"Mill_Sec": "554",
"rows": "2",
"Date": "19-05-2020"
}
You can use mutate to remove unwanted fields (like message) prior to going to your output part. If there is no match with the pattern defined you'll get a tag with the value _grokparsefailure in your tags, you can use that to decide to send it to your output or not. As you defined that it has to be numbers, it will also fail on the header one and thus leave you with only 'real' events.
This can be done by having your output defined like this:
output {
if "_grokparsefailure" not in [tags] {
elasticsearch {
...
}
}
}
You should do this before the file gets to Logstash. There are ways to do it within Logstash, for example by using a mutliline code then doing exotic grok matches to remove the first N lines (or removing lines until a particular regex), then doing a split followed by a plain ol' csv filter. You need to be even more careful than usual with header rows. It's a big mess.
Much better to put something in front of Logstash to handle this issue.
If the files are local to your logstash instance, you could use the Exec input plugin to deal with the irregularities.
input {
exec {
command => "/path/to/command_or_script" # sh or py or js etc
interval => 60
}
}
On Linux, this command will print a file from the 8th line on...
command => "tail +8 /path/to/file"
This one (again for Linux) will drop everything until a line that starts with date, and print everything after that
command => "sed -n -e '/^date/,$p' /path/to/file"
You can avoid read the same file over and over again by deleting or archiving it in a script (rather than a one-liner as used in these examples)
After trimming the unwanted leading lines, you should be able to use the csv filter in a normal way.
Note that if you want to autodetect_column_names that pipeline workers must be set to 1.
Your content is not CSV format. Your task is convert it to true CSV format.

EvaluateJsonPath in NiFi returns empty string

The output of ExecuteSqlRecord is fed to EvaluateJsonPath and it returns empty string.
Output of ExecuteSqlRecord:
[
{
"X_LAST_DAY": "1618459200000",
"X_FIRST_DAY_3MON_PREV": "1610427600000",
"X_FIRST_DAY_1MON_PREV": "1615525200000",
"X_LAST_DAY_1MON_PREV": "1617163200000"
}
]
The attribute values are coming as 'Empty String Set'. Why is it coming as empty and what am I doing wrong?
In the EvaluateJsonPath, I have also tried different options like setting the following.
Return Type - auto-detect
Null Value Representation - empty string
your data is array, which should be split. use SplitJson before EvaluateJsonPath

Related to previous question, how can we format string escaped value?

Related to the previous question
How to make _source field dynamic ?
I was able to make search template _source field dynamic from the front-end, but due to invalid JSON format, I had to make it to string format. which is very hard to read. Is there any way to make it in a readable form? I tried \ after each new line to make (as suggested in ruby) but could not get it working.
"source": "{\"query\":{\"bool\":{\"must\":{\"match\":{\"line\":\"{{text}}\"}},\"filter\":{{{#line_no}}\"range\":{\"line_no\":{{{#start}}\"gte\":\"{{start}}\"{{#end}},{{/end}}{{/start}}{{#end}}\"lte\":\"{{end}}\"{{/end}}}}{{/line_no}}}}}}"
this is the string query which saved in a YML file.
I tried with ruby multiline string but still giving a parsing error.
I have created a template.yml file and store the template as given below
template: |
{
"script": {
"lang": "mustache",
"source": '{'\
'"_source": {{#toJson}}fields{{/toJson}}'\
'}'\
}
}
also tried the replace with double quotes and still backtick not helping.

Apache NiFi EvaluateJson Path starts with $ (dollar sign)

How would I manage to set the extraction path of a node that is names $?
I got this JSON and I have tried to escape it like $$ but I get nothing.
{
"Name" : "Bla",
"$" : "A"
}
Any ideas?
according to jsonpath expressions
it's possible to access keys with dot-notation or bracket-notation
bracket–notation should allow you to access the keys with non-word chars like ., $, etc.
assume you have json:
{
"the.name": "boo",
"$": "foo"
}
in this case to access key "the.name" you have to use brackets-notation:
$['the.name']
the same idea with "$" key:
$['$']

Resources