Apache NiFi EvaluateJson Path starts with $ (dollar sign) - apache-nifi

How would I manage to set the extraction path of a node that is names $?
I got this JSON and I have tried to escape it like $$ but I get nothing.
{
"Name" : "Bla",
"$" : "A"
}
Any ideas?

according to jsonpath expressions
it's possible to access keys with dot-notation or bracket-notation
bracket–notation should allow you to access the keys with non-word chars like ., $, etc.
assume you have json:
{
"the.name": "boo",
"$": "foo"
}
in this case to access key "the.name" you have to use brackets-notation:
$['the.name']
the same idea with "$" key:
$['$']

Related

Removing Special Characters from JSON in NIFI

I have JSON input for Nifi flow with some special characters. Could someone help me with how to remove special characters following payload? we would need only value with array and double-quotes.
input json:
{
"TOT_NET_AMT": "["55.00"]",
"H_OBJECT": "File",
"H_GROSS_AMNT": "["55.00"]",
"TOT_TAX_AMT": "[9.55]"
}
Expected Result :
{
"TOT_NET_AMT": "55.00",
"H_OBJECT": "File",
"H_GROSS_AMNT": "55.00",
"TOT_TAX_AMT": "9.55"
}

Ruby string interpolation with substitution

I have a given method that adds keys to urls with:
url % {:key => key}
But for one url I need the key to be escaped with CGI.escape. I cannot change the method, I can only change the url, but substitution does not work:
"https://www.example.com?search=#{CGI.escape(%{key})}"
Is there a way to achieve this only by changing the url string? I cannot use additional variables or change the method, thus I cannot do the escaping in the method and send the escaped key to the url string.
It isn't clear how your given method is supposed to work. Can you give an example where the method works, and one where it doesn't? Ignoring the method part of your question, and focusing on the URL bit,
>> key = "Baby Yoda"
=> "Baby Yoda"
>> %{key}
=> "key"
is the expected result, regardless of whether you have a variable named key, set to any value. See: https://en.wikibooks.org/wiki/Ruby_Programming/Syntax/Literals#The_.25_Notation
Unless you have a method defined which overloads '%' to do something else special for URLs, but that isn't clear in your question.
If you just want to CGI escape the value of 'key' within your URL string, don't use the percent notation:
>> key = 'Baby Yoda'
=> "Baby Yoda"
>> "https://www.example.com?search=#{CGI.escape(key)}"
=> "https://www.example.com?search=Baby+Yoda"
It just seems not possible. I worked around by defining a syntax ${...}
"https://www.example.com?search=${CGI.escape(%{key})}"
Then I first do subtitution of %{key} and then use eval to do CGI.Escape (or any method for that matter) with
gsub(/\${(.+?)}/) { |e| eval($1) }

Elasticsearch - text type regexp

Does elasticsearch support regex search on text type string?
I created a document like below.
{
"T": "a$b$c$d"
}
and I tried to search this document with below query.
{
"query": {
"query_string": {
"query": "T:/a.*/"
}
}
}
It seems work for me, BUT when I tried to query with '$' symbol. It's unable to find the document.
{
"query": {
"query_string": {
"query": "T:/a$.*/"
}
}
}
How should I do to find the document? This key data should be text type(not keyword) since it can be longer than keyword max length.
You should be aware of some things, here:
If your field is analyzed (and tokenized in the process) you will only find matches in fields containing a token (not the whole "text") that matches your RegExp. If you want the whole content of the field to match, you must use a keyword field or at least a Keyword Analyzer that doesn't tokenize your text.
The $ symbol has a special meaning in Regular Expressions (it marks the end of a string), so you'll have to escape it: a\$.*
Your RegExp must match a whole token to get a hit. That's why there's no point to use $ as a (non-escaped) RegExp symbol: Your RegExp must match a whole token from beginning to end, anyway. So (to stick to your example) to match fields where a is followed by c, you'd need .*?a[^c]*c.*, or if you need the $s in there, escape them: .*?a\$[^c]*c\$.*

Elasticsearch escape hyphenated field in groovy script

I am attempting to add a field to a document doing something similar to https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update.html#_scripted_updates. However, I appear to be running into issues due to the field being hyphen separated(appears to be treated as a minus sign) as opposed to underscore separated.
Example body below:
{"script":"ctx._source.path.to.hyphen-separated-field = \"new data\""}
I attempted to escape the hyphens with a backslash, but to no luck.
You can access the field using square brackets, i.e. simply do it like this:
{"script": "ctx._source.path.to['hyphen-separated-field'] = \"new data\""}
This one worked for me on 2.x (or maybe other version as well):
"script": {
"inline": "ctx._source.path.to[field] = val",
"params": {
"val": "This is the new value",
"field": "hyphen-separated-field"
}
}
Or this will also work
{"script": "ctx._source.path.to.'hyphen-separated-field' = 'new data'"}

Logstash filter regular expression difference in JRuby and Ruby

The Logstash filter regular expression to parse our syslog stream is getting more and more complicated, which led me to write tests. I simply copied the structure of a Grok test in the main Logstash repository, modified it a bit, and ran it with bin/logstash rspec as explained here. After a few hours of fighting with the regular expression syntax, I found out that there is a difference in how modifier characters have to be escaped. Here is a simple test for a filter involving square brackets in the log message, which you have to escape in the filter regular expression:
require "test_utils"
require "logstash/filters/grok"
describe LogStash::Filters::Grok do
extend LogStash::RSpec
describe "Grok pattern difference" do
config <<-CONFIG
filter {
grok {
match => [ "message", '%{PROG:theprocess}(?<forgetthis>(: )?(\\[[\\d:|\\s\\w/]*\\])?:?)%{GREEDYDATA:message}' ]
add_field => { "process" => "%{theprocess}" "forget_this" => "%{forgetthis}" }
}
}
CONFIG
sample "uwsgi: [pid: 12345|app: 0|req: 21/93281] BLAHBLAH" do
insist { subject["tags"] }.nil?
insist { subject["process"] } == "uwsgi"
insist { subject["forget_this"] } == ": [pid: 12345|app: 0|req: 21/93281]"
insist { subject["message"] } == "BLAHBLAH"
end
end
end
Save this as e.g. grok_demo.rb and test it with bin/logstash rspec grok_demo.rb, and it will work. If you remove the double escapes in the regexp, though, it won't.
I wanted to try the same thing in straight Ruby, using the same regular expression library that Logstash uses, and followed the directions given here. The following test worked as expected, without the need for double escape:
require 'rubygems'
require 'grok-pure'
grok = Grok.new
grok.add_patterns_from_file("/Users/ulas/temp/grok_patterns.txt")
pattern = '%{PROG:theprocess}(?<forgetthis>(: )?(\[[\d:|\s\w/]*\])?:?)%{GREEDYDATA:message}'
grok.compile(pattern)
text1 = 'uwsgi: [pid: 12345|app: 0|req: 21/93281] BLAHBLAH'
puts grok.match(text1).captures()
I'm not a Ruby programmer, and am a bit lost as to what causes this difference. Is it possible that the heredoc config specification necessitates double escapes? Or does it have to do with the way the regular expression gets passed to the regexp library within Logstash?
I never worked with writing tests for logstash before, but my guess is the double escape is due to the fact that you have strings embedded in strings.
The section:
<<-CONFIG
# stuff here
CONFIG
Is a heredoc in ruby (which is a fancy way to generate a string). So the filter, grok, match, add and all the brackets/braces are actually part of the string. Inside this string, you are escaping the escape sequence so the resulting string has a single literal escape sequence. I'm guessing that this string gets eval'd somewhere so that all the filter etc. stuff gets implemented as needed and that's where the single escape sequence is getting used.
When using "straight ruby" you aren't doing this double interpretation. You're just passing a string directly into the method to compile it.

Resources