Logstash Add Extra Space on a particular field based on the config - ruby

I have a particular requirement where I have extracted out a field based on the GROK patterns. And based on the length of the attribute, i have to add space at the end of the string.
For example:
Attribute:
PartA: test
PartB: abc
Output Should be like:
test#####abc##
Where the requirement says partA should be of length 10 and partB should be of length 5. So if you see, I have added extra space (#) based on the current length of the attribute.
I am currently trying to create a new line based on the output above requirement,
here is the attempt (logstash.conf):
file {
path => "/Users/Smit/Downloads/chrome/observability/logstash_dump.txt"
codec => line { format => "%{PartA}#####${PartB}###"}
}
}
Is there a way to dynamically append space (# used this for clarity) instead of hardcoding.
Here is the full Logstash file: https://dpaste.org/dsuH

Related

Remove fields if their key is longer than x chars in Logstash?

I have a logstash pipeline for the most part, parses data sufficiently into JSON to go to elasticsearch; however... Sometimes it does not parse fields very well, and finds keys and values which are very weird, most of the time, these keys are very long, such as: aaaakgaaaaiaaaaaaaaaabiofy1mb....
I was hoping to be able to remove these fields based on how long the key is in the kv. Say any field over 30 chars is removed. Although I may improve the logstash parsing in the futurethat these problems won't persist, for now I'd like to have something such as this as a last stitched sanity check.
Try this filter whith 2 steps :
prefix kv field to be able to identify them (optional here)
loop on this prefixed field to apply the condition (here the limit is 30 prefix included)
filter {
kv {
#Step 1
#prefix=>"kv_"
}
ruby {
code => "
hash = event.to_hash
hash.each do |key, value|
# Step 2
#if(key.to_s.start_with?('kv_'))
if(key.size>30)
event.cancel
end
#end
end
"
}
}

NIFI: Unable to extract two values from a list during each iteration over a loop

I would like to retrieve large SQL dump between date ranges. For the same, I constructed a loop over a date list, which intends to extract adjacent fields. Unfortunately, in my case, it doesnt work as planned.
Following is my flow:
Replace Text: Takes flowfile content date list as all_first_dates
Initialize Count:
While Loop:
Get first and adjacent dates:
However, on seeing the queue, I get the first and second as this:
Whereas, I desired as 2016-01-01 and 2016-01-02 for first and second respectively on my first iteration and so on.
check the description of the getDelimitedField function and it's parameters:
Description: Parses the Subject as a delimited line of text and returns just a single field from that delimited text.
Arguments:
index: The index of the field to return. A value of 1 will return the first field, a value of 2 will return the second field, and so on.
delimiter: Optional argument that provides the character to use as a field separator. If not specified, a comma will be used. This value must be exactly 1 character.
...
you are not passing the second parameter, so the coma used to split the subject, and you got the whole subject as one element in result.

How do I regex-match an unknown number of repeating elements?

I'm trying to write a Ruby script that replaces all rem values in a CSS file with their px equivalents. This would be an example CSS file:
body{font-size:1.6rem;margin:4rem 7rem;}
The MatchData I'd like to get would be:
# Match 1 Match 2
# 1. font-size 1. margin
# 2. 1.6 2. 4
# 3. 7
However I'm entirely clueless as to how to get multiple and different MatchData results. The RegEx that got me closest is this (you can also take a look at it at Rubular):
/([^}{;]+):\s*([0-9.]+?)rem(?=\s*;|\s*})/i
This will match single instances of value declarations (so it will properly return the desired Match 1 result), but entirely disregards multiples.
I also tried something along the lines of ([0-9.]+?rem\s*)+, but that didn't return the desired result either, and doesn't feel like I'm on the right track, as it won't return multiple result data sets.
EDIT After the suggestions in the answers, I ended up solving the problem like this:
# search for any declarations that contain rem unit values and modify blockwise
#output.gsub!(/([^ }{;]+):\s*([^}{;]*[0-9.]rem+[^;]*)(?=\s*;|\s*})/i) do |match|
# search for any single rem value
string = match.gsub(/([0-9.]+)rem/i) do |value|
# convert the rem value to px by multiplying by 10 (this is not universal!)
value = sprintf('%g', Regexp.last_match[1].to_f * 10).to_s + 'px'
end
string += ';' + match # append the original match result to the replacement
match = string # overwrite the matched result
end
You can't capture a dynamic number of match groups (at least not in ruby).
Instead you could do either one of the following:
Capture the whole value and split on space
Use multilevel matching to capture first the whole key/value pair and secondly match the value. You can use blocks on the match method in ruby.
This regex will do the job for your example :
([^}{;]+):(?:([0-9\.]+?)rem\s?)?(?:([0-9\.]+?)rem\s?)
But whith this you can't match something like : margin:4rem 7rem 9rem
This is what I've been able to do: DEMO
Regex: (?<={|;)([^:}]+)(?::)([^A-Za-z]+)
And this is what my result looks like:
# Match 1 Match 2
# 1. font-size 1. margin
# 2. 1.6 2. 4
As #koffeinfrei says, dynamic capture isn't possible in Ruby. Would be smarter to capture the whole string and remove spaces.
str = 'body{font-size:1.6rem;margin:4rem 7rem;}'
str.scan(/(?<=[{; ]).+?(?=[;}])/)
.map { |e| e.match /(?<prop>.+):(?<value>.+)/ }
#⇒ [
# [0] #<MatchData "font-size:1.6rem" prop:"font-size" value:"1.6rem">,
# [1] #<MatchData "margin:4rem 7rem" prop:"margin" value:"4rem 7rem">
# ]
The latter match might be easily adapted to return whatever you want, value.split(/\s+/) will return all the values, \d+ instead of .+ will match digits only etc.

Check if a string contains multiple items, separated by comma or space?

I want to allow users to input multiple strings, separated by comma or space, and then check if a referring URL contains any of those strings.
For instance, someone may only want a widget to show up on their /contact, /support, /about pages.
So then I'd want to do something like this to check if the URL contains any of those strings...
ref = "http://example.com/contact"
ref.include?('/contact, /support, /about')
Since what we're checking against would be input by the user, ideally the strings could be comma or space-separated.
a = "/contact, /support, /about".split(/[,\s]+/)
# => ["/contact", "/support", "/about"]
a.any?{|s| "http://example.com/contact".include?(s)}
# => true

How to get text between two strings in ruby?

I have a text file that contains this text:
What's New in this Version
==========================
-This is the text I want to get
-It can have 1 or many lines
-These equal signs are repeated throughout the file to separate sections
Primary Category
================
I just want to get everything between ========================== and Primary Category and store that block of text in a variable. I thought the following match method would work but it gives me, NoMethodError: undefined method `match'
f = File.open(metadataPath, "r")
line = f.readlines
whatsNew = f.match(/==========================(.*)Primary Category/m).strip
Any ideas? Thanks in advance.
f is a file descriptor - you want to match on the text in the file, which you read into line. What I prefer to do instead of reading the text into an array (which is hard to regex on) is to just read it into one string:
contents = File.open(metadataPath) { |f| f.read }
contents.match(/==========================(.*)Primary Category/m)[1].strip
The last line produces your desired output:
-This is the text I want to get \n-It can have 1 or many lines\n-These equal signs are repeated throughout the file to separate sections"
f = File.open(metadataPath, "r")
line = f.readlines
line =~ /==========================(.*)Primary Category/m
whatsNew = $1
you may want to consider refining the .* though as that could be greedy
Your problem is that readlines gives you an array of strings (one for each line), but the regular expression you're using needs a single string. You could read the file as one string:
contents = File.read(metadataPath)
puts contents[/^=+(.*?)Primary Category/m]
# => ==========================
# => -This is the text I want to get
# => -It can have 1 or many lines
# => -These equal signs are repeated throughout the file to separate sections
# =>
# => Primary Category
or you could join the lines into a single string before applying the regular expression:
lines = File.readlines(metadataPath)
puts lines.join[/^=+(.*?)Primary Category/m]
# => ==========================
# => -This is the text I want to get
# => -It can have 1 or many lines
# => -These equal signs are repeated throughout the file to separate sections
# =>
# => Primary Category
The approach I'd take is read in the lines, find out which line numbers are a series of equal signs (using Array#find_index), and group the lines into chunks from the line after the equal signs to the line before (or two lines before) the next lot of equal signs (probably using Enumerable#each_cons(2) and map). That way I don't have to modify much if the section headings change.

Resources