Logstash error when using if statement in pipeline - elasticsearch

I am trying to determine if a field exists in a log file and if so, use the value of that field as part of the index name. If the field does not exist, use a different index name.
beats {
port => 5000
}
}
filter {
}
output {
elasticsearch {
hosts => ["https://elasticserver.io:9243"]
user => "user"
password => "pass"
retry_on_conflict => "2"
if [index_append] {
index = "%{[#metadata][beat]}%{index_append}"
}
else {
index = "%{[#metadata][beat]}"
}
"action" => "create"
}
}
If I remove the if statements in the output section and just either one of the index options (index = "%{[#metadata][beat]}%{index_append}", or index = "%{[#metadata][beat]}") the pipeline loads fine, but doesn't account for when the field 'index_append' exists or not.
I have tried many combinations, but the logstash logs seem to indicate some sort of syntax issue.
[2021-06-09T17:17:38,658][ERROR][logstash.agent ] Failed to execute action {:id=>:"LogstashPipeline", :action_type=>LogStash::ConvergeResult::FailedAction, :message=>"Expected one of [ \\t\\r\\n], \"#\", \"=>\" at line 14, column 8 (byte 259) after output {\n elasticsearch {\n hosts => [\"https://elasticserver.io:9243\"]\n user => \"user\"\n password => \"pass\"\n retry_on_conflict => \"2\"\n if ", :backtrace=>["/opt/logstash/logstash-core/lib/logstash/compiler.rb:32:in `compile_imperative'", "org/logstash/execution/AbstractPipelineExt.java:184:in `initialize'", "org/logstash/execution/JavaBasePipelineExt.java:69:in `initialize'", "/opt/logstash/logstash-core/lib/logstash/pipeline_action/reload.rb:53:in `execute'", "/opt/logstash/logstash-core/lib/logstash/agent.rb:389:in `block in converge_state'"]}
I tried moving the if statements to the filter section, but receive the same error in logstash logs. I have used similar if statements in other pipelines and have not had these types of issues. I copied the code to VS Code and verified that there were no extra spaces or characters. I'm at a loss.
This pipeline is running on Logstash 7.10.2

Move the conditional to the filter section. Use a field under [#metadata] to store the index name. By default [#metadata] does not get written by the output so it is useful for storing temporary variables.
if [index_append] {
mutate { add_field => { "[#metadata][index]" => "%{[#metadata][beat]}%{index_append}" } }
} else {
mutate { add_field => { "[#metadata][index]" => "%{[#metadata][beat]}" } }
}
Then reference it in the output using
index => "%{[#metadata][index]}"

Related

Using conditionals in Logstash pipeline configuration

I am trying to use Logstash conditionals in a context of pipeline output configuration.
Based on the presence of device field in the payload I'd like to forward the event to the appropriate index name in Elasticsearch:
output {
elasticsearch {
hosts => ["10.1.1.5:9200"]
if [device] ~= \.* {
index => "%{[device][0]}-%{+YYYY.ww}"
} else {
index => "%{[beat][name]}-%{+YYYY.ww}"
}
}
}
The above code would fail with the following mgs in the log indicating the syntax error:
...
"Expected one of #, => at line 14, column 12 (byte 326) after output {\n elasticsearch {\n hosts => [\"10.1.1.5:9200\"]\n if "
...
Can someone please advise?
You should use the conditional before the elasticsearch output, not inside it.
output {
if [device] ~= \.* {
elasticsearch {
hosts => ["10.1.1.5:9200"]
index => "%{[device][0]}-%{+YYYY.ww}"
}
} else {
elasticsearch {
hosts => ["10.1.1.5:9200"]
index => "%{[beat][name]}-%{+YYYY.ww}"
}
}
}

Logstash Update a document in elasticsearch

Trying to update a specific field in elasticsearch through logstash. Is it possible to update only a set of fields through logstash ?
Please find the code below,
input {
file {
path => "/**/**/logstash/bin/*.log"
start_position => "beginning"
sincedb_path => "/dev/null"
type => "multi"
}
}
filter {
csv {
separator => "|"
columns => ["GEOREFID","COUNTRYNAME", "G_COUNTRY", "G_UPDATE", "G_DELETE", "D_COUNTRY", "D_UPDATE", "D_DELETE"]
}
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash-data-monitor"
query => "GEOREFID:%{GEOREFID}"
fields => [["JSON_COUNTRY","G_COUNTRY"],
["XML_COUNTRY","D_COUNTRY"]]
}
if [G_COUNTRY] {
mutate {
update => { "D_COUNTRY" => "%{D_COUNTRY}"
}
}
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash-data-monitor"
document_id => "%{GEOREFID}"
}
}
We are using the above configuration when we use this the null value field is getting removed instead of skipping null value update.
Data comes from 2 different source. One is from XML file and the other is from JSON file.
XML log format : GEO-1|CD|23|John|892|Canada|31-01-2017|QC|-|-|-|-|-
JSON log format : GEO-1|AS|33|-|-|-|-|-|Mike|123|US|31-01-2017|QC
When adding one log new document will get created in the index. When reading the second log file the existing document should get updated. The update should happen only in the first 5 fields if log file is XML and last 5 fields if the log file is JSON. Please suggest us on how to do this in logstash.
Tried with the above code. Please check and can any one help on how to fix this ?
For the Elasticsearch output to do any action other than index you need to tell it to do something else.
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash-data-monitor"
action => "update"
document_id => "%{GEOREFID}"
}
This should probably be wrapped in a conditional to ensure you're only updating records that need updating. There is another option, though, doc_as_upsert
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash-data-monitor"
action => "update"
doc_as_upsert => true
document_id => "%{GEOREFID}"
}
This tells the plugin to insert if it is new, and update if it is not.
However, you're attempting to use two inputs to define a document. This makes things complicated. Also, you're not providing both inputs, so I'll improvise. To provide different output behavior, you will need to define two outputs.
input {
file {
path => "/var/log/xmlhome.log"
[other details]
}
file {
path => "/var/log/jsonhome.log"
[other details]
}
}
filter { [some stuff ] }
output {
if [path] == '/var/log/xmlhome.log' {
elasticsearch {
[XML file case]
}
} else if [path] == '/var/log/jsonhome.log' {
elasticsearch {
[JSON file case]
action => "update"
}
}
}
Setting it up like this will allow you to change the ElasticSearch behavior based on where the event originated.

Logstash: error for querying elasticsearch

Hello everyone,
Through logstash, I want to query elasticsearch in order to get fields from previous events and do some computation with fields of my current event and add new fields. Here is what I did:
input file:
{"device":"device1","count":5}
{"device":"device2","count":11}
{"device":"device1","count":8}
{"device":"device3","count":100}
{"device":"device3","count":95}
{"device":"device3","count":155}
{"device":"device2","count":15}
{"device":"device1","count":55}
My expected output:
{"device":"device1","count":5,"previousCount=0","delta":0}
{"device":"device2","count":11,"previousCount=0","delta":0}
{"device":"device1","count":8,"previousCount=5","delta":3}
{"device":"device3","count":100,"previousCount=0","delta":0}
{"device":"device3","count":95,"previousCount=100","delta":-5}
{"device":"device3","count":155,"previousCount=95","delta":60}
{"device":"device2","count":15,"previousCount=11","delta":4}
{"device":"device1","count":55,"previousCount=8","delta":47}
Logstash filter part:
filter {
elasticsearch {
hosts => ["localhost:9200/device"]
query => 'device:"%{[device]}"'
sort => "#timestamp:desc"
fields => ['count','previousCount']
}
if [previousCount]{
ruby {
code => "event[delta] = event[count] - event[previousCount]"
}
}
else{
mutate {
add_field => { "previousCount" => "0" }
add_field => { "delta" => "0" }
}
}
}
My problem:
For every line of my input file I got the following error : Failed to query elasticsearch for previous event ..
It seems that every line completely treated is not put in elasticsearch before logstash starts to treat the next line.
I don't know if my conclusion is correct and, if yes, why it happens.
So, do you know how I could solve this problem please ?!
Thank you for your attention and your help.
S

Data type conversion using logstash grok

Basic is a float field. The mentioned index is not present in elasticsearch. When running the config file with logstash -f, I am getting no exception. Yet, the data reflected and entered in elasticsearch shows the mapping of Basic as string. How do I rectify this? And how do I do this for multiple fields?
input {
file {
path => "/home/sagnik/work/logstash-1.4.2/bin/promosms_dec15.csv"
type => "promosms_dec15"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
grok{
match => [
"Basic", " %{NUMBER:Basic:float}"
]
}
csv {
columns => ["Generation_Date","Basic"]
separator => ","
}
ruby {
code => "event['Generation_Date'] = Date.parse(event['Generation_Date']);"
}
}
output {
elasticsearch {
action => "index"
host => "localhost"
index => "promosms-%{+dd.MM.YYYY}"
workers => 1
}
}
You have two problems. First, your grok filter is listed prior to the csv filter and because filters are applied in order there won't be a "Basic" field to convert when the grok filter is applied.
Secondly, unless you explicitly allow it, grok won't overwrite existing fields. In other words,
grok{
match => [
"Basic", " %{NUMBER:Basic:float}"
]
}
will always be a no-op. Either specify overwrite => ["Basic"] or, preferably, use mutate's type conversion feature:
mutate {
convert => ["Basic", "float"]
}

How to extract variables from log file path, test log file name for pattern in Logstash?

I have AWS ElasticBeanstalk instance logs on S3 bucket.
Path to Logs is:
resources/environments/logs/publish/e-3ykfgdfgmp8/i-cf216955/_var_log_nginx_rotated_access.log1417633261.gz
which translates to :
resources/environments/logs/publish/e-[random environment id]/i-[random instance id]/
The path contains multiple logs:
_var_log_eb-docker_containers_eb-current-app_rotated_application.log1417586461.gz
_var_log_eb-docker_containers_eb-current-app_rotated_application.log1417597261.gz
_var_log_rotated_docker1417579261.gz
_var_log_rotated_docker1417582862.gz
_var_log_rotated_docker-events.log1417579261.gz
_var_log_nginx_rotated_access.log1417633261.gz
Notice that there's some random number (timestamp?) inserted by AWS in filename before ".gz"
Problem is that I need to set variables depending on log file name.
Here's my configuration:
input {
s3 {
debug => "true"
bucket => "elasticbeanstalk-us-east-1-something"
region => "us-east-1"
region_endpoint => "us-east-1"
credentials => ["..."]
prefix => "resources/environments/logs/publish/"
sincedb_path => "/tmp/s3.sincedb"
backup_to_dir => "/tmp/logstashed/"
tags => ["s3","elastic_beanstalk"]
type => "elastic_beanstalk"
}
}
filter {
if [type] == "elastic_beanstalk" {
grok {
match => [ "#source_path", "resources/environments/logs/publish/%{environment}/%{instance}/%{file}<unnecessary_number>.gz" ]
}
}
}
In this case I want to extract environment , instance and file name from path. In file name I need to ignore that random number.
Am I doing this the right way? What will be full, correct solution for this?
Another question is how can I specify fields for custom log format for particular log file from above?
This could be something like: (meta-code)
filter {
if [type] == "elastic_beanstalk" {
if [file_name] BEGINS WITH "application_custom_log" {
grok {
match => [ "message", "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" ]
}
}
if [file_name] BEGINS WITH "some_other_custom_log" {
....
}
}
}
How do I test for file name pattern?
For your first question, and assuming that #source_path contains the full path, try:
match => [ "#source_path", "logs/publish/%{NOTSPACE:env}/%{NOTSPACE:instance}/%{NOTSPACE:file}%{NUMBER}%{NOTSPACE:suffix}" ]
This will create 4 logstash field for you:
env
instance
file
suffix
More information is available on the grok man page and you should test with the grok debugger.
To test fields in logstash, you use conditionals, e.g.
if [field] == "value"
if [field] =~ /regexp/
etc.
Note that it's not always necessary to do this with grok. You can have multiple 'match' arguments, and it will (by default) stop after hitting the first one that matches. If your patterns are exclusive, this should work for you.

Resources