Logstash: error for querying elasticsearch - elasticsearch

Hello everyone,
Through logstash, I want to query elasticsearch in order to get fields from previous events and do some computation with fields of my current event and add new fields. Here is what I did:
input file:
{"device":"device1","count":5}
{"device":"device2","count":11}
{"device":"device1","count":8}
{"device":"device3","count":100}
{"device":"device3","count":95}
{"device":"device3","count":155}
{"device":"device2","count":15}
{"device":"device1","count":55}
My expected output:
{"device":"device1","count":5,"previousCount=0","delta":0}
{"device":"device2","count":11,"previousCount=0","delta":0}
{"device":"device1","count":8,"previousCount=5","delta":3}
{"device":"device3","count":100,"previousCount=0","delta":0}
{"device":"device3","count":95,"previousCount=100","delta":-5}
{"device":"device3","count":155,"previousCount=95","delta":60}
{"device":"device2","count":15,"previousCount=11","delta":4}
{"device":"device1","count":55,"previousCount=8","delta":47}
Logstash filter part:
filter {
elasticsearch {
hosts => ["localhost:9200/device"]
query => 'device:"%{[device]}"'
sort => "#timestamp:desc"
fields => ['count','previousCount']
}
if [previousCount]{
ruby {
code => "event[delta] = event[count] - event[previousCount]"
}
}
else{
mutate {
add_field => { "previousCount" => "0" }
add_field => { "delta" => "0" }
}
}
}
My problem:
For every line of my input file I got the following error : Failed to query elasticsearch for previous event ..
It seems that every line completely treated is not put in elasticsearch before logstash starts to treat the next line.
I don't know if my conclusion is correct and, if yes, why it happens.
So, do you know how I could solve this problem please ?!
Thank you for your attention and your help.
S

Related

Parse integer and show in timeline in Kibana

Here is the Kibana UI and I want to parse some Integer in the message. The number in the end of message is the process time for one method and I what to visualize the average process time by hour in Kibana. Is that possible?
I tried some conf in logstash:
filter{
json{
source => "message"
}
grok {
match => {
"message" => "^Finish validate %{NUMBER:cto_validate_time}$"
}
}
grok {
match => {
"message" => "^Finish customize %{NUMBER:cto_customize_time}$"
}
}
}
It works. But when I create the timechart I can not get the new field.
Since you don't care about performance issues, you may create a scripted field named process_time in your index pattern with the following painless code. What it does is simply take the last numerical value from your message field.
def m = /.*\s(\d+)$/.matcher(doc['message.keyword'].value);
if ( m.matches() ) {
return m.group(1)
} else {
return 0
}
Then you can build a chart to show the average process time by hour. Go to the Visualize tab and create a new vertical bar chart. On the Y-Axis you'll create an Average aggregation on the process_time field and on the X-Axis you'll use a Date histogram aggregation on your timestamp field. A sample is shown below:
Note: You also need to add the following line in your elasticsearch.yml file and restart ES:
script.painless.regex.enabled: true
UPDATE
If you want to do it via Logstash you can add the following grok filter
filter{
grok {
match => {
"message" => "^Finish customize in controller %{NUMBER:cto_customize_time}$"
}
}
mutate {
convert => { "cto_customize_time" => "integer" }
}
}

Can't access Elasticsearch index name metadata in Logstash filter

I want to add the elasticsearch index name as a field in the event when processing in Logstash. This is suppose to be pretty straight forward but the index name does not get printed out. Here is the complete Logstash config.
input {
elasticsearch {
hosts => "elasticsearch.example.com"
index => "*-logs"
}
}
filter {
mutate {
add_field => {
"log_source" => "%{[#metadata][_index]}"
}
}
}
output {
elasticsearch {
index => "logstash-%{+YYYY.MM}"
}
}
This will result in log_source being set to %{[#metadata][_index]} and not the actual name of the index. I have tried this with _id and without the underscores but it will always just output the reference and not the value.
Doing just %{[#metadata]} crashes Logstash with the error that it's trying to accessing the list incorrectly so [#metadata] is being set but it seems like index or any values are missing.
Does anyone have a another way of assigning the index name to the event?
I am using 5.0.1 of both Logstash and Elasticsearch.
You're almost there, you're simply missing the docinfo setting, which is false by default:
input {
elasticsearch {
hosts => "elasticsearch.example.com"
index => "*-logs"
docinfo => true
}
}

Drop filter not working logstash

I have multiple log messages in a file which I am processing using logstash filter plugins. Then, the filtered logs are getting sent to elasticsearch.
There is one field called addID in a log message. I want to drop all the log messages which have a particular addID present. These particular addIDS are present in a ID.yml file.
Scenario: If the addID of a log message matches with any of the addIDs present in the ID.yml file, that log message should be dropped.
Could anyone help me in achieving this?
Below is my config file.
input {
file {
path => "/Users/jshaw/logs/access_logs.logs
ignore_older => 0
}
}
filter {
grok {
patterns_dir => ["/Users/jshaw/patterns"]
match => ["message", "%{TIMESTAMP:Timestamp}+{IP:ClientIP}+{URI:Uri}"]
}
kv{
field_split => "&?"
include_keys => [ "addID" ]
allow_duplicate_values => "false"
}
if [addID] in "/Users/jshaw/addID.yml" {
drop{}
}
}
output {
elasticsearch{
hosts => ["localhost:9200"]
}
}
You are using the in operator wrong. It is used to check if a value is in an array, not in a file, which are usually a bit more complicated to use.
A solution would be to use the ruby filter to open the file each time.
Or put the addId value in your configuration file, like this:
if [addID] == "addID" {
drop{}
}

Can we extract numeric values from string through script in kibana?

I need to extract numeric values from string and store in new field..Can we do this through scripted field?
Ex: 1 hello 3 test
I need to extract 1 and 3.
You can do this through logstash if you are using elasticsearch.
Run a logstash process with a config like
input {
elasticsearch {
hosts => "your_host"
index => "your_index"
query => "{ "query": { "match_all": {} } }"
}
}
filter {
grok {
match => { "your_string_field" => "%{NUMBER:num1} %{GREEDYDATA:middle_stuff} %{NUMBER:num2} %{GREEDYDATA:other_stuff}" }
}
mutate {
remove_field => ["middle_stuff", "other_stuff"]
}
}
output{
elasticsearch {
host => "yourhost"
index => "your index"
document_id => %{id}
}
}
This would essentially overwrite each document in your index with two more fields, num1 and num2 that correspond to the numbers that you are looking for. This is just a quick and dirty approach that would take up more memory, but would allow you to do all of the break up at one time instead of at visualization time.
I am sure there is a way to do this with scripting, look into groovy regex matching where you return a specific group.
Also no guarantee my config representation is correct as I don't have time to test it at the moment.
Have a good day!

How to find documents which are causing mapping conflict in Elasticsearch

I have following Logstash filter:
...
if [type] == "binarysize" {
if ([file] =~ /svn/) {
drop {} }
grok {
match => {
"file" => "\A/home/data/binaries_size_stats/%{WORD:branch}/%{WORD:binary}/%{WORD:architecture}/%{WORD}"
}
}
grok {
match => {
"message" => "\A%{INT:commit_number},%{INT:binary_size},%{TIMESTAMP_ISO8601:date_and_time_of_commit}"
}
}
date {
match => ["date_and_time_of_commit", "ISO8601"]
#timezone => "CET"
}
}
...
I started it, pushed some data into Elasticsearch, did some plots in Kibana and left everything working nicely for the weekend.
When I returned, my plots are not updated with new data and I receive constant "14 Courier Fetch: 15 of 2465 shards failed." message no matter how much I reload page in browser.
After reloading field list I found that I have one conflict in "binary_size" field. I was plotting data based on this field, so my guess is that something weird happened over weekend with new documents pushed to Elasticsearch by Logstash.
My question is: "How can I find those documents with conflicting fields?". Or what should I do alternatively to be able to plot fresh data again.

Resources