How to select specific element of xml input log in Logstash - xpath

I'm setting logstash for being able to receive xml logs from filebeat. The problem I'm facing on is that I don't want to print the whole log file, I'm just interested in specific fields. To do so I'm using xml filter plugin and Prune filter plugin.
For example, I'm working with IDMEF-Message alerts, and I'm interested in the Classification field.
The configuration I did is:
input {
beats {
port => "5044"
}
}
#I'm just interested in the log alert.
filter {
prune {
whitelist_names => [ "^message$"]
}
}
#Get de classification text from the alert
filter {
xml {
source => "message"
store_xml => false
target => "clasifications"
xpath => ["/IDMEF-Message/Alert/Classification/text()", "clasificacion"]
remove_field => "message"
}
}
#Add a new field class with the clasifications value
filter {
mutate{add_field=>{"class"=>"%{clasifications}"}}
}
#remove message and just let the class field
filter {
prune {
whitelist_names => [ "clas"]
}
}
output {
file {
path => "~/xml_logstash.txt"
}
}
And the output I'm receiving is just {"class":"%{clasifications}"}. I also tried changing mutate{add_field=>{"class"=>"%{clasifications}"}} to mutate{add_field=>{"class"=>"%{clasificacion}"}}, but the result is the same.
My doubt is how to access to the "clasificacion" field where I stored the result of the xml filter.
An example of the logs I'm working on is:
<IDMEF-Message>
<Alert messageid="...">
<Analyzer ...
</Analyzer>
<CreateTime ... </CreateTime>
<DetectTime ... </DetectTime>
<AnalyzerTime ... </AnalyzerTime>
<Source>
...
</Source>
<Target>
...
</Target>
<Classification text="Text_Class" />
<IDMEF-Message>
Thank you
Rubi

I solved it.
The problem was the way I accessed the text attribute of the classification field. You have to use #text if it is an attribute, and text() in case is the value of a field.
filter {
xml {
source => "message"
store_xml => false
target => "clasifications"
xpath => ["/IDMEF-Message/Alert/Classification/#text", "clasificacion"]
}
}
filter {
mutate{add_field=>{"clasificacion"=>"%{clasificacion}"}}
}

Related

logstash _grokparsefailure for realy simple tag

I don't understand why I have a grokparse failure for this simple config :
input {
file {
path => "/var/log/*.log"
codec => json {
}
}
}
filter {
grok {
add_tag => ["test"]
}
}
output {
elasticsearch {
/.../
}
}
The logs are correcly sent to elasticsearch, the json is correcly parsed, but the added tag don't work, instead I have a tag "_grokparsefailure". What I want is to pass a static value as a tag.
I am surely missing something dumb, but I can't find what.
Your grok filter does nothing, there is no pattern to match, the tag would only be applied after a successful match.
To add a tag in your case you can use the tags option in your input or the mutate filter.
To use the tags option just add change your input to this one:
input {
file {
path => "/var/log/*.log"
codec => json
tags => ["test"]
}
}
To use the mutate filter, put the bellow config inside your filter block.
mutate {
add_tag => ["test"]
}
Both configurations will add a test tag to all your messages.

Kibana. Extract fields from #message containing a JSON

I would like to extract in Kiabana fields from #message field which contains a json.
ex:
Audit{
uuid='xxx-xx-d3sd-fds3-f43',
action='/v1.0/execute/super/method',
resultCode='SUCCESS',
browser='null',
ipAddress='192.168.2.44',
application='application1',
timeTaken='167'
}
Having "action" and "application" fields I hope to be able to find top 5 requests that hits the application.
I started with something similar to this:
filter {
if ([message]~ = "Audit") {
grok {
match => {
"message" => "%{WORD:uuid}, %{WORD:action}, %{WORD:resultCode}, %{WORD:browser}, %{WORD:ipAddress}, %{WORD:application}, %{NUMBER:timeTaken}"
}
add_field => ["action", "%{action}"]
add_field => ["application", "%{application}"]
}
}
}
But it seems to be too far from reality.
If the content of "Audit" is really in json format, you can use the filter plugin "json"
json{
source => "Audit"
}
It will do the parsing for you and creates everything. You don't need grok / add_field.

how filter {"foo":"bar", "bar": "foo"} with grok to get only foo field?

I copied
{"name":"myapp","hostname":"banana.local","pid":40161,"level":30,"msg":"hi","time":"2013-01-04T18:46:23.851Z","v":0}
from https://github.com/trentm/node-bunyan and save it as my logs.json. I am trying to import only two fields (name and msg) to ElasticSearch via LogStash. The problem is that I depend on a sort of filter that I am not able to accomplish. Well I have successfully imported such line as a single message but certainly it is not worth in my real case.
That said, how can I import only name and msg to ElasticSearch? I tested several alternatives using http://grokdebug.herokuapp.com/ to reach an useful filter with no success at all.
For instance, %{GREEDYDATA:message} will bring the entire line as an unique message but how to split it and ignore all other than name and msg fields?
At the end, I am planing to use here:
input {
file {
type => "my_type"
path => [ "/home/logs/logs.log" ]
codec => "json"
}
}
filter {
grok {
match => { "message" => "data=%{GREEDYDATA:request}"}
}
#### some extra lines here probably
}
output
{
elasticsearch {
codec => json
hosts => "http://127.0.0.1:9200"
index => "indextest"
}
stdout { codec => rubydebug }
}
I have just gone through the list of available Logstash filters. The prune filter should match your need.
Assume you have installed the prune filter, your config file should look like:
input {
file {
type => "my_type"
path => [ "/home/logs/logs.log" ]
codec => "json"
}
}
filter {
prune {
whitelist_names => [
"#timestamp",
"type",
"name",
"msg"
]
}
}
output {
elasticsearch {
codec => json
hosts => "http://127.0.0.1:9200"
index => "indextest"
}
stdout { codec => rubydebug }
}
Please be noted that you will want to keep type for Elasticsearch to index it into a correct type. #timestamp is required if you will view the data on Kibana.

Drop filter not working logstash

I have multiple log messages in a file which I am processing using logstash filter plugins. Then, the filtered logs are getting sent to elasticsearch.
There is one field called addID in a log message. I want to drop all the log messages which have a particular addID present. These particular addIDS are present in a ID.yml file.
Scenario: If the addID of a log message matches with any of the addIDs present in the ID.yml file, that log message should be dropped.
Could anyone help me in achieving this?
Below is my config file.
input {
file {
path => "/Users/jshaw/logs/access_logs.logs
ignore_older => 0
}
}
filter {
grok {
patterns_dir => ["/Users/jshaw/patterns"]
match => ["message", "%{TIMESTAMP:Timestamp}+{IP:ClientIP}+{URI:Uri}"]
}
kv{
field_split => "&?"
include_keys => [ "addID" ]
allow_duplicate_values => "false"
}
if [addID] in "/Users/jshaw/addID.yml" {
drop{}
}
}
output {
elasticsearch{
hosts => ["localhost:9200"]
}
}
You are using the in operator wrong. It is used to check if a value is in an array, not in a file, which are usually a bit more complicated to use.
A solution would be to use the ruby filter to open the file each time.
Or put the addId value in your configuration file, like this:
if [addID] == "addID" {
drop{}
}

Logstash float field does not been indexed

The following is the configuration of logstash. When I input log into logstash, it works well as expected. All the field can be accepted by elasticsearch and the value and type of all fields is correct. However, when I view the log in kinana, it says that the cost field is not indexed so that it can't been visualized. While all the string fields are indexed. I want to visualize my float field. Anyone know what's the problem?
input {
syslog {
facility_labels=>["local0"]
port=>515
}
stdin {}
}
filter{
grok {
overwrite => ["host", "message"]
match => { "message" => " %{BASE10NUM:cost} %{GREEDYDATA:message}" }
}
mutate {
convert => { "cost" => "float" }
}
}
output {
stdout{
codec=>rubydebug
}
elasticsearch{ }
}
Kibana doesn't autoreload a new field from Elastic Search. You need to reload it manually.
So you go in the Settings tab, select you index and reload the fields

Resources