Can filebeat dissect a log line with spaces? - elasticsearch

So I have a log line formatted as such:
2020-04-15 12:16:44,936 WARN c.e.d.c.p.p.BasePooledObjectFactory [main] Caution - XML schema validation has been disabled! Validation is only available when using XML.
I am using filebeat to send this directly to elasticsearch, which it does but the log.level is not set, the whole line becomes the message.
reading up on dissection I had intended to use:
- add_host_metadata: ~
- dissect:
tokenizer: "%{} %{} %{log.level} %{} [%{}] %{message}"
field: "message"
target_prefix: ""
which I expected to split into:
log.level: WARN
message: Caution - XML schema validation has been disabled! Validation is only available when using XML.
instead I get the same output as without the dissect:
message: 2020-04-15 12:16:44,936 WARN c.e.d.c.p.p.BasePooledObjectFactory [main] Caution - XML schema validation has been disabled! Validation is only available when using XML.
I'm just getting to grips with filebeat and I've tried looking through the documentation which made it look simple enough. however my dissect is currently not doing anything. host metadata is being added so I believe that the processors are being called.
How can I get the log level out of the log line? (preferably without changing the format of the log itself)

You need to pick another field name than message in the dissect tokenization since this is the name of the field that contains the original log message:
- add_host_metadata: ~
- dissect:
tokenizer: "%{} %{} %{log.level} %{} [%{}] %{msg}"
field: "message"
target_prefix: ""


Transform String into JSON so that it's searchable in Kibana/Elasticsearch

I have Elasticsearch, Filebeat and Kibana running on a Windows machine. Filebeat log has a proper log file and is listening to the path. When I look on the data in Kibana it looks fine.
My issue is that the message field is a String.
Example of one log line:
12:58:09.9608 Trace {"message":"No more Excel rows found","level":"Trace","logType":"User","timeStamp":"2020-08-14T12:58:09.9608349+02:00","fingerprint":"226fdd2-e56a-4af4-a7ff-724a1a0fea24","windowsIdentity":"mine","machineName":"NAME-PC","processName":"name","processVersion":"","jobId":"957ef018-0a14-49d2-8c95-2754479bb8dd","robotName":"NAME-PC","machineId":6,"organizationUnitId":1,"fileName":"GetTransactionData"}
So what I would like to have now is that String converted to a JSON so that it is possible to search in Kibana for example for the level field.
I already had a look on Filebeat. There I tried to enable LogStash . But then the data does not come anymore to Elasticsearch. And also the log file is not genereated into the LogStash folder.
Then I downloaded LogStash via install guide, but unfortunately I got this message:
Logstash logs to C:/Users/mine/Desktop/logstash-7.8.1/logs which
is now configured via ERROR: Pipelines YAML file is
empty. Location:
C:/Users/mine/Desktop/logstash-7.8.1/config/pipelines.yml usage:
bin/logstash -f CONFIG_PATH [-t] [-r] [] [-w COUNT] [-l LOG]
bin/logstash --modules MODULE_NAME [-M
[-w COUNT] [-l LOG] bin/logstash -e CONFIG_STR [-t] [--log.level
fatal|error|warn|info|debug|trace] [-w COUNT] [-l LOG] bin/logstash
-i SHELL [--log.level fatal|error|warn|info|debug|trace] bin/logstash -V [--log.level fatal|error|warn|info|debug|trace]
bin/logstash --help
[2020-08-14T15:07:51,696][ERROR][org.logstash.Logstash ]
java.lang.IllegalStateException: Logstash stopped processing because
of an error: (SystemExit) exit
I tried to use Filebeat only. Here I set:
- add_host_metadata: ~
- add_cloud_metadata: ~
- add_docker_metadata: ~
- add_kubernetes_metadata: ~
- dissect:
tokenizer: '"%{event_time} %{loglevel} %{json_message}"'
field: "message"
target_prefix: "dissect"
- decode_json_fields:
fields: ["json_message"]
but that gave me:
The tip with removing the "" at tokenizer helped. Then I got:
I simply refreshed the index and the message was gone. Nice.
But The question is now, how to filter for something in the new field?
The message says, your pipeline config is empty. It seems you did not configured any pipeline yet. Logstash can do the trick (JSON filter plugin), but Filebeat is sufficient here. If you don't want to introduce another Service, this is the better option.
It has the decode_json_fields option to transform specific fields containing JSON in your event to a . Here is the documentation.
For the future case, where your whole event is a JSON, there is the possibility of parsing in filebeat configuring the json.message_key and related json.* option.
EDIT - Added filebeat snippet as an processors example of dissecting the log line into three fields (event_time, loglevel, json_message). Afterwards the recently extracted field json_message, whose value is a JSON object encoded as a string, will be decoded into an JSON structure:
- type: log
- path to your logfile
- dissect:
tokenizer: '%{event_time} %{loglevel} %{json_message}'
field: "message"
target_prefix: "dissect"
- decode_json_fields:
fields: ["dissect.json_message"]
target: ""
- drop_fields:
fields: ["dissect.json_message"]
If you want to practice the filebeat processors, try to set the correct event timestamp, taken from the encoded json and written into #timestamp using the timestamp processor.

ELK parse json field as seperate fields

I have json like this:
{"date":"2018-12-14 00:00:44,292","service":"aaa","severity":"DEBUG","trace":"abb161a98c23fc04","span":"cd782a330dd3271b","parent":"abb161a98c23fc04","pid":"12691","thread":"http-nio-9080-exec-12","message":"{\"type\":\"Request\",\"lang\":\"pl\",\"method\":\"POST\",\"sessionId\":5200,\"ipAddress\":\"\",\"username\":\"\",\"contentType\":\"null\",\"url\":\"/aaa/getTime\",\"queryString\":\"null\",\"payload\":\",}"}
The issue is that above we have:
That application saves log file that way
and filebeat and logstash does not parse it as i want to.
I see only one field in Kibana named message but i want to have seperate fields like: type, lang, method etc.
I think the issue occurs cause of \ sign near " character.
How can i change behavior of filebeat/logstash to make it happen?
The application is to huge for me to add everywhere net.logstash.logback.encoder.LogstashEncoder in project java files.
I have many logback-json.xml files.
These files have:
<encoder class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
"severity": "%level",
"service": "${springAppName}",
"trace": "%X{X-B3-TraceId:-}",
"span": "%X{X-B3-SpanId:-}",
"parent": "%X{X-B3-ParentSpanId:-}",
"exportable": "%X{X-Span-Export:-}",
"pid": "${PID:-}",
"thread": "%thread",
"class": "%logger{26}",
"message": "%message",
"ex": "%ex"
I tried adding somethine like "jsonMessage": "#asJson{%message}"
mentioned here:
but in case message is like mentioned before i see that it fails to parse and i get "jsonMessage":null
In simplier case i get:
for example and not null.
My filebeat config:
I wrote following code and if I start logstash with this file then I can see correct json in kibana.
input {
file {
path => "C:/Temp/logFile.log"
start_position => "beginning"
filter {
source => "message"
target => "parsedJson"
output {
elasticsearch {
hosts => "localhost:9200"
index => "demo"
document_type => "demo"
stdout { }
Please refer Kibana image
Reference from : Reference
use this configuration in your logstash filter
filter {json{ source => "message" target => "message1" }
mutate{ remove_field => [ "message" ]}}

Multi-line pattern in FileBeat

I am using Filbeat for log aggregation, which takes the logs to Kibana. Below is my error message that needs to be directed to Kibana:
2017-04-17 15:45:47,154 [JCO.ServerThread-8] ERROR com.webservice.AxisWebServiceClient - Client error
2017-04-17 15:45:47,154 [JCO.ServerThread-8] ERROR com.webservice.AxisWebServiceClient - The XML request is invalid. Fix the request and resend.
2017-04-04 12:47:09,362 [JCO.ServerThread-3] INFO - End RFC_CALCULATE_TAXES_DOC
2017-04-04 12:47:09,362 [JCO.ServerThread-3] DEBUG com.Time - RFC_CALCULATE_TAXES_DOC,DEC[2],Total Time,39
i want only to have 2017-04-17 15:45:47,154 [JCO.ServerThread-8]ERROR and lines below the error to be send to Kibana, but i do get the INFO part as well
Below is filbeat.yml file
- /apps/global/vertex/SIC_HOME_XEC/logs/sic.log
input_type: log
exclude_lines: ['^INFO']
#include_lines: 'ERROR'
pattern: '^\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3}\s\[[A-Za-z0-9.-]*\]\s[E]RROR'
negate: true
match: after
Request veterans help to select only the ERROR message pattern using regex.
In order to extract the error messages as a group, you'll need to modify your regex as following:
^\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3}\s\[[A-Za-z0-9.-]*\]\sERROR (\w.+)
This creates a group with all characters and the dot character, which captures the error message.

sematext logagent debugging patterns

I have installed sematext logagent
Configured it to output to elasticsearch and all is good but one thing which i spent this all day trying to do.
There is 0, null, none information on how to debug parsers. I start logagent with "logagent --config logagent.yml -v -j", yml file bellow
printStats: 30
# don't write parsed logs to stdout
suppress: false
# Enable/disable GeoIP lookups
# Startup of logagent might be slower, when downloading the GeoIP database
geoipEnabled: false
# Directory to store Logagent status nad temporary files
diskBufferDir: ./tmp
- '/var/log/messages'
- '/var/log/test'
sourceName: !!js/regexp /test/
- type: mysyslog
regex: !!js/regexp /([a-z]){2}(.*)/
fields: [message,severity]
dateFormat: MMM DD HH:mm:ss
module: elasticsearch
url: http://host:9200
index: mysyslog
stdout: yaml # use 'pretty' for pretty json and 'ldjson' for line delimited json (default)
I would expect (based on the scares documentation) that this would split each line of test file into 2, example 'ggff', 'gg' would be message, 'ff' would be severity, but all i can see in my kibana is that 'ggff' is a message and severity is defaulted (?) to info. The problem is, i dont know where the problem is. Does it skip my pattern, does match in my pattern fail ? any help would be VERY appreciated.
Setting 'debug: true' in patterns.yml prints detailed info about matched patterns.
Watch Logagent issue #69 ( for additional improvements.
The docs moved to . I recommend to test regular expressions (please use JavaScript regex syntax).
Examples of Syslog messages in /var/log are in the default pattern library:

How do i prevent elasticsearch's _analyze from interpretting yml

I'm trying to use the _analyze api with text that looks like this:
--- some -- text ---
This request works as expected:
curl localhost:9200/my_index/_analyze -d '--'
However, this one fails:
curl localhost:9200/medical_documents/_analyze -d '---'
- type: "illegal_argument_exception"
reason: "Malforrmed content, must start with an object"
type: "illegal_argument_exception"
reason: "Malforrmed content, must start with an object"
status: 400
Considering the formatting of the response, i assume that elasticsearch tried to parse the request as yaml and failed.
If that is the case, how can i disable yml parsing, or _analyze a text that starts with --- ?
The problem is not the yaml parser. The problem is that you are trying to create a type.
The following is incorrect(will give you Malforrmed content, must start with an object error)
curl localhost:9200/my_index/medical_documents/_analyze -d '---'
This will give you no error, but is incorrect. Because it will tell elastic to create a new type.
curl localhost:9200/my_index/medical_documents/_analyze -d '{"analyzer" : "standard","text" : "this is a test"}'
Analyzers are created Index level. verify with:
curl -XGET 'localhost:9200/my_index/_settings'<br/>
So the proper way is:
curl -XGET 'localhost:9200/my_index/_analyze' -d '{"analyzer" : "your_analyzer_name","text" : "----"}'
Previously need to create the analyzer.
