logstash add_field conversion issue - elasticsearch

I am using logstash version 5.0.2. Parsing a file which holds a filename as one of the fields which is parsed by logstash grok filter, but for visualization i needed the file number to identify each file. So I added new field through mutate filter add_field checking the filename in [message].
if 'filename_1' in [message] {
mutate { add_field => { "file_no" => "13" } }
mutate {convert => [ "file_no", "float" ] }
}
If i check the parsing through stdin/stdout (rubydebug codec) filterit shows the file_no field is converted properly. but if I send the logstash output to elasticsearch kibana shows conflict in data type of that field.
there I am able to see file_no.keyword(as string) and file_no(as conflict), with error as:
Mapping conflict! A field is defined as several types (string, integer,
etc) across the indices that match this pattern. You may still be able to use
these conflict fields in parts of Kibana, but they will be unavailable for
functions that require Kibana to know their type. Correcting this issue will
require reindexing your data
I have converted the added filed so why is is still being sent to elasticsearch as string not sure.
any help would be great.
When tried converting the field there is not option of number in Kibana. The source logfile being monitored doesn't have this number to parse it directly as an integer with %{PATTERN_FOR_NUMBER:number_variable:int} otherwise this could have been easier

Related

Add extra value to field before sending to elasticsearch

I'm using logstash, filebeat and grok to send data from logs to my elastisearch instance. This is the grok configuration in the pipe
filter {
grok {
match => {
"message" => "%{SYSLOGTIMESTAMP:messageDate} %{GREEDYDATA:messagge}"
}
}
}
This works fine, the issue is that messageDate is in this format Jan 15 11:18:25 and it doesn't have a year entry.
Now, i actually know the year these files were created in and i was wondering if it is possible to add the value to the field during the process, that is, somehow turn Jan 15 11:18:25 into 2016 Jan 15 11:18:25 before sending to elasticsearch (obviously without editing the files, which i could do and even with ease but it'll be a temporary fix to what i have to do and not a definitive solution)
I have tried googling if it was possible but no luck...
Valepu,
The only way to modify the data from a field is using the ruby filter:
filter {
ruby {
code => "#your code here#"
}
}
For more information like...how to get,set field values, here is the link:
https://www.elastic.co/guide/en/logstash/current/plugins-filters-ruby.html
If you have a separate field for date as a string, you can use logstash date plugin:
https://www.elastic.co/guide/en/logstash/current/plugins-filters-date.html
If you don't have it as a separate field (as in this case) use this site to construct your own grok pattern:
http://grokconstructor.appspot.com/do/match
I made this to preprocess the values:
%{YEAR:yearVal} %{MONTH:monthVal} %{NUMBER:dayVal} %{TIME:timeVal} %{GREEDYDATA:message}
Not the most elegant I guess, but you get the values in different fields. Using this you can create your own date field and parse it with date filter so you will get a comparable value or you can use these fields by themselves. I'm sure there is a better solution, for example you could make your own grok pattern and use that, but I'm gonna leave some exploration for you too. :)
By reading thoroughly the grok documentation i found what google couldn't find for me and which i apparently missed the first time i read that page
https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html#plugins-filters-grok-add_field
Using the add_field and remove_field options i managed to add the year to my date, then i used the date plugin to send it to logstash as a timestamp. My filter configuration now looks like this
filter {
grok {
match => {
"message" => "%{SYSLOGTIMESTAMP:tMessageDate} %{GREEDYDATA:messagge}"
add_field => { "messageDate" => "2016 %{tMessageDate}" }
remove_field => ["tMessageDate"]
}
}
date {
match => [ "messageDate", "YYYY MMM dd HH:mm:ss"]
}
}
And it worked fine

elastic stack : i need set Time Filter field name with another field

i need read messages(content is logs) from rabbitMq by logstash and then send that to elasticsearch for make visualize monitoring in kibana. so i wrote input for read from rabbitmq in logstash like this:
input {
rabbitmq {
queue => "testLogstash"
host => "localhost"
}
}
and i wrote output configuration for store in elasticsearch in logstash like this:
output {
elasticsearch{
hosts => "http://localhost:9200"
index => "d13-%{+YYYY.MM.dd}"
}
}
Both of them are placed in myConf.conf
In the content of each message, there is a Json that contains the fields like this:
{
"mDate":"MMMM dd YYYY, HH:mm:ss.SSS"
"name":"test name"
}
But there are two problems. First, there is no date field in the field of creating a new index(Time Filter field name). Second, I use the same timestamp as the default #timestamp, this field will not be displayed in the build type of graphs. I think the reason for this is because of the data type of the field. The field is of type date, but the string is considered.
i try to convert value of field to date by mutate in logstash config like this:
filter {
mutate {
convert => { "mdate" => "date" }
}
}
Now, two questions arise:
1- Is this the problem? If yes What is the right solution to fix it?
2- My main need is to use the time when messages are entered in the queue, not when Logstash takes them. What is the best solution?
If you don't specify a value for #timestamp, you should get the current system time when elasticsearch indexes the document. With that, you should be able to see items in kibana.
If I understand you correctly, you'd rather use you mDate field for #timestamp. For this, use the date{} filter in logstash.

Why parse failure on Log-stash occurs while field type is the same as before with no change?

The logstash log file says:
"tags"=>["_grokparsefailure"]}, "status_code"]}>, #data={"#version"=>"1", "#timestamp"=>"2016-09-24T08:00:54.894Z", "path"=>"/var/log/nginx/access.log", "host"=>"sample-com", "remote_addr"=>"127.0.0.1", "remote_user"=>"-", "date"=>"05/Sep/2016:10:03:01 +0000", "method"=>"GET", "uri_path"=>"/accounts", "version"=>"HTTP/1.1", "status_code"=>"200", "body_byte_sent"=>419, "referer"=>"-", "user_agent"=>"python-requests/2.4.3 CPython/2.7.9 Linux/3.16.0-4-amd64", "request_time"=>6.161, "auth_type"=>"Bearer", "client_id"=>"beta",
"web_client_ip"=>"172.*.131.177", "response_json"=>{"_links"=>{"applications"=>{"href"=>"/applications"}, "menus"=>{"href"=>"/menus"}, "messages"=>{"href"=>"/messages"}, "numbers"=>{"href"=>"/numbers"}, "self"=>{"href"=>"/accounts"}}, "account_status"=>"active", "creation_date"=>"2016-06-07 09:25:18", "credit"=>{"balance"=>#<BigDecimal:367dbf49,'0.19819267E4',8(12)>, "currency"=>"usd"}, "email"=>"*#gmail.com",
"id"=>"677756yt7557", "lastname"=>"Qurbani", "name"=>"M", "notifications"=>{"black_list"=>{"uids"=>[]}, "settings"=>{"email"=>{"low_credit"=>true, "new_feature"=>true, "receive_f"=>true, "send_f"=>true, "voice"=>true}, "language"=>"en", "push_notif"=>{"low_credit"=>true, "new_feature"=>true, "receive_f"=>true, "send_f"=>true, "voice"=>true}, "sms"=>{"low_credit"=>true, "new_feature"=>true, "receive_f"=>true, "send_f"=>true, "voice"=>true}}}, "phone"=>"+9****", "status"=>"inactive", "verification_status"=>{"email"=>"unverified", "phone"=>"verified"}}, "request_json"=>{}, "tags"=>["_grokparsefailure"]}, #metadata_accessors=#<LogStash::Util::Accessors:0x6ec6acbe #store={"path"=>"/var/log/nginx/access.log"}, #lut={"[path]"=>[{"path"=>"/var/log/nginx/access.log"}, "path"]}>,
#cancelled=false>], :response=>{"create"=>{"_index"=>"logstash-api-2016.09.24", "_type"=>"logs", "_id"=>"AVdbNisZCijYhuqEamFy", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception",
"reason"=>"failed to parse [response_json.credit]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"unknown property [balance]"}}}}, :level=>:warn}
Here I have a log like below in credit section:
"credit": {"balance": 0.0, "currency": "usd"}
I have removed all the indices from Elasticsearch, and I didn't found any .sincedb* in home or elsewhere to remove logstash DB.
Why this error happens when I don't actually have a change in balance value? What is the reason for that?
After restarting Logstash it does not aggregate data from log files!
I removed all since_dbs_* from /var/lib/logstash/ and said to start tailing from the beginning position in Logstash configuration.
Now the below error is raised:
object mapping for [response_json.credit] tried to parse field [credit] as object, but found a concrete value
It seems that sometimes credit is sent as a scalar value and sometimes as an object with two fields!
EDIT1:
2 different credit fields with different data has been posted to one credit in Elasticsearch. So I tried to rename these fields and remove the credit from both configs in logstash, so for now I have:
add_field => {"first_credit" => "%{[response_json.credit]}"}
remove_field => ["response_json.credit"]
New fields get added, but the value is literally %{[response_json.credit]} and field is not removed so error happens again. I want to get the value of credit and put it inside of first credit and remove the credit itself. I even tried the below:
add_field => {"first_credit" => "%{[response_json][credit]}"}
remove_field => ["response_json.credit"]
What I'm doing wrong?
EDIT:2
I have noticed that one file access.log has a credit field with different values.
One credit is numeric: 2.99
The other credit is a JSON: {"currency": "usd", "balance": 2.99}
I used the below logstash configuration to solve the problem and save them all as a string in ES:
if ([response_json][credit]) {
mutate {
add_field => {"new_credit" => "%{[response_json][credit]}"}
remove_field => [ "[response_json][credit]" ]
}
}
It gives the below error:
"new_credit"=>"{\"balance\":3.102,\"currency\":\"usd\"}", "tags"=>["_grokparsefailure"]},
#metadata_accessors=#<LogStash::Util::Accessors:0x46761362 #store={"path"=>"/var/log/nginx/access.log.1"},
#lut={"[path]"=>[{"path"=>"/var/log/nginx/access.log.1"}, "path"]}>,
#cancelled=false>], :response=>{"create"=>{"_index"=>"logstash-api-2016.09.27", "_type"=>"logs", "_id"=>"AVdqrION3CJVjhZgZcnl", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception",
"reason"=>"failed to parse [new_credit]", "caused_by"=>{"type"=>"number_format_exception", "reason"=>"For input string: \"{\"balance\":3.102,\"currency\":\"usd\"}\""}}}}, :level=>:warn
From looking at your log, "credit"=>{"balance"=>#<BigDecimal:367dbf49,'0.19819267E4',8(12)>, I think this issue may be related.
If you check the Elasticsearch mapping of your index at {elasticsearch:ip}:9200/logstash-api-2016.09.24/_mapping, I bet that the balance field has an Integer mapping. If there was initially an integer mapping, any value that is not an integer (for example, an object) will fail.
You can resolve this by creating an index template that specifies balance as a float. If you choose to do this, ensure that you delete the old index or create a new one, as existing mappings cannot be modified.
You could also ensure that balance is always the same data type in the source of the logs.
Or you could add a mutate filter and convert the balance field to your desired data type.
Check out your mapping and let me know if my theory is right. :)
EDIT:
The code block you just sent me will have exactly the same problem as before - object credit and int credit will be stored in the same field. The following will store credit[balance] (an int) and int credit in the same field called new_credit, which should be mapped to an Integer.
if ([response_json][credit][balance]) {
mutate {
add_field => {"new_credit" => "%{[response_json][credit][balance}"}
remove_field => [ "[response_json][credit]" ]
}
}
else {
mutate {
add_field => {"new_credit" => "%{[response_json][credit]}"}
remove_field => [ "[response_json][credit]" ]
}
}

Elasticsearch converting a string to number

I am new to Elasticsearch and am just starting up with ELK stack. I am collecting key value type logs in my Logstash and passing it to an index in Elasticsearch. I am using the kv filter plugin in Logstash. Due to this, all the fields are string type by default.
When I try to perform aggregation like avg or sum on a numeric field in Elasticsearch, I am getting an Exception: ClassCastException[org.elasticsearch.index.fielddata.plain.PagedBytesIndexFieldData cannot be cast to org.elasticsearch.index.fielddata.IndexNumericFieldData]
When I check the mappings in the index, all the fields except the timestamp ones are marked as string.
Please tell me how to overcome this issue as I have many numeric fields in my log events for aggregation.
Thanks,
Keerthana
You could set explicit mappings for those fields (see e.g. Change default mapping of string to "not analyzed" in Elasticsearch for some guidance), but it's easier to just convert those fields to integers in Logstash using the mutate filter:
mutate {
convert => ["name-of-field", "integer"]
}
Then Elasticsearch will do a better job at guessing the best data type for your field(s).
(See also Data type conversion using logstash grok.)
In latest Logstash the syntax is as follows
filter {
mutate {
convert => { "fieldname" => "integer" }
}
}
You can visit this link for more detail: https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html#plugins-filters-mutate-convert

Logstash grok filter integer

I need to index numerical data in my ElasticSearch DB and i'm using grok filter to parse the log line (which is all comma separated integers).
trying to use this format %{NUMBER:userID_2:int} did not work and no data was indexed and no exception appeared.
When i changed the type to "float" -i.e. %{NUMBER:userID_2:float} it worked just fine.
Any idea why i'm not able to index integers?
(Using elastic 1.4.4 and logstash 1.4.1)
Thanks!
In "filter" section you set up match expression:
match => "%{NUMBER:user_id}"
and then you convert it:
mutate {
convert => {
"user_id" => "integer"
....
}
}

Resources