Logstash timestamp appearing as text in Elasticsearch - elasticsearch

I am using Elasticsearch 7.3.1 and Logstash 7.3.1. I am trying to make a field of mine as the Elasticsearch timestamp using the date filter. The data is being inserted properly but the type of #timestamp is coming text. How do I fix this?
My input timestamp is like 1567408605794750813. My code is:
input {
elasticsearch {
hosts => "x.x.x.x"
index => "raw"
docinfo => true
}
}
filter {
mutate {
convert => {
"timestamp" => "integer"
}
}
date {
match => ["timestamp", "UNIX_MS", "ISO8601"]
target => "#timestamp"
}
}
output {
elasticsearch {
index => "logs-%{app_name}"
document_id => "%{[#metadata][_id]}"
}
}
After running the mapping API, I get
"#timestamp" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}

try the below code,
date {
match => [ "[timestamp]", "UNIX" ]
target => "[#timestamp]"
}

Related

Elasticsearch - Reindex documents with stored / excluded fields

Im having an index mapping with the following configuration:
"mappings" : {
"_source" : {
"excludes" : [
"special_field"
]
},
"properties" : {
"special_field" : {
"type" : "text",
"store" : true
},
}
}
So, when A new document is indexed using this mapping a got de following result:
{
"_index": "********-2021",
"_id": "************",
"_source": {
...
},
"fields": {
"special_field": [
"my special text"
]
}
}
If a _search query is perfomed, special_field is not returned inside _source as its excluded.
With the following _search query, special_field data is returned perfectly:
GET ********-2021/_search
{
"stored_fields": [ "special_field" ],
"_source": true
}
Right now im trying to reindex all documents inside that index, but im loosing the info stored in special_field and only _source field is getting reindexed.
Is there a way to put that special_field back inside _source field?
Is there a way to reindex that documents without loosing special_field data?
How could these documents be migrated to another cluster without loosing special_field data?
Thank you all.
Thx Hamid Bayat, I finally got it using a small logstash pipeline.
I will share it:
input {
elasticsearch {
hosts => "my-first-cluster:9200"
index => "my-index-pattern-*"
user => "****"
password => "****"
query => '{ "stored_fields": [ "special_field" ], "_source": true }'
size => 500
scroll => "5m"
docinfo => true
docinfo_fields => ["_index", "_type", "_id", "fields"]
}
}
filter {
if [#metadata][fields][special_field]{
mutate {
add_field => { "special_field" => "%{[#metadata][fields][special_field]}" }
}
}
}
output {
elasticsearch {
hosts => ["http://my-second-cluster:9200"]
password => "****"
user => "****"
index => "%{[#metadata][_index]}"
document_id => "%{[#metadata][_id]}"
template => "/usr/share/logstash/config/index_template.json"
template_name => "template-name"
template_overwrite => true
}
}
I had to add fields into docinfo_fields => ["_index", "_type", "_id", "fields"] elasticsearch input plugin and all my stored_fields were on [#metadata][fields] event field.
As the #metadata field is not indexed i had to add a new field at root level with [#metadata][fields][special_field] value.
Its working like a charm.

Converting fields from String to Date in Logstash

I'm trying to index emails into elasticsearch with logstash
My conf file is like this :
sudo bin/logstash -e 'input
{ imap
{ host => "imap.googlemail.com"
password => "********"
user => "********#gmail.com"
port => 993
secure => "true"
check_interval => 10
folder => "Inbox"
verify_cert => "false" } }
output
{ stdout
{ codec => rubydebug }
elasticsearch
{ index => "emails"
document_type => "email"
hosts => "localhost:9200" } }'
The problem is that two fields of the outputs are parsed as String fields but they are supposed to be "date" fields
The format of the fields is as below :
"x-dbworld-deadline" => "31-Jul-2019"
"x-dbworld-start-date" => "18-Nov-2019"
How can I convert these two fields into date fields ?
Thanks!
How about create mapping of index on Elasticsearch.
It may look like this:
PUT date-test-191211
{
"mappings": {
"_doc": {
"properties": {
"x-dbworld-deadline": {
"type": "date",
"format": "dd-MMM-yyyy"
},
"x-dbworld-start-date": {
"type": "date",
"format": "dd-MMM-yyyy"
}
}
}
}
}
Then, those fields are recognized as Date format:
result:
[

how to transfer data to elastic via logstast and using analyzer?

I have a logstash config file below. Elastic is reading my data as a b where as i want it to read it as ab i found i need to use not_analyzed for my sscat filed and max_shingle_size , min_shingle_size for products to get the best result.
Should I use not_analyzed for products field as well? Will that give better result?
How should I fill my my_id_analyzer to actually use the analyzer on different fields?
How should I connect the template with logstash config file?
input{
file{
path => "path"
start_position =>"beginning"
}
}
filter{
csv{
separator => ","
columns => ["Index", "Category", "Scat", "Sscat", "Products", "Measure", "Price", "Description", "Gst"]
}
mutate{convert => ["Index", "float"] }
mutate{convert => ["Price", "float"] }
mutate{convert => ["Gst", "float"] }
}
output{
elasticsearch{
hosts => "host"
user => "elastic"
password => "pass"
index => "masterdb"
}
}
I also have a template that can do it for all the future files that i upload
curl user:pass host:"host" /_template/logstash-id -XPUT -d '{
"template": "logstash-*",
"settings" : {
"analysis": {
"analyzer": {
"my_id_analyzer"{
}
}
}
}
},
"mappings": {
"properties" : {
"id" : { "type" : "string", "analyzer" : "my_id_analyzer" }
}
}
}'
You can use "ignore_above:" to restrict to a max length along with "not_analyzed" while creating mapping so that text doesn't get analyzed.
Declaring type as keyword instead of text will be other alternative for you.
Regarding the connecting template with logstash, why you need this? Once you have template created on elasticsearch, you can create your index which will follow the created template definition and you can start indexing.

logstash and elasticsearch geo_point

I am using logstash to input geospatial data from a csv into elasticsearch as geo_points.
The CSV looks like the following:
$ head -5 geo_data.csv
"lon","lat","lon2","lat2","d","i","approx_bearing"
-1.7841,50.7408,-1.7841,50.7408,0.982654,1,256.307
-1.7841,50.7408,-1.78411,50.7408,0.982654,1,256.307
-1.78411,50.7408,-1.78412,50.7408,0.982654,1,256.307
-1.78412,50.7408,-1.78413,50.7408,0.982654,1,256.307
I have create a mapping template that looks like the following:
$ cat map_template.json
{
"template": "base_map_template",
"order": 1,
"settings": {
"number_of_shards": 1
},
{
"mappings": {
"base_map": {
"properties": {
"lon2": { "type" : "float" },
"lat2": { "type" : "float" },
"d": { "type" : "float" },
"appox_bearing": { "type" : "float" },
"location": { "type" : "geo_point" }
}
}
}
}
}
My config file for logstash has been set up as follows:
$ cat map.conf
input {
stdin {}
}
filter {
csv {
columns => [
"lon","lat","lon2","lat2","d","i","approx_bearing"
]
}
if [lon] == "lon" {
drop { }
} else {
mutate {
remove_field => [ "message", "host", "#timestamp", "#version" ]
}
mutate {
convert => { "lon" => "float" }
convert => { "lat" => "float" }
convert => { "lon2" => "float" }
convert => { "lat2" => "float" }
convert => { "d" => "float" }
convert => { "i" => "integer"}
convert => { "approx_bearing" => "float"}
}
mutate {
rename => {
"lon" => "[location][lon]"
"lat" => "[location][lat]"
}
}
}
}
output {
# stdout { codec => rubydebug }
stdout { codec => dots }
elasticsearch {
index => "base_map"
template => "map_template.json"
document_type => "node_points"
document_id => "%{i}"
}
}
I then try and use logstash to input the csv data into elasticsearch as geo_points using the following command:
$ cat geo_data.csv | logstash-2.1.3/bin/logstash -f map.conf
I get the following error:
Settings: Default filter workers: 16
Unexpected character ('{' (code 123)): was expecting double-quote to start field name
at [Source: [B#278e55d1; line: 7, column: 3]{:class=>"LogStash::Json::ParserError", :level=>:error}
Logstash startup completed
....Logstash shutdown completed
What am I missing?
wayward "{" on line 7 of your template file

Logstash/Elasticsearch CSV Field Types, Date Formats and Multifields (.raw)

I've been playing around with getting a tab delimited file into Elasticsearch using the CSV filter in Logstash. Getting the data in was actually incredibly easy, but I'm having trouble getting the field types to come in right when I look at the data in Kibana. Dates and integers continue to come in as strings, so I can't plot by date or do any analysis functions on integers (sum, mean, etc).
I'm also having trouble getting the .raw version of the fields to populate. For example, in device I have data like "HTC One", but when if I do a pie chart in Kibana, it'll show up as two separate groupings "HTC" and "One". When I try to chart device.raw instead, it comes up as a missing field. From what I've read, it seems like Logstash should automatically create a raw version of each string field, but that doesn't seem to be happening.
I've been sifting through the documentation, google and stack, but haven't found a solution. Any ideas appreciated! Thanks.
Config file:
#logstash.conf
input {
file {
path => "file.txt"
type => "event"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
columns => ["userid","date","distance","device"]
separator => " "
}
}
output {
elasticsearch {
action => "index"
host => "localhost"
port => "9200"
protocol => "http"
index => "userid"
workers => 2
template => template.json
}
#stdout {
# codec => rubydebug
#}
}
Here's the template file:
#template.json:
{
"template": "event",
"settings" : {
"number_of_shards" : 1,
"number_of_replicas" : 0,
"index" : {
"query" : { "default_field" : "userid" }
}
},
"mappings": {
"_default_": {
"_all": { "enabled": false },
"_source": { "compress": true },
"dynamic_templates": [
{
"string_template" : {
"match" : "*",
"mapping": { "type": "string", "index": "not_analyzed" },
"match_mapping_type" : "string"
}
}
],
"properties" : {
"date" : { "type" : "date", "format": "yyyy-MM-dd HH:mm:ss"},
"device" : { "type" : "string", "fields": {"raw": {"type": "string","index": "not_analyzed"}}},
"distance" : { "type" : "integer"}
}
}
}
Figured it out - the template name IS the index. So the "template" : "event" line should have been "template" : "userid"
I found another (easier) way to specify the type of the fields. You can use logstash's mutate filter to change the type of a field. Simply add the following filter after your csv filter to your logstash config
mutate {
convert => [ "fieldname", "integer" ]
}
For details check out the logstash docs - mutate convert

Resources