Elasticsearch 5.x D3 integration example fail - elasticsearch

I'm trying an example from https://www.elastic.co/blog/data-visualization-elasticsearch-aggregations
When I try to create indecies and upload data, I get the folllowing error:
rolf#PE:~/nfl/scripts/Elasticsearch-datasets-master/mappings$ curl -XPUT localhost:9200/nfl?pretty
{
"acknowledged" : true,
"shards_acknowledged" : true,
"index" : "nfl"
}
rolf#PE~/nfl/scripts/Elasticsearch-datasets-master/mappings$ curl -XPUT localhost:9200/nfl/2013/_mapping?pretty -d #nfl_mapping.json
{
"error" : {
"root_cause" : [
{
"type" : "mapper_parsing_exception",
"reason" : "_index is not configurable"
}
],
"type" : "mapper_parsing_exception",
"reason" : "_index is not configurable"
},
"status" : 400
}
The start of the mapping file is as follows:
{
"2013" : {
"_index" : {
"enabled" : true
},
"_id" : {
"index" : "not_analyzed",
"store" : "yes"
},
"properties" : {
"gameid" : {
"type" : "string",
"index" : "not_analyzed",
"store" : "yes"
}, ...
Appreciate some hints. Thanks.

You're probably using a recent version of ES and the nfl_mapping.json mapping is for an older version. In recent versions, it is not possible anymore to specify _index and _id in your mapping. Change it to this and it will work
{
"2013" : {
"properties" : {
"gameid" : {
"type" : "keyword"
}, ...
Also change all occurrences of string with text and string+not_analyzedto keyword.
After that you should be good to go.

Also note that "index" : "not_analyzed" is no longer supported.

Related

Using Ingest Attachment Plugin within elastic search index template

I am trying to update my current elastic search schema which is on 1.3.2 to the latest one. For one of the indexes, the current schema looks something like the below:
curl -XPOST localhost:9200/_template/<INDEXNAME> -d '{
"template" : "*-<INDEXNAME_TYPE>",
"index.mapping.attachment.indexed_chars": -1,
"mappings" : {
"post" : {
"properties" : {
"sub" : { "type" : "string" },
"sender" : { "type" : "string" },
"dt" : { "type" : "date", "format" : "EEE, d MMM yyyy HH:mm:ss Z" },
"body" : { "type" : "string"},
"attachments" : {
"type" : "attachment",
"path" : "full",
"fields" : {
"attachments" : {
"type" : "string",
"term_vector" : "with_positions_offsets",
"store" : true
},
"name" : {"store" : "yes"},
"title" : {"store" : "yes"},
"date" : {"store" : "yes"},
"content_type" : {"store" : "yes"},
"content_length" : {"store" : "yes"}
}
}
}
}
}
}'
With my old version of Elastic Search, there is a "mapper-attachment" plugin installed. I am aware that the "mapper-attachment" plugin has been replaced by the "Ingest Attachment Processor" and following the examples from the plugins' website, I do understand their examples where I got to create a pipeline,
PUT _ingest/pipeline/attachment
{
"description" : "Extract attachment information from arrays",
"processors" : [
{
"foreach": {
"field": "attachments",
"processor": {
"attachment": {
"target_field": "_ingest._value.attachment",
"field": "_ingest._value.data",
"indexed_chars" : -1
}
}
}
}
]
}
PUT my-index-000001/_doc/my_id?pipeline=attachment
{
"sub" : "This is a test post",
"sender" : "jane.doe#gmail.com",
"dt" : "Sat, 15 Jan 2022 08:50:00 AEST"
"body" : "Test Body",
"fromaddr": "jane.doe#gmail.com",
"toaddr": "larne.jones#gmail.com",
"attachments" : [
{
"filename" : "ipsum.txt",
"data" : "dGhpcyBpcwpqdXN0IHNvbWUgdGV4dAo="
},
{
"filename" : "test.txt",
"data" : "VGhpcyBpcyBhIHRlc3QK"
}
]
}
How do I make use of this new attachment processor to create the index template I had before?
Note: With my index and schema, for each "post", there will be one or many attachments,
The answer is, unlike the previous version, I cannot use the data type of attachment. So following the example from the elastic.co website and from my own question, the answer is in my question itself.
1st: create the pipeline as in the question
2nd Create the schema [see below]
3rd Insert the data as shown in the question. When inserting the data into the index, use pipeline=attachment as the name of the pipeline and the plugin would parse the given attachment into the schema above
curl -XPOST localhost:9200/_template/<INDEXNAME> -d '{
"template" : "*-<INDEXNAME_TYPE>",
"index.mapping.attachment.indexed_chars": -1,
"mappings" : {
"post" : {
"properties" : {
"sub" : { "type" : "string" },
"sender" : { "type" : "string" },
"dt" : { "type" : "date", "format" : "EEE, d MMM yyyy HH:mm:ss Z" },
"body" : { "type" : "string"},
"attachments" : {
"properties" : {
"attachment" : {
"properties" : {
"content" : {
"type" : "text",
"store": true,
"term_vector": "with_positions_offsets"
},
"content_length" : { "type" : "long" },
"content_type" : { "type" : "keyword" },
"language" : { "type" : "keyword"},
"date" : { "type" : "date", "format" : "EEE, d MMM yyyy HH:mm:ss Z" }
}
},
"content" : { "type": "keyword" },
"name" : { "type" : "keyword" }
}
}
}
}
}
}'

illegal_argument_exception while creating elasticsearch mapping

I am trying to stream logs from logstash to elasticsearch (5.5.0). I am using filebeat to send logs to logstash.
I have not defined any index; it is defined automatically (say "test1") when data is pushed for the first time.
Now, I want to create another index ("test2") so that I can manage field data types. For that, I got the mappings for test1. Updated the index name. And did PUT call for test2 with this data. However, it fails with following result:
`ubuntu#elasticsearch:~$ curl -XPUT 'localhost:9200/test2?pretty' -H 'Content-Type: application/json' -d'#/tmp/mappings_test.json'
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "unknown setting [index.test2.mappings.log.properties.#timestamp.type] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"
}
],
"type" : "illegal_argument_exception",
"reason" : "unknown setting [index.test2.mappings.log.properties.#timestamp.type] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"
},
"status" : 400
}
`
Following is the excerpt of the json which I am using.
`
{
"test2" : {
"mappings" : {
"log" : {
"properties" : {
"#timestamp" : {
"type" : "date"
},
"#version" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"accept_date" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
....
`
I modified index name only. Rest of the content is same as mapping of test1 index.
Any help is appreciated on how to create this new index by updating types?
You need to remove test2 on the second line and have only mappings:
PUT test2
{
"mappings" : { <---- this needs to be at the top level
"log" : {
"properties" : {
"#timestamp" : {
"type" : "date"
},
"#version" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"accept_date" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
....

How Can I check which field parse error from elastic log

my error log in elasticsearch like that:
[2015-09-04 10:59:49,531][DEBUG][action.bulk ] [baichebao-node-2] [questions][0] failed to execute bulk item (index) index {[questions][baichebao][AU-WS7qZwHwGnxdqIztg], source[_na_]}
org.elasticsearch.index.mapper.MapperParsingException: failed to parse
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:565)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:493)
at org.elasticsearch.index.shard.IndexShard.prepareCreate(IndexShard.java:466)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:418)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:148)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase.performOnPrimary(TransportShardReplicationOperationAction.java:574)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase$1.doRun(TransportShardReplicationOperationAction.java:440)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.ElasticsearchParseException: Failed to derive xcontent
at org.elasticsearch.common.xcontent.XContentFactory.xContent(XContentFactory.java:195)
at org.elasticsearch.common.xcontent.XContentHelper.createParser(XContentHelper.java:75)
at org.elasticsearch.common.xcontent.XContentHelper.createParser(XContentHelper.java:53)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:507)
... 10 more
and my mapping like that:
{
"mappings" : {
"baichebao" : {
"dynamic" : false,
"_all" : { "enable" : false },
"_id" : {
"store" : true,
"path" : "id"
},
"properties" : {
"id" : {
"type" : "long"
},
"content" : {
"type" : "string",
"analyzer" : "ik_syno_smart"
},
"uid" : {
"type" : "integer"
},
"all_answer_count" : {
"type" : "integer"
},
"answer_users" : {
"type" : "integer"
},
"best_answer" : {
"type" : "long"
},
"status" : {
"type" : "short"
},
"created_at" : {
"type" : "long"
},
"distrust" : {
"type" : "short"
},
"is_expert" : {
"type" : "boolean"
},
"series_id" : {
"type" : "integer"
},
"is_closed" : {
"type" : "boolean"
},
"closed_at" : {
"type" : "long"
},
"tags" : {
"type" : "string"
},
"channel_type" : {
"type" : "integer"
},
"channel_sub_type" : {
"type" : "integer"
}
}
}
}
}
But I can not find out which field parse error?
How can i resolve this problem?
This error typically indicates that the document that was sent to elasticsearch cannot be identified as JSON or SMILE document by checking the first 20 bytes. For example, you would get this error if you omit the leading "{" in a JSON document:
curl -XPUT localhost:9200/test/doc/1 -d 'I am not a json document'
or prepend valid JSON with 20+ whitespace characters:
curl -XPUT localhost:9200/test/doc/1 -d ' {"foo": "bar"}'

How to use _timestamp in logstash elasticsearch

I am trying to figure out how to use the _timestamp with logstash.
I have tried to add to the mapping:
"_timestamp" : {
"enabled" : true,
"path" : "#timestamp"
},
But that does not have the expected effect. I did this in the elasticsearch-template.json file (I tried with and without the "store"=true):
{
"template" : "logstash-*",
"settings" : {
"index.refresh_interval" : "5s"
},
"mappings" : {
"_default_" : {
"_timestamp" : {
"enabled" : true,
"store" : true,
"path" : "#timestamp"
},
"_all" : {"enabled" : true},
"dynamic_templates" : [ {
.....
And I added the modified file to the output filter
output {
elasticsearch_http {
template => '/tmp/elasticsearch-template.json'
host => '127.0.0.1'
port=>9200
}
}
In order to make sure the database is clean I repeatedly do:
curl -XDELETE http://localhost:9200/logstash*
curl -XDELETE http://localhost:9200/_template/logstash
rm ~/.sincedb_*
and then I try to import my logfile. But for some reasons, the _timestamp is not set.
The mapping seems to be ok
{
"logstash-2014.03.24" : {
"_default_" : {
"dynamic_templates" : [ {
"string_fields" : {
"mapping" : {
"index" : "analyzed",
"omit_norms" : true,
"type" : "string",
"fields" : {
"raw" : {
"index" : "not_analyzed",
"ignore_above" : 256,
"type" : "string"
}
}
},
"match" : "*",
"match_mapping_type" : "string"
}
} ],
"_timestamp" : {
"enabled" : true,
"store" : true,
"path" : "#timestamp"
},
"properties" : {
"#version" : {
"type" : "string",
"index" : "not_analyzed",
"omit_norms" : true,
"index_options" : "docs"
},
"geoip" : {
"dynamic" : "true",
"properties" : {
"location" : {
"type" : "geo_point"
}
}
}
}
},
"logs" : {
"dynamic_templates" : [ {
"string_fields" : {
"mapping" : {
"index" : "analyzed",
"omit_norms" : true,
"type" : "string",
"fields" : {
"raw" : {
"index" : "not_analyzed",
"ignore_above" : 256,
"type" : "string"
}
}
},
"match" : "*",
"match_mapping_type" : "string"
}
} ],
"_timestamp" : {
"enabled" : true,
"store" : true,
"path" : "#timestamp"
},
"properties" : {
"#timestamp" : {
"type" : "date",
"format" : "dateOptionalTime"
},
The documents in the database look like
{
"_id": "Cps2Lq1nTIuj_VysOwwcWw",
"_index": "logstash-2014.03.25",
"_score": 1.0,
"_source": {
"#timestamp": "2014-03-25T00:47:09.703Z",
"#version": "1",
"created": "2014-03-25 01:47:09,703",
"host": "macbookpro.fritz.box",
"message": "2014-03-25 01:47:09,703 - Starting new HTTP connection (1): localhost",
"path": "/Users/scharf/git/ckann/annotator-store/logs/requests.log",
"text": "Starting new HTTP connection (1): localhost"
},
"_type": "logs"
},
why is the _timestamp not set???
In short, it does work.
I tested your exact scenario and here's what I found:
When using _source enabled and specifying _timestamp from some path in the _source,
you will never see _timestamp as part of the document, but if however, you add the ?fields query string part, for example:
http://<localhost>:9200/es_test_logs/ESTest1/ilq4PU3tR9SeoLo794wZlg?fields=_timestamp
you will get the correct _timestamp value.
If, instead of using path, you pass _timestamp externally (in the _source document), you will see _timestamp under the _source property in the document as normal.
If you disable the _source field, you will not see ANY property at all in the document, even those you set as "store" : true. You will only see them when specifying ?fields, or when building a query that returns those fields.

Use nested fields in kibana panels

I tried to display a Kibana dashboard and it works well. Unfortunately, when I want to add a pie chart (or another representation) containing the countries of the companies locations, I have an empty panel.
I'm able to use the kibana queries to filter on a specific country but I'm not able to display a panel with nested documents.
My mapping (I have to use nested fields because a company can have several locations):
{
"settings" : {
"number_of_shards" : 1
},
"mappings": {
"company" : {
"properties" : {
"name" : { "type" : "string", "store" : "yes" },
"website" : { "type" : "string", "store" : "yes" },
"employees" : { "type" : "string", "store" : "yes" },
"type": { "type" : "string", "store" : "yes" },
"locations" : {
"type" : "nested",
"properties" : {
"city" : { "type" : "string", "store" : "yes" },
"country" : { "type" : "string", "store" : "yes" },
"coordinates" : { "type" : "geo_point", "store" : "yes" }
}
}
}
}
}
}
Do you know how could I display panel with nested objects? Is it implemented?
Thanks,
Kevin
you are missing one parameter ("include_in_parent": true) in your mapping.
The correct mapping should be:
{
"settings" : {
"number_of_shards" : 1
},
"mappings": {
"company" : {
"properties" : {
"name" : { "type" : "string", "store" : "yes" },
"website" : { "type" : "string", "store" : "yes" },
"employees" : { "type" : "string", "store" : "yes" },
"type": { "type" : "string", "store" : "yes" },
"locations" : {
"type" : "nested",
"include_in_parent": true,
"properties" : {
"city" : { "type" : "string", "store" : "yes" },
"country" : { "type" : "string", "store" : "yes" },
"coordinates" : { "type" : "geo_point", "store" : "yes" }
}
}
}
}
}
}
It's clearly a Kibana bug. The facet query generated by Kibana is missing the "nested" field to indicate this.

Resources