How Can I check which field parse error from elastic log - elasticsearch

my error log in elasticsearch like that:
[2015-09-04 10:59:49,531][DEBUG][action.bulk ] [baichebao-node-2] [questions][0] failed to execute bulk item (index) index {[questions][baichebao][AU-WS7qZwHwGnxdqIztg], source[_na_]}
org.elasticsearch.index.mapper.MapperParsingException: failed to parse
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:565)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:493)
at org.elasticsearch.index.shard.IndexShard.prepareCreate(IndexShard.java:466)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:418)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:148)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase.performOnPrimary(TransportShardReplicationOperationAction.java:574)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase$1.doRun(TransportShardReplicationOperationAction.java:440)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.ElasticsearchParseException: Failed to derive xcontent
at org.elasticsearch.common.xcontent.XContentFactory.xContent(XContentFactory.java:195)
at org.elasticsearch.common.xcontent.XContentHelper.createParser(XContentHelper.java:75)
at org.elasticsearch.common.xcontent.XContentHelper.createParser(XContentHelper.java:53)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:507)
... 10 more
and my mapping like that:
{
"mappings" : {
"baichebao" : {
"dynamic" : false,
"_all" : { "enable" : false },
"_id" : {
"store" : true,
"path" : "id"
},
"properties" : {
"id" : {
"type" : "long"
},
"content" : {
"type" : "string",
"analyzer" : "ik_syno_smart"
},
"uid" : {
"type" : "integer"
},
"all_answer_count" : {
"type" : "integer"
},
"answer_users" : {
"type" : "integer"
},
"best_answer" : {
"type" : "long"
},
"status" : {
"type" : "short"
},
"created_at" : {
"type" : "long"
},
"distrust" : {
"type" : "short"
},
"is_expert" : {
"type" : "boolean"
},
"series_id" : {
"type" : "integer"
},
"is_closed" : {
"type" : "boolean"
},
"closed_at" : {
"type" : "long"
},
"tags" : {
"type" : "string"
},
"channel_type" : {
"type" : "integer"
},
"channel_sub_type" : {
"type" : "integer"
}
}
}
}
}
But I can not find out which field parse error?
How can i resolve this problem?

This error typically indicates that the document that was sent to elasticsearch cannot be identified as JSON or SMILE document by checking the first 20 bytes. For example, you would get this error if you omit the leading "{" in a JSON document:
curl -XPUT localhost:9200/test/doc/1 -d 'I am not a json document'
or prepend valid JSON with 20+ whitespace characters:
curl -XPUT localhost:9200/test/doc/1 -d ' {"foo": "bar"}'

Related

Using Ingest Attachment Plugin within elastic search index template

I am trying to update my current elastic search schema which is on 1.3.2 to the latest one. For one of the indexes, the current schema looks something like the below:
curl -XPOST localhost:9200/_template/<INDEXNAME> -d '{
"template" : "*-<INDEXNAME_TYPE>",
"index.mapping.attachment.indexed_chars": -1,
"mappings" : {
"post" : {
"properties" : {
"sub" : { "type" : "string" },
"sender" : { "type" : "string" },
"dt" : { "type" : "date", "format" : "EEE, d MMM yyyy HH:mm:ss Z" },
"body" : { "type" : "string"},
"attachments" : {
"type" : "attachment",
"path" : "full",
"fields" : {
"attachments" : {
"type" : "string",
"term_vector" : "with_positions_offsets",
"store" : true
},
"name" : {"store" : "yes"},
"title" : {"store" : "yes"},
"date" : {"store" : "yes"},
"content_type" : {"store" : "yes"},
"content_length" : {"store" : "yes"}
}
}
}
}
}
}'
With my old version of Elastic Search, there is a "mapper-attachment" plugin installed. I am aware that the "mapper-attachment" plugin has been replaced by the "Ingest Attachment Processor" and following the examples from the plugins' website, I do understand their examples where I got to create a pipeline,
PUT _ingest/pipeline/attachment
{
"description" : "Extract attachment information from arrays",
"processors" : [
{
"foreach": {
"field": "attachments",
"processor": {
"attachment": {
"target_field": "_ingest._value.attachment",
"field": "_ingest._value.data",
"indexed_chars" : -1
}
}
}
}
]
}
PUT my-index-000001/_doc/my_id?pipeline=attachment
{
"sub" : "This is a test post",
"sender" : "jane.doe#gmail.com",
"dt" : "Sat, 15 Jan 2022 08:50:00 AEST"
"body" : "Test Body",
"fromaddr": "jane.doe#gmail.com",
"toaddr": "larne.jones#gmail.com",
"attachments" : [
{
"filename" : "ipsum.txt",
"data" : "dGhpcyBpcwpqdXN0IHNvbWUgdGV4dAo="
},
{
"filename" : "test.txt",
"data" : "VGhpcyBpcyBhIHRlc3QK"
}
]
}
How do I make use of this new attachment processor to create the index template I had before?
Note: With my index and schema, for each "post", there will be one or many attachments,
The answer is, unlike the previous version, I cannot use the data type of attachment. So following the example from the elastic.co website and from my own question, the answer is in my question itself.
1st: create the pipeline as in the question
2nd Create the schema [see below]
3rd Insert the data as shown in the question. When inserting the data into the index, use pipeline=attachment as the name of the pipeline and the plugin would parse the given attachment into the schema above
curl -XPOST localhost:9200/_template/<INDEXNAME> -d '{
"template" : "*-<INDEXNAME_TYPE>",
"index.mapping.attachment.indexed_chars": -1,
"mappings" : {
"post" : {
"properties" : {
"sub" : { "type" : "string" },
"sender" : { "type" : "string" },
"dt" : { "type" : "date", "format" : "EEE, d MMM yyyy HH:mm:ss Z" },
"body" : { "type" : "string"},
"attachments" : {
"properties" : {
"attachment" : {
"properties" : {
"content" : {
"type" : "text",
"store": true,
"term_vector": "with_positions_offsets"
},
"content_length" : { "type" : "long" },
"content_type" : { "type" : "keyword" },
"language" : { "type" : "keyword"},
"date" : { "type" : "date", "format" : "EEE, d MMM yyyy HH:mm:ss Z" }
}
},
"content" : { "type": "keyword" },
"name" : { "type" : "keyword" }
}
}
}
}
}
}'

aggregation fails on nested aggregation field

I've this mapping for fuas type:
curl -XGET 'http://localhost:9201/living_team/_mapping/fuas?pretty'
{
"living_v1" : {
"mappings" : {
"fuas" : {
"properties" : {
"backlogStatus" : {
"type" : "long"
},
"comment" : {
"type" : "string"
},
"dueTimestamp" : {
"type" : "date",
"format" : "strict_date_optional_time||epoch_millis"
},
"matter" : {
"type" : "string"
},
"metainfos" : {
"properties" : {
"category 1" : {
"type" : "string"
},
"key" : {
"type" : "string"
},
"null" : {
"type" : "string"
},
"processos" : {
"type" : "string"
}
}
},
"resources" : {
"properties" : {
"noteId" : {
"type" : "string"
},
"resourceId" : {
"type" : "string"
}
}
},
"status" : {
"type" : "long"
},
"timestamp" : {
"type" : "date",
"format" : "strict_date_optional_time||epoch_millis"
},
"user" : {
"type" : "string",
"index" : "not_analyzed"
}
}
}
}
}
}
I'm trying to perform this aggregation:
curl -XGET 'http://ESNode01:9201/living_team/fuas/_search?pretty' -d '
{
"aggs" : {
"demo" : {
"nested" : {
"path" : "metainfos"
},
"aggs" : {
"key" : { "terms" : { "field" : "metainfos.key" } }
}
}
}
}
'
ES realizes me:
"error" : {
"root_cause" : [ {
"type" : "aggregation_execution_exception",
"reason" : "[nested] nested path [metainfos] is not nested"
} ],
"type" : "search_phase_execution_exception",
"reason" : "all shards failed",
"phase" : "query_fetch",
"grouped" : true,
"failed_shards" : [ {
"shard" : 3,
"index" : "living_v1",
"node" : "HfaFBiZ0QceW1dpqAnv-SA",
"reason" : {
"type" : "aggregation_execution_exception",
"reason" : "[nested] nested path [metainfos] is not nested"
}
} ]
},
"status" : 500
}
Any ideas?
You're missing "type":"nested" from your metainfos mapping.
Should have been:
"metainfos" : {
"type":"nested",
"properties" : {
"category 1" : {
"type" : "string"
},
"key" : {
"type" : "string"
},
"null" : {
"type" : "string"
},
"processos" : {
"type" : "string"
}
}
}

how to change type of a value in elasticsearch

I am trying to do geomap of a value in Elasticsearch but the value type of the client_location is set as a string and I would like to change it to geo_point. When I run the following I am getting:
#curl -XGET "http://core.z0z0.tk:9200/_all/_mappings/http?pretty"
{
"packetbeat-2015.12.04" : {
"mappings" : {
"http" : {
"properties" : {
"#timestamp" : {
"type" : "date",
"format" : "strict_date_optional_time||epoch_millis"
},
"beat" : {
"properties" : {
"hostname" : {
"type" : "string"
},
"name" : {
"type" : "string"
}
}
},
"bytes_in" : {
"type" : "long"
},
"bytes_out" : {
"type" : "long"
},
"client_ip" : {
"type" : "string"
},
"client_location" : {
"type" : "string"
},
"client_port" : {
"type" : "long"
},
"client_proc" : {
"type" : "string"
},
"client_server" : {
"type" : "string"
},
"count" : {
"type" : "long"
},
"direction" : {
"type" : "string"
},
"http" : {
"properties" : {
"code" : {
"type" : "long"
},
"content_length" : {
"type" : "long"
},
"phrase" : {
"type" : "string"
}
}
},
"ip" : {
"type" : "string"
},
"method" : {
"type" : "string"
},
"notes" : {
"type" : "string"
},
"params" : {
"type" : "string"
},
"path" : {
"type" : "string"
},
"port" : {
"type" : "long"
},
"proc" : {
"type" : "string"
},
"query" : {
"type" : "string"
},
"responsetime" : {
"type" : "long"
},
"server" : {
"type" : "string"
},
"status" : {
"type" : "string"
},
"type" : {
"type" : "string"
}
}
}
}
}
}
When I run the following command to change the type of the value from string to geo_point I am getting the following error:
# curl -XPUT "http://localhost:9200/_all/_mappings/http" -d '
> {
> "http" : {
> "properties" : {
> "client_location" : {
> "type" : "geo_point"
> }
> }
> }
> }
> '
{"error":{"root_cause":[{"type":"merge_mapping_exception","reason":"Merge failed with failures {[mapper [client_location] of different type, current_type [string], merged_type[geo_point]]}"}],"type":"merge_mapping_exception","reason":"Merge failed with failures {[mapper [client_location] of different type, current_type [string], merged_type [geo_point]]}"},"status":400}
Any suggestion how should I correctly change the type?
Thanks in advance.
Unfortunately, once you've created a field you cannot change its type anymore. The best thing to do is to delete the index and recreate it properly with the adequate mapping.
Another temporary solution if you don't want to delete your index immediately, is to create a sub-field of your existing field:
# curl -XPUT "http://localhost:9200/_all/_mappings/http" -d '{
"http": {
"properties": {
"client_location": {
"type": "string",
"fields": {
"geo": {
"type": "geo_point"
}
}
}
}
}
}'
And then you can access it in your queries using client_location.geo.
Also note that you have to re-index your data in order to populate that new sub-field... which means you might just as well delete your index and re-create it properly.
UPDATE
After installing Packetbeat you need to make sure to install the packetbeat template yourself as described here (i.e. it is not done automatically):
https://www.elastic.co/guide/en/beats/packetbeat/current/packetbeat-getting-started.html#packetbeat-template
curl -XPUT 'http://localhost:9200/_template/packetbeat' -d#/etc/packetbeat/packetbeat.template.json

Elasticsearch 1.7.3: doc_values treated as fielddata

I'm new to ElasticSearch, started working with ElasticSearch 1.7.3 as part of a Logstash-ElasticSearch-Kibana deployment.
I've defined a mapping template for my log messages, this is the interesting part:
{
"template" : "logstash-*",
"settings" : { "index.refresh_interval" : "5s" },
"mappings" : {
"_default_" : {
"_all" : {"enabled" : true, "omit_norms" : true},
"dynamic_templates" : [ {
"date_fields" : {
"match" : "*",
"match_mapping_type" : "date",
"mapping" : { "type" : "date", "doc_values" : true }
}
}],
"properties" : {
"#version" : { "type" : "string", "index" : "not_analyzed" },
"#timestamp" : { "type" : "date", "format" : "dateOptionalTime" },
"message" : { "type" : "string" }
}
} ,
"my_log" : {
"_all" : { "enabled" : true, "omit_norms" : true },
"dynamic_templates" : [ {
"date_fields" : {
"match" : "*",
"match_mapping_type" : "date",
"mapping" : { "type" : "date", "doc_values" : true }
}
}],
"properties" : {
"#timestamp" : { "type" : "date", "format" : "dateOptionalTime" },
"file" : { "type" : "string" },
"message" : { "type" : "string" }
"geolocation" : { "type" : "string" },
}
}
}
}
Although the #timestamp field is defined as doc_value:true I have an error of MemoryException because it is a fielddata:
[FIELDDATA] Data too large, data for [#timestamp] would be larger than
limit of [633785548/604.4 mb]
NOTE:
I know I can change the memory or add more nodes to the cluster, but in my point of view this is a design problem where this field should not be indexed in memory.

How can i map custom date format in elasticsearch and Kibana4

I have nginx logs and i have this date format [02/Mar/2015:13:02:51 +0000]
What should i use in elasticsearch and what i should put in the dateformat field of Kibana4?
curl -XGET 'http://localhost:9200/_mapping?pretty'
{
"nginx" : {
"mappings" : {
"t07_nginx" : {
"properties" : {
"#timestamp" : {
"type" : "date",
"format" : "dateOptionalTime"
},
"body_bytes_sent" : {
"type" : "string"
},
"geoip_country_code" : {
"type" : "string"
},
"host" : {
"type" : "string"
},
"http_host" : {
"type" : "string"
},
"http_referer" : {
"type" : "string"
},
"http_user_agent" : {
"type" : "string",
"index" : "not_analyzed"
},
"http_x_forwarded_for" : {
"type" : "string"
},
"message" : {
"type" : "string"
},
"msec request_time" : {
"type" : "string"
},
"remote_addr" : {
"type" : "string"
},
"request_http_protocol" : {
"type" : "string"
},
"request_time" : {
"type" : "string"
},
"request_type" : {
"type" : "string"
},
"request_url" : {
"type" : "string"
},
"status" : {
"type" : "string"
},
"upstream_addr" : {
"type" : "string"
},
"upstream_response_time" : {
"type" : "string"
}
}
}
}
}
with the above i can't see any data(events) in Kibana
Thanks
What does the input plugin for nginx/output plugin for elasticsearch in your fluentd config file look like?
Also, make sure you have your time range setup correctly in kibana. I believe it defaults to 15 minutes.

Resources