Getting unknown setting [index._id] error while adding data to Elasticsearch - elasticsearch

I have created a mapping eventlog in Elasticsearch 5.1.1. I added it successfully however while adding data under it, I am getting Illegal_argument_exception with reason unknown setting [index._id]. My result from getting the indices is yellow open eventlog sX9BYIcOQLSKoJQcbn1uxg 5 1 0 0 795b 795b
My mapping is:
{
"mappings" : {
"_default_" : {
"properties" : {
"datetime" : {"type": "date"},
"ip" : {"type": "ip"},
"country" : { "type" : "keyword" },
"state" : { "type" : "keyword" },
"city" : { "type" : "keyword" }
}
}
}
}
and I am adding the data using
curl -u elastic:changeme -XPUT 'http://localhost:8200/eventlog' -d '{"index":{"_id":1}}
{"datetime":"2016-03-31T12:10:11Z","ip":"100.40.135.29","country":"US","state":"NY","city":"Highland"}';
If I don't include the {"index":{"_id":1}} line, I get Illegal_argument_exception with reason unknown setting [index.apiKey].

The problem was arising with sending the data from the command line as a string. Keeping the data in a JSON file and sending it as binary solved it. The correct command is:
curl -u elastic:changeme -XPUT 'http://localhost:8200/eventlog/_bulk?pretty' --data-binary #eventlogs.json

Related

Elasticsearch: strict_dynamic_mapping_exception

Hi,
I am trying to modify the date format in an elasticsearch index (operate-operation-0.26.0_). But I get the following error.
{
"took" : 148,
"errors" : true,
"items" : [
{
"index" : {
"_index" : "operate-operation-0.26.0_",
"_type" : "_doc",
"_id" : "WBGhSXcB_hD8-yfn-Rh5",
"status" : 400,
"error" : {
"type" : "strict_dynamic_mapping_exception",
"reason" : "mapping set to strict, dynamic introduction of [dynamic] within [_doc] is not allowed"
}
}
}
]
}
The json file I am using is bulk6.json:
{"index":{}}
{"dynamic":"strict","properties":{"date":{"type":"date","format":"yyyy-MM-dd'T'HH:mm:ss.SSSZZ"}}}
The command I am running is
curl -H "Content-Type: application/x-ndjson" -XPOST 'localhost:9200/operate-operation-0.26.0_/_bulk?pretty&refresh' --data-binary #"bulk6.json"
The _bulk API endpoint is not meant for changing mappings. You need to use the _mapping API endpoint like this:
The JSON file mapping.json should contain:
{
"dynamic": "strict",
"properties": {
"date": {
"type": "date",
"format": "yyyy-MM-dd'T'HH:mm:ss.SSSZZ"
}
}
}
And then the call can be made like this:
curl -H "Content-Type: application/json" -XPUT 'localhost:9200/operate-operation-0.26.0_/_mapping?pretty&refresh' --data-binary #"mapping.json"
However, this is still not going to work as you're not allowed to change the date format after the index has been created. You're going to get the following error:
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Mapper for [date] conflicts with existing mapper:\n\tCannot update parameter [format] from [strict_date_optional_time||epoch_millis] to [yyyy-MM-dd'T'HH:mm:ss.SSSZZ]"
}
],
"type" : "illegal_argument_exception",
"reason" : "Mapper for [date] conflicts with existing mapper:\n\tCannot update parameter [format] from [strict_date_optional_time||epoch_millis] to [yyyy-MM-dd'T'HH:mm:ss.SSSZZ]"
},
"status" : 400
}
You need to create a new index with the desired correct mapping and reindex your data.

"_doc" mapping type name not accepted in ElasticSearch 5.6

I am looking at examples of single-type indices on ElasticSearch 5.6 to prepare for the removal of mapping types. Specifically, I am running the first example from the ElasticSearch page about the removal of types, on a fresh cluster running locally in Docker using the docker.elastic.co/elasticsearch/elasticsearch:5.6.5 image
Running the first example from section I linked to:
PUT localhost:9200/users
{
"settings": {
"index.mapping.single_type": true
},
"mappings": {
"_doc": {
"properties": {
"name": {
"type": "text"
},
"user_name": {
"type": "keyword"
},
"email": {
"type": "keyword"
}
}
}
}
}
I get the following error:
{
"error": {
"root_cause": [
{
"type": "invalid_type_name_exception",
"reason": "mapping type name [_doc] can't start with '_'"
}
],
"type": "invalid_type_name_exception",
"reason": "mapping type name [_doc] can't start with '_'"
},
"status": 400
}
I understand that fields with a leading underscore in the name are generally considered as reserved for ES internals; but I was assuming that _doc would be considered a special case starting with version 5.6, since the linked guide mentions:
Indices created in 6.x only allow a single-type per index. Any name can be used for the type, but there can be only one. The preferred type name is _doc so that index APIs have the same path as they will have in 7.0
Am I missing something, such as a cluster setting?
The document I linked to is the master version. In the 6.1 or 5.6 versions of that same document, there is no mention of _doc being the preferred name; which likely means that the ability to use _doc as a mapping type name will come with future 6.x versions.
I got the same issue while trying examples in the readme file from master branch https://github.com/elastic/elasticsearch/tree/master.
$ curl -XPUT 'elastic:#localhost:9200/twitter/_doc/1?pretty' -H 'Content-Type: application/json' -d '
{
"user": "kimchy",
"post_date": "2009-11-15T13:12:00",
"message": "Trying out Elasticsearch, so far so good?"
}'
{
"error" : {
"root_cause" : [
{
"type" : "invalid_type_name_exception",
"reason" : "Document mapping type name can't start with '_', found: [_doc]"
}
],
"type" : "invalid_type_name_exception",
"reason" : "Document mapping type name can't start with '_', found: [_doc]"
},
"status" : 400
}
Just checkout to the branch for version 5.6 https://github.com/elastic/elasticsearch/tree/5.6 and it looks everything is fine.
$ curl -XPUT 'http://localhost:9200/twitter/user/kimchy?pretty' -H 'Content-Type: application/json' -d '{ "name" : "Shay Banon" }'
{
"_index" : "twitter",
"_type" : "user",
"_id" : "kimchy",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"created" : true
}

Bulk upload log messages to local Elasticsearch

We have a few external applications in cloud (IBM Bluemix) which logs its application syslogs in the bluemix logmet service which internally uses the ELK stack.
Now on a periodic basis, we would like to download the logs from the cloud and upload it into a local Elastic/Kibana instance. This is because storing logs in cloud services incurs cost and additional cost if we want to search the same by Kibana. The local elastic instance can delete/flush old logs which we don't need.
The downloaded logs will look like this
{"instance_id_str":"0","source_id_str":"APP/PROC/WEB","app_name_str":"ABC","message":"Hello","type":"syslog","event_uuid":"474b78aa-6012-44f3-8692-09bd667c5822","origin_str":"rep","ALCH_TENANT_ID":"3213cd20-63cc-4592-b3ee-6a204769ce16","logmet_cluster":"topic3-elasticsearch_3","org_name_str":"123","#timestamp":"2017-09-29T02:30:15.598Z","message_type_str":"OUT","#version":"1","space_name_str":"prod","application_id_str":"3104b522-aba8-48e0-aef6-6291fc6f9250","ALCH_ACCOUNT_ID_str":"","org_id_str":"d728d5da-5346-4614-b092-e17be0f9b820","timestamp":"2017-09-29T02:30:15.598Z"}
{"instance_id_str":"0","source_id_str":"APP/PROC/WEB","app_name_str":"ABC","message":"EFG","type":"syslog","event_uuid":"d902dddb-afb7-4f55-b472-211f1d370837","origin_str":"rep","ALCH_TENANT_ID":"3213cd20-63cc-4592-b3ee-6a204769ce16","logmet_cluster":"topic3-elasticsearch_3","org_name_str":"123","#timestamp":"2017-09-29T02:30:28.636Z","message_type_str":"OUT","#version":"1","space_name_str":"prod","application_id_str":"dcd9f975-3be3-4451-a9db-6bed1d906ae8","ALCH_ACCOUNT_ID_str":"","org_id_str":"d728d5da-5346-4614-b092-e17be0f9b820","timestamp":"2017-09-29T02:30:28.636Z"}
I have created an index in our local elasticsearch as
curl -XPUT 'localhost:9200/commslog?pretty' -H 'Content-Type: application/json' -d'
{
"settings" : {
"number_of_shards" : 1
},
"mappings" : {
"logs" : {
"properties" : {
"instance_id_str" : { "type" : "text" },
"source_id_str" : { "type" : "text" },
"app_name_str" : { "type" : "text" },
"message" : { "type" : "text" },
"type" : { "type" : "text" },
"event_uuid" : { "type" : "text" },
"ALCH_TENANT_ID" : { "type" : "text" },
"logmet_cluster" : { "type" : "text" },
"org_name_str" : { "type" : "text" },
"#timestamp" : { "type" : "date" },
"message_type_str" : { "type" : "text" },
"#version" : { "type" : "text" },
"space_name_str" : { "type" : "text" },
"application_id_str" : { "type" : "text" },
"ALCH_ACCOUNT_ID_str" : { "type" : "text" },
"org_id_str" : { "type" : "text" },
"timestamp" : { "type" : "date" }
}
}
}
}'
Now to bulk upload the file, used the command
curl -XPOST -H 'Content-Type: application/x-ndjson' http://localhost:9200/commslog/logs/_bulk --data-binary '#commslogs.json'
The above command throws an error
Malformed action/metadata line [1], expected START_OBJECT or END_OBJECT but found [VALUE_STRING]
The solution is to follow the rules for bulk upload as per
https://discuss.elastic.co/t/bulk-insert-file-having-many-json-entries-into-elasticsearch/46470/2
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html
So i manually changed few of the log statements by adding action before every line
{ "index" : { "_index" : "commslog", "_type" : "logs" } }
This works!!.
Another option was to call the curl command, providing the _idex and _type in the path
curl -XPOST -H 'Content-Type: application/x-ndjson' http://localhost:9200/commslog/logs/_bulk --data-binary '#commslogs.json'
but without the action, this too throws the same error
The problem is we cannot do this for thousands of log records we get. Is there an option where once we download the log files from Bluemix and upload the files without adding the action.
NOTE We are not using logstash at the moment, but
is it possible to use logstash and just use grok to transform the
logs and add the necessary entries?
How can we bulk upload documents via Logstash?
Is logstash the ideal solution or we can just write a program to
transform and do that
Thanks
As #Alain Collins said, you should be able to use filebeat directly.
For logstash:
it should be possible to use logstash, but rather than using grok, you should use the json codec/filter, it would be much easier.
You can use the file input with logstash to process many files and wait for it to finish (to know when it's finished, use a file/stdout, possibly with the dot codec, and wait for it to stop writing).
Instead of just transforming the files with logstash, you should directly upload to elasticsearch (with the elasticsearch output).
As for your problem, I think it will be much easier to just use a small program to add the missing action line or use filebeat, unless you are experimented enough with logstash config to write and logstash config quicker than a program adding one line everywhere in the document.

Elasticsearch JDBC importer not importing entry correctly

Having the following mapping:
curl -XPUT 'localhost:9200/borrador' -d '{
"mappings": {
"item": {
"dynamic": "strict",
"properties" : {
"body" : { "type": "string" },
"source_id" : { "type": "integer" },
}}}}'
I'm trying to import my DB to Elasticsearch using the Elasticsearch-JDBC importer.
This is the script I'm using:
#!/bin/sh
bin=/usr/share/elasticsearch/elasticsearch-jdbc-2.1.1.2/bin
lib=/usr/share/elasticsearch/elasticsearch-jdbc-2.1.1.2/lib
echo "Indexando base de datos..."
echo '{
"type" : "jdbc",
"jdbc" : {
"url" : "jdbc:mydbip/mydbname",
"user" : "username",
"password" : "pw",
"sql" : "select source_id, body, id as _id from table_name",
"index" : "borrador",
"type" : "item"
}
}' | java \
-cp "${lib}/*" \
-Dlog4j.configurationFile=${bin}/log4j2.xml \
org.xbib.tools.Runner \
org.xbib.tools.JDBCImporter
Most of the rows of the table are indexed correctly, but the following row from that DB is giving me an error and it's not indexing correctly:
This is the error that shows up:
[ERROR][org.xbib.elasticsearch.helper.client.BulkTransportClient][elasticsearch[importer][listener][T#1]]
bulk [957] failed with 1 failed items, failure message = failure in
bulk execution:
[3499]: index [borrador], type [item], id [14327140], message [MapperParsingException[failed to parse [body]]; nested:
IllegalArgumentException[unknown property [records]];]
As you can see in this case, this specific row has a json format string ({"format":"MS Excel","price":"750","records":"577","recordType":"records"}<!-- com -->) instead of the normal string that has the other entries that are indexing correctly.
What is happening? I would like to store that as a normal string. It's problem of the mapping as it's reading it as a json or something? Even if I remove the "dynamic": "strict", or the entire mapping, it still gives me the error. Thanks in advance.
By default the JDBC importer tries to detect JSON strings in your data and will parse them. You need to modify the configuration of your importer with the detect_json setting and set it to false:
{
"type" : "jdbc",
"jdbc" : {
"url" : "jdbc:mydbip/mydbname",
"user" : "username",
"password" : "pw",
"sql" : "select source_id, body, id as _id from table_name",
"index" : "borrador",
"type" : "item",
"detect_json": false <--- add this
}
}

set default analyzer of index

First I wanted to set default analyzer of ES, and failed. And then according to other questions and websites, I'm trying to set default analyzer of one index.But there are some problems too.
I have configured ik analyzer, and I can set some fields' analyzer, here is my command:
curl -XPUT localhost:9200/test
curl -XPUT localhost:9200/test/test/_mapping -d'{
"test":{
"properties":{
"name":{
"type":"string",
"analyzer":"ik"
}
}
}
}'
and get the message:
{"acknowledged":true}
also, it works as my wish.
but, if I try to set default analyzer of index:
curl -XPOST localhost:9200/test1?pretty -d '{ "index":{
"analysis" : {
"analyzer" : {
"default" : {
"type" : "ik"
}
}
}
}
}'
I will get error message:
{
"error" : {
"root_cause" : [ {
"type" : "index_creation_exception",
"reason" : "failed to create index"
} ],
"type" : "illegal_argument_exception",
"reason" : "no default analyzer configured"
},
"status" : 400
}
So strange,isn't it?
Looking forward to your opinions about this problem. Thanks! :)
You're almost there, you're simply missing /_settings in your path. Do it like this instead. Also note that you need to close the index first and then reopen it after updating analyzers.
// close index
curl -XPOST 'localhost:9200/test1/_close'
add this to the path
|
v
curl -XPUT localhost:9200/test1/_settings?pretty -d '{ "index":{
"analysis" : {
"analyzer" : {
"default" : {
"type" : "ik"
}
}
}
}
}'
// re-open index
curl -XPOST 'localhost:9200/test1/_open'

Resources