Trying to define an index within Elasticsearch for py-image-dedup - elasticsearch

I'm trying to get py-image-dedup working (https://github.com/markusressel/py-image-dedup) which requires an index built within elasticsearch. So far so good, all python code for py-image-dedup working and brew install elasticsearch all installed and working with the elasticsearch server happily working at local host address 127.0.0.1:9200
So now I try to build the index. The instructions say
curl -X PUT "127.0.0.1:9200/images?pretty" -H "Content-Type: application/json" -d "
{
\"mappings\": {
\"image\": {
\"properties\": {
\"path\": {
\"type\": \"keyword\",
\"ignore_above\": 256
}
}
}
}
}
which is clearly missing a " at the end and doesn't work in any variant as far as I can see.
I try
curl -X PUT "127.0.0.1:9200/images?pretty" -H "Content-Type: application/json" -d "{\"mappings\":{\"image\":{\"properties\":{\"path\":{\"type\":\"keyword\",\"ignore_above\":256}}}}} "
which looks sensible but get
{
"error" : {
"root_cause" : [
{
"type" : "mapper_parsing_exception",
"reason" : "Root mapping definition has unsupported parameters: [image : {properties={path={ignore_above=256, type=keyword}}}]"
}
],
"type" : "mapper_parsing_exception",
"reason" : "Failed to parse mapping [_doc]: Root mapping definition has unsupported parameters: [image : {properties={path={ignore_above=256, type=keyword}}}]",
"caused_by" : {
"type" : "mapper_parsing_exception",
"reason" : "Root mapping definition has unsupported parameters: [image : {properties={path={ignore_above=256, type=keyword}}}]"
}
},
"status" : 400
}
and cannot for the life of me see why the index is not building correctly. Grateful for help.

you are trying to use types, which have been deprecated: https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html
Please drop the image type from your mapping definition.
curl -X PUT "127.0.0.1:9200/images?pretty" -H "Content-Type: application/json" -d "
{
\"mappings\": {
\"properties\": {
\"path\": {
\"type\": \"keyword\",
\"ignore_above\": 256
}
}
}
}

Related

A mapper_parsing_exception occurred when using the bulk API of Elasticsearch

Elasticsearch version: 8.3.3
Indexing was performed using the following Elasticsearch API.
curl -X POST "localhost:9200/bulk_meta/_doc/_bulk?pretty" -H 'Content-Type: application/json' -d'
{"index": { "_id": "1"}}
{"mydoc": "index action, id 1 "}
{"index": {}}
{"mydoc": "index action, id 2"}
'
In this case, the following error occurred.
{
"error" : {
"root_cause" : [
{
"type" : "mapper_parsing_exception",
"reason" : "failed to parse"
}
],
"type" : "mapper_parsing_exception",
"reason" : "failed to parse",
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "Malformed content, found extra data after parsing: START_OBJECT"
}
},
"status" : 400
}
I've seen posts asking to add \n, but that didn't help.
You need to remove _doc from the requst.
curl -X POST "localhost:9200/bulk_meta/_bulk?pretty" -H 'Content-Type: application/json' -d'
{"index":{"_id":"1"}}
{"mydoc":"index action, id 1 "}
{"index":{}}
{"mydoc":"index action, id 2"}
'

Elasticsearch: strict_dynamic_mapping_exception

Hi,
I am trying to modify the date format in an elasticsearch index (operate-operation-0.26.0_). But I get the following error.
{
"took" : 148,
"errors" : true,
"items" : [
{
"index" : {
"_index" : "operate-operation-0.26.0_",
"_type" : "_doc",
"_id" : "WBGhSXcB_hD8-yfn-Rh5",
"status" : 400,
"error" : {
"type" : "strict_dynamic_mapping_exception",
"reason" : "mapping set to strict, dynamic introduction of [dynamic] within [_doc] is not allowed"
}
}
}
]
}
The json file I am using is bulk6.json:
{"index":{}}
{"dynamic":"strict","properties":{"date":{"type":"date","format":"yyyy-MM-dd'T'HH:mm:ss.SSSZZ"}}}
The command I am running is
curl -H "Content-Type: application/x-ndjson" -XPOST 'localhost:9200/operate-operation-0.26.0_/_bulk?pretty&refresh' --data-binary #"bulk6.json"
The _bulk API endpoint is not meant for changing mappings. You need to use the _mapping API endpoint like this:
The JSON file mapping.json should contain:
{
"dynamic": "strict",
"properties": {
"date": {
"type": "date",
"format": "yyyy-MM-dd'T'HH:mm:ss.SSSZZ"
}
}
}
And then the call can be made like this:
curl -H "Content-Type: application/json" -XPUT 'localhost:9200/operate-operation-0.26.0_/_mapping?pretty&refresh' --data-binary #"mapping.json"
However, this is still not going to work as you're not allowed to change the date format after the index has been created. You're going to get the following error:
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Mapper for [date] conflicts with existing mapper:\n\tCannot update parameter [format] from [strict_date_optional_time||epoch_millis] to [yyyy-MM-dd'T'HH:mm:ss.SSSZZ]"
}
],
"type" : "illegal_argument_exception",
"reason" : "Mapper for [date] conflicts with existing mapper:\n\tCannot update parameter [format] from [strict_date_optional_time||epoch_millis] to [yyyy-MM-dd'T'HH:mm:ss.SSSZZ]"
},
"status" : 400
}
You need to create a new index with the desired correct mapping and reindex your data.

Elasticseach error with null value for dense vector datatype

I created an index with a dense_vector:
curl -X PUT "localhost:9200/my_index?pretty" -H 'Content-Type: application/json' -d'
{
"mappings": {
"properties": {
"my_vector": {
"type": "dense_vector",
"dims": 3
}
}
}
}
'
When I index a document with a vector it works well:
curl -X PUT "localhost:9200/my_index/_doc/1?pretty" -H 'Content-Type: application/json' -d'
{
"my_vector" : [0.5, 10, 6]
}
'
BUT when I index a document with a null value for the vector it returns an error:
curl -X PUT "localhost:9200/my_index/_doc/2?pretty" -H 'Content-Type: application/json' -d'
{
"my_vector" : null
}
'
The error is:
{
"error" : {
"root_cause" : [
{
"type" : "parsing_exception",
"reason" : "Failed to parse object: expecting token of type [VALUE_NUMBER] but found [END_OBJECT]",
"line" : 5,
"col" : 1
}
],
"type" : "mapper_parsing_exception",
"reason" : "failed to parse",
"caused_by" : {
"type" : "parsing_exception",
"reason" : "Failed to parse object: expecting token of type [VALUE_NUMBER] but found [END_OBJECT]",
"line" : 5,
"col" : 1
}
},
"status" : 400
}
How can I handle null value for vector type in ES?
instead of setting it to null you can remove that field from that particular document which is equivalent to setting it as null using the followingrequest
curl --location --request POST 'http://{ip}:9200/my_index/_doc/{docId}/_update' \
--header 'Content-Type: application/json' \
--header 'Content-Type: application/json' \
--data-raw '{
"script" : "ctx._source.remove(\"my_vector\")"
}'

Using Curl to put data into ES and got Unexpected character ('n' (code 110))

I'm using Curl to put data into ES. I have already created a customer index.
The following command is from ES document.
curl -X PUT "localhost:9200/customer/_doc/1?pretty" -H 'Content-Type: application/json' -d'
{
"name": "John Doe"
}
'
When I do this, I get an error.
{
"error" : {
"root_cause" : [
{
"type" : "mapper_parsing_exception",
"reason" : "failed to parse"
}
],
"type" : "mapper_parsing_exception",
"reason" : "failed to parse",
"caused_by" : {
"type" : "json_parse_exception",
"reason" : "Unexpected character ('n' (code 110)): was expecting double-quote to start field name\n at [Source: org.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper#1ec5236e; line: 3, column: 4]"
}
},
"status" : 400
}
I think, the below is the main reason of my error.
reason" : "Unexpected character ('n' (code 110)): was expecting double-quote to start field name
I have a feeling that I need to use (backslash) to escape. However, my attempt \' is not working great. Any advice?
I made it work like the below.
curl -X PUT "localhost:9200/customer/_doc/1?pretty" -H 'Content-Type: application/json' -d '
{
\"name\": \"John Doe\" <==== I used "backslash" in front of all the "
}
'
Answer without my comment:
curl -X PUT "localhost:9200/customer/_doc/1?pretty" -H 'Content-Type: application/json' -d '
{
\"name\": \"John Doe\"
}
'

Error while sending data into Elasticsearch

While using Elasticsearch to load datasets with curl command->
curl -H "Content-Type: application/x-ndjson" -XPOST "localhost:9200/shakespeare/doc/_bulk?pretty" --data-binary #$shakespeare_6.0
Following warning is encountered->
Warning: Couldn't read data from file "$shakespeare_6.0", this makes an empty
Warning: POST.
{
"error" : {
"root_cause" : [
{
"type" : "parse_exception",
"reason" : "request body is required"
}
],
"type" : "parse_exception",
"reason" : "request body is required"
},
"status" : 400
}
My data is:
{"index":{"_index":"shakespeare","_id":0}}
{"type":"act","line_id":1,"play_name":"Henry IV", "speech_number":"","line_number":"","speaker":"","text_entry":"ACT I"}
What is the root cause of this warning? I am using 64 bit Windows 10.
Also, Please let me know what are the different ways to send the data into the elasticsearch? I am a noob.
You provided a wrong file name. The name of that file is shakespeare_6.0.json, not $shakespeare_6.0. This is the correct command:
curl -H "Content-Type: application/x-ndjson" -XPOST "localhost:9200/shakespeare/doc/_bulk?pretty" --data-binary #shakespeare_6.0.json
This assumes that the file is in the current directory.

Resources