Update and search in multi field properties in ElasticSearch - elasticsearch

I'm trying to use multi field properties for multi language support. I created following mapping for this:
{
"mappings": {
"product": {
"properties": {
"prod-id": {
"type": "string"
},
"prod-name": {
"type": "string",
"fields": {
"en": {
"type": "string",
"analyzer": "english"
},
"fr": {
"type": "string",
"analyzer": "french"
}
}
}
}
}
}
}
I created test record:
{
"prod-id": "1234567",
"prod-name": [
"Test product",
"Produit d'essai"
]
}
and tried to query using some language:
{
"query": {
"bool": {
"must": [
{"match": {
"prod-name.en": "Produit"
}}
]
}
}
}
As a result I got my document. But I expected that I will have empty result when I use French but choose English. It seems ElasticSearch ignores which field I specified in query. There is no difference in search result when I use "prod-name.en" or "prod-name.fr" or just "prod-name". Is this behaviour expected? Should I do some special things to have searching just in one language?
Another problem with updating multi field property. I can't update just one field.
{
"doc" : {
"prod-name.en": "Test"
}
}
I got following error:
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "Field name [prod-name.en] cannot contain '.'"
}
],
"type": "mapper_parsing_exception",
"reason": "Field name [prod-name.en] cannot contain '.'"
},
"status": 400
}
Is there any way to update just one field in multi field property?

In your mapping, the prod-name.en field will simply be analyzed using the english analyzer and the same for the french field. However, ES will not choose for you which value to put in which field.
Instead, you need to modify your mapping like this
{
"mappings": {
"product": {
"properties": {
"prod-id": {
"type": "string"
},
"prod-name": {
"type": "object",
"properties": {
"en": {
"type": "string",
"analyzer": "english"
},
"fr": {
"type": "string",
"analyzer": "french"
}
}
}
}
}
}
}
and input document to be like this and you'll get the results you expect.
{
"prod-id": "1234567",
"prod-name": {
"en": "Test product",
"fr": "Produit d'essai"
}
}
As for the updating part, your partial document should be like this instead.
{
"doc" : {
"prod-name": {
"en": "Test"
}
}
}

Related

How to create a custom reusable type in ElasticSearch?

My json for ElasticSearch schema looks like this :-
{
"mappings": {
"properties": {
"DESCRIPTION_FR": {
"type": "text",
"analyzer": "french"
},
"FEEDBACK_FR": {
"type": "text",
"analyzer": "french"
},
"SOURCE_FR": {
"type": "text",
"analyzer": "french"
}
}
}
}
There are 100 of properties like this. Replicating a change across all the properties with this approach is redundant and erroneous.
Is there a way in ElasticSearch 7.2 to write custom data type and reuse it in property mapping.
{
"settings": {
//definition of custom type "text_fr"
},
"mappings": {
"properties": {
"DESCRIPTION_FR": {
"type": "text_fr"
},
"FEEDBACK_FR": {
"type": "text_fr"
},
"SOURCE_FR": {
"type": "text_fr"
}
}
}
}
Yes! What you're after is dynamic mapping templates. More specifically the match feature.
Define the target field names with a leading wildcard:
PUT my_index
{
"mappings": {
"dynamic_templates": [
{
"is_french_text": {
"match_mapping_type": "*",
"match": "*_FR",
"mapping": {
"type": "text",
"analyzer": "french"
}
}
}
]
}
}
Insert a doc:
POST my_index/_doc
{
"DESCRIPTION_FR": "je",
"FEEDBACK_FR": "oui",
"SOURCE_FR": "je ne sais quoi"
}
Verify the dynamically generated mapping:
GET my_index/_mapping

Update a string parameter in Elasticsearch _mapping

I have such a _mapping in Elasticsearch 6.8:
{
"grch38_test__wes__grch38__variants__20210222" : {
"mappings" : {
"variant" : {
"_meta" : {
"gencodeVersion" : "25",
"hail_version" : "0.2.20",
"genomeVersion" : "38",
"sampleType" : "WES",
"sourceFilePath" : "s3://my_folder/my_vcf.vcf"
},
...
My goal is to issue a query in Kibana to modify variant._meta.sourceFilePath. Following thread:
Elastic search mapping for nested json objects
I was able to come up with the query:
PUT /grch38_test__wes__grch38__variants__20210222/_mapping/variant
{
"properties": {
"variant": {
"type": "nested",
"properties": {
"_meta": {
"type": "nested",
"properties": {
"type": "text",
"sourceFilePath": "s3://my_folder/my_vcf.vcf"
}
}
}
}
}
}
But its giving me an error:
elasticsearch mapping Expected map for property [fields] on field [name] but got a class java.lang.String
Full error message:
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "Expected map for property [fields] on field [type] but got a class java.lang.String"
}
],
"type": "mapper_parsing_exception",
"reason": "Expected map for property [fields] on field [type] but got a class java.lang.String"
},
"status": 400
}
I have also tried:
PUT /grch38_test__wes__grch38__variants__20210222/_mapping/variant
{
"properties": {
"variant": {
"type": "nested",
"properties": {
"_meta": {
"type": "nested",
"properties": {
"sourceFilePath": {
"type": "text",
"value":"s3://my_folder/my_vcf.vcf"
}
}
}
}
}
}
}
But its telling me that value is unsupported:
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "Mapping definition for [sourceFilePath] has unsupported parameters: [value : s3://seqr-dp-data--prod/vcf/dev/grch38_test_contracted.vcf]"
}
],
"type": "mapper_parsing_exception",
"reason": "Mapping definition for [sourceFilePath] has unsupported parameters: [value : s3://seqr-dp-data--prod/vcf/dev/grch38_test_contracted.vcf]"
},
"status": 400
}
What am I doing wrong? How to modify the field?
_meta is a reserved field for storing application-specific metadata. It's not meant to be searchable and can be only retrieved through the GET Mapping API.
This means that if your _meta content was intended to be consistent with what the _meta field is designed for, you cannot apply any mappings to it. It's a "final" hashmap of concrete values and would need to be defined at the top level of your update-mapping payload:
PUT /grch38_test__wes__grch38__variants__20210222/_mapping/variant
{
"_meta": {
"variant": { <-- shared index-level metadata
"gencodeVersion": "25",
"hail_version": "0.2.20",
"genomeVersion": "38",
"sampleType": "WES",
"sourceFilePath": "s3://my_folder/my_vcf.vcf"
}
},
"properties": {
"some_text_field": { <-- actual document properties
"type": "text"
}
}
}
If, on the other hand, your _meta field is an unfortunate naming coincidence, you can declare the mappings for it like so:
PUT /grch38_test__wes__grch38__variants__20210222/_mapping/variant
{
"properties": {
"_meta": {
"properties": {
"variant": {
"properties": {
"gencodeVersion": {
"type": "text"
},
"genomeVersion": {
"type": "text"
},
"hail_version": {
"type": "text"
},
"sampleType": {
"type": "text"
},
"sourceFilePath": {
"type": "text"
}
}
}
}
}
}
}
and ingest documents of the form:
POST grch38_test__wes__grch38__variants__20210222/variant/_doc
{
"_meta": {
"variant": {
"gencodeVersion": "25",
"hail_version": "0.2.20",
"genomeVersion": "38",
"sampleType": "WES",
"sourceFilePath": "s3://my_folder/my_vcf.vcf"
}
}
}
But again, the _meta content would be document-specific, not index-wide!
And BTW, the nested mapping only makes sense if you're dealing with arrays of objects, not objects of objects.
But if you insist on wanting it, here's how you'd do it:
PUT /grch38_test__wes__grch38__variants__20210222/_mapping/variant?include_type_name
{
"properties": {
"_meta": {
"type": "nested", <---
"properties": {
"variant": {
"type": "nested", <---
"properties": {
"gencodeVersion": {
"type": "text"
},
"genomeVersion": {
"type": "text"
},
"hail_version": {
"type": "text"
},
"sampleType": {
"type": "text"
},
"sourceFilePath": {
"type": "text"
}
}
}
}
}
}
}

Return only top level fields in elasticsearch query?

I have a document that has nested fields. Example:
"mappings": {
"blogpost": {
"properties": {
"title": { "type": "text" },
"body": { "type": "text" },
"comments": {
"type": "nested",
"properties": {
"name": { "type": "text" },
"comment": { "type": "text" },
"age": { "type": "short" },
"stars": { "type": "short" },
"date": { "type": "date" }
}
}
}
}
}
}
Can the query be modified so that the response only contains non-nested fields?
In this example, the response would only contain body and title.
Using _source you can exclude/include fields
GET /blogpost/_search
{
"_source":{
"excludes":["comments"]
}
}
But you have to explicitly put the field names inside exclude, I'm searching for a way to exclude all nested fields without knowing their field name
You can achieve that but in a static way, which means you entered the field(s) name using excludes keyword, like:
GET your_index/_search
{
"_source": {
"excludes": "comments"
},
"query": {
"match_all" : {}
}
}
excludes can take an array of strings; not just one string.

Simple elasticsearch input - Rejecting mapping update final mapping would have more than 1 type: [_doc, doc]

I'm trying to send data to elasticsearch but running into an issue where my number field only comes up as a string. These are the steps I took.
Step 1. Add index & map
PUT http://123.com:5101/core_060619/
{
"mappings": {
"properties": {
"date": {
"type": "date",
"format": "HH:mm yyyy-MM-dd"
},
"data": {
"type": "integer"
}
}
}
}
Result:
{
"acknowledged": true,
"shards_acknowledged": true,
"index": "core_060619"
}
Step 2. Add data
PUT http://123.com:5101/core_060619/doc/1
{
"test" : [ {
"data" : "119050300",
"date" : "00:00 2019-06-03"
} ]
}
Result:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "Rejecting mapping update to [zyxnewcoreyxbl_060619] as the final mapping would have more than 1 type: [_doc, doc]"
}
],
"type": "illegal_argument_exception",
"reason": "Rejecting mapping update to [zyxnewcoreyxbl_060619] as the final mapping would have more than 1 type: [_doc, doc]"
},
"status": 400
}
You can not have more than one type of document in Elasticsearch 6.0.0+. If you set your document type to doc, then you can add another document by simply PUT http://123.com:5101/core_060619/doc/1, PUT http://123.com:5101/core_060619/doc/2 etc.
Elasticsearch 6.+
PUT core_060619/
{
"mappings": {
"doc": { //type of documents in index is 'doc'
"properties": {
"date": {
"type": "date",
"format": "HH:mm yyyy-MM-dd"
},
"data": {
"type": "integer"
}
}
}
}
}
Since we created mapping to have doc type of documents, now we can add new documents by simply adding /doc/_id:
PUT core_060619/doc/1
{
"test" : [ {
"data" : "119050300",
"date" : "00:00 2019-06-03"
} ]
}
PUT core_060619/doc/2
{
"test" : [ {
"data" : "111120300",
"date" : "10:15 2019-06-02"
} ]
}
Elasticsearch 7.+
Types are removed, but you can use custom like field(s):
PUT twitter
{
"mappings": {
"_doc": {
"properties": {
"type": { "type": "keyword" },
"name": { "type": "text" },
"user_name": { "type": "keyword" },
"email": { "type": "keyword" },
"content": { "type": "text" },
"tweeted_at": { "type": "date" }
}
}
}
}
PUT twitter/_doc/user-kimchy
{
"type": "user",
"name": "Shay Banon",
"user_name": "kimchy",
"email": "shay#kimchy.com"
}
PUT twitter/_doc/tweet-1
{
"type": "tweet",
"user_name": "kimchy",
"tweeted_at": "2017-10-24T09:00:00Z",
"content": "Types are going away"
}
GET twitter/_search
{
"query": {
"bool": {
"must": {
"match": {
"user_name": "kimchy"
}
},
"filter": {
"match": {
"type": "tweet"
}
}
}
}
}
Removal of mapping types

In Elasticsearch, can I set language-specific multi-fields?

Is it possible to use multi-fields to set and query multilingual fields?
Consider this mapping:
PUT multi_test
{
"mappings": {
"data": {
"_field_names": {
"enabled": false
},
"properties": {
"book_title": {
"type": "text",
"fields": {
"english": {
"type": "text",
"analyzer": "english"
},
"german": {
"type": "text",
"analyzer": "german"
},
"italian": {
"type": "text",
"analyzer": "italian"
}
}
}
}
}
}
}
I tried the following, but it doesn't work:
PUT multi_test/data/1
{
"book_title.english": "It's good",
"book_title.german": "Das gut"
}
The error seems to indicate I'm trying to add new fields:
{ "error": { "root_cause": [ { "type": "mapper_parsing_exception",
"reason": "Could not dynamically add mapping for field
[book_title.english]. Existing mapping for [book_title] must be of
type object but found [text]." } ], "type":
"mapper_parsing_exception", "reason": "Could not dynamically add
mapping for field [book_title.english]. Existing mapping for
[book_title] must be of type object but found [text]." }, "status":
400 }
What am I doing wrong here?
If my approach is unworkable, what is a better way to do this?
The problem is that you are using using fields for the field book_title.
Fields keyword is used when you want to keep same field and data in multiple ways i.e using different analyzers or some other setting changes but values should be same in all field names under fields.Here is the link describing what is keyword fields https://www.elastic.co/guide/en/elasticsearch/reference/2.4/multi-fields.html
In you use case the mapping should be like below
PUT multi_test
{
"mappings": {
"data": {
"_field_names": {
"enabled": false
},
"properties": {
"book_title": {
"properties": {
"english": {
"type": "text",
"analyzer": "english"
},
"german": {
"type": "text",
"analyzer": "german"
},
"italian": {
"type": "text",
"analyzer": "italian"
}
}
}
}
}
}
}
This will define book_title as object type and you can add multiple fields with different data under book_title

Resources