I want to find out with a rest call to which template an index is bound to. So basically pass the index name and get which template it belongs to.
Basically, I know I can list all the templates and see by patterns what indices will bind to the template, but we have so many templates and so many orderings on them that it's hard to tell.
You can use the _meta mapping field for this in order to attach any custom information to your indexes.
So let's say you have an index template like this
PUT _index_template/template_1
{
"index_patterns": ["index*"],
"template": {
"settings": {
"number_of_shards": 1
},
"mappings": {
"_meta": { <---- add this
"template": "template_1" <---- add this
}, <---- add this
"_source": {
"enabled": true
},
"properties": {
"host_name": {
"type": "keyword"
},
"created_at": {
"type": "date",
"format": "EEE MMM dd HH:mm:ss Z yyyy"
}
}
},
"aliases": {
}
},
"_meta": {
"description": "my custom template"
}
}
Once you create and index that matches that template's pattern, the _meta field will also make it into the new index you're creating.
PUT index1
Then if you get that new index settings, you'll see from which template it was created:
GET index1?filter_path=**._meta.template
=>
{
"index1" : {
"mappings" : {
"_meta" : {
"template" : "template_1" <---- you get this
},
Related
Is is any way in elastic to store index template per alias.
I mean create Index with multiple aliases (alias1 ,alias2 ..) and attach different template to each of them. Then perform Index/Search docs on specific alias.
The reason I'm doing so due to multiple different data-structure (up to 50 types) of documents.
What I did so far is :
1. PUT /dynamic_index
2. POST /_aliases
{ "actions" : [
{ "add" : { "index" : "dynamic_index", "alias" : "alias_type1" } },
{ "add" : { "index" : "dynamic_index", "alias" : "alias_type2" } },
{ "add" : { "index" : "dynamic_index", "alias" : "alias_type3" } }
]}
3.
PUT_template/template1 {
"index_patterns": [
"dynamic_index"
],
"mappings": {
"dynamic_templates": [
{
"strings_as_keywords": {
"match_mapping_type": "string",
"mapping": {
"type": "text",
"analyzer": "standard",
"copy_to": "_all",
"fields": {
"keyword": {
"type": "keyword",
"normalizer": "lowercase_normalizer"
}
}
}
}
}
],
"properties": {
"source": {
"type": "keyword"
}
}
},
"aliases": {
"alias_type1": {
}
}
}
4. same way to alias_type2 , alias_type3 but different fields ...
Indexing/Search : Trying create and search docs per alias like in example:
POST alias_type1/_doc
{
"source": "foo"
, .....
}
POST alias_type2/_doc
{
"source": "foo123"
, .....
}
GET alias_type1/_search
{
"query": {
"match_all": {}
}
}
GET alias_type2/_search
{
"query": {
"match_all": {}
}
}
What I see actually that even if I index documents per alias,
when searching I don't see result per alias ,all results are same on alias_type1,2 and even on index.
Any way I can achieve separation logic on each alias in terms of searches/index docs per type (alias) ?
Any ideas ?
You can’t have separate mapping for aliases pointing to the same index! Aliases are like virtual link pointing to a index so if your aliases pointing to same index you will get the same result back.
If you want to have different mapping based on your data structure you will need to creat multiple indices.
Update
You also can use custom routing based on a field for more information you can check Elastic official documentation here.
It is possible to include a "date and time" field in a document that receives elasticsearch without it being previously defined.
The date and time corresponds to the one received by the json to elasticsearch
This is the mapping:
{
"mappings": {
"properties": {
"entries":{"type": "nested"
}
}
}
}
Is it possible that it can be defined in the mapping field so that elasticsearch includes the current date automatically?
What you can do is to define an ingest pipeline to automatically add a date field when your document are indexed.
First, create a pipeline, like this (_ingest.timestamp is a built-in field that you can access):
PUT _ingest/pipeline/add-current-time
{
"description" : "automatically add the current time to the documents",
"processors" : [
{
"set" : {
"field": "#timestamp",
"value": "_ingest.timestamp"
}
}
]
}
Then when you index a new document, you need to reference the pipeline, like this:
PUT test-index/_doc/1?pipeline=add-current-time
{
"my_field": "test"
}
After indexing, the document would look like this:
GET test-index/_doc/1
=>
{
"#timestamp": "2020-08-12T15:48:00.000Z",
"my_field": "test"
}
UPDATE:
Since you're using index templates, it's even easier because you can define a default pipeline to be run for each indexed documents.
In your index templates, you need to add this to the index settings:
{
"order": 1,
"index_patterns": [
"attom"
],
"aliases": {},
"settings": {
"index": {
"number_of_shards": "5",
"number_of_replicas": "1",
"default_pipeline": "add-current-time" <--- add this
}
},
...
Then you can keep indexing documents without referencing the pipeline, it will be automatic.
"value": "{{{_ingest.timestamp}}}"
Source
I have an index with the mappings
{
"mappings": {
"properties": {
"title": {
"type": "text"
},
"location": {
"type": "keyword"
}
}
}
}
In location field at the moment we are storing the city name.
And we need to change the mapping structure to store also the country and state, so the mapping will be
{
"mappings": {
"properties": {
"title": {
"type": "text"
},
"location": {
"properties": {
"country": {
"type": "keyword"
},
"state": {
"type": "keyword"
},
"city": {
"type": "keyword"
}
}
}
}
}
}
What is the recommended flow for such migration?
Elasticsearch does not allow changing the definition of mapping for existing fields, just the addition of new field definitions as you can check here.
So one of the possibilities is:
create a new field definition, with a different name obviously, to store the new data type.
Stop to use the location field
The another but costly possibility is:
create a new index with the right mapping
do the reindex of the data from the old index to the new index
To reindex the data from the old index with the right format to the new index you can use a painless script:
POST /_reindex
{
"source": {
"index": "old_index_name"
},
"dest": {
"index": "new_index_name"
},
"script": {
"lang": "painless",
"params" : {
"location":{
"country" : null,
"state": null,
"city": null
}
},
"source": """
params.location.city = ctx._source.location
ctx._source.location = params.location
"""
}
}
After you can update country and state fields for the old data.
If you need the same index name, use the new index you created with the correct mapping just as a backup, then you need to delete the index with the old mapping and recreate it again with the same name using the correct mapping and bring the data that are in the other reserve index.
For more about change the mapping read CHANGE ELASTIC SEARCH MAPPING.
Follow these steps:
Create a new index
Reindex the existing index to populate the new index
Aliases can help cutover from over index to another
The kind of document we want to index and query contains variable keys but are grouped into a common root key as follows:
{
"articles": {
"0000000000000000000000000000000000000001": {
"crawled_at": "2016-05-18T19:26:47Z",
"language": "en",
"tags": [
"a",
"b",
"d"
]
},
"0000000000000000000000000000000000000002": {
"crawled_at": "2016-05-18T19:26:47Z",
"language": "en",
"tags": [
"b",
"c",
"d"
]
}
},
"articles_count": 2
}
We want to able to ask: what documents contains articles with tags "b" and "d", with language "en".
The reason why we don't use list for articles, is that elasticsearch can efficiently and automatically merge documents with partial updates. The challenge however is to index the objects inside under the variable keys. One possible way we tried is to use dynamic_templates as follows:
{
"sources": {
"dynamic": "strict",
"dynamic_templates": [
{
"article_template": {
"mapping": {
"fields": {
"crawled_at": {
"format": "dateOptionalTime",
"type": "date"
},
"language": {
"index": "not_analyzed",
"type": "string"
},
"tags": {
"index": "not_analyzed",
"type": "string"
}
}
},
"path_match": "articles.*"
}
}
],
"properties": {
"articles": {
"dynamic": false,
"type": "object"
},
"articles_count": {
"type": "integer"
}
}
}
}
However this dynamic template fails because when documents are inserted, the following can be found in the logs:
[2016-05-30 17:44:45,424][WARN ][index.codec] [node]
[main] no index mapper found for field:
[articles.0000000000000000000000000000000000000001.language] returning
default postings format
Same for the two other fields as well. When I try to query for the existence of a certain article, or even articles it doesn't return any document (no error but empty hits):
curl -LsS -XGET 'localhost:9200/main/sources/_search' -d '{"query":{"exists":{"field":"articles"}}}'
When I query for the existence of articles_count, it returns everything. Is there a minor error in what we are trying to achieve, for example in the schema: the definition of articles as a property and in the dynamic template? What about the types and dynamic false? The path seems correct. Maybe this is not possible to define templates for objects in variable-keys, but it should be according to the documentation.
Otherwise, what alternatives are possible without changing the document if possible?
Notes: we have other types in the same index main that also have these fields like language, I ignore if it could influence. The version of ES we are using is 1.7.5 (we cannot upgrade to 2.X for now).
Given:
Documents of two different types, let's say 'product' and 'category', are indexed to the same Elasticsearch index.
Both document types have a field 'tags'.
Problem:
I want to build a query that returns results of both types, but the documents of type 'product' are allowed to have tags 'X' and 'Y', and the documents of type 'category' are only allowed to have tag 'Z'. How can I achieve this? It appears I can't use product.tags and category.tags since then ES will look for documents' product/category field, which is not what I intend.
Note:
While for the example above there might be some kind of workaround, I'm looking for a general way to target or specify fields of a specific document type when writing queries. I basically want to 'namespace' the field names used in my query so only documents of the type I want to work with are considered.
I think field aliasing would be the best answer for you, but it's not possible.
Instead you can use "copy_to" but I it probably affects index size:
DELETE /test
PUT /test
{
"mappings": {
"product" : {
"properties": {
"tags": { "type": "string", "copy_to": "ptags" },
"ptags": { "type": "string" }
}
},
"category" : {
"properties": {
"tags": { "type": "string", "copy_to": "ctags" },
"ctags": { "type": "string" }
}
}
}
}
PUT /test/product/1
{ "tags":"X" }
PUT /test/product/2
{ "tags":"Y" }
PUT /test/category/1
{ "tags":"Z" }
And you can query one of fields or many of them:
GET /test/product,category/_search
{
"query": {
"term": {
"ptags": {
"value": "x"
}
}
}
}
GET /test/product,category/_search
{
"query": {
"multi_match": {
"query": "x",
"fields": [ "ctags", "ptags" ]
}
}
}