ElasticSearch - How create Index template/mapping per alias and perform search against each alias separately - elasticsearch

Is is any way in elastic to store index template per alias.
I mean create Index with multiple aliases (alias1 ,alias2 ..) and attach different template to each of them. Then perform Index/Search docs on specific alias.
The reason I'm doing so due to multiple different data-structure (up to 50 types) of documents.
What I did so far is :
1. PUT /dynamic_index
2. POST /_aliases
{ "actions" : [
{ "add" : { "index" : "dynamic_index", "alias" : "alias_type1" } },
{ "add" : { "index" : "dynamic_index", "alias" : "alias_type2" } },
{ "add" : { "index" : "dynamic_index", "alias" : "alias_type3" } }
]}
3.
PUT_template/template1 {
"index_patterns": [
"dynamic_index"
],
"mappings": {
"dynamic_templates": [
{
"strings_as_keywords": {
"match_mapping_type": "string",
"mapping": {
"type": "text",
"analyzer": "standard",
"copy_to": "_all",
"fields": {
"keyword": {
"type": "keyword",
"normalizer": "lowercase_normalizer"
}
}
}
}
}
],
"properties": {
"source": {
"type": "keyword"
}
}
},
"aliases": {
"alias_type1": {
}
}
}
4. same way to alias_type2 , alias_type3 but different fields ...
Indexing/Search : Trying create and search docs per alias like in example:
POST alias_type1/_doc
{
"source": "foo"
, .....
}
POST alias_type2/_doc
{
"source": "foo123"
, .....
}
GET alias_type1/_search
{
"query": {
"match_all": {}
}
}
GET alias_type2/_search
{
"query": {
"match_all": {}
}
}
What I see actually that even if I index documents per alias,
when searching I don't see result per alias ,all results are same on alias_type1,2 and even on index.
Any way I can achieve separation logic on each alias in terms of searches/index docs per type (alias) ?
Any ideas ?

You can’t have separate mapping for aliases pointing to the same index! Aliases are like virtual link pointing to a index so if your aliases pointing to same index you will get the same result back.
If you want to have different mapping based on your data structure you will need to creat multiple indices.
Update
You also can use custom routing based on a field for more information you can check Elastic official documentation here.

Related

Elasticsearch Text with Path Hierarchy vs KeyWord using Prefix query performance

I'm trying to achieve the best way to filter results based on folder hierarchies. We will use this to simulate a situation where we want to get all assets/documents in provided folder and all subfolders (recursive search).
So for example for such a structure
/someFolder/someSubfolder/1
/someFolder/someSubfolder/1/subFolder
/someFolder/someSubfolder/2
/someFolder/someSubfolder/2/subFolder
If we search for /someFolder/someSubfolder/1
We want to get as results
/someFolder/someSubfolder/1
/someFolder/someSubfolder/1/subFolder
Now I've found two ways to do this. Not sure which one would be better from performance perspective.
Use Text property with path_hierarchy Tokenizer
Use Keyword property and use Query prefix to get results
Both of the above seem to work as I want them to (unless I missed something). Not sure which one would be better. On one hand I've read that filtering should be done on Keywords. On the other hand path_hierarchy Tokenizer seems to be created exactly for these scenarios but we can only use it with Text field.
Below I prepared a sample code.
Create index and push some test data into it.
PUT test-index-2
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "path_hierarchy"
}
}
}
},
"mappings": {
"properties": {
"folderPath": {
"type": "text",
"analyzer": "my_analyzer",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
POST test-index-2/_doc/
{
"folderPath": "8bf5ad7949a1_104d753b-0fdf-4b07-9213-534dec89112a/Folder with Spaces"
}
POST test-index-2/_doc/
{
"folderPath": "8bf5ad7949a1_104d753b-0fdf-4b07-9213-534dec89112a/Folder with Spaces/SomeTestValue/11"
}
Now both of below queries will return two results for matching partial path hierarchy.
1.
GET test-index-2/_search
{
"query": {
"bool": {
"filter": [
{ "term": { "folderPath": "8bf5ad7949a1_104d753b-0fdf-4b07-9213-534dec89112a/Folder with Spaces" }}
]
}
}
}
GET test-index-2/_search
{
"query": {
"prefix" : { "folderPath.keyword": "8bf5ad7949a1_104d753b-0fdf-4b07-9213-534dec89112a/Folder with Spaces" }
}
}
Now the question would be: Which solution is better if we want to get a subset of results ?

Elasticsearch Mapping - Rename existing field

Is there anyway I can rename an element in an existing elasticsearch mapping without having to add a new element ?
If so whats the best way to do it in order to avoid breaking the existing mapping?
e.g. from fieldCamelcase to fieldCamelCase
{
"myType": {
"properties": {
"timestamp": {
"type": "date",
"format": "date_optional_time"
},
"fieldCamelcase": {
"type": "string",
"index": "not_analyzed"
},
"field_test": {
"type": "double"
}
}
}
}
You could do this by creating an Ingest pipeline, that contains a Rename Processor in combination with the Reindex API.
PUT _ingest/pipeline/my_rename_pipeline
{
"description" : "describe pipeline",
"processors" : [
{
"rename": {
"field": "fieldCamelcase",
"target_field": "fieldCamelCase"
}
}
]
}
POST _reindex
{
"source": {
"index": "source"
},
"dest": {
"index": "dest",
"pipeline": "my_rename_pipeline"
}
}
Note that you need to be running Elasticsearch 5.x in order to use ingest. If you're running < 5.x then you'll have to go with what #Val mentioned in his comment :)
Updating field name in ES (version>5, missing has been removed) using _update_by_query API:
Example:
POST http://localhost:9200/INDEX_NAME/_update_by_query
{
"query": {
"bool": {
"must_not": {
"exists": {
"field": "NEW_FIELD_NAME"
}
}
}
},
"script" : {
"inline": "ctx._source.NEW_FIELD_NAME = ctx._source.OLD_FIELD_NAME; ctx._source.remove(\"OLD_FIELD_NAME\");"
}
}
First of all, you must understand how elasticsearch and lucene store data, by immutable segments (you can read about easily on Internet).
So, any solution will remove/create documents and change mapping or create a new index so a new mapping as well.
The easiest way is to use the update by query API: https://www.elastic.co/guide/en/elasticsearch/reference/2.4/docs-update-by-query.html
POST /XXXX/_update_by_query
{
"query": {
"missing": {
"field": "fieldCamelCase"
}
},
"script" : {
"inline": "ctx._source.fieldCamelCase = ctx._source.fieldCamelcase; ctx._source.remove(\"fieldCamelcase\");"
}
}
Starting with ES 6.4 you can use "Field Aliases", which allow the functionality you're looking for with close to 0 work or resources.
Do note that aliases can only be used for searching - not for indexing new documents.

How to specify or target a field from a specific document type in queries or filters in Elasticsearch?

Given:
Documents of two different types, let's say 'product' and 'category', are indexed to the same Elasticsearch index.
Both document types have a field 'tags'.
Problem:
I want to build a query that returns results of both types, but the documents of type 'product' are allowed to have tags 'X' and 'Y', and the documents of type 'category' are only allowed to have tag 'Z'. How can I achieve this? It appears I can't use product.tags and category.tags since then ES will look for documents' product/category field, which is not what I intend.
Note:
While for the example above there might be some kind of workaround, I'm looking for a general way to target or specify fields of a specific document type when writing queries. I basically want to 'namespace' the field names used in my query so only documents of the type I want to work with are considered.
I think field aliasing would be the best answer for you, but it's not possible.
Instead you can use "copy_to" but I it probably affects index size:
DELETE /test
PUT /test
{
"mappings": {
"product" : {
"properties": {
"tags": { "type": "string", "copy_to": "ptags" },
"ptags": { "type": "string" }
}
},
"category" : {
"properties": {
"tags": { "type": "string", "copy_to": "ctags" },
"ctags": { "type": "string" }
}
}
}
}
PUT /test/product/1
{ "tags":"X" }
PUT /test/product/2
{ "tags":"Y" }
PUT /test/category/1
{ "tags":"Z" }
And you can query one of fields or many of them:
GET /test/product,category/_search
{
"query": {
"term": {
"ptags": {
"value": "x"
}
}
}
}
GET /test/product,category/_search
{
"query": {
"multi_match": {
"query": "x",
"fields": [ "ctags", "ptags" ]
}
}
}

Setting up a Kibana terms panel for an Elasticsearch field that is a list of strings

I have a Kibana dashboard that contains a terms panel to show the number of instances for a particular field (let's call it field1). Field1 is, effectively, a list of strings. Each string usually contains multiple words. Since it's analyzed, Elasticsearch breaks the terms up into separate columns. I need to keep the text together, so I need a not_analyzed version. Here's my attempt to do that with a template, located at ~\config\templates\doc_template.json on a Windows box, which does not seem to be working. Elasticsearch is running as a Windows service.
{
"doc_template": {
"template": "*",
"mappings": {
"Type-*": {
"properties": {
"Field1": {
"type": "multi_field",
"fields": {
"Field1": { "index": "analyzed" },
"RawField1": { "index": "not_analyzed" }
}
}
}
}
}
}
}
In the terms panel, I expect the necessary field to be either RawField1 or Field1.RawField1, but I've tried other variations including and excluding .raw, with no luck.
New indexes are created daily. Field1 exists in 4 separate types, each of which begin with "Type-". I suspect my attempt at using a wildcard there is problematic, but I'm not sure. All data is being sent to Elasticsearch via NEST in a C# .NET application. Here's the mapping for Field1 as it currently exists for one of the types:
{
"index-2014.12.08" : {
"mappings" : {
"Type-1" : {
"properties" : {
"Field1" : {
"type" : "string"
},
"Field2" : {
"type" : "string"
},
"Field3" : {
"type" : "string"
}
}
}
}
}
}
Obviously, the mapping doesn't look like how I expect. What's the best way to remedy this issue?

How to index both a string and its reverse?

I'm looking for a way to analyze the string "abc123" as ["abc123", "321cba"]. I've looked at the reverse token filter, but that only gets me ["321cba"]. Documentation on this filter is pretty sparse, only stating that
"A token filter of type reverse ... simply reverses each token."
(see http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-reverse-tokenfilter.html).
I've also tinkered with using the keyword_repeat filter, which gets me two instances. I don't know if that's useful, but for now all it does it reverse both instances.
How can I use the reverse token filter but keep the original token as well?
My analyzer:
{ "settings" : { "analysis" : {
"analyzer" : {
"phone" : {
"type" : "custom"
,"char_filter" : ["strip_non_numeric"]
,"tokenizer" : "keyword"
,"filter" : ["standard", "keyword_repeat", "reverse"]
}
}
,"char_filter" : {
"strip_non_numeric" : {
"type" : "pattern_replace"
,"pattern" : "[^0-9]"
,"replacement" : ""
}
}
}}}
Make and put a analyzer to reverse a string (say reverse_analyzer).
PUT index_name
{
"settings": {
"analysis": {
"analyzer": {
"reverse_analyzer": {
"type": "custom",
"char_filter": [
"strip_non_numeric"
],
"tokenizer": "keyword",
"filter": [
"standard",
"keyword_repeat",
"reverse"
]
}
},
"char_filter": {
"strip_non_numeric": {
"type": "pattern_replace",
"pattern": "[^0-9]",
"replacement": ""
}
}
}
}
}
then, for a field, (say phoneno), use mapping as, (create a type and append mapping for phone as)
PUT index_name/type_name/_mapping
{
"type_name": {
"properties": {
"phone_no": {
"type": "string",
"fields": {
"reverse": {
"type": "string",
"analyzer": "reverse_analyzer"
}
}
}
}
}
}
So, phone_no is like multifield, which will store a string and its reverse as,
if you index
phone_no: 911220
then in elasticsearch, there will be fields as,
phone_no: 911220 and phone_no.reverse : 022119, so you can search, filter reverse or not-reversed field.
Hope this helps.
I don't believe you can do this directly, as I am unaware of any way to get the reverse token filter to also output the original.
However, you could use the fields parameter to index both the original and the reversed at the same time with no additional coding. You would then search both fields.
So let's say your field was called phone_number:
"phone_number": {
"type": "string",
"fields": {
"reverse": { "type": "string", "index": "phone" }
}
}
In this case we're indexing using the default analyzer (assume standard) plus also indexing into reverse with your customer analyzer phone which reverses. You then issue your queries against both fields.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/_multi_fields.html
I'm not sure it's possible to do this using built-in set of token filters. I would recommend you to create your own plugin. There is ICU Analysis plugin supported by elastic search team, that you can use as example.
I wound up using the following two char_filter's in my analyzer. It's an ugly abuse of regex, but it seems to work. It is limited to the first 20 numeric characters, but in my use-case that is acceptable.
First it groups all numeric characters, then explicitly rebuilds the string with its own (numeric-only!) reverse. The space in the center of the replacement pattern then causes the tokenizer to split it into two tokens - the original and the reverse.
,"char_filter" : {
"strip_non_numeric" : {
"type" : "pattern_replace"
,"pattern" : "[^0-9]"
,"replacement" : ""
}
,"dupe_and_reverse" : {
"type" : "pattern_replace"
,"pattern" : "([0-9]?)([0-9]?)([0-9]?)([0-9]?)([0-9]?)([0-9]?)([0-9]?)([0-9]?)([0-9]?)([0-9]?)([0-9]?)([0-9]?)([0-9]?)([0-9]?)([0-9]?)([0-9]?)([0-9]?)([0-9]?)([0-9]?)([0-9]?)"
,"replacement" : "$1$2$3$4$5$6$7$8$9$10$11$12$13$14$15$16$17$18$19$20 $20$19$18$17$16$15$14$13$12$11$10$9$8$7$6$5$4$3$2$1"
}
}

Resources