Remove custom analyzer / filter on Elasticsearch for all new indexes - elasticsearch

We tried adding a customer analyzer / lowercase filter to all new indexes in Elasticsearch. It looks something like this:
"analysis": {
"normalizer": {
"lowercase_normalizer": {
"filter": [
"lowercase"
],
"type": "custom",
"char_filter": []
}
}
},
This is automatically applied to all new indexes. How do I remove this? I realize i can not remove this from existing indexes, but how do i stop it from being automatically added to new ones?
It appears that these settings are located somewhere in my master "template". I can see the template using "GET /_template", which contains all of the unwanted lowercase normalizers... but how do i remove them?
Thanks!

Here is how you can delete and index template
DELETE /_template/template_1
Also in future if you want to add a new custom analyzer to any template first make a testing index with your custom analyzer and then test if you custom analyzers is giving you desired results by following
GET <index_name>/_analyze
{
"analyzer" : "analyzer_name",
"text" : "this is a test"
}

Related

How to update data type of a field in elasticsearch

I am publishing a data to elasticsearch using fluentd. It has a field Data.CPU which is currently set to string. Index name is health_gateway
I have made some changes in python code which is generating the data so now this field Data.CPU has now become integer. But still elasticsearch is showing it as string. How can I update it data type.
I tried running below commands in kibana dev tools:
PUT health_gateway/doc/_mapping
{
"doc" : {
"properties" : {
"Data.CPU" : {"type" : "integer"}
}
}
}
But it gave me below error:
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Types cannot be provided in put mapping requests, unless the include_type_name parameter is set to true."
}
],
"type" : "illegal_argument_exception",
"reason" : "Types cannot be provided in put mapping requests, unless the include_type_name parameter is set to true."
},
"status" : 400
}
There is also this document which says using mutate we can convert the data type but I am not able to understand it properly.
I do not want to delete the index and recreate as I have created a visualization based on this index and after deleting it will also be deleted. Can anyone please help in this.
The short answer is that you can't change the mapping of a field that already exists in a given index, as explained in the official docs.
The specific error you got is because you included /doc/ in your request path (you probably wanted /<index>/_mapping), but fixing this alone won't be sufficient.
Finally, I'm not sure you really have a dot in the field name there. Last I heard it wasn't possible to use dots in field names.
Nevertheless, there are several ways forward in your situation... here are a couple of them:
Use a scripted field
You can add a scripted field to the Kibana index-pattern. It's quick to implement, but has major performance implications. You can read more about them on the Elastic blog here (especially under the heading "Match a number and return that match").
Add a new multi-field
You could add a new multifield. The example below assumes that CPU is a nested field under Data, rather than really being called Data.CPU with a literal .:
PUT health_gateway/_mapping
{
"doc": {
"properties": {
"Data": {
"properties": {
"CPU": {
"type": "keyword",
"fields": {
"int": {
"type": "short"
}
}
}
}
}
}
}
}
Reindex your data within ES
Use the Reindex API. Be sure to set the correct mapping on the target index.
Delete and reindex everything from source
If you are able to regenerate the data from source in a timely manner, without disrupting users, you can simply delete the index and reingest all your data with an updated mapping.
You can update the mapping, by indexing the same field in multiple ways i.e by using multi fields.
Using the below mapping, Data.CPU.raw will be of integer type
{
"mappings": {
"properties": {
"Data": {
"properties": {
"CPU": {
"type": "string",
"fields": {
"raw": {
"type": "integer"
}
}
}
}
}
}
}
}
OR you can create a new index with correct index mapping, and reindex the data in it using the reindex API

Elasticsearch Dynamic Analyzer & synonyms

Hi I have a use case where I want my application to dynamically decide on xyz_tokizer, xyz_filter, xyz_synonyms etc
something similar to this
'''
GET test/_search
{
"query":{
"match": {
"content": {
"query": "search_text",
"analyzer": {
"filter": "xyz_filter",
"tokenizer": "xyz_tokenizer"
}
}
}
}
}
'''
However, it throws error. As per elasticsearch docs I found out that we can specify only analyzers that are defined in index settings. Similarly, How to specify filters, tokenizer as well dynamically
You can't, these analyzers need to be registered in your index, what you can do is to use the search time analyzer, dynamically according to your requirements.
But index-time, you can't add them dynamically, it needs to be present in your index settings. You can also change the index-setting to add the new analyzer and add new fields with the newly added analyzer(incremental changes), but changing the existing analyzer of a field is a breaking change and you need to reindex the whole data.

Elasticsearch Suggest+Synonyms+fuzziness

I am looking for a way to implement the auto-suggest with synonyms & fuzziness
For example, when the user tried to search for "replce ar"
My synonym list has ar => audio record
So, the result should include the items matching
changing audio record
replacing audio record
etc..,
Here we need fuzziness because there is a typo on "replace" (in the user's search text)
Synonyms to match ar => audio record
Auto-suggest with regex pattern.
Is it possible to implement all the three features in a single field?
Edit:
a regex+fuzzy just throws error.
I haven't well explained my need of a regex-pattern.
so, i needed a Regex for doing a partial word lookup ('encyclopedic' contains 'cyclo').
now, after investigating what options do i have for this purpose, directing me to the NGram Tokenizer and looking into the other suggesters, i found that maybe Phrase suggester is realy what I'm looking for, so I'll try it & tell you about.
Yes, you can use synonyms as well as fuzziness for suggestions. The synonyms are handled by adding a synonym filter in your language analyzer and adding that filter to the analyzer. Then, when you create the field mapping for the field(s) you want to use for suggestions, you assign that analyzer to that field.
As for fuzziness, that happens at query time. Most text-based queries support a fuzziness option which allows you to specify how many corrections you want to allow. The default auto value adjusts the number of corrections, depending on how long the term is, so that's usually best.
Notional analysis setup (synonym_graph reference)
{
"analysis": {
"filter": {
"synonyms": {
"type": "synonym_graph",
"expand": "false",
"synonyms": [
"ar => audio record"
]
}
},
"analyzer": {
"synonyms": {
"tokenizer": "standard",
"type": "custom",
"filter": [
"standard",
"lowercase",
"synonyms"
]
}
}
}
}
Notional Field Mapping (Analyzer + Mapping reference)
(Note that the analyzer matches the name of the analyzer defined above)
{
"properties": {
"suggestion": {
"type": "text",
"analyzer": "synonyms"
}
}
}
Notional Query
{
"query": {
"match": {
"suggestion": {
"query": "replce ar",
"fuzziness": "auto",
"operator": "and"
}
}
}
}
Keep in mind that there are several different options for suggestions, so depending on which option you use, you may need to adjust the way the field is mapped, or even add another token filter to the analyzer. But analyzers are just made up of a series of token filters, so you can usually combine whatever token filters you need to achieve your goal. Just make sure you understand what each filter is doing so you get the filters in the correct order.
If you get stuck in part of this process, just submit another question with the specific issue you're running into. Good luck!

ElasticSearch - what is the difference between an index template and an index pattern

I have read an explanation to my question here:
https://discuss.elastic.co/t/whats-the-differece-between-index-pattern-and-index-template/54948
However, I still don't understand the difference. When defining an index PATTERN, does it not affect index creation at all? Also, what happens if I create an index but it doesn't have a corresponding index pattern? How can I see the mapping used for an index pattern so I can know how to use the Mapping API to update it?
And on a side note, the docs say you manage the index patterns by clicking the "Settings" and then "Indices" tab. I'm looking at Kibana and I don't see any settings tab. I can view the index patterns through the management tab, but I don't see any settings tab there
An index template is an ES feature for triggering the creation of new indexes whenever a name pattern is matched. For instance, let's say we create the following index template:
PUT _template/template_1
{
"index_patterns": ["foo*"],
"settings": {
"number_of_shards": 1
},
"mappings": {
...
}
}
As you can see, as soon as we want to index a document inside an index named (e.g.) foo-44 and that index doesn't exist, then that template (settings + mappings) will be used by ES in order to create the foo-44 index automatically.
You can update an index template at any time by simply PUTting a new settings/mappings definition like above.
An index pattern (not to be confounded with the index-patterns property you saw above, those are two totally different things), is a Kibana feature for telling Kibana what makes up an index (all the fields, their types, etc). Nothing can happen in Kibana without creating index patterns, which you can do in Management > Index Patterns.
Creating an index in ES will not create any index pattern in Kibana. Similarly, creating an index pattern in Kibana will not create any index in ES.
The reason why Kibana needs an index pattern is because it needs to store different kind of information as it available in an index mapping. For instance, let's say you create an index with the following mapping:
PUT my_index
{
"mappings": {
"doc": {
"properties": {
"timestamp": {
"type": "date"
},
"name": {
"type": "text"
}
}
}
}
}
Then the corresponding index pattern that you will create in Kibana will have the following content:
GET .kibana/doc/index-pattern:16a98050-a53f-11e8-82ab-af0d48c6ddd8
{
"type": "index-pattern",
"updated_at": "2018-08-21T12:38:22.509Z",
"index-pattern": {
"title": "my_index*",
"timeFieldName": "timestamp",
"fields": """[{"name":"_id","type":"string","count":0,"scripted":false,"searchable":true,"aggregatable":true,"readFromDocValues":false},{"name":"_index","type":"string","count":0,"scripted":false,"searchable":true,"aggregatable":true,"readFromDocValues":false},{"name":"_score","type":"number","count":0,"scripted":false,"searchable":false,"aggregatable":false,"readFromDocValues":false},{"name":"_source","type":"_source","count":0,"scripted":false,"searchable":false,"aggregatable":false,"readFromDocValues":false},{"name":"_type","type":"string","count":0,"scripted":false,"searchable":true,"aggregatable":true,"readFromDocValues":false},{"name":"name","type":"string","count":0,"scripted":false,"searchable":true,"aggregatable":false,"readFromDocValues":false},{"name":"timestamp","type":"date","count":0,"scripted":false,"searchable":true,"aggregatable":true,"readFromDocValues":true}]"""
}
}
As you can see, Kibana also stores the timestamp field, the name of the index pattern (which can span several indexes). Also it stores various properties for each field you have defined, for instance, for the name field, the index-pattern contains the following information that Kibana needs to know:
{
"name": "name",
"type": "string",
"count": 0,
"scripted": false,
"searchable": true,
"aggregatable": false,
"readFromDocValues": false
},

Elastic search update mappings

I have mappings created wrongly for an object in elastic search. Is there a way to update the mappings. The mapping has been created wrongly for type of the object(String instead of double).
In general, the mapping for existing fields cannot be updated. There are some exceptions to this rule. For instance:
new properties can be added to Object datatype fields.
new multi-fields can be added to existing fields.
doc_values can be disabled, but not enabled.
the ignore_above parameter can be updated.
Source : https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html
That's entirely possible, by PUTting the new mapping over the existing one, here are some examples.
Please note, that you will probably need to reindex all your data after you have done this, because I don't think that ES can convert string indexes to double indexes. (what will instead happen is, that you won't find any document when you search in that field)
PUT Mapping API allows you to add/modified datatype in an existing index.
PUT /assets/asset/_mapping
{
"properties": {
"common_attributes.asset_id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"doc_values": true,
"normalizer": "lowercase_normalizer"
}
}
},
}
}
After updating the mapping, update the existing documents using bulk Update API.
POST /_bulk
{"update":{"_id":"59519","_type":"asset","_index":"assets"}}
{"doc":{"facility_id":491},"detect_noop":false}
Note - Use 'detect_noop' for detecting noop update.

Resources