elastic search ignore_above setting use - elasticsearch

can anyone please help on one of the doubts regarding the explaination if ignore above that is there in the elastic document
its mentioned that
Strings longer than the ignore_above setting will not be indexed or stored. For arrays of strings, ignore_above will be applied for each array element separately and string elements longer than ignore_above will not be indexed or stored.
does this means that if i add data longer then the length then it won't allow to post data in ES
https://www.elastic.co/guide/en/elasticsearch/reference/current/ignore-above.html#ignore-above
Here is what i have tried
my mapping for index testData(i.e index i created) is as follows
using PUT mapping api i added the following mapping
{
"testdata": {
"mappings": {
"testdata": {
"properties": {
"email": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
i added data which has length i.e around 150kb(by length around 145149)
it allowed to add the data in the email field , i am also able to search data using post search endpoint, should it allow do to that or am i getting this concept wrong.

Your setting ignore_above: 256 means if the string length is greater than 256 characters then the document is indexed but that field will not be indexed. If the string length is < 256 characters then the document along with the field will be indexed. Example - String length for text "stackoverflow" is 13 characters. Hope this clarifies.
As per your mapping, the ignore_above setting is applied to the email.keyword field.

I solve my problem today ,I can't get agg terms value by xx.keyword. My value length is 315.
if string longer than the ignore_above value , xx.keyword will not work for search. But the value can be saved , you can search the doc by other fields.

Related

Elasticsearch - Making a field aggregatable but not searchable

My elasticsearch data has a large number of fields that I don't need to search by. But I would like to get aggregations like percentiles, median, count, avg. etc. on these fields.
Is there a way to disable searchability of a field but let it still be aggregatable?
Most of the fields are indexed by default and hence make them searchable. If you want to make a field non-searchable all you have to do is set its index param as false and doc_values to true.
As per elastic documentation:
All fields which support doc values have them enabled by default.
So you need not explicitly set "doc_values": true for such fields.
For e.g.
{
"mappings": {
"_doc": {
"properties": {
"only_agg": {
"type": "keyword",
"index": false
}
}
}
}
}
If you try to search on field only_agg in above example, elastic will throw exception with reason as below:
Cannot search on field [only_agg] since it is not indexed.
yeah take a look at doc_value:
https://www.elastic.co/guide/en/elasticsearch/reference/current/doc-values.html

Elasticsearch 6.2: terms query require lowercase input when searching on keyword

I've created an example index, with the following mapping:
{
"_doc": {
"_source": {
"enabled": False
},
"properties": {
"status": { "type": "keyword" }
}
}
}
And indexed a document:
{"status": "CMP"}
When searching the documents with this status with a terms query, I find no results:
{
"query" : {
"terms": { "status": ["CMP"]}
}
}
However, if I make the same query by putting the input in lowercase, I will find my document:
{
"query" : {
"terms": { "status": ["cmp"]}
}
}
Why is it? Since I'm searching on a keyword field, the indexed content should not be analyzed and should match an uppercase value...
no more #Oliver Charlesworth Now - in Elastic 6.x - you could continue to use a keyword datatype, lowercasing your text with a normalizer,doc here. However in every cases you should change your index mapping and reindex your docs
The index and mapping creation and the search were part of a test suite. It seems that the setup part of the test suite was not executed, and the mapping was not applied to the index.
The index was then using the default types instead of the mapping types, resulting of the use of string fields instead of keywords.
After changing the setup method of the automated tests, the mappings are well applied to the index, and the uppercase values for the status "CMP" are now matching documents.
The symptoms you're seeing shouldn't occur, unless something else is wrong.
A keyword index is not analysed, so your index should contain only CMP. A terms query is also not analysed, etc. so your index is searched only for CMP. Hence there should be a match.

Scripted Field Kibana Not Working

I am trying to get scripted fields in Kibana to work.
I have two fields in my documents, customer and site
I'd like to create a new scripted field called friendly_name which is customer+" "+site
I've tried
return doc["customer"].value + " "+doc["site"].value
and it doesn't yield any results.
I've even tried just return 1 to see if I can get anything to return.
How can I get this to work?
Scripted fields work with doc_values only and I am guessing that, since this doesn't work for you, your customer and site field are text fields.
From https://www.elastic.co/blog/using-painless-kibana-scripted-fields:
Both Painless and Lucene expressions operate on fields stored in doc_values. So for string data, you will need to have the string to be stored in data type keyword.
So, you either define your two fields to be keyword or you add a subfield to them and in your scrip you use customer.keyword and site.keyword. And the changed mapping should be:
"customer": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}

Prevent Elasticsearch from splitting on specific character while indexing

I have a field with values such as 170726-001, 170726-002, 170726-003 and it appears that the values in the three fields get split into 170726 and 00N. This affects the relevance of my search results when searching for 170726-001 as a keyword using Query String Query.
How to I prevent Elasticsearch from splitting the value on the - character when indexing?
With the help of #filip-cordas and other comments I updated my index to reflect the following. Its using the keyword type instead of the default text type. Doing it on index like this prevents me from having to specify my_field.keyword in the search.
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"my_field": {
"type": "keyword",
"index": true
}
}
}
}
}

Unanalyzed fields on Kibana

i need help to correct kibana field. when I try to visualizing the fields, shown me the following warning:
Careful! The field contains Analyzed selected strings. Analyzed
strings are highly unique and can use a lot of memory to visualize.
Values: such as bar will be foo-foo and bar broken into. See Core
Mapping Types for more information on setting esta field Analyzed as
not
Elasticsearch default dynamic mapping is to analyze any string field (break the field into tokens, for instance: aaa_bbb_ccc will be break down into aaa,bbb and ccc).
If you do not want such behavior you must change the mapping settings
before any document was pushed into the index.
You have two options to do that:
Change the mapping for a particular index using mapping API, in a static way or dynamic way (dynamic means that the mapping will be applies also to fields that still does not exist in the index)
You can change the behavior of any index according to a pattern, using the template API
This example shows a template that changes the mapping for any index that starts with "app", applying "not analyze" to any field in any type and make sure "timestamp" is a date (good for cases in with the timestamp is represented as a number of seconds from 1970):
{
"template": "myindciesprefix*",
"mappings": {
"_default_": {
"dynamic_templates": [
{
"strings": {
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "not_analyzed"
}
}
},
{
"timestamp_field": {
"match": "timestamp",
"mapping": {
"type": "date"
}
}
}
]
}
}
}
Really you dont have any problem is only a message of info, but if you dont want analyzed fields when you build your index in elasticsearch you must indicate that one field is a not analyzed field.

Resources