ElasticSearch Reindex API not analyzing the new field - elasticsearch

I have an existing index named "Docs" which has documents in it.
I am creating a new Index named "Docs1" exactly same like "Docs" with only one extra field with analyzer in one property, which I want to use for autocomplete purpose.
Property in "Docs" index
"name": {
"type": "text",
"analyzer": "text_standard_analyzer",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
Property in the "Docs1" index going to be
{
"name": {
"type": "text",
"analyzer": "text_standard_analyzer",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
},
"pmatch": {
"type": "text",
"analyzer": "text_partialmatching_analyzer"
}
}
}
}
I am using Reindex API to copy records from "Docs" to "Docs1"
POST _reindex
{
"source": {
"index": "Docs"
},
"dest": {
"index": "Docs1"
}
}
when I reindex, I expect for the older documents to contain the new field with the information in that field.
I am noticing the new field in my destination index "Docs1" is not analyzed for existing data. But it is analyzed for any new documents I am adding.
Please suggest

Reindex by adding "type" worked
POST _reindex
{
"source":
{ "index": "sourceindex" },
"dest":
{ "index": "destindex",
"type":"desttype"
}
}

Related

Sorting in elastic search using new java api

I am using the latest java API for communication with the elastic search server.
I require to search data in some sorted order.
SortOptions sort = new SortOptions.Builder().field(f -> f.field("customer.keyword").order(SortOrder.Asc)).build();
List<SortOptions> list = new ArrayList<SortOptions>();
list.add(sort);
SearchResponse<Order> response = elasticsearchClient.search(b -> b.index("order").size(100).sort(list)
.query(q -> q.bool(bq -> bq
.filter(fb -> fb.range(r -> r.field("orderTime").
gte(JsonData.of(timeStamp("01-01-2022-01-01-01")))
.lte(JsonData.of(timeStamp("01-01-2022-01-01-10")))
)
)
// .must(query)
)), Order.class);
I have written the
above code for getting search results in sorted order by customer.
I am getting the below error when I run the program.
Exception in thread "main" co.elastic.clients.elasticsearch._types.ElasticsearchException: [es/search] failed: [search_phase_execution_exception] all shards failed
at co.elastic.clients.transport.rest_client.RestClientTransport.getHighLevelResponse(RestClientTransport.java:281)
at co.elastic.clients.transport.rest_client.RestClientTransport.performRequest(RestClientTransport.java:147)
at co.elastic.clients.elasticsearch.ElasticsearchClient.search(ElasticsearchClient.java:1487)
at co.elastic.clients.elasticsearch.ElasticsearchClient.search(ElasticsearchClient.java:1504)
at model.OrderDAO.fetchRecordsQuery(OrderDAO.java:128)
Code runs fine if I remove .sort() method.
My index is configured in the following format.
{
"order": {
"aliases": {},
"mappings": {
"properties": {
"customer": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"orderId": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"orderTime": {
"type": "long"
},
"orderType": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"settings": {
"index": {
"routing": {
"allocation": {
"include": {
"_tier_preference": "data_content"
}
}
},
"number_of_shards": "1",
"provided_name": "order",
"creation_date": "1652783550822",
"number_of_replicas": "1",
"uuid": "mrAj8ZT-SKqC43-UZAB-Jw",
"version": {
"created": "8010299"
}
}
}
}
}
Please let me know what is wrong here also if possible please send me the correct syntax for using sort() in the new java API.
Thanks a lot.
As you have confirmed in comment, customer is a text type field and this is the reason you are getting above error as sort can not apply on texttype of field.
Your index should be configured like below for customer field to apply sort:
{
"mappings": {
"properties": {
"customer": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
Once you have index mapping like above, you can use customer.keyword as field name for sorting and customer as field name for free text search.

How to set elasticsearch index mapping as not_analysed for all the fields

I want my elasticsearch index to match the exact value for all the fields. How do I map my index to "not_analysed" for all the fields.
I'd suggest making use of multi-fields in your mapping (which would be default behavior if you aren't creating mapping (dynamic mapping)).
That way you can switch to traditional search and exact match searches when required.
Note that for exact matches, you would need to have keyword datatype + Term Query. Sample examples are provided in the links I've specified.
Hope it helps!
You can use dynamic_templates mapping for this. As a default, Elasticsearch is making the fields type as text and index: true like below:
{
"products2": {
"mappings": {
"product": {
"properties": {
"color": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
As you see, also it creates a keyword field as multi-field. This keyword fields indexed but not analyzed like text. if you want to drop this default behaviour. You can use below configuration for the index while creating it :
PUT products
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"product": {
"dynamic_templates": [
{
"strings": {
"match_mapping_type": "string",
"mapping": {
"type": "keyword",
"index": false
}
}
}
]
}
}
}
After doing this the index will be like below :
{
"products": {
"mappings": {
"product": {
"dynamic_templates": [
{
"strings": {
"match_mapping_type": "string",
"mapping": {
"type": "keyword",
"index": false
}
}
}
],
"properties": {
"color": {
"type": "keyword",
"index": false
},
"type": {
"type": "keyword",
"index": false
}
}
}
}
}
}
Note: I don't know the case but you can use the multi-field feature as mentioned by #Kamal. Otherwise, you can not search on the not analyzed fields. Also, you can use the dynamic_templates mapping set some fields are analyzed.
Please check the documentation for more information :
https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-templates.html
Also, I was explained the behaviour in this article. Sorry about that but it is Turkish. You can check the example code samples with google translate if you want.

kibana keyword occurrency across documents

I have been unable to show words occurrency in kibana inside a full_text field mapped as "type": "keyword" across documents in the index.
My first attempt involved the usage of an analyzer. However I have been unable to change the document in any way, the index mapping relfect the analyzer but no field reflect the analysis.
This is the simplified mapping:
{
"mappings": {
"doc": {
"properties": {
"text": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
},
"analyzed": {
"type": "text",
"analyzer": "rebuilt"
}
}
}
}
}
},
"settings": {
"analysis": {
"analyzer": {
"rebuilt": {
"tokenizer": "standard"
}
}
},
"index.mapping.ignore_malformed": true,
"index.mapping.total_fields.limit": 2000
}
}
but still I'm unable to see the array of words that I expect to be saved under the text.analyzed field, indeed that fields does not exists and I'm wondering why
It seems like settings fielddata=true link, in spite of being heavily discouraged, solved my problem (at least for now), and allows me to visualize in kibana the occurrence (or absolute frequency) of each word in the text field across documents.
The final version of the proposed simplified mapping therefore became:
{
"mappings": {
"doc": {
"properties": {
"text": {
"type": "text",
"analyzer": "rebuilt",
"fielddata": true
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
},
"settings": {
"analysis": {
"analyzer": {
"rebuilt": {
"tokenizer": "standard"
}
}
},
"index.mapping.ignore_malformed": true,
"index.mapping.total_fields.limit": 2000
}
}
Getting rid of the useless analyzed field.
I still have to check the performance of kibana. If someone has a performance safe solution to this problem please do not hesitate.
Thanks.

Elasticsearch Field Preference for result sequence

I have created the index in elasticsearch with the following mapping:
{
"test": {
"mappings": {
"documents": {
"properties": {
"fields": {
"type": "nested",
"properties": {
"uid": {
"type": "keyword"
},
"value": {
"type": "text",
"copy_to": [
"fulltext"
]
}
}
},
"fulltext": {
"type": "text"
},
"tags": {
"type": "text"
},
"title": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"url": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
}
}
}
}
While searching I want to set the preference of fields for example if search text found in title or url then that document comes first then other documents.
Can we set a field preference for search result sequence(in my case preference like title,url,tags,fields)?
Please help me into this?
This is called "boosting" . Prior to elasticsearch 5.0.0 - boosting could be applied in indexing phase or query phase( added as part of field mapping ). This feature is deprecated now and all mappings after 5.0 are applied in query time .
Current recommendation is to to use query time boosting.
Please read this documents to get details on how to use boosting:
1 - https://www.elastic.co/guide/en/elasticsearch/guide/current/_boosting_query_clauses.html
2 - https://www.elastic.co/guide/en/elasticsearch/guide/current/_boosting_query_clauses.html

Kibana doesn't show results on tile map

I have approximately 3300 documents with geo_point typed field filled.
When I try to visualize my documents on the tile map, kibana says "no results found".
I've already tried putting coordinates as:
- geohash in string
- [lon, lat] array
- object with "lat" and "lon" properties
- string "lat,lon"
All these ways of setting geo_point are allowed according to ES docs.
Kibana detects this field as geo_point (there is a globe icon near field name), but nothing shows up on tile map.
What's wrong with me?
I'm using Kibana 4.2, elasticsearch 2.0.0
I've managed it.
It was happening because I had my geo_point typed field inside of the field with "type": "nested" parameter.
I've changed this outer field to "dynamic": "true" and now I can visualize my locations!
I was able to have a nested geo_point by removing the "type": "nested" from the mapping. No "dynamic":"true" needed. My mapping looks like this:
"mappings": {
"_default_": {
"_all": {
"enabled": true
},
"_ttl": {
"enabled": true,
"default": "12m"
},
"dynamic_templates": [{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "analyzed",
"omit_norms": true,
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"ignore_above": 256
}
}
}
}
}],
"properties": {
"#version": {
"type": "string",
"index": "not_analyzed"
},
"user_data": {
"properties": {
"user_geolocation": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
}
}
}
}

Resources