Keyword field created automatically without any mapping in Entity class - spring-boot

My ElasticSearch version is 7.6.2 and my spring-boot-starter-data-elasticsearch is version 2.2.0.
Due to some dependency i am not upgrading ES to lastest version.
Problem i am facing is ES index is sometimes created with .keyword fields and sometimes it is just normal text field.
Below is my entity class. i am not able to find why this is happening. I read that all text field will have keyword field also. but why it is not created always.
My Entity class
#Setter
#Getter
#Document(indexName="myindex", createIndex=true, shards = 4)
public class MyIndex {
#Field(type = FieldType.Keyword)
private String place;
#Field(type = FieldType.Text)
private String name;
#Id
private String dynamicId = UUID.randomUUID().toString();
public MyIndex()
{}
Mapping in ES:
{
"mappings": {
"myindex": {
"properties": {
"place": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"dynamicId": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
}
}
}
}
}
Sometimes it is created as below for the same entity class
{
"mappings": {
"myindex": {
"properties": {
"place": {
"type": "keyword"
},
"name": {
"type": "text"
},
"dynamicId": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
}
}
}
}
}

With the entity definition shown, when Spring Data Elasticsearch creates the index and writes the mapping, you will get the mapping shown in your second example with these value for the properties:
{
"properties": {
"place": {
"type": "keyword"
},
"name": {
"type": "text"
}
}
}
If you want to have a nested keyword property in Spring Data Elasticsearch you have to define it on the entity with the corresponding annotation.
Please notice: the #Id property is not mapped explicitly but will be dynamically mapped on first indexing of a document.
The mapping in the first case and the part in the second where a String is mapped as
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
}
is the default value that Elasticsearch uses when a document is indexed with a text field that was not mapped before - see the docs about dynamic mapping.
So your second example shows the mapping of an index that was created by Spring Data Elasticsearch and where some documents have been indexed.
The first one would be created by Elasticsearch if some other application creates the index and writes data into the index. It could also be that the index was created outside your application, and on application startup no mapping would then be written, because the index already exists. So you should review the way your indices are created.

Related

How to detect whether elasticsearch has enabled dynamic field

I don't know whether my index has enabled/disabled dynamic field. When I use get index mapping command it just responses these informations:
GET /my_index1/_mapping
{
"my_index1": {
"mappings": {
"properties": {
"goodsName": {
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
},
"type": "text"
},
"auditTime": {
"type": "long"
},
"createUserId": {
"type": "long"
}
}
}
}
}
If you don't explicitly set the dynamic to false or strict, it will be true by default. If you explicitly set that, you will see that in your mappings:
{
"mappings": {
"dynamic": false,
"properties": {
"name": {
"type": "text"
}
}
}
}
And when you index the following document:
{"name":"products", "clickCount":1, "bookingCount":2, "isPromoted":1}
Only the field name will be indexed, the rest won't. If you call the _mapping endpoint again, it will give you the exact mappings above.

Failed to find geo_point field location Springframework Data Elasticsearch

Failed to find geo_point field location. The location field should have geo_point in mapping but getting lat, lon.
Model Class:
import org.springframework.data.elasticsearch.annotations.GeoPointField;
import org.springframework.data.elasticsearch.core.geo.GeoPoint;
class EsMapping extends TestIndex{
#GeoPointField
private GeoPoint location;
#Field(type = FieldType.Double)
private double latitude;
#Field(type = FieldType.Double)
private double longitude;
.
.
}
TestIndex.java
class TestIndex{
#Field(type = FieldType.Text)
private String name;
.
.
}
Test Controller
#Autowired
private ElasticsearchOperations elasticsearchOperations;
IndexOperations esMappingIndex =
elasticsearchOperations.indexOps(EsMapping.class);
esMappingIndex.delete();
esMappingIndex.create();
esMappingIndex.putMapping(esMappingIndex.createMapping());
esMappingIndex.refresh();
Mapping: (not expected)
http://localhost:9200/testindex/_mapping
"location": {
"properties": {
"lat": {
"type": "float"
},
"lon": {
"type": "float"
}
}
},
"latitude": {
"type": "float"
},
"longitude": {
"type": "float"
},
Error:
{"error":{"root_cause":[{"type":"query_shard_exception","reason":"failed to find geo_point field [location] ","index_uuid":"QLCjshecRNqDMXtkrtbF9g","index":"testindex"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"testindex","node":"jCKzz88BQXeL3wzyX6c8lQ","reason":{"type":"query_shard_exception","reason":"failed to find geo_point field [location]","index_uuid":"QLCjshecRNqDMXtkrtbF9g","index":"testindex"}}]},"status":400}
Expected Mapping
"location": {
"type": "geo_point"
},
"latitude": {
"type": "double"
},
"longitude": {
"type": "double"
},
When I define entity properties like above - with an index name geopoint-test - and do execute the mapping creating code (which is executed as well on application startup), I get the following mapping:
{
"geopoint-test": {
"mappings": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
}
There are two differences to the mapping you show:
the location propert is mapped correctly
the latitude and longitude properties are not written to the mapping, as they are not annotated with #Field and so are left for automatic mapping.
The mapping you show is the one that will be automatically created.
Which version of Spring Data Elasticsearch are you using?
Edit 12.12.2020:
Using Spring Data Elasticsearch 4.0.2 and adding the two #Field annotations to the latitudeand longitude properties, both the initial setup of an index and the code to explicitly write the mapping produce this mapping:
{
"properties": {
"location": {
"type": "geo_point"
},
"latitude": {
"type": "double"
},
"longitude": {
"type": "double"
}
}
}

Update NonNested Property to Nested Type in ElasticSearch Document

We are creating a dynamic object in an elasticsearch index. While creation we don't have a mapping for this object so object created with nonnested type as mentioned below.
"categoriesScore": {
"properties": {
"score": {
"type": "float"
},
"categoryName": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"categoryId": {
"type": "long"
}
}
},
So we need to update property type to nested for some nonnested object.
we have tried below code
await _client.MapAsync<DocumentEntity> (c => c.Index(_index).Type(_type)).ConfigureAwait (false);
We need NEST query so we can update nonnested type to nested type document

How to declare mapping for nested fields in Elasticsearch to allow for storing different types?

In essence, I want my mapping to be as schemaless as possible, but allow for nested types and being able to store data that may have different types:
When I try to add a document where some fields have different types of values, I get an error like this:
"type": "illegal_argument_exception",
"reason": "mapper [data.customData.value] of different type, current_type [long], merged_type [text]"
This can easily be solved by mapping the field value to text (or create it dynamically by first inserting a document with only text). However, I would like to avoid having a schema. Perhaps having all of the fields nested in customData to be set to text? How do I do that?
I had the problem earlier, but then it started working after accidentally managing to get a dynamical mapping that worked (since everything was regarded as text. I was later made aware of this problem since I needed to change the mapping to allow for nested types.
Documents with this kind of data are troublesome to store successfully:
"customData": [
{
"value": "some_text",
"key": "some_text"
},
{
"value": 0,
"key": "some_text"
}
]
A part of the mapping that works:
{
"my_index": {
"aliases": {},
"mappings": {
"_doc": {
"properties": {
"data": {
"properties": {
"customData": {
"properties": {
"key": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"value": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
},
"some_list": {
"type": "nested",
"properties": {
"some_field": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
}
}
In essence, I want the mapping to be as schemaless as possible, but allow for nested types and being able to store data that may have different types:
{
"mappings": {
"_doc": {
"properties": {
"data": {
"type": "object"
},
"somee_list": {
"type": "nested"
}
}
}
}
}
So what would be the best approach to go about this problem?

ElasticSearch Reindex API not analyzing the new field

I have an existing index named "Docs" which has documents in it.
I am creating a new Index named "Docs1" exactly same like "Docs" with only one extra field with analyzer in one property, which I want to use for autocomplete purpose.
Property in "Docs" index
"name": {
"type": "text",
"analyzer": "text_standard_analyzer",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
Property in the "Docs1" index going to be
{
"name": {
"type": "text",
"analyzer": "text_standard_analyzer",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
},
"pmatch": {
"type": "text",
"analyzer": "text_partialmatching_analyzer"
}
}
}
}
I am using Reindex API to copy records from "Docs" to "Docs1"
POST _reindex
{
"source": {
"index": "Docs"
},
"dest": {
"index": "Docs1"
}
}
when I reindex, I expect for the older documents to contain the new field with the information in that field.
I am noticing the new field in my destination index "Docs1" is not analyzed for existing data. But it is analyzed for any new documents I am adding.
Please suggest
Reindex by adding "type" worked
POST _reindex
{
"source":
{ "index": "sourceindex" },
"dest":
{ "index": "destindex",
"type":"desttype"
}
}

Resources