adding null_value through spring-data-elasticsearch annotation - elasticsearch

I want to create user index like below using spring-data-elasticsearch-2.1.0. annotation.
I am not able to find any annotation to add "null_value": "NULL". This is required because our sorting order is failing.
"user": {
"properties": {
"firstName": {
"type": "string"
},
"lastName": {
"type": "string"
},
"displayName": {
"type": "string",
"analyzer": "word_analyzer",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"null_value": "NULL"
}
}
}
}
}
Domain class
private String firstName;
private String lastName;
#MultiField(
mainField = #Field(type = FieldType.String, analyzer = "word_analyzer"),
otherFields = {
#InnerField(suffix = "raw", type = FieldType.String, index = FieldIndex.not_analyzed)
}
)
private String displayName;
How to add "null_value": "NULL" through spring-data-elasticsearch annotation in InnerField? I do not want to creating index mapping externally.

At now it's only possible through #Mapping annotation. Create JSON file with mapping definition:
{
"type": "string",
"index": "analyzed",
"analyzer": "word_analyzer",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"null_value": "NULL"
}
}
}
And save it in your resources folder. In this example I save it in resources/elastic/document_display_name_mapping.json.
Annotate field with #Mapping annotation
#Mapping(mappingPath = "elastic/document_display_name_mapping.json")
private String displayName;

Referring to https://jira.spring.io/browse/DATAES-312
This is an open issue(Fixed but not merged).
This can be handled by adding "missing" : "_last"/"_first" and "unmapped_type" in sorting options..
Ref:- https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-sort.html#_ignoring_unmapped_fields
These options("missing", "unmapped_type", "mode") are not available.
Putting "null_value": "NULL" will not have correct sorting order in string field.
"null_value": "0" can satisfy sorting order for integer field.
is it possible to do something in settings itself to achive one usecase
usecase -- if sort direction is ASC then "missing" : "_first" and if sort direction is DESC then "missing" : "_last" .. which can be applied on raw field.

Related

Keyword field created automatically without any mapping in Entity class

My ElasticSearch version is 7.6.2 and my spring-boot-starter-data-elasticsearch is version 2.2.0.
Due to some dependency i am not upgrading ES to lastest version.
Problem i am facing is ES index is sometimes created with .keyword fields and sometimes it is just normal text field.
Below is my entity class. i am not able to find why this is happening. I read that all text field will have keyword field also. but why it is not created always.
My Entity class
#Setter
#Getter
#Document(indexName="myindex", createIndex=true, shards = 4)
public class MyIndex {
#Field(type = FieldType.Keyword)
private String place;
#Field(type = FieldType.Text)
private String name;
#Id
private String dynamicId = UUID.randomUUID().toString();
public MyIndex()
{}
Mapping in ES:
{
"mappings": {
"myindex": {
"properties": {
"place": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"dynamicId": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
}
}
}
}
}
Sometimes it is created as below for the same entity class
{
"mappings": {
"myindex": {
"properties": {
"place": {
"type": "keyword"
},
"name": {
"type": "text"
},
"dynamicId": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
}
}
}
}
}
With the entity definition shown, when Spring Data Elasticsearch creates the index and writes the mapping, you will get the mapping shown in your second example with these value for the properties:
{
"properties": {
"place": {
"type": "keyword"
},
"name": {
"type": "text"
}
}
}
If you want to have a nested keyword property in Spring Data Elasticsearch you have to define it on the entity with the corresponding annotation.
Please notice: the #Id property is not mapped explicitly but will be dynamically mapped on first indexing of a document.
The mapping in the first case and the part in the second where a String is mapped as
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
}
is the default value that Elasticsearch uses when a document is indexed with a text field that was not mapped before - see the docs about dynamic mapping.
So your second example shows the mapping of an index that was created by Spring Data Elasticsearch and where some documents have been indexed.
The first one would be created by Elasticsearch if some other application creates the index and writes data into the index. It could also be that the index was created outside your application, and on application startup no mapping would then be written, because the index already exists. So you should review the way your indices are created.

Failed to find geo_point field location Springframework Data Elasticsearch

Failed to find geo_point field location. The location field should have geo_point in mapping but getting lat, lon.
Model Class:
import org.springframework.data.elasticsearch.annotations.GeoPointField;
import org.springframework.data.elasticsearch.core.geo.GeoPoint;
class EsMapping extends TestIndex{
#GeoPointField
private GeoPoint location;
#Field(type = FieldType.Double)
private double latitude;
#Field(type = FieldType.Double)
private double longitude;
.
.
}
TestIndex.java
class TestIndex{
#Field(type = FieldType.Text)
private String name;
.
.
}
Test Controller
#Autowired
private ElasticsearchOperations elasticsearchOperations;
IndexOperations esMappingIndex =
elasticsearchOperations.indexOps(EsMapping.class);
esMappingIndex.delete();
esMappingIndex.create();
esMappingIndex.putMapping(esMappingIndex.createMapping());
esMappingIndex.refresh();
Mapping: (not expected)
http://localhost:9200/testindex/_mapping
"location": {
"properties": {
"lat": {
"type": "float"
},
"lon": {
"type": "float"
}
}
},
"latitude": {
"type": "float"
},
"longitude": {
"type": "float"
},
Error:
{"error":{"root_cause":[{"type":"query_shard_exception","reason":"failed to find geo_point field [location] ","index_uuid":"QLCjshecRNqDMXtkrtbF9g","index":"testindex"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"testindex","node":"jCKzz88BQXeL3wzyX6c8lQ","reason":{"type":"query_shard_exception","reason":"failed to find geo_point field [location]","index_uuid":"QLCjshecRNqDMXtkrtbF9g","index":"testindex"}}]},"status":400}
Expected Mapping
"location": {
"type": "geo_point"
},
"latitude": {
"type": "double"
},
"longitude": {
"type": "double"
},
When I define entity properties like above - with an index name geopoint-test - and do execute the mapping creating code (which is executed as well on application startup), I get the following mapping:
{
"geopoint-test": {
"mappings": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
}
There are two differences to the mapping you show:
the location propert is mapped correctly
the latitude and longitude properties are not written to the mapping, as they are not annotated with #Field and so are left for automatic mapping.
The mapping you show is the one that will be automatically created.
Which version of Spring Data Elasticsearch are you using?
Edit 12.12.2020:
Using Spring Data Elasticsearch 4.0.2 and adding the two #Field annotations to the latitudeand longitude properties, both the initial setup of an index and the code to explicitly write the mapping produce this mapping:
{
"properties": {
"location": {
"type": "geo_point"
},
"latitude": {
"type": "double"
},
"longitude": {
"type": "double"
}
}
}

How to rename a field in Elasticsearch?

I have an index in Elasticsearch with the following field mapping:
{
"version_data": {
"properties": {
"title": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"updated_at": {
"type": "date"
},
"updated_by": {
"type": "keyword"
}
}
}
}
I have already created some documents in it and now want to rename version_data field with _version_data.
Is there any way in the Elasticsearch to rename a field within the mapping and in documents?
The closest thing is the alias data type.
In your mapping you could link it from the old to the new name like this:
PUT test/_mapping
{
"properties": {
"_version_data": {
"type": "alias",
"path": "version_data"
}
}
}
BTW I would generally avoid leading underscored since those normally used for internal fields like _id.

Elastic search common mapping type and run aggregation based on type of data

we have an elastic search index with following mapping (showing only partial mapping relevant to this question)
"instFields": {
"properties": {
"_index": {
"type": "object"
},
"fieldValue": {
"fields": {
"raw": {
"index": "not_analyzed",
"type": "string"
}
},
"type": "string"
},
"sourceFieldId": {
"type": "integer"
}
},
"type": "nested"
}
as you can see fieldValue type is string: in original data in the database for that fieldValue column is stored in a JSON type column (in Postgresql). use case is such that when this data is stored fieldValue can be valid JsNumber, JsString,JsBoolean (any valid [JsValue][1] now question is that when storing this fieldValue in ES - it'll have to be a definite type - so we convert fieldValue to String while pushing data into ElasticSearch.
Following is a sample data from Elastic search
"instFields": [
{
"sourceFieldId": 1233,
"fieldValue": "Demo Logistics LLC"
},
{
"sourceFieldId": 1236,
"fieldValue": "169451"
}
]
this is where it gets interesting where now we want to run various metrics aggregations on fieldValue - for e.g. if sourceFieldId = 1236 then run [avg][3] on fieldValue - problem is fieldValue had to be stored as string in ES - due to originally fieldValue being JsValue type field in the application. what's the best way to create mapping in elastic search such that fieldValue can be stored with an appropriate type vs string type so various metrics aggregation can be run of fieldValue which are of type long (though encoded as string in ES)
One of the ways to achieve this is create different fields in elastic search with all possible type of JsValue (e.g. JsNumber, JsBoolean,JsString etc). now while indexing - application can derive proper type of JsValue field to find out whether it's JsString, JsNumber, JsBoolean etc.
on application side I can decode proper type of fieldValue being indexed
value match{
case JsString(s) =>
case JsNumber(n) =>
case JsBoolean(b)
}
now modify mapping in elastic search and add more fields - each with proper type - as shown below
"instFields": {
"properties": {
"_index": {
"type": "object"
},
"fieldBoolean": {
"type": "boolean"
},
"fieldDate": {
"fields": {
"raw": {
"format": "dateOptionalTime",
"type": "date"
}
},
"format": "dateOptionalTime",
"type": "date"
},
"fieldDouble": {
"fields": {
"raw": {
"type": "double"
}
},
"type": "double"
},
"fieldLong": {
"fields": {
"raw": {
"type": "long"
}
},
"type": "long"
},
"fieldString": {
"fields": {
"raw": {
"index": "not_analyzed",
"type": "string"
}
},
"type": "string"
},
"fieldValue": {
"fields": {
"raw": {
"index": "not_analyzed",
"type": "string"
}
},
"type": "string"
}
now at the time of indexing
value match{
case JsString(s) => //populate fieldString
case JsNumber(n) => //populate fieldDouble (there is also fieldLong)
case JsBoolean(b) //populate fieldBoolean
}
this way now boolean value is stored in fieldBoolean, number is stored in long etc. now running metrics aggregation becomes a normal business by going against fieldLong or fieldDouble field (depending on the query use case). notice fieldValue field is still there in ES mapping and index as before. Application will continue to convert value to string and store it in fieldValue as before - this way queries which don't care about types can only query fieldValue field in the index.
It sounds like you should have two separate fields, one for the case when the value is a string and one for when it is an instance of a number.
Depending on how you're indexing this data, it can be easy or hard. However, its a bit strange that you have a fields that could be a string or a number.
Regardless, elasticsearch is not going to be able to do both in a single field

How to set IndexOption = docs

I need to get result below with NEST (Elastic Search .NET client)
"detailVal": {
"name": "detailVal",
"type": "multi_field",
"fields": {
"detailVal": {
"type": "string"
},
"untouched": { // <== FOCUS 2
"type": "string",
"index": "not_analyzed",
"omit_norms": true,
"include_in_all": false,
"index_options": "docs" // <== FOCUS 1
}
}
}
I have done so far
[ElasticProperty(OmitNorms = true, Index = FieldIndexOption.not_analyzed, IncludeInAll = false, AddSortField = true)]
public string DetailVal { get; set; }
which gets me
"detailVal": {
"name": "detailVal",
"type": "multi_field",
"fields": {
"detailVal": {
"type": "string",
"index": "not_analyzed",
"omit_norms": true,
"include_in_all": false
},
"sort": { // <== FOCUS 2
"type": "string",
"index": "not_analyzed"
}
}
}
so, any idea how to
add "index_options": "docs" (I found IndexOptions.docs but it is not valid as Attribute)
change sort to untouched
The attribute based mapping only gets you so far. It's good enough if you only need to change names and set simple properties.
The recommended approach is to use client.MapFluent()
See https://github.com/Mpdreamz/NEST/blob/master/src/Nest.Tests.Unit/Core/Map/FluentMappingFullExampleTests.cs#L129
For an example how to set index_options
And line 208:
https://github.com/Mpdreamz/NEST/blob/master/src/Nest.Tests.Unit/Core/Map/FluentMappingFullExampleTests.cs#L208
To see how you can create your own multi_field mapping.
You can even combine both approaches:
client.MapFluent<MyType>(m=>m
.MapFromAttributes()
//Map what you can't with attributes here
);
client.Map() and client.MapFromAttributes() will most likely be removed at some point.

Resources