How do i get Keyword mapping working in NEST? - elasticsearch

I'm on NEST v6.3.1, ElasticSearch v6.4.2
I can't get my field to be indexed as a keyword.
I've tried both attributes:
[Keyword]
public string Suburb { get; set; }
and fluent:
client.CreateIndex(indexName, i => i
.Mappings(ms => ms
.Map<Listing>(m => m
.Properties(ps => ps
.Keyword(k => k
.Name(n => n.Suburb)))
.AutoMap())
.Map<Agent>(m => m
.AutoMap())
.Map<BuildingDetails>(m => m
.AutoMap())
.Map<LandDetails>(m => m
.AutoMap())
)
);
Both result in the same thing:
{
"listings": {
"mappings": {
"listing": {
"properties": {
"suburb": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
e.g doesn't match what i'm seeing here:
https://www.elastic.co/guide/en/elasticsearch/client/net-api/current/attribute-mapping.html
Same thing happens when i try and use [GeoPoint]. Should be type geopoint, but it's mapped to floats:
"latLong": {
"properties": {
"lat": {
"type": "float"
},
"lon": {
"type": "float"
}
}
}
So i'm missing something, just not sure what.
Any help?
Thanks

The index likely already exists, and a field mapping cannot be updated. Check .IsValid on the response from the create index call, and if invalid, take a look at the error and reason. You likely need to delete the index and create again.
Also note that multiple type mappings in one index is not allowed in Elasticsearch 6.x, and will fail. Either create separate indices for different types, or, if the types have the same field structure and you wish to index/analyze them in the same way, you may consider introducing an additional discriminator field.

Related

Elasticsearch how to specify Array of keyword with index false in mapping?

I am trying to specify Array of "keyword" fields in Elasticsearch mapping with index: "false", As according to ES docs there is no type as "Array" so I was thinking about using below mapping
PUT my_index
{
"mappings": {
"_doc": {
"properties": {
"arr_field": {
"type": "keyword", "index": false
}
}
}
}
}
Is this a correct way or not?
Yes there is no such specific data type for array. If you want to have a field that store array of integers all you need is to define the field as type integer and while indexing always make sure that value against that field is an array even if the value is single.
For e.g.:
PUT test
{
"mappings": {
"_doc": {
"properties": {
"intArray": {
"type": "integer"
}
}
}
}
}
PUT test/_doc/1
{
"intArray": [10, 12, 50]
}
PUT test/_doc/1
{
"intArray": [7]
}
Same goes for keyword data type as well. So what you are doing is right. All you need to take care is that while indexing a document, value for arr_field is always an array.

Elastic nest accessing sub property of text type

I have created text property name also i have created sub property as words_count of name and i want to have range query on words_count of name. How can i access it in c# using Nest.
"mappings": {
"person": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
},
"words_count": {
"type": "token_count",
"analyzer": "standard"
},
"length": {
"type": "token_count",
"analyzer": "character_analyzer"
}
}
}
}
}
}
I have length of name but its from c# string length. I want access words_count sub property of name created in elastic.
c# code
Func<QueryContainerDescriptor<MyType>, QueryContainer> query = m => m
.Range(r => r.Field(f => f.name.words_count).Relation(RangeRelation.Within)
.GreaterThanOrEquals(10).LessThanOrEquals(14));
I want replacement for f.name.words_count from elastic nest. do i need to create class for name having property length.
You don't need to create a POCO property to map to a multi-field (also often referred to as fields or sub-fields).
They are functionality to be able to index a single input in multiple different ways, which is very common for search use cases. For example, indexing a street address with multiple different types of analysis.
You can use the .Suffix(...) extension method to reference a multi-field
Func<QueryContainerDescriptor<MyType>, QueryContainer> query = m => m
.Range(r => r
.Field(f => f.name.Suffix("words_count"))
.Relation(RangeRelation.Within)
.GreaterThanOrEquals(10)
.LessThanOrEquals(14)
);

ElasticSearch 5 Sort by Keyword Field Case Insensitive

We are using ElasticSearch 5.
I have a field city using a custom analyzer and the following mapping.
Analyzer
"analysis": {
"analyzer": {
"lowercase_analyzer": {
"filter": [
"standard",
"lowercase",
"trim"
],
"type": "custom",
"tokenizer": "keyword"
}
}
Mapping
"city": {
"type": "text",
"analyzer": "lowercase_analyzer"
}
I am doing this so that I can do a case insensitive sort on the city field. Here is an example query that I am trying to run
{
"query": {
"term": {
"email": {
"value": "some_email#test.com"
}
}
},
"sort": [
{
"city": {
"order": "desc"
}
}
]
}
Here is the error I am getting:
"Fielddata is disabled on text fields by default. Set fielddata=true
on [city] in order to load fielddata in memory by uninverting the
inverted index. Note that this can however use significant memory."
I don't want to turn on FieldData and incur a performance hit in ElasticSearch. I would like to have a Keyword field that is not case sensitive, so that I can perform more meaningful aggregations and sorts on it. Is there no way to do this?
Yes, there is a way to do this, using multi_fields.
In Elasticsearch 5.0 onwards, string field types were split out into two separate types, text field types that are analyzed and can be used for search, and keyword field types that are not analyzed and are suited to use for sorting, aggregations and exact value matches.
With dynamic mapping in Elasticsearch 5.0 (i.e. let Elasticsearch infer the type that a document property should be mapped to), a json string property is mapped to a text field type, with a sub-field of "keyword" that is mapped as a keyword field type and the setting ignore_above:256.
With NEST 5.x automapping, a string property on your POCO will be automapped in the same way as dynamic mapping in Elasticsearch maps it as per above e.g. given the following document
public class Document
{
public string Property { get; set; }
}
automapping it
var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var defaultIndex = "default-index";
var connectionSettings = new ConnectionSettings(pool)
.DefaultIndex(defaultIndex);
var client = new ElasticClient(connectionSettings);
client.CreateIndex(defaultIndex, c => c
.Mappings(m => m
.Map<Document>(mm => mm
.AutoMap()
)
)
);
produces
{
"mappings": {
"document": {
"properties": {
"property": {
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
},
"type": "text"
}
}
}
}
}
You can now use property for sorting using Field(f => f.Property.Suffix("keyword"). Take a look at Field Inference for more examples.
keyword field types have doc_values enabled by default, which means that a columnar data structure is built at index time and this is what provides efficient sorting and aggregations.
To add a custom analyzer at index creation time, we can automap as before, but then provide overrides for fields that we want to control the mapping for with .Properties()
client.CreateIndex(defaultIndex, c => c
.Settings(s => s
.Analysis(a => a
.Analyzers(aa => aa
.Custom("lowercase_analyzer", ca => ca
.Tokenizer("keyword")
.Filters(
"standard",
"lowercase",
"trim"
)
)
)
)
)
.Mappings(m => m
.Map<Document>(mm => mm
.AutoMap()
.Properties(p => p
.Text(t => t
.Name(n => n.Property)
.Analyzer("lowercase_analyzer")
.Fields(f => f
.Keyword(k => k
.Name("keyword")
.IgnoreAbove(256)
)
)
)
)
)
)
);
which produces
{
"settings": {
"analysis": {
"analyzer": {
"lowercase_analyzer": {
"type": "custom",
"filter": [
"standard",
"lowercase",
"trim"
],
"tokenizer": "keyword"
}
}
}
},
"mappings": {
"document": {
"properties": {
"property": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
},
"analyzer": "lowercase_analyzer"
}
}
}
}
}

How to specify or target a field from a specific document type in queries or filters in Elasticsearch?

Given:
Documents of two different types, let's say 'product' and 'category', are indexed to the same Elasticsearch index.
Both document types have a field 'tags'.
Problem:
I want to build a query that returns results of both types, but the documents of type 'product' are allowed to have tags 'X' and 'Y', and the documents of type 'category' are only allowed to have tag 'Z'. How can I achieve this? It appears I can't use product.tags and category.tags since then ES will look for documents' product/category field, which is not what I intend.
Note:
While for the example above there might be some kind of workaround, I'm looking for a general way to target or specify fields of a specific document type when writing queries. I basically want to 'namespace' the field names used in my query so only documents of the type I want to work with are considered.
I think field aliasing would be the best answer for you, but it's not possible.
Instead you can use "copy_to" but I it probably affects index size:
DELETE /test
PUT /test
{
"mappings": {
"product" : {
"properties": {
"tags": { "type": "string", "copy_to": "ptags" },
"ptags": { "type": "string" }
}
},
"category" : {
"properties": {
"tags": { "type": "string", "copy_to": "ctags" },
"ctags": { "type": "string" }
}
}
}
}
PUT /test/product/1
{ "tags":"X" }
PUT /test/product/2
{ "tags":"Y" }
PUT /test/category/1
{ "tags":"Z" }
And you can query one of fields or many of them:
GET /test/product,category/_search
{
"query": {
"term": {
"ptags": {
"value": "x"
}
}
}
}
GET /test/product,category/_search
{
"query": {
"multi_match": {
"query": "x",
"fields": [ "ctags", "ptags" ]
}
}
}

ElasticSearch - issue with sub term aggregation with array fields

I have the two following documents:
{
"title":"The Avengers",
"year":2012,
"casting":[
{
"name":"Robert Downey Jr.",
"category":"Actor",
},
{
"name":"Chris Evans",
"category":"Actor",
}
]
}
and:
{
"title":"The Judge",
"year":2014,
"casting":[
{
"name":"Robert Downey Jr.",
"category":"Producer",
},
{
"name":"Robert Duvall",
"category":"Actor",
}
]
}
I would like to perform aggregations, based on two fields : casting.name and casting.category.
I tried with a TermsAggregation based on casting.name field, with a subaggregation, which is another TermsAggregation based on the casting.category field.
The problem is that for the "Chris Evans" entry, ElasticSearch set buckets for ALL categories (Actor, Producer) whereas it should set only 1 bucket (Actor).
It seems that there is a cartesian product between all casting.category occurences and all casting.name occurences.
It behaves like this with array fields (casting), whereas I don't have the problem with simple fields (as title, or year).
I also tried to use nested aggregations, but maybe not properly, and ElasticSearch throws an error telling that casting.category is not a nested field.
Any idea here?
Elasticsearch will flatten the nested objects, so internally you will get:
{
"title":"The Judge",
"year":2014,
"casting.name": ["Robert Downey Jr.","Robert Duvall"],
"casting.category": ["Producer", "Actor"]
}
if you want to keep the relationship you'll need to use either nested objects or a parent child relationship
To do a nested mapping you'd need to do something like this:
"mappings": {
"movies": {
"properties": {
"title" : { "type": "string" },
"year" : { "type": "integer" },
"casting": {
"type": "nested",
"properties": {
"name": { "type": "string" },
"category": { "type": "string" }
}
}
}
}
}

Resources