ElasticSearch Sorting using NEST 7

ElasticSearch Sorting using NEST 7 - sorting

I'm using ElarsticSearch 7.7 & NEST 7.7 and I'm trying to use the sort function. But I'm getting an error,
I'm getting an error saying,
"Type: illegal_argument_exception Reason: "Text fields are not optimized for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [name] in order to load field data by uninverting the inverted index. Note that this can use significant memory." caused by: "Type: illegal_argument_exception Reason: "Text fields are not optimized for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [name] in order to load field data by uninverting the inverted
this is my code below
public SearchResult Search(string searchQuery, int storeId, int pageNumber = 1, int pageSize = 10, IList<SearchFilter> requestFilter = null, decimal? priceFrom = null, decimal? priceTo = null, string sortBy = null)
{
var queryContainer = new QueryContainer();
var multiMatch = new QueryStringQuery
{
Fields = Infer.Field<ElasticIndexGroupProduct>(p => p.Name)
.And(Infer.Field<ElasticIndexGroupProduct>(p => p.CategoryName))
.And(Infer.Field<ElasticIndexGroupProduct>(p => p.VendorName))
.And(Infer.Field<ElasticIndexGroupProduct>(p => p.AssociatedProducts.Select(x => x.Name)))
.And(Infer.Field<ElasticIndexGroupProduct>(p => p.AssociatedProducts.Select(x => x.CategoryName)))
.And(Infer.Field<ElasticIndexGroupProduct>(p => p.AssociatedProducts.Select(x => x.ManufacturerName)))
.And(Infer.Field<ElasticIndexGroupProduct>(p => p.AssociatedProducts.Select(x => x.ShortDescription))),
Boost = 1.1,
Name = "named_query",
Query = searchQuery,
DefaultOperator = Operator.Or,
Analyzer = "standard",
QuoteAnalyzer = "keyword",
AllowLeadingWildcard = true,
MaximumDeterminizedStates = 2,
Escape = true,
FuzzyPrefixLength = 2,
FuzzyMaxExpansions = 3,
FuzzyRewrite = MultiTermQueryRewrite.ConstantScore,
Rewrite = MultiTermQueryRewrite.ConstantScore,
Fuzziness = Fuzziness.Auto,
TieBreaker = 1,
AnalyzeWildcard = true,
MinimumShouldMatch = 2,
QuoteFieldSuffix = "'",
Lenient = true,
AutoGenerateSynonymsPhraseQuery = false
};
queryContainer &= multiMatch;
//sorting
var sorts = new List<ISort>();
switch (sortBy)
{
case "z-a":
sorts.Add(new FieldSort { Field = Infer.Field<ElasticIndexGroupProduct>(p => p.Name), Order = SortOrder.Descending });
sorts.Add(new FieldSort { Field = Infer.Field<ElasticIndexGroupProduct>(p => p.AssociatedProducts.Select(x=>x.Name)), Order = SortOrder.Descending });
break;
default:
sorts.Add(new FieldSort { Field = Infer.Field<ElasticIndexGroupProduct>(p => p.Name), Order = SortOrder.Ascending });
sorts.Add(new FieldSort { Field = Infer.Field<ElasticIndexGroupProduct>(p => p.AssociatedProducts.Select(x => x.Name)), Order = SortOrder.Ascending });
break;
}
var searchRequest = new SearchRequest<ElasticIndexGroupProduct>()
{
Profile = true,
Query = queryContainer,
From = (pageNumber - 1) * pageSize,
Size = pageSize,
Version = true,
Sort = sorts
};
var searchResponse = _client.Search<ElasticIndexGroupProduct>(searchRequest);
return GenerateSearchResult(searchQuery, searchResponse);
}
this is the error

Found a solution. Need to set fielddata=true on [Keywords] in order to load fielddata in memory by uninverting the inverted index.
[ElasticsearchType(RelationName = "searchproduct")]
public class ElasticIndexGroupProduct
{
[Text(Fielddata = true)]
public string Name { get; set; }
}

If you want to be able to perform full text search on Name as well as sort and aggregate on it, then you probably want to index it as a multi_field with both text and keyword data type.

Related

NEST API Default value for GeoShapes fields

We are using a filter as per following:
filters.Add(fq => fq
.Term(t => t
.Field(f => f.LocalityId)
.Value(locationParams[2])) || fq
.GeoShape(g => g
.Field("locationShape")
.Relation(GeoShapeRelation.Within)
.IndexedShape(f => f
.Id(searchCriteria.spLocationId)
.Index(indexName)
.Path("geometry")
)
)
);
However, if the geometry field is missing, Elasticsearch throws an exception.
Is there anyway to avoid this by using a default (Null Value) in the mapping or any other way.

It is not possible to avoid the exception in this case. Elasticsearch assumes that the parameters that the user provides to a pre-indexed shape are valid.
Ideally, the values supplied to the indexed shape should be constrained in a manner that prevents an end user from supplying invalid values. If that is unfeasible, you could run a bool query with filter clauses of exists query and ids query on the indexName index before adding the indexed shape geoshape query filter to the search request.
For example
private static void Main()
{
var documentsIndex = "documents";
var shapesIndex = "shapes";
var host = "localhost";
var pool = new SingleNodeConnectionPool(new Uri($"http://{host}:9200"));
var settings = new ConnectionSettings(pool)
.DefaultMappingFor<Document>(m => m.IndexName(documentsIndex))
.DefaultMappingFor<Shape>(m => m.IndexName(shapesIndex));
var client = new ElasticClient(settings);
if (client.Indices.Exists(documentsIndex).Exists)
client.Indices.Delete(documentsIndex);
client.Indices.Create(documentsIndex, c => c
.Map<Document>(m => m
.AutoMap()
)
);
if (client.Indices.Exists(shapesIndex).Exists)
client.Indices.Delete(shapesIndex);
client.Indices.Create(shapesIndex, c => c
.Map<Shape>(m => m
.AutoMap()
)
);
client.Bulk(b => b
.IndexMany(new [] {
new Document
{
Id = 1,
LocalityId = 1,
LocationShape = GeoWKTReader.Read("POLYGON ((30 20, 20 15, 20 25, 30 20))")
},
new Document
{
Id = 2,
LocalityId = 2
},
})
.IndexMany(new []
{
new Shape
{
Id = 1,
Geometry = GeoWKTReader.Read("POLYGON ((20 35, 10 30, 10 10, 30 5, 45 20, 20 35))")
}
})
.Refresh(Refresh.WaitFor)
);
var shapeId = 1;
var searchResponse = client.Search<Shape>(s => s
.Size(0)
.Query(q => +q
.Ids(i => i.Values(shapeId)) && +q
.Exists(e => e.Field("geometry"))
)
);
Func<QueryContainerDescriptor<Document>, QueryContainer> geoShapeQuery = q => q;
if (searchResponse.Total == 1)
geoShapeQuery = q => +q
.GeoShape(g => g
.Field("locationShape")
.Relation(GeoShapeRelation.Within)
.IndexedShape(f => f
.Id(shapeId)
.Index(shapesIndex)
.Path("geometry")
)
);
client.Search<Document>(s => s
.Query(q => +q
.Term(t => t
.Field(f => f.LocalityId)
.Value(2)
) || geoShapeQuery(q)
)
);
}
public class Document
{
public int Id { get; set; }
public int LocalityId { get; set; }
public IGeoShape LocationShape { get; set; }
}
public class Shape
{
public int Id { get; set; }
public IGeoShape Geometry { get; set; }
}
If var shapeId = 1; is changed to var shapeId = 2; then the geoshape query is not added to the filter clauses when searching on the documents index.

NEST aggs collection is read only

I'm using the NEST client with the following syntax:
_server.Search<Document>(s => s.Index(_config.SearchIndex)
.Query(q => q.MatchAll(p => p)).Aggregations(
a => a
.Terms("type", st => st
.Field(p => p.Type)
)));
However I keep getting the following exception
A first chance exception of type 'System.NotSupportedException' occurred in mscorlib.dll
Additional information: Collection is read-only.
It only seems to occur when I'm using aggregations, the field of Type has the following mapping:
[Keyword(Name = "Type")]
public string Type { get; set; }

I would double check the versions of NEST and Elasticsearch.Net that you are using. I just tried the following example with Elasticsearch 5.1.2 and NEST 5.0.1 and don't see the issue
void Main()
{
var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var defaultIndex = "default-index";
var connectionSettings = new ConnectionSettings(pool)
.DefaultIndex(defaultIndex);
var client = new ElasticClient(connectionSettings);
if (client.IndexExists(defaultIndex).Exists)
client.DeleteIndex(defaultIndex);
client.CreateIndex(defaultIndex, c => c
.Mappings(m => m
.Map<Document>(mm => mm.AutoMap())
)
);
var documents = Enumerable.Range(1, 100).Select(i =>
new Document
{
Type = $"Type {i % 5}"
}
);
client.IndexMany(documents);
client.Refresh(defaultIndex);
var searchResponse = client.Search<Document>(s => s
.Index(defaultIndex)
.Query(q => q.MatchAll(p => p))
.Aggregations(a => a
.Terms("type", st => st
.Field(p => p.Type)
)
)
);
foreach (var bucket in searchResponse.Aggs.Terms("type").Buckets)
Console.WriteLine($"key: {bucket.Key}, count {bucket.DocCount}");
}
public class Document
{
[Keyword(Name = "Type")]
public string Type { get; set; }
}
this outputs
key: Type 0, count 20
key: Type 1, count 20
key: Type 2, count 20
key: Type 3, count 20
key: Type 4, count 20

Nest MultiMatch Field Boost

I'm trying to boost some fields over others in a multiMatch search.
Looking at the docs I see you can create a Field with boost by doing this
var titleField = Infer.Field<Page>(p => p.Title, 2);
I haven't been able to figure out how that translates to Fields though.
Something like this isn't right
var bodyField = Infer.Field<Page>(p => p.Body);
var titleField = Infer.Field<Page>(p => p.Title, 2);
var metaDescriptionField = Infer.Field<Page>(p => p.MetaDescription, 1.5);
var metaKeywordsField = Infer.Field<Page>(p => p.Keywords, 2);
MultiMatchQuery multiMatchQuery = new MultiMatchQuery()
{
Fields = Infer.Fields<Page>(bodyField, titleField, metaDescriptionField, metaKeywordsField),
Query = search.Term
};
Do I need to use the string names for the fields like
var titleFieldString = "Title^2";
and pass those into Infer.Fields

You can use the strongly typed Infer.Field<T>(); there is an implicit conversion from Field to Fields, and additional fields can be added with .And(). Here's an example
void Main()
{
var client = new ElasticClient();
Fields bodyField = Infer.Field<Page>(p => p.Body);
var titleField = Infer.Field<Page>(p => p.Title, 2);
var metaDescriptionField = Infer.Field<Page>(p => p.MetaDescription, 1.5);
var metaKeywordsField = Infer.Field<Page>(p => p.Keywords, 2);
var searchRequest = new SearchRequest<Page>()
{
Query = new MultiMatchQuery()
{
Fields = bodyField
.And(titleField)
.And(metaDescriptionField)
.And(metaKeywordsField),
Query = "multi match search term"
}
};
client.Search<Page>(searchRequest);
}
public class Page
{
public string Body { get; set; }
public string Title { get; set; }
public string MetaDescription { get; set; }
public string Keywords { get; set; }
}
this yields
{
"query": {
"multi_match": {
"query": "multi match search term",
"fields": [
"body",
"title^2",
"metaDescription^1.5",
"keywords^2"
]
}
}
}
You can also pass an array of Field which also implicitly converts to Fields
var searchRequest = new SearchRequest<Page>()
{
Query = new MultiMatchQuery()
{
Fields = new[] {
bodyField,
titleField,
metaDescriptionField,
metaKeywordsField
},
Query = "multi match search term"
}
};
As well as pass an array of strings
var searchRequest = new SearchRequest<Page>()
{
Query = new MultiMatchQuery()
{
Fields = new[] {
"body",
"title^2",
"metaDescription^1.5",
"keywords^2"
},
Query = "multi match search term"
}
};

Elasticsearch NEST client creating multi-field fields with completion

I am trying to create some completion suggesters on some of my fields. My document class looks like this:
[ElasticType(Name = "rawfiles", IdProperty = "guid")]
public class RAW
{
[ElasticProperty(OmitNorms = true, Index = FieldIndexOption.NotAnalyzed, Type = FieldType.String, Store = true)]
public string guid { get; set; }
[ElasticProperty(OmitNorms = true, Index = FieldIndexOption.Analyzed, Type = FieldType.String, Store = true, IndexAnalyzer = "def_analyzer", SearchAnalyzer = "def_analyzer_search", AddSortField = true)]
public string filename { get; set; }
[ElasticProperty(OmitNorms = true, Index = FieldIndexOption.Analyzed, Type = FieldType.String, Store = true, IndexAnalyzer = "def_analyzer", SearchAnalyzer = "def_analyzer_search")]
public List<string> tags { get { return new List<string>(); } }
}
And here is how I am trying to create the completion fields
public bool CreateMapping(ElasticClient client, string indexName)
{
IIndicesResponse result = null;
try
{
result = client.Map<RAW>(
c => c.Index(indexName)
.MapFromAttributes()
.AllField(f => f.Enabled(false))
.SourceField(s => s.Enabled())
.Properties(p => p
.Completion(s => s.Name(n => n.tags.Suffix("comp"))
.IndexAnalyzer("standard")
.SearchAnalyzer("standard")
.MaxInputLength(20)
.Payloads()
.PreservePositionIncrements()
.PreserveSeparators())
.Completion(s2 => s2.Name(n=>n.filename.Suffix("comp"))
.IndexAnalyzer("standard")
.SearchAnalyzer("standard")
.MaxInputLength(20)
.Payloads()
.PreservePositionIncrements()
.PreserveSeparators())
)
);
}
catch (Exception)
{
}
return result != null && result.Acknowledged;
}
My problem is that this is only creating a single completion field named "comp". I was under the impression that this will create two completion fields, one named filename.comp and the other named tags.comp.
I then tried the answer on this SO question but this complicated the matter even worse as now my two fields were mapped as a completion field only.
Just to be clear, I want to create a multi-field (field) that has a data, sort and completion fileds. Much like the one in this example

This is how you can reproduce auto-complete example from attached by you article.
My simple class(we are going to implement auto-complete on Name property)
public class Document
{
public int Id { get; set; }
public string Name { get; set; }
}
To create multi field mapping in NEST we have to define mapping in such manner:
var indicesOperationResponse = client.CreateIndex(descriptor => descriptor
.Index(indexName)
.AddMapping<Document>(m => m
.Properties(p => p.MultiField(mf => mf
.Name(n => n.Name)
.Fields(f => f
.String(s => s.Name(n => n.Name).Index(FieldIndexOption.Analyzed))
.String(s => s.Name(n => n.Name.Suffix("sortable")).Index(FieldIndexOption.NotAnalyzed))
.String(s => s.Name(n => n.Name.Suffix("autocomplete")).IndexAnalyzer("shingle_analyzer"))))))
.Analysis(a => a
.Analyzers(b => b.Add("shingle_analyzer", new CustomAnalyzer
{
Tokenizer = "standard",
Filter = new List<string> {"lowercase", "shingle_filter"}
}))
.TokenFilters(b => b.Add("shingle_filter", new ShingleTokenFilter
{
MinShingleSize = 2,
MaxShingleSize = 5
}))));
Let's index some documents:
client.Index(new Document {Id = 1, Name = "Tremors"});
client.Index(new Document { Id = 2, Name = "Tremors 2: Aftershocks" });
client.Index(new Document { Id = 3, Name = "Tremors 3: Back to Perfection" });
client.Index(new Document { Id = 4, Name = "Tremors 4: The Legend Begins" });
client.Index(new Document { Id = 5, Name = "True Blood" });
client.Index(new Document { Id = 6, Name = "Tron" });
client.Index(new Document { Id = 7, Name = "True Grit" });
client.Index(new Document { Id = 8, Name = "Land Before Time" });
client.Index(new Document { Id = 9, Name = "The Shining" });
client.Index(new Document { Id = 10, Name = "Good Burger" });
client.Refresh();
Now, we are ready to write prefix query :)
var searchResponse = client.Search<Document>(s => s
.Query(q => q
.Prefix("name.autocomplete", "tr"))
.SortAscending(sort => sort.Name.Suffix("sortable")));
This query will get us
Tremors 2: Aftershocks
Tremors 3: Back to Perfection
Tremors 4: The Legend Begins
Tron
True Blood
True Grit
Hope this will help you.
Recently, guys from NEST prepared great tutorial about NEST and elasticsearch. There is a part about suggestions, it should be really useful for you.

Using MinimumShouldMatch with terms query in elasticsearch

I am writing a query in nest for elasticsearch that matches to a list of countries - it cutrrently matches whenever any of the countries in the list is present in ESCountryDescription (a list of countries). I only want to match when all of the countries in CountryList match ESCountryDescription. I believe that I need to use MinimumShouldMatch as in this example http://www.elastic.co/guide/en/elasticsearch/reference/0.90/query-dsl-terms-query.html
a.Terms(t => t.ESCountryDescription, CountryList)
But I cannot find a way of adding MinimumShouldMatch into my query above.

You can apply MinimumShouldMatch patameter in TermsDescriptor. Here is an example:
var lookingFor = new List<string> { "netherlands", "poland" };
var searchResponse = client.Search<IndexElement>(s => s
.Query(q => q
.TermsDescriptor(t => t.OnField(f => f.Countries).MinimumShouldMatch("100%").Terms(lookingFor))));
or
var lookingFor = new List<string> { "netherlands", "poland" };
var searchResponse = client.Search<IndexElement>(s => s
.Query(q => q
.TermsDescriptor(t => t.OnField(f => f.Countries).MinimumShouldMatch(lookingFor.Count).Terms(lookingFor))));
And this is the whole example
class Program
{
public class IndexElement
{
public int Id { get; set; }
[ElasticProperty(Index = FieldIndexOption.NotAnalyzed)]
public List<string> Countries { get; set; }
}
static void Main(string[] args)
{
var indexName = "sampleindex";
var uri = new Uri("http://localhost:9200");
var settings = new ConnectionSettings(uri).SetDefaultIndex(indexName).EnableTrace(true);
var client = new ElasticClient(settings);
client.DeleteIndex(indexName);
client.CreateIndex(
descriptor =>
descriptor.Index(indexName)
.AddMapping<IndexElement>(
m => m.MapFromAttributes()));
client.Index(new IndexElement {Id = 1, Countries = new List<string> {"poland", "germany", "france"}});
client.Index(new IndexElement {Id = 2, Countries = new List<string> {"poland", "france"}});
client.Index(new IndexElement {Id = 3, Countries = new List<string> {"netherlands"}});
client.Refresh();
var lookingFor = new List<string> { "germany" };
var searchResponse = client.Search<IndexElement>(s => s
.Query(q => q
.TermsDescriptor(t => t.OnField(f => f.Countries).MinimumShouldMatch("100%").Terms(lookingFor))));
}
}
Regarding your problem
For terms: "netherlands" you will get document with Id 3
For terms: "poland" and "france" you will get documents with Id 1 and 2
For terms: "germany" you will get document with Id 1
For terms: "poland", "france" and "germany" you will get document
with Id 1
I hope this is your point.

Instead of doing
.Query(q => q
.Terms(t => t.ESCountryDescription, CountryList))
You can use the command below
.Query(q => q
.TermsDescriptor(td => td
.OnField(t => t.ESCountryDescription)
.MinimumShouldMatch(x)
.Terms(CountryList)))
See this for unit tests in elasticsearch-net Github repository.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

ElasticSearch Sorting using NEST 7 - sorting

Found a solution. Need to set fielddata=true on [Keywords] in order to load fielddata in memory by uninverting the inverted index. [ElasticsearchType(RelationName = "searchproduct")] public class ElasticIndexGroupProduct { [Text(Fielddata = true)] public string Name { get; set; } }

If you want to be able to perform full text search on Name as well as sort and aggregate on it, then you probably want to index it as a multi_field with both text and keyword data type.

Related

NEST API Default value for GeoShapes fields

NEST aggs collection is read only

Nest MultiMatch Field Boost

Elasticsearch NEST client creating multi-field fields with completion

Using MinimumShouldMatch with terms query in elasticsearch

Categories

Resources