Conditional Elastic Search Filter for Fuzzy Search in NEST - elasticsearch

I have a working query against my index however I am having difficulty getting a conditional filter working.
The query I have working is as follows:
var firstSearchResponse = client.Search<IndexedUser>(s => s
.Query(q =>
(q
.Match(m => m
.Field(f => f.FirstName)
.Query(term)
.Fuzziness(Fuzziness.EditDistance(2))
)
||
q.Match(m => m
.Field(f => f.LastName)
.Query(term)
.Fuzziness(Fuzziness.EditDistance(2))
))
&&
(
q.Match(m => m
.Field(f => f.IsSearchable)
.Query("true")
)
)
);
However now I would like to say that if a user passes in a city, or a state, or both, then filter on each of those fields to only return those results.
As an example, if I pass in a city of New York, then only filter those results.
How can I do this?
I tried using PostFilter but it doesn't seem to work when I pass in input:
var filters = new List<Func<QueryContainerDescriptor<IndexedUser>, QueryContainer>>();
//If city was provided, then search on that
if (!String.IsNullOrEmpty(city))
filters.Add(fq => fq.Terms(t => t.Field(f => f.City).Terms(city)));
//If state was provided, then search on that
if (!String.IsNullOrEmpty(state))
filters.Add(fq => fq.Terms(t => t.Field(f => f.State).Terms(state)));
and then at the end of my query, adding this:
var firstSearchResponse = client.Search<IndexedUser>(s => s
.Query(q =>
(q
.Match(m => m
.Field(f => f.FirstName)
.Query(term)
.Fuzziness(Fuzziness.EditDistance(2))
)
||
q.Match(m => m
.Field(f => f.LastName)
.Query(term)
.Fuzziness(Fuzziness.EditDistance(2))
))
&&
(
q.Match(m => m
.Field(f => f.IsSearchable)
.Query("true")
)
)
).PostFilter(p => p.Bool(q => q.Filter(filters))));
..if I add the PostFilter, it works unless I actually pass in a value for city or state.
Am I misusing PostFilter, or is there a logic flaw above?
Thanks you so much for any help/guidance!

Related

How to construct Aggregation (Count) of records using NEST?

I have a requirement to perform Aggregation (Count) of records using NEST wrapper but to fire the DSL query inside NEST.
Since I don't know how to construct it properly, I have done the same using LINQ approach.
ISearchResponse<AgencyDetailReportModel> searchResponse = ConnectionToESClient().Search<AgencyDetailReportModel>
(s => s
.Index("accountsdata")
.From(0)
.Size(15000)
.Query(q =>
q.MatchAll()
)
);
var allocatedAgencies = agencySearchResponse.Documents.Where(w => !string.IsNullOrEmpty(w.agencyid)).Count();
var unAllocatedAgencies = agencySearchResponse.Documents.Where(w => string.IsNullOrEmpty(w.agencyid)).Count();
How can I construct the DSL query inside NEST?
So for your question you need allocatedAgencies count and unAllocatedAgencies count right.We can achieve this by simple query rather than going for aggregations.
var searchResponse = await highLevelClient.CountAsync<accountsdata>(s => s
.Index("accountsdata")
.Query(q => q
.ConstantScore(c => c
.Filter(f => f
.Bool(b => b
.MustNot(m => m
.Exists(e => e.Field("agencyid"))))))));
This is for unAllocatedAgencies count and for allocatedAgencies below is the query.
var searchResponse = await highLevelClient.CountAsync<accountsdata>(s => s
.Index("accountsdata")
.Query(q => q
.ConstantScore(c => c
.Filter(f => f
.Bool(b => b
.Must(m => m
.Exists(e => e.Field("agencyid"))))))));
Let me know if you face any issues, max it will work for your above mentioned problem. Thanks

How can I find the total hits for an Elastic NEST query?

In my application I have a query which brings limits the number of hits returned to 50 as follows
var response = await client.SearchAsync<Episode>(s => s
.Source(sf => sf
.Includes(i => i
.Fields(
f => f.Title,
f => f.PublishDate,
f => f.PodcastTitle
)
)
.Excludes(e => e
.Fields(f => f.Description)
)
)
.From(request.Skip)
.Size(50)
.Query(q => q
.Term(t => t.Title, request.Search) || q
.Match(mq => mq.Field(f => f.Description).Query(request.Search))));
I am interested in the total number of hits for the query (i.e. not limited to the size), so that I can deal with pagination on the front-end. Does anyone know how I can do this?
You are looking for Total property on the search response object. Have a look.
So in your particular case that will be response.Total.
For those who are working on indices with more than 10000 documents, Elasticsearch will calculate total hits up to 10000 by default. To get around that, include .TrackTotalHits(true) in your query:
var resp = client.Search<yourmodel>(s => s
.Index(yourindexname)
.TrackTotalHits(true)
.Query(q => q.MatchAll()));

Using which field matched in a multimatch query in a function score

I have a multimatch query which I am using across 5 fields. I am also using a function score to combine various factors into the score. I would like to add a factor to this so that results that matched on one of the fields is increased (adding a large number so that matches on this field always have the highest score).
I know that I can use highlighting to find out which fields were matched, but how can I access that information in the function score script?
Here's what I have so far (using NEST, but that shouldn't make a difference).
var searchResponse = client.Search<TopicCollection.Topic>(s => s
.Query(q => q
.FunctionScore(fs => fs
.Name("function_score_query")
.Query(q1 => q1
.MultiMatch(c => c
.Fields(f => f
.Field(p => p.field1)
.Field(p => p.field2) //...etc
.Query(searchTerm)
)
)
.Functions(fun => fun
.ScriptScore(ss => ss.Script(sc => sc
.Inline(
//TODO: add 1000 to normalised _score if match is in field1
)))
).BoostMode(FunctionBoostMode.Replace)
)
).Highlight(h => h
.Fields(p => p.AllField())
)
);

Querying NEST 2.0.0-rc1 Aggregation Buckets

Previously in NEST (for Elasticsearch 1.x), after an aggregation query, I had some code that went through and grouped up aggregations by going through all the buckets, similar to this:
var r = (from SingleBucket items1 in result.Aggregation.Values
select (Bucket) agg1
into items1Aggs
from col1 in items1Aggs.Items.Cast<KeyItem>()
....
select new x {}).ToList();
But it seems that now the SingleBucket has to be SingleBucketAggregate, but the Bucketclass used before is now an internal BucketAggregateData class that is no long accessible. Is there any way around this?
Aggregation query being used:
var result = _client.Search<MetaStoreEntry>(s => s
.Aggregations(a => a
.Filter("fullGroupBy", k => fad
.Aggregations(e => e
.Terms("col1", t => t
.Field(f => f.col1)
.Aggregations(b => b
.Terms("col2", u => u
.Field(f => f.col2)
...
).Size(int.MaxValue).CollectMode(TermsAggregationCollectMode.DepthFirst)
) ...
);

how do i set similarity in nest for elasticsearch on a per field basis

i have not been able to 'programmatically' set the similarity on a field in elasticsearch using Nest.
here's an example of how i set up my index. it's within the multifield mapping where i'd like to set the similarity so i can experiment with things like BM25 similarity...
(see the props > multifield section below)...
var createInd = client.CreateIndex("myindex", i =>
{
i
.Analysis(a => a.Analyzers(an => an
.Add("nameAnalyzer", nameAnalyzer)
)
.AddMapping<SearchData>(m => m
.MapFromAttributes()
.Properties(props =>
{
props
.MultiField(mf => mf
//title
.Name(s => s.Title)
.Fields(f => f
.String(s => s.Name(o => o.Title).Analyzer("nameAnalyzer"))
.String(s => s.Name(o => o.Title.Suffix("raw")).Index(FieldIndexOption.not_analyzed))
)
);
...
It was just recently made possible with this commit to set the similarity on a string field. You can now do this:
.String(s => s.Name(o => o.Title).Similarity("my_similarity")
This is assumming you already have the similarity added to your index. NEST is lacking a bit of flexibility at the moment for actually configuring similarities. Right now you have to use the CustomSimilaritySettings class. For example:
var bm25 = new CustomSimilaritySettings("my_similarity", "BM25");
bm25.SimilarityParameters.Add("k1", "2.0");
bm25.SimilarityParameters.Add("b", "0.75");
var settings = new IndexSettings();
settings.Similarity = new SimilaritySettings();
settings.Similarity.CustomSimilarities.Add(bm25);
client.CreateIndex("myindex", c => c.InitializeUsing(settings));
It would be nice to be able to do this via the fluent API when creating an index. I am considering sending a PR for this before the 1.0RC release.

Resources