Dynamic field list for MultiMatch - Nest - elasticsearch

We have a requirement to have a search for a document type with a variable/dynamic number of fields being queried against. For one search/type it might be Name and Status. For another, the Description field. The fields to be searched against will be chosen by the user at run time.
To do this statically appears easy. Something like this to search in Name and Description fields. (Assume that rootQuery is a valid searchDescriptor ready for the query.
rootQuery.Query(q => q.MultiMatch(mm => mm.Query(filter.Value.ToString()).Fields(f => f.Field(ff => ff.Name).Field(ff => ff.Description))));
However, we don't want to have a library of static queries to handle the potential permutations if possible. We'd rather do something dynamic like:
foreach (var field in string-list-of-fields-from-user)
{
rootQuery.Query(q => q.MultiMatch(mm => mm.Query(filter.Value.ToString()).Fields(f => f.Field(ff => field);
}
Is this possible? If so, how?

You can pass the string list of fields directly to .Fields(...)
var searchResponse = client.Search<Document>(s => s
.Query(q => q
.MultiMatch(mm => mm
.Query("query")
.Fields(new string[] { "field1", "field2", "field3" })
)
)
);
which yields
{
"query": {
"multi_match": {
"fields": ["field1", "field2", "field3"],
"query": "query"
}
}
}

Related

Trying to filter some Elasticsearch results where the field might not exist

I have some data and I'm trying to add an extra filter that will exclude/filter-out any results which is where the key/value is foo.IsMarried == true.
Now, there's heaps of documents that don't have this field. If the field doesn't exist, then I'm assuming that the value is foo.IsMarried = false .. so those documents will be included in the result set.
Can anyone provide any clues, please?
I'm also using the .NET 'NEST' nuget client library - so I'll be really appreciative if the answer could be targeting that, but just happy with any answer, really.
Generally, within elasticsearch, for a boolean field, if the field doesn't exist, it doesn't mean that it's value is false. It could be that there is no value against it.
But, based on the assumption you are making in this case - we can check if the field foo.isMarried is explicitly false OR it does not exist in the document itself.
The query presented by Rahul in the other answer does the job. However since you wanted a NEST version of the same, the query can be constructed using the below snippet of code.
// Notice the use of not exists here. If you do not want to check for the 'false' value,
// you can omit the first term filter here. 'T' is the type to which you are mapping your index.
// You should pass the field based on the structure of 'T'.
private static QueryContainer BuildNotExistsQuery()
{
var boolQuery = new QueryContainerDescriptor<T>().Bool(
b => b.Should(
s => s.Term(t => t.Field(f => f.foo.IsMarried).Value(false)),
s => !s.Exists(ne => ne.Field(f => f.foo.IsMarried))
)
);
}
You can trigger the search through the NEST client within your project as shown below.
var result = client.Search<T>(
.From(0)
.Size(20)
.Query(q => BuildNotExistsQuery())
// other methods that you want to chain go here
)
You can use a should query with following conditions.
IsMarried = false
must not exists IsMarried
POST test/person/
{"name": "p1", "IsMarried": false}
POST test/person/
{"name": "p2", "IsMarried": true}
POST test/person/
{"name": "p3"}
Raw DSL query
POST test/person/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"IsMarried": false
}
},
{
"bool": {
"must_not": {
"exists": {
"field": "IsMarried"
}
}
}
}
]
}
}
}
I hope you can convert this raw DSL query to NEST!

Return five following documents from an id with Elasticsearch and NEST

I think I have blinded myself staring at an error over and over again and could really use some input. I have a time-series set of documents. Now I want to find the five documents following a specific id. I start by fetching that single document. Then fetching the following five documents without this id:
var documents = client.Search<Document>(s => s
.Query(q => q
.ConstantScore(cs => cs
.Filter(f => f
.Bool(b => b
.Must(must => must
.DateRange(dr => dr.Field(field => field.Time).GreaterThanOrEquals(startDoc.Time))
.MustNot(mustNot => mustNot
.Term(term => term.Id, startDoc.Id))
))))
.Take(5)
.Sort(sort => sort.Ascending(asc => asc.Time))).Documents;
My problem is that while 5 documents are returned and sorted correctly, the start document is in the returned data. I'm trying to filter this away with the must not filter, but doesn't seem to be working. I'm pretty sure I have done this in other places, so might be a small issue that I simply cannot see :)
Here's the query generated by NEST:
{
"query":{
"constant_score":{
"filter":{
"bool":{
"must":[
{
"range":{
"time":{
"gte":"2020-08-31T10:47:12.2472849Z"
}
}
}
],
"must_not":[
{
"term":{
"id":{
"value":"982DBC1BE9A24F0E"
}
}
}
]
}
}
}
},
"size":5,
"sort":[
{
"time":{
"order":"asc"
}
}
]
}
This could be happening because the id field might be an analyzed field. Analyzed fields are tokenized. Having a non-analyzed version, for exact match (like you mentioned in the comments, you have one) and using it within your filter will fix the difference you are seeing.
More about analyzed vs non-analyzed fields here

NEST 2.0 with Elasticsearch for GeoDistance always returns all records

I have the below code using C# .NET 4.5 and NEST 2.0 via nuget. This query always returns my type 'trackpointes' with the total number of documents with this distance search code. I have 2,790 documents and the count return is just that. Even for 1 centimeter as the distance unit it returns all 2,790 documents. My type of 'trackpointes' has a location field, type of geo_point, geohash true, and geohash_precision of 9.
I am just trying to filter results based on distance without any other search terms and for my 2,790 records it returns them all regardless of the unit of measurement. So I have to be missing something (hopefully small). Any help is appreciated. The NEST examples I can find are a year or two old and that syntax does not seem to work any more.
double distance = 4.0;
var geoResult = client.Search<TrackPointES>(s => s.From(0).Size(10000).Type("trackpointes")
.Query(query => query
.Bool( b => b.Filter(filter => filter
.GeoDistance(geo => geo
.Distance(distance, Nest.DistanceUnit.Kilometers).Location(35, -82)))
)
)
);
If I use POSTMAN to connect to my instance of ES and POST a search w/ the below JSON, I get a return of 143 total documents out of 2,790. So I know the data is right as that is a realistic return.
{
"query" : {
"filtered" : {
"filter" : {
"geo_distance" : {
"distance" : "4km",
"location" : {
"top_left": {
"lat" : 35,
"lon" : -82
}
}
}
}
}
}
}
Looks like you didn't specify field in your query. Try this one:
var geoResult = client.Search<Document>(s => s.From(0).Size(10000)
.Query(query => query
.Bool(b => b.Filter(filter => filter
.GeoDistance(geo => geo
.Field(f => f.Location) //<- this
.Distance(distance, Nest.DistanceUnit.Kilometers).Location(35, -82)))
)
)
);
I forgot to specify the field to search for the location. :( But I am posting here just in case someone else has the same issue and to shame myself into trying harder...
.Field(p => p.location) was the difference in the query.
var geoResult = client.Search<TrackPointES>(s => s.From(0).Size(10000).Type("trackpointes")
.Query(query => query
.Bool( b => b.Filter(filter => filter
.GeoDistance(geo => geo.Field(p => p.location).DistanceType(Nest.GeoDistanceType.SloppyArc)
.Distance(distance, Nest.DistanceUnit.Kilometers).Location(35, -82)))
)
)
);

ElasticSearch NEST Query

I'm trying to mimic a query that I wrote in Sense (chrome plugin) using NEST in C#. I can't figure out what the difference between the two queries is. The Sense query returns records while the nest query does not. The queries are as follows:
var searchResults = client.Search<File>(s => s.Query(q => q.Term(p => p.fileContents, "int")));
and
{
"query": {
"term": {
"fileContents": {
"value": "int"
}
}
}
What is the difference between these two queries? Why would one return records and the other not?
You can find out what query NEST uses with the following code:
var json = System.Text.Encoding.UTF8.GetString(searchResults.RequestInformation.Request);
Then you can compare the output.
I prefer this slightly simpler version, which I usually just type in .NET Immediate window:
searchResults.ConnectionStatus;
Besides being shorter, it also gives the url, which can be quite helpful.
? searchResults.ConnectionStatus;
{StatusCode: 200,
Method: POST,
Url: http://localhost:9200/_all/filecontent/_search,
Request: {
"query": {
"term": {
"fileContents": {
"value": "int"
}
}
}
}
Try this:
var searchResults2 = client.Search<File>(s => s
.Query(q => q
.Term(p => p.Field(r => r.fileContents).Value("int")
)
));
Followup:
RequestInformation is not available in newer versions of NEST.
I'd suggest breaking down your code in steps (Don't directly build queries in client.Search() method.
client.Search() takes Func<SearchDescriptor<T>, ISearchRequest> as input (parameter).
My answer from a similar post:
SearchDescriptor<T> sd = new SearchDescriptor<T>()
.From(0).Size(100)
.Query(q => q
.Bool(t => t
.Must(u => u
.Bool(v => v
.Should(
...
)
)
)
)
);
And got the deserialized JSON like this:
{
"from": 0,
"size": 100,
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
...
]
}
}
]
}
}
}
It was annoying, NEST library should have something that spits out the JSON from request. However this worked for me:
using (MemoryStream mStream = new MemoryStream()) {
client.Serializer.Serialize(sd, mStream);
Console.WriteLine(Encoding.ASCII.GetString(mStream.ToArray()));
}
NEST library version: 2.0.0.0.
Newer version may have an easier method to get this (Hopefully).

elasticsearch nest support of filters in functionscore function

I am currently trying to implement a "function_score" query in NEST, with functions that are only applied when a filter matches.
It doesn't look like FunctionScoreFunctionsDescriptor supports adding a filter yet. Is this functionality going to be added any time soon?
Here's a super basic example of what I'd like to be able to implement:
Runs an ES query, with basic scores
Goes through a list of functions, and adds to it the first score where the filter matches
"function_score": {
"query": {...}, // base ES query
"functions": [
{
"filter": {...},
"script_score": {"script": "25"}
},
{
"filter": {...},
"script_score": {"script": "15"}
}
],
"score_mode": "first", // take the first script_score where the filter matches
"boost_mode": "sum" // and add this to the base ES query score
}
I am currently using Elasticsearch v1.1.0, and NEST v1.0.0-beta1 prerelease.
Thanks!
It's already implemented:
_client.Search<ElasticsearchProject>(s =>
s.Query(q=>q
.FunctionScore(fs=>fs.Functions(
f=>f
.ScriptScore(ss=>ss.Script("25"))
.Filter(ff=>ff.Term(t=>t.Country, "A")),
f=> f
.ScriptScore(ss=>ss.Script("15"))
.Filter(ff=>ff.Term("a","b")))
.ScoreMode(FunctionScoreMode.first)
.BoostMode(FunctionBoostMode.sum))));
The Udi's answer didn't work for me. It seems that in new version (v 2.3, C#) there's no Filter() method on ScoreFunctionsDescriptor class.
But I found a solution. You can provide an array of IScoreFunction. To do that you can use new FunctionScoreFunction() or use my helper class:
class CustomFunctionScore<T> : FunctionScoreFunction
where T: class
{
public CustomFunctionScore(Func<QueryContainerDescriptor<T>, QueryContainer> selector, double? weight = null)
{
this.Filter = selector.Invoke(new QueryContainerDescriptor<T>());
this.Weight = weight;
}
}
With this class, filter can be applied this way (this is just an example):
SearchDescriptor<BlobPost> searchDescriptor = new SearchDescriptor<BlobPost>()
.Query(qr => qr
.FunctionScore(fs => fs
.Query(q => q.Bool(b => b.Should(s => s.Match(a => a.Field(f => f.FirstName).Query("john")))))
.ScoreMode(FunctionScoreMode.Max)
.BoostMode(FunctionBoostMode.Sum)
.Functions(
new[]
{
new CustomFunctionScore<BlobPost>(q => q.Match(a => a.Field(f => f.Id).Query("my_id")), 10),
new CustomFunctionScore<BlobPost>(q => q.Match(a => a.Field(f => f.FirstName).Query("john")), 10),
}
)
)
);

Resources