Elasticsearch NEST - Phrase search - elasticsearch

What methods should I use in order for my query to return hits with at least 2 keywords in the text from an input phrase.
For example, if the input "hello friend" I want the return results to contain documents where "hello" and "friend" somewhere in the text. If the input "hello good friend" I want results where 2 of 3 keyword in the text. Or at least results with best combinations be on top.
If I use code like one below I get results where "hello" or "friend" but not both.
var searchResults = client.Search<Thread>(s => s
.Type("threads")
.From(0)
.Size(100)
.Query(q => q
.Match(qs => qs
.OnField(p => p.Posttext)
.Query("hello friend")
)
)
.Highlight(h => h
.OnFields(
f => f.OnField("posttext").PreTags("<b>").PostTags("</b>").FragmentSize(150)
)
)
);
I can get results I want by code like this one but it is not flexible because phrase can be with arbitrary number of words.
var searchResults = client.Search<Thread>(s => s
.Type("threads")
.From(0)
.Size(100)
.Query(q => q
.Match(qs => qs
.OnField(p => p.Posttext)
.Query("hello")
)
&&
q.Match(qs => qs
.OnField(p => p.Posttext)
.Query("friend")
)
)
.Highlight(h => h
.OnFields(
f => f.OnField("posttext").PreTags("<b>").PostTags("</b>").FragmentSize(150)
)
)
);
I think I am missing something. Please help.
Thanks in advance.

you need to use phrase query..
within the match you need specify the type as phrase ..
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query.html#query-dsl-match-query-phrase
IF you go through the article above i guess you can find a direction to your question..
PS: I am aware of elasticsearch for javascript...

I found that adding .Operator(Operator.And) to Match query works in my situation. But I need to investigate more on phrase search.

Related

ElasticSearch - Search middle of words over multiple fields

I'm trying to retrieve documents that have a phrase in them, not necessarily at the start of the word, over multiple document fields.
Such as "ell" should match a document field "hello". And do this on two fields.
I initially went with MultiMatch due to this SO answer. Here was my implementation:
QueryContainer &= Query<VeganItemEstablishmentSearchDto>.MultiMatch(c => c
.Fields(f => f.Field(p => p.VeganItem.Name).Field(v => v.VeganItem.CompanyName))
.Query(query)
.MaxExpansions(2)
.Slop(2)
.Name("named_query")
);
But I found that it would only match "hello" if my search phrase started with the start of the word e.g. it would not match "ello".
So I then changed to QueryString due to this SO answer. My implementation was:
QueryContainer &= Query<VeganItemEstablishmentSearchDto>.QueryString(c => c
.Fields(f => f.Field(p => p.VeganItem.Name).Field(v => v.VeganItem.CompanyName))
.Query(query)
.FuzzyMaxExpansions(2)
.Name("named_query")
);
But I found that was even worse. It didn't search multiple fields, only p.VeganItem.Name and still "ello" was not matching "hello".
How do I use Nest to search for a term that can be in the middle of a word and over multiple document fields?
Wildcard queries are expensive, if you want to customize and allow how many middle characters you want to search, you can do it using the n-gram tokenizer, that would be less expensive and will provide more customisation/flexibility to you.
I've also written a blog post on implementing the autocomplete and its various trade-offs with performance and functional requirements.
You will need to use wild card query for this scenario, for more information about wild cards query check here, and for nest WildQueries check here.
To do wild card query in Nest you can do like this:
new QueryContainer[]
{
Query<VeganItemEstablishmentSearchDto>.Wildcard(w => w
.Field(v => v.VeganItem.CompanyName))
.Value(query)),
Query<VeganItemEstablishmentSearchDto>.Wildcard(w => w
.Field(p => p.VeganItem.Name))
.Value(query)
}
Your should add asterisk (*) in the beginning and end of your query.
Please keep in your mind that wildCard queries are expensive and you might want to achieve these by having different Analyzer in your mapping.
QueryString from this SO answer is what worked for me for multiple fields and the middle of a word. I have not tried Amit's answer yet. I will in the future. This is the quick solution for a beginner:
QueryContainer &= Query<VeganItemEstablishmentSearchDto>
.QueryString(c => c
.Name("named_query")
.Boost(1.1)
.Fields(f => f.Field(p => p.VeganItem.Name).Field(v => v.VeganItem.CompanyName))
.Query($"*{query}*")
.Rewrite(MultiTermQueryRewrite.TopTermsBoost(10))
);
This also works:
QueryContainer = QueryContainer | Query<VeganItemEstablishmentSearchDto>
.MatchPhrase(c => c
.Boost(1.1)
.Field(f => f.VeganItem.Name)
.Query(query)
.Slop(1)
);
QueryContainer = QueryContainer | Query<VeganItemEstablishmentSearchDto>
.MatchPhrase(c => c
.Boost(1.1)
.Field(f => f.VeganItem.CompanyName)
.Query(query)
.Slop(1)
);
var terms = query.ToLower().Split(' ');
foreach (var term in terms)
{
QueryContainer = QueryContainer | Query<VeganItemEstablishmentSearchDto>
.Wildcard(c => c
.Value($"*{term}*")
.Field(f => f.VeganItem.CompanyName)
.Rewrite(MultiTermQueryRewrite.TopTermsBoost(10))
);
QueryContainer = QueryContainer | Query<VeganItemEstablishmentSearchDto>
.Wildcard(c => c
.Value($"*{term}*")
.Field(f => f.VeganItem.Name)
.Rewrite(MultiTermQueryRewrite.TopTermsBoost(10))
);
}

Query one field with multiple values in elasticsearch nest

I have a combination of two queries with Elasticsearch and nest, the first one is a full-text search for a specific term and the second one is to filter or query another field which is file-path but it should be for many files paths and the path could be part or full path, I can query one file-path but I couldn't manage to do it for many file paths, any suggestion?
Search<SearchResults>(s => s
.Query(q => q
.Match(m => m.Field(f => f.Description).Query("Search_term"))
&& q
.Prefix(t => t.Field(f => f.FilePath).Value("file_Path"))
)
);
For searching for more than one path you can use bool Query in elasticsearch and then use Should Occur to search like logical OR, so you code should look like this:
Search<SearchResults>(s => s
.Query(q => q.
Bool(b => b
.Should(
bs => bs.Wildcard(p => p.FilePath, "*file_Pathfile_Path*"),
bs => bs.Wildcard(p => p.FilePath, "*file_Pathfile_Path*"),
....
))
&& q.Match(m => m.Field(f => f.description).Query("Search_term")
)));
Also you should use WildCard Query to get result for paths that could be part or full path. For more information check ES offical documentation about WildQuery and Bool Query below:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html
https://www.elastic.co/guide/en/elasticsearch/client/net-api/current/bool-queries.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-wildcard-query.html

Using which field matched in a multimatch query in a function score

I have a multimatch query which I am using across 5 fields. I am also using a function score to combine various factors into the score. I would like to add a factor to this so that results that matched on one of the fields is increased (adding a large number so that matches on this field always have the highest score).
I know that I can use highlighting to find out which fields were matched, but how can I access that information in the function score script?
Here's what I have so far (using NEST, but that shouldn't make a difference).
var searchResponse = client.Search<TopicCollection.Topic>(s => s
.Query(q => q
.FunctionScore(fs => fs
.Name("function_score_query")
.Query(q1 => q1
.MultiMatch(c => c
.Fields(f => f
.Field(p => p.field1)
.Field(p => p.field2) //...etc
.Query(searchTerm)
)
)
.Functions(fun => fun
.ScriptScore(ss => ss.Script(sc => sc
.Inline(
//TODO: add 1000 to normalised _score if match is in field1
)))
).BoostMode(FunctionBoostMode.Replace)
)
).Highlight(h => h
.Fields(p => p.AllField())
)
);

elasticsearch : Avoid repetitive scoring when using ngram analyzer

Suppose I search for "hello" when the document contains "hello" and "hello hello" I want "hello" to have higher scoring.
I am using ngram index and search analyzer. (Because I really need this for other scenarios) So "hello hello" gets matched twice and hence shows as the top result. Is there any way I can avoid this? I have already tried term query, match phrase query, multi match queries all of them scores "hello hello" higher.
I solved this by adding a duplicate unanalyzed (keyword) column for the document and used bool clause to boost the term query.
var res = client.Search<MyClass>(s => s
.Query(q => q
.Bool(
b1 => b1.Should(
s1 =>s1
.Term(m=>m
.Field(f => f._DUPLICATE_COLUMN)
.Value("hello")
.Boost(1)
),
s1=>s1.Match(m => m
.Field(f => f.MY_COLUMN)
.Query("hello")
.Analyzer("myNgramSearchAnalyzer")
)
)
.MinimumShouldMatch(1)
)
)
);

How to : ElasticSearch .NET and NEST 5.X Multimatch with wildcard

I has been search a lot of sample from Internet, however i still could not find any sample on Wildcard Search with more than one fields, can anyone help me with some example? Im very new into ElasticSearch. Below is what im trying to do with wildcard, but it work for one field.
How can i combine below Wildcard with MultiMatch in C#?
var result = client.Search<Metadata>(x => x
.Index("indexname")
.Type("Metadata")
.MatchAll()
.Query(q => q
.Wildcard(c => c
.Name("Query")
.Boost(1.1)
.Field(p => p.Title)
.Value("input*")
.Rewrite(MultiTermQueryRewrite.TopTermsBoost(10))
)
)
);
How can i add in below multi fields support like in Multimatch?
.Fields(f => f
.Fields(f1 => f1.Title, f2 => f2.Keywords)
)

Resources