I'm new to Elasticsearch. I'm trying to search keywords in Elasticsearch with the match_phrase.
And I don't want to match all terms, so I add the minimum_should_match in search queries, but it seems like it is impossible (ES doesn't support it).
What I need:
The slop of each term is 0 or bigger.
Matching terms must appear in their specified order.
It doesn't have to match every one of them. And I can specify any parameter such as the ** minimum_should_match**.
Anybody has experience on this?
Thanks in advance.
Related
I would like to know whether match_phrase_prefix and slop search options can be used along with ES Percolated query?
Yes, You can do it because it support storing Elastic DSL query.
You can read my blog on percolate query here which will give you basic understanding with real life example.
Currently I'm able to do query on AWS elastic search to get matching doc's for single term. for that I'm using below query.
Now I have a requirement to do query for multiple term and get there matching doc's
Is there anyway to do query with multiple terms in single query with that we can get the matching terms separately for each term. which save lot of time for us
Looks like you need terms query or multi_match if you are querying text fields.
I've been using a lot of match queries in my project. Now, I have just faced with term query in Elasticsearch. It seems the term query is more faster in case that keyword of your query is specified.
Now I have a question there..
Should I refactor my codes (it's a lot) and use term instead of match?
How much is the performance of using term better than match?
using term in my query:
main_query["query"]["bool"]["must"].append({"term":{object[..]:object[...]}})
using match query in my query:
main_query["query"]["bool"]["must"].append({"match":{object[..]:object[...]}})
Elastic discourages to use term queries for text fields for obvious reasons (analysis!!), but if you know you need to query a keyword field (not analyzed!!), definitely go for term/terms queries instead of match, because the match query does a lot more things aside from analyzing the input and will eventually end up executing a term query anyway because it notices that the queried field is a keyword field.
As far as I know when you use the match query it means your field is mapped as "text" and you use an analyzer. With that, your indexed word will generate tokens and when you run the query you go through an analyzer and the correspondence will be made for each of them.
Term will do the exact match, that is, it does not go through any analyzer, it will look for the exact term in the inverted index.
Because of this I believe that by not going through analyzers, Term is faster.
I use Term match to search for keywords like categories, tag, things that don't make sense use an analyzer.
I was wondering if it is possible in Elasticsearch to exclude queries where the query is a single term? I am trying to use "minimum_should_match" as 2, which works well when the query has 2 or more terms. However, if the number of terms in the query is 1, ES will still return results. It seems that ES is using the logic of "well you asked for a minimum of matching two terms, yet there is only one term to match; we'll lower the minimum to 1". Is there a way to turn this functionality off, or otherwise do what I am looking for?
For those wondering why this can't be done at the API level, I am using a query analyzer that excludes stop words. So a query like "a ipad" would end up being 1 term, while the API would see 2. The API could do stopword filtering but that seems to be a waste of resources.
Before doing a query you can first analyze the input by your custom analyzer.
You can use the Analyze API for this (be sure to set the analyzer property to be equal to your custom analyzer name).
The result would be a list of analyzed tokens. If your analyzer removes stopwords, it would return only ipad for a ipad.
So if the Analyze API returns only one token you actually don't need to query Elasticsearch, because you don't want any results if number of tokens is less than 2 (if I understood you correctly)
Is there a way to do faceted searches using the elasticsearch Search API maintaining case (as opposed to having the results be converted to lowercase).
Thanks in advance, Chuck
Assuming you are using the "terms" facet, the facet entries are exactly the terms in the index. Briefly, analysis is the process of converting a field value into a sequence of terms, and lowercasing is a step in the default analyzer; that's why you're seeing lowercased terms. So you will want to change your analysis configuration (and perhaps introduce a multi_field if you want to run several different analyzers.)
There's a great explanation in Lucene in Action (2nd Ed.); it's applicable to ElasticSearch, too.