Filter Then Sort Results By Query in ElasticSearch - sorting

Is there a way in ElasticSearch to run a boolean filter, then without refinding the search further, sort/order the results based on a multi_field query?
Eg: Get all items with status_id = 1 (the filter), then order those documents by using the keywords "red car" (documents whose name and description contain those keywords are first, documents without are last).

You can use bool query -
As per condition of should -
The clause (query) should appear in the matching document. In a boolean query with no must clauses, one or more should clauses must match a document. The minimum number of should clauses to match can be set using the minimum_should_match parameter.
In our case , as there is a must and its a number match , score value wont be computed. But then conditions in should would be used for computing the score alone -
{
"query": {
"bool": {
"must": [
{
"match": {
"status_id": 1
}
}
],
"should": [
{
"multi_match": {
"query": "red car",
"fields": [
"subject",
"message"
]
}
}
]
}
}
}

Related

multi_match vs should match vs must query_string in ElasticSearch

I tried these type of queries in ElasticSearch and wondering which type is the most suitable (most accurate and most efficient) one. Basically, one person can have multiple set of names (array). Names split into firstname, surname and middlename. Some person can have just firstname and surname. Parameter (input) is fullname (combination of firstname, surname and middlename in one string). Fuzzy logic added. One difference I notice is the score.
This is the score of the first result returned.
first query: 17.41911
second query: 24.332222
third query: 21.200104
Is this mean that the second query is the most accurate query for this requirement?
GET /person/_search
{
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "David Bill Gonzalo~",
"fields": [
"nameDetails.name.nameValue.firstName",
"nameDetails.name.nameValue.surname",
"nameDetails.name.nameValue.middleName"
]
}
}
]
}
}
}
GET /person/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"nameDetails.name.nameValue.firstName": "David Bill Gonzalo~"
}
},
{
"match": {
"nameDetails.name.nameValue.surname": "David Bill Gonzalo~"
}
},
{
"match": {
"nameDetails.name.nameValue.middleName": "David Bill Gonzalo~"
}
}
]
}
}
}
GET /person/_search
{
"query": {
"bool": {
"must": [
{
"query_string": {
"fields": [
"nameDetails.name.nameValue.firstName",
"nameDetails.name.nameValue.surname",
"nameDetails.name.nameValue.middleName"
],
"query": "David Bill Gonzalo~"
}
}
]
}
}
}
First Query:
The multi-match query allows us to run a query on multiple fields. It is an extension of the match query.
As in the first query, you have not specified any type parameter, so by default best_fields is considered the type. This finds all the documents which match with the query, but _score is calculated only from the best field.
To know more about the types of multi-match queries, refer to this part of the documentation.
Second Query:
This is a boolean query with the combination of the bool/should clause. The score from each matching should clause is taken to calculate the final score here.
Third Query:
In the third query, query_string is running against multiple fields.
As you have not specified any type parameter, so by default best_fields is considered the type. This finds all the documents which match with the query, but _score is calculated only from the best field.
Since you are querying on multiple fields, with the same query parameter i.e "David Bill Gonzalo~", according to me you should use a multi-match query. You can use multi-match queries with different options as well like boosting one or more fields, adding type parameter in multi-match queries, etc.

Filter results from Elasticsearch if only a specific field matches

I'm using the following query for searching across multiple fields:
{
"query": {
"multi_match": {
"query": "italian sports car",
"fields": ["car_name", "car_brand", "car_description", "car_country"],
"type": "most_fields"
}
}
}
In this example, I'm looking for sports cars made in Italy (hence the car_country field). However, this will return all the cars made in Italy even if they are not sports cars. I want car_country to be just an auxiliary search field, so I don't want hits when the only matched field is car_country. Is this possible? I know I can set a lower score for that field, but I want hits with only this matching field to be completely ignored.
There can be different ways you handle this problem depending on the scoring etc. you require from you results. For instance -
Use a bool query with 2 parts
Must query - include queries that must match for the document to be in the resultset
Should query - include queries that should match(and impact scoring) but do not decide if a document should or should not be in the result set.
Add the multi-match query without the car_country field in must query and a match query for car_country field in should query.
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "italian sports car",
"fields": [
"car_name",
"car_brand",
"car_description"
],
"type": "most_fields"
}
}
],
"should": [
{
"match": {
"car_country": {
"query": "italian sports car"
}
}
}
]
}
}
}

Elasticsearch: how to write bool query that will contain multiple conditions on the same token?

I have a field with tokenizer that splits by dots.
on search, the following value aaa.bbb will be splitted to two terms aaa and bbb.
My question is how to write bool query that will contain multiple conditions on the same term?
For example, i want to get all docs where its field contains a term that matches a fuzzy search for gmail but also the same term must not contain gamil.
Here are some examples of what i want to achieve:
bmail // MATCH: since its matches fuzzy search and is not gamil
gamil.bmail // MATCH: since the term bmail matches fuzzy search and is not gamil
gamil // NO MATCH: since its matches fuzzy search and but equals gamil
NOTE: the following query does NOT appear to be working since it looks as if one term matches one condition and the second term matches the other, it will be considered a hit.
{
...
"body": {
"query": {
"bool": {
"must": [
{
"fuzzy": {
"my_field": {
"value": "gmail",
"fuzziness": 1,
"max_expansions": 2100000000
}
}
},
{
"bool": {
"must_not": [
{
"query_string": {
"default_field": "my_field",
"query": "*gamil*",
"analyzer": "keyword"
}
}
]
}
}
]
}
}
},
}
I ended up using Highlight by executing fuzzy (or any other) query, and then programatically filter the results by the returned highlight object.
span queries might also be a good option if you don't need regular expression or you can make sure you don't exceed the boolean query limit.
(see more details in the provided link)

Elasticsearch Per-Field Boosts, Wildcard and Explicit Field Matches Conflict

Some queries in Elasticsearch offer wildcard matching for fields with boosts, for example, the simple_query_string query.
In our use-case we would like to boost specific fields while giving all other fields a score of 0. We thought this could be achieved with the following example query pattern:
GET /kibana_sample_data_ecommerce/_search
{
"profile": "true",
"query": {
"simple_query_string": {
"query": "Eddie",
"fields": [
"*^0",
"customer_full_name^100"
]
}
}
}
It appears though that if several field definitions match the same field name via wildcards their boosts are multiplied. The above example will yield a score of 0 for all documents even if they contain the token "Eddie" in the customer_full_name field.
This following example demonstrates that the boosts are multiplied:
GET /kibana_sample_data_ecommerce/_search
{
"profile": "true",
"query": {
"simple_query_string": {
"query": "Eddie",
"fields": [
"*^0.1",
"customer_full_name^100"
]
}
}
}
It leads to the expression (customer_full_name:eddie)^10.0 in the profile explanation of the query.
Does that mean that it is not possible to achieve our desired outcome with field boosts? The desired outcome is: All matches in a specific field have theirs score multiplied by 100 while all documents with matches in other fields are still returned but have 0 score.

Elasticsearch Boolean query with Constant score wrapper

When using elasticsearch-7 I'm confused by es compound queries syntax.
Though reading es documents repeatedly but i just find standard syntax of Boolean or Constant score seperately.
As it illuminate,i understand what is 'query context' and what is 'filter context'.But when combining these two query type in a single query i don't know what it mean.
Let's see a example:
GET /classes_test/_search
{
"size": "21",
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
{
"match": {
"class_name": "29386556"
}
}
],
"should": [
{
"term": {
"master": "7033560"
}
},
{
"term": {
"assistant": "7033560"
}
},
{
"term": {
"students": "7033560"
}
}
],
"minimum_should_match": 1,
"must_not": [
{
"term": {
"class_id": 0
}
}
],
"filter": [
{
"term": {
"class_status": "1"
}
}
]
}
}
}
}
}
This query can be executed and response well.Each item in response content has a '_score' value with 1.0.
So,is it mean that the sub bool query as a entirety is in a filter context though it has a 'must' and 'should'?
Also i found boolean query can have a constant score sub query.
Why es allow these syntax but has no more words to explain?
If you use a constant_score query, you'll never get scores different than 1.0, unless you specify boost parameters in which case the score will match those.
If you need scoring you obviously need to ditch constant_score.
In your case, your match query on class_name cannot yield any other score than 1 or 0 since this is basically a yes/no filter, not a matching based on full-text search.
To sum up, all your query executes in a filter context (hence score 0 or 1) since you don't rely on full-text search. So you get scoring whenever you use full-text search, not because you use a match query. In your case, you can merge all must constraints into filter, it won't make any difference since you only have filters (yes/no matches) and no full-text search.

Resources