Elasticsearch match_phrase doesn't perform the same as multi_match with type phrase? - elasticsearch

I'm having some trouble turning a match_phrase query into a multi_match query for multiple fields. My original query:
{
"from" : 0,
"size" : 50,
"query" : {
"filtered" : {
"query" : {
"match_phrase" : {
"metadata.description" : "Search Terms"
}
},
"filter" : {
"bool" : {
"must" : [ {
"terms" : {
"collectionId" : [ "1", "2" ]
}
} ]
}
}
}
}
}
Returns results correctly, but when I rewrite the match_phrase piece as a multi_match to run against multiple fields:
{
"from" : 0,
"size" : 50,
"query" : {
"filtered" : {
"query" : {
"multi_match" : {
"query" : "Search Terms",
"fields" : [ "metadata.description", "metadata.title" ],
"type" : "phrase"
}
},
"filter" : {
"bool" : {
"must" : [ {
"terms" : {
"collectionId" : [ "1", "2" ]
}
} ]
}
}
}
}
}
I am not getting any results. Is there anything obvious I am doing wrong here?
EDIT:
It must be something to do with the filter, as
{
"from" : 0,
"size" : 50,
"query" : {
"match_phrase" : {
"metadata.description" : "Search Terms"
}
}
}
and
{
"from" : 0,
"size" : 50,
"query" : {
"multi_match" : {
"query" : "Search Terms",
"fields" : [ "metadata.description", "metadata.title" ],
"type" : "phrase"
}
}
}
both perform as expected.

I am not sure why, exactly, but not using a filtered query, and applying the filter at the top level
{
"from" : 0,
"size" : 50,
"query" : {
"multi_match" : {
"query" : "Search Terms",
"fields" : [ "metadata.description", "metadata.title" ],
"type" : "phrase"
}
},
"filter" : {
"bool" : {
"must" : [ {
"terms" : {
"collectionId" : [ "1", "2" ]
}
} ]
}
}
}
resolves the problem.

Related

ElasticSearch : constant_score query vs function_score query

I recently upgraded my ElasticSearch version from version 5.3 to version 5.6
"query" : {
"constant_score" : {
"query" : {
"bool" : {
"must" : {
"terms" : {
"customerId" : [ "ASERFE", "7004567457" ]
}
},
"must_not" : {
"terms" : {
"useCase" : [ "PAY", "COLLECT" ]
}
}
},
"bool" : {
"must" : {
"match" : {
"cardProductGroupName" : {
"query" : "Pre-fill Test birthday Present",
"type" : "phrase"
}
}
}
}
}
}
}
executing the query mentioned above gave me the following error -
{"root_cause":[{"type":"parsing_exception","reason":"[constant_score] query does not support [query]","line":1,"col":37}],"type":"parsing_exception","reason":"[constant_score] query does not support [query]","line":1,"col":37}
So, I searched for the solution and found this function_score query. On executing the query mentioned below I am getting the same results that I would have got with constant_score.
"query" : {
"function_score" : {
"query" : {
"bool" : {
"must" : {
"terms" : {
"customerId" : [ "ASERFE", "7004567457" ]
}
},
"must_not" : {
"terms" : {
"useCase" : [ "PAY", "COLLECT" ]
}
}
},
"bool" : {
"must" : {
"match" : {
"groupName" : {
"query" : "Pre-fill Test birthday Present",
"type" : "phrase"
}
}
}
}
},
"functions" : [ {
"script_score" : {
"script" : "1"
}
} ],
"boost_mode" : "replace"
}
}
so my question is, Does it implies that function_score with script : "1" would give same result as constant_function ?
It will give the same result indeed though performance might be worse if it will still run the "script" for each matching document.
On the other hand, constant_score still exists in 5.6 though you have to use filter+boost instead of query.

Why my Elasticsearch query retrieves all indexed documents

I've a problem to understand the functionality of the following Elasticsearch (ES 6.4) query:
{
"query" : {
"bool" : {
"should" : [
{
"match" : {
"title" : {
"query" : "example",
"operator" : "AND",
"boost" : 2
}
}
},
{
"multi_match" : {
"type" : "best_fields",
"query" : "example",
"operator" : "AND",
"fields" : [
"author", "content", "tags"
],
"boost" : 1
}
}
],
"must" : [
{
"range" : {
"dateCreate" : {
"gte" : "2000-01-01T00:00:00+0200",
"lte" : "2019-02-12T23:59:59+0200"
}
}
},
{
"term" : {
"client" : {
"value" : "test",
"boost" : 1
}
}
}
]
}
},
"size" : 10,
"from" : 0,
"sort" : [
{
"_score" : {
"order" : "desc"
}
}
]
}
The query is executed successfully but retrieves about 400,000 documents which is the total count of my index. It means that all documents are in the result set. But why? Is this really the correct behavior of the multi_match query?
When I was still using the query_string query, I only got the actual matching documents. That's why I'm a bit surprised.
You're missing minimum_should_match:
"bool" : {
"minimum_should_match": 1, <--- add this
"should" : [
...

multiple match must fields not working in elastic search

below query is fetching result if i give existing record that is fine , but if i change name field from 'John' to 'John1' then still record is fetching.
{
"query" : {
"bool" : {
"must" : [
{ "match" : {"employeeId" : "1234"}},
{ "match" : {"name" : "John"}}
]
}
}
}
I tried another alternative query as well but still giving result.which query is correct in terms of performance?but both are giving results if i change name record from 'John' to 'John1'
{
"filter": {
"bool" : {
"must" : {
"term" : {
"employeeId" : "1234"
}
}
}
},
"query": {
"match" : {
"name" : {
"query" : "John",
"type" : "phrase"
}
}
}
}
This because you are doing match, if you want do exact search you need to use filter
Notice we assuce the mapping of name column is analyzed
{
"query" :{
"filtered" : {
"filter" : {
"bool" : {
"must" : [
{ "term" : {"employeeId" : "1234"}},
{ "term" : {"name" : "john"}}
]
}
}
}
}
}

elasticsearch nested functionScoreQuery cannot access parent properties

I have a type in elasticsearch that looks like this:
"hotel" : {
"field" : 1,
"rooms" : [
{
"type" : "single",
"magicScore" : 1
},
{
"type" : "double",
"magicScore" : 2
}
]
}
where rooms is of type nested. I sort using a nested functionScoreQuery:
{
"query" : {
"filtered" : {
"query" : {
"nested" : {
"query" : {
"function_score" : {
"filter" : {
"match_all" : { }
},
"functions" : [ {
"script_score" : {
"script" : "return doc['hotel.field'].value"
}
} ]
}
},
"path" : "rooms",
"score_mode" : "max"
}
}
}
}
Problem is hotel.field returns 0 always. Is there a way to access the parent field inside a nested query? I know I can always pack the field inside the nested document but its a hack not a solution. Would using a dismax query help me? https://discuss.elastic.co/t/nested-value-on-function-score/29935
The query I am actually using looks something like this:
{
"query" : {
"bool" : {
"must" : {
"nested" : {
"query" : {
"function_score" : {
"query" : {
"not" : {
"query" : {
"terms" : {
"rooms.type" : [ "single", "triple" ]
}
}
}
},
"functions" : [ {
"script_score" : {
"script" : {
"inline" : "return doc['rooms.magicScore'].value;",
"lang" : "groovy",
"params" : {
"ratings" : {
"sample" : 0.5
},
"variable" : [ 0.0, 0.0, 0.0, 0.0, -0.5, -2.5]
}
}
}
} ],
"score_mode" : "max"
}
},
"path" : "rooms"
}
},
"filter" : {
"bool" : {
"filter" : [ {
"bool" : {
"should" : [ {
"term" : {
"cityId" : "166"
}
}, {
"term" : {
"cityId" : "165"
}
} ]
}
}, {
"nested" : {
"query" : {
"not" : {
"query" : {
"terms" : {
"rooms.type" : [ "single", "triple" ]
}
}
}
},
"path" : "rooms"
}
} ]
}
}
}
}
}
What I am trying to achieve is to access for example the cityId inside the function_score query which is nested.
The question is why are you accessing the parent values in a nested query. Once you are in the nested context, you cannot access parent fields or other fields from other nested fields.
From the documentation:
The nested clause “steps down” into the nested comments field. It no longer has access to fields in the root document, nor fields in any other nested document.
So, rewrite your queries so that the nested part touches the fields in that nested field and anything else is accessed outside the nested part.

How to exclude a filter from a facet?

I have come from a Solr background and am trying to find the equivalent of "tagging" and "excluding" in Elasticsearch.
In the following example, how can I exclude the price filter from the calculation of the prices facet? In other words, the prices facet should take into account all of the filters except for price.
{
query : {
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"and" : [
{
"term" : {
"colour" : "Red"
}
},
{
"term" : {
"feature" : "Square"
}
},
{
"term" : {
"feature" : "Shiny"
}
},
{
"range" : {
"price" : {
"from" : "10",
"to" : "20"
}
}
}
]
}
}
},
"facets" : {
"colours" : {
"terms" : {
"field" : "colour"
}
},
"features" : {
"terms" : {
"field" : "feature"
}
},
"prices" : {
"statistical" : {
"field" : "price"
}
}
}
}
You can apply price filter as a top level filter to your query and add it to all facets expect prices as a facet_filter:
{
query : {
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"and" : [
{
"term" : {
"colour" : "Red"
}
},
{
"term" : {
"feature" : "Square"
}
},
{
"term" : {
"feature" : "Shiny"
}
}
]
}
}
},
"facets" : {
"colours" : {
"terms" : {
"field" : "colour"
},
"facet_filter" : {
"range" : { "price" : { "from" : "10", "to" : "20" } }
}
},
"features" : {
"terms" : {
"field" : "feature"
},
"facet_filter" : {
"range" : { "price" : { "from" : "10", "to" : "20" } }
}
},
"prices" : {
"statistical" : {
"field" : "price"
}
}
},
"filter": {
"range" : { "price" : { "from" : "10", "to" : "20" } }
}
}
Btw, important change since ES 1.0.0. Top-level filter was renamed to post_filter (http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/_search_requests.html#_search_requests). And filtered queries using is still preferred as described here: http://elasticsearch-users.115913.n3.nabble.com/Filters-vs-Queries-td3219558.html
And there is global option for facets to avoid filtering by query filter (elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets.html#_scope).

Resources