Query with multiple values on a property with one value in Elasticsearch - elasticsearch

I am trying to build on this query a little bit. The index I am searching also has a field "entity" with an id. So a few records will have "entity" : 16, "entity" 156 etc, depending on the id of the entity. I need to expand this query in such a way that I can pass an array or some list of values in, such as {:term => {:entity => [1, 16, 100]}} and get back records that have one of these integers as their entity value. I haven't had any luck so far, can someone help me?
{
"query" : {
"bool" : {
"must" : [
{
"term" : {"user_type" : "alpha"}
},
{
"term" :{"area" : "16"}
}
],
"must_not" : [],
"should" : []
}
},
"filter": {
"or" : [{
"and" : [
{ "term" : { "area" : "16" } },
{ "term" : { "date" : "05072013" } }
]
}, {
"and" : [
{ "term" : { "area" : "16" } },
{ "term" : { "date" : "blank" } }
]
}
]
},
"from" : 0,
"size" : 100
}

Use "terms" instead of "term".
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-terms-filter.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-terms-query.html
{ "terms" : { "entity" : [ 123, 1234, ... ] }}

Related

Elasticsearch one field match different values

I want to do a query, with one field match different values,In SQL it likes:
select * from location where address like '%California%' and address like '%145%';
I tried using must condition array, it contains several phrase match conditions, but its doesnt work!
{
"from" : 0,
"size" : 10,
"query" : {
"bool" : {
"must" : {
"bool" : {
"must" : [ {
"match" : {
"address" : {
"query" : "California",
"type" : "phrase"
}
}
}, {
"match" : {
"address" : {
"query" : "145",
"type" : "phrase"
}
}
} ]
}
}
}
},
"sort" : [ {
"pageRankScore" : {
"order" : "desc",
"unmapped_type" : "double"
}
} ]
}
Thats my code, it only do a match '145', never match 'California'.
My question is: with several values, how to do a fuzzy match in one field?
Help me, thanks a lot!

Why my Elasticsearch query retrieves all indexed documents

I've a problem to understand the functionality of the following Elasticsearch (ES 6.4) query:
{
"query" : {
"bool" : {
"should" : [
{
"match" : {
"title" : {
"query" : "example",
"operator" : "AND",
"boost" : 2
}
}
},
{
"multi_match" : {
"type" : "best_fields",
"query" : "example",
"operator" : "AND",
"fields" : [
"author", "content", "tags"
],
"boost" : 1
}
}
],
"must" : [
{
"range" : {
"dateCreate" : {
"gte" : "2000-01-01T00:00:00+0200",
"lte" : "2019-02-12T23:59:59+0200"
}
}
},
{
"term" : {
"client" : {
"value" : "test",
"boost" : 1
}
}
}
]
}
},
"size" : 10,
"from" : 0,
"sort" : [
{
"_score" : {
"order" : "desc"
}
}
]
}
The query is executed successfully but retrieves about 400,000 documents which is the total count of my index. It means that all documents are in the result set. But why? Is this really the correct behavior of the multi_match query?
When I was still using the query_string query, I only got the actual matching documents. That's why I'm a bit surprised.
You're missing minimum_should_match:
"bool" : {
"minimum_should_match": 1, <--- add this
"should" : [
...

Can't create the correct filter for my elasticsearch query

I've a problem to set the correct filter. My query looks like this:
{
"query" : {
"bool" : {
"must" : [
{
"query_string" : {
"query" : "example~",
"analyzer" : "standard",
"default_operator" : "OR",
"fuzziness" : "AUTO"
}
},
{
"term" : {
"client" : {
"value" : "MyClient",
"boost" : 1
}
}
},
{
"range" : {
"dateCreate" : {
"gte" : "2016-01-01T00:00:00+0200",
"lte" : "2016-12-31T23:59:59+0200"
}
}
},
{
"match" : {
"lang" : "php OR java"
}
}
]
}
},
"size" : 10,
"from" : 0,
"sort" : [
{
"_score" : {
"order" : "desc"
}
}
]
}
The "lang" field is of type text.
My expectation is to get all documents with the given query string and then I want select only the documents which have "PHP" or "Java" in their lang field. The lang fields only contain either "PHP" or "Java" but never both strings so I thought about using an exact matching but I can't got it to work.
The result is actually a list of two documents but with total_count=2510.
One of my documents that doesn't match:
{
"id" : "d3295f18-a033-4934-941a-21a8bef901e8",
"client" : "MyClient",
"lang" : "PHP",
"author" : null,
"dateCreate" : "2016-03-31T00:00:00+0200",
"title" : "Sample document",
"content" : "This is a short text describing the deocument."
}
Yes, the client field is also of type text.
client field must be either of keyword type to use term query or change the query for client from term to match:
{
"match" : {
"client" : {
"query" : "MyClient",
"boost" : 1
}
}
}

elasticsearch nested functionScoreQuery cannot access parent properties

I have a type in elasticsearch that looks like this:
"hotel" : {
"field" : 1,
"rooms" : [
{
"type" : "single",
"magicScore" : 1
},
{
"type" : "double",
"magicScore" : 2
}
]
}
where rooms is of type nested. I sort using a nested functionScoreQuery:
{
"query" : {
"filtered" : {
"query" : {
"nested" : {
"query" : {
"function_score" : {
"filter" : {
"match_all" : { }
},
"functions" : [ {
"script_score" : {
"script" : "return doc['hotel.field'].value"
}
} ]
}
},
"path" : "rooms",
"score_mode" : "max"
}
}
}
}
Problem is hotel.field returns 0 always. Is there a way to access the parent field inside a nested query? I know I can always pack the field inside the nested document but its a hack not a solution. Would using a dismax query help me? https://discuss.elastic.co/t/nested-value-on-function-score/29935
The query I am actually using looks something like this:
{
"query" : {
"bool" : {
"must" : {
"nested" : {
"query" : {
"function_score" : {
"query" : {
"not" : {
"query" : {
"terms" : {
"rooms.type" : [ "single", "triple" ]
}
}
}
},
"functions" : [ {
"script_score" : {
"script" : {
"inline" : "return doc['rooms.magicScore'].value;",
"lang" : "groovy",
"params" : {
"ratings" : {
"sample" : 0.5
},
"variable" : [ 0.0, 0.0, 0.0, 0.0, -0.5, -2.5]
}
}
}
} ],
"score_mode" : "max"
}
},
"path" : "rooms"
}
},
"filter" : {
"bool" : {
"filter" : [ {
"bool" : {
"should" : [ {
"term" : {
"cityId" : "166"
}
}, {
"term" : {
"cityId" : "165"
}
} ]
}
}, {
"nested" : {
"query" : {
"not" : {
"query" : {
"terms" : {
"rooms.type" : [ "single", "triple" ]
}
}
}
},
"path" : "rooms"
}
} ]
}
}
}
}
}
What I am trying to achieve is to access for example the cityId inside the function_score query which is nested.
The question is why are you accessing the parent values in a nested query. Once you are in the nested context, you cannot access parent fields or other fields from other nested fields.
From the documentation:
The nested clause “steps down” into the nested comments field. It no longer has access to fields in the root document, nor fields in any other nested document.
So, rewrite your queries so that the nested part touches the fields in that nested field and anything else is accessed outside the nested part.

Elasticsearch bool filter for multiple conditions on same element of array

I'm trying to create a query/filter that matches a document only if a number of conditions are met on the same item of an array.
Let's say this is the document:
{
arr: [
{ "f1" : "a" , f2 : true },
{ "f1" : "b" , f2 : false}
]
}
I want to be able to retrieve documents that have N conditions matching on the same element. For example: arr.f1 == "a" AND arr.f2 == true should match the document but arr.f1 == "b" AND arr.f2 == true should not.
I'm trying nested bool filters (I have other filters apart from this one) but it doesn't work, something in the lines of
"bool" : {
"must": [
{ some other filter },
{"bool": {"must" : [
{"term" : {"arr.f1" : "a"}},
{"term" : {"arr.f2" : true}},
] }}
]
}
Any idea how to do that?
thanks
edit:
I changed the mapping and now a nested query works as per Val's response. I'm now not able to do an "exists" filter on the nested field:
A simple { "filter" : {"exists" : { "field" : "arr" } } } search returns no hits. How do I do that?
edit: It looks like I need to do a nested exists filter to check that a field inside the nested object exists.
something like:
"filter" : {
"nested" : {"path" : "arr", "filter" : {"exists" : { "field" : "f1" } }}
}
edit:
argh - now highlight doesn't work anymore:
"highlight" : {
"fields" : [
{"arr.f1" : {}},
]
}
Worked around that by adding include_in_parent : true and querying both the nested field and the root object. It's just awful. If anyone has a better idea, they're more than welcome!
{
"query" : {
"bool" : {
"must": [
{"term" : { "arr.f1" : "a" }},
{ "nested" : { "path" : "arr",
"query" : { "bool" : { "must" : [
{"term" : { "arr.f1" : "a" }},
{"term" : { "arr.f2" : true }}
] } }
}}
]
}
},
"highlight" : {
"fields" : [
{"arr.f1" : {}},
]
}
}
In case you're wondering: it's legacy stuff. I can't reindex right now (that would be the obvious solution) and I need a quick & dirty workaround
You need to set the type of your arr field as nested like this:
{
"your_type": {
"properties": {
"arr": {
"type": "nested",
"properties": {
"f1": {"type":"string"},
"f2": {"type":"boolean"}
}
}
}
}
}
Then you need to use a nested query:
{
"nested" : {
"path" : "arr",
"query" : {
"bool" : {
"must" : [
{
"term" : {"arr.f1" : "a"}
},
{
"term" : {"arr.f2" : true}
}
]
}
}
}
}
Your exists filter needs to specify the full field path
"filter" : {
"nested" : {"path" : "arr", "filter" : {"exists" : { "field" : "arr.f1" } }}
}

Resources