Elasticsearch must_not filter not works with a big bunch of values - elasticsearch

I have the next query that include some filters:
{
"from": 0,
"query": {
"function_score": {
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"idpais": [
115
]
}
},
{
"term": {
"tipo": [
1
]
}
}
],
"must_not": [
{
"term": {
"idregistro": [
5912471,
3433876,
9814443,
11703069,
6333176,
8288242,
9924922,
6677850,
11852501,
12530205,
4703469,
12776479,
12287659,
11823679,
12456304,
12777457,
10977614,
...
]
}
}
]
}
},
"query": {
"bool": {
"should": [
{
"match_phrase": {
"area": "Coordinator"
}
},
{
"match_phrase": {
"company": {
"boost": 5,
"query": "IBM"
}
}
},
{
"match_phrase": {
"topic": "IT and internet stuff"
}
},
{
"match_phrase": {
"institution": {
"boost": 5,
"query": "University of my city"
}
}
}
]
}
}
}
},
"script_score": {
"params": {
"idpais": 115,
"idprovincia": 0,
"relationships": []
},
"script_id": "ScoreUsuarios"
}
}
},
"size": 24,
"sort": [
{
"_script": {
"order": "desc",
"script_id": "SortUsuarios",
"type": "number"
}
}
]
}
The must_not filter has a big bunch of values to exclude (around 200 values), but it looks like elasticsearch ignores those values and it includes on the result set. If I try to set only a few values (10 to 20 values) then elasticsearch applies the must_not filter.
Exists some restriction a bout the amount of values in the filters? Exists some way to remove a big amount of results from the query?

terms query is used for passing a list of values not term query.You have to use it like below in your must filter.
{
"query": {
"terms": {
"field_name": [
"VALUE1",
"VALUE2"
]
}
}
}

Related

Elasticsearch multiple fields wildcard bool query

Currently using bool query which searches for a combination of both input words or either one of input word on field "Name". How to search on multiple fields using wild cards?
POST inventory_dev/_search
{"from":0,"query":{"bool":{"must":[{"bool":{"should":[{"term":{"Name":{"value":"dove"}}},{"term":{"Name":{"value":"3.75oz"}}},{"bool":{"must":[{"wildcard":{"Name":{"value":"*dove*"}}},{"wildcard":{"Name":{"value":"*3.75oz*"}}}]}}]}}]}},"size":10,"sort":[{"_score":{"order":"desc"}}]}
You can use query_string in place of wildcard query, to search on multiple fields
{
"from": 0,
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"term": {
"Name": {
"value": "dove"
}
}
},
{
"term": {
"Name": {
"value": "3.75oz"
}
}
},
{
"bool": {
"must": [
{
"query_string": {
"query": "*dove*",
"fields": [
"field1",
"Name"
]
}
},
{
"query_string": {
"query": "*3.75oz*",
"fields": [
"field1",
"Name"
]
}
}
]
}
}
]
}
}
]
}
},
"size": 10,
"sort": [
{
"_score": {
"order": "desc"
}
}
]
}

Combining missing and term query in nested document in Elasticsearch

I have these 3 documents, where fields is of type nested:
{
"fields": [
{"field_id": 23, "value": "John Doe"},
{"field_id": 92, "value": null}
]
}
{
"fields": [
{"field_id": 23, "value": "Ada Lovelace"},
]
}
{
"fields": [
{"field_id": 23, "value": "Jack Daniels"},
{"field_id": 92, "value": "jack#example.com"}
]
}
I need to search for documents where:
(`field_id` = `92` AND `value` is `null`) OR (`field_id` `92` is missing.)
Combining a terms and missing query leads to only the document with the null value being returned:
...
"nested": {
"path": "fields",
"filter": {
"bool": {
"bool": {
"must": [
{
"missing": {
"field": "fields.value"
}
},
{
"terms": {
"fields.field_id": [92]
}
}
]
}
}
}
}
...
How can I do this?
You already have query for one condition. Lets call this A. For second condition check for fields.field_id: 92 in nested documents. Lets say this is B. But your condition is fields.field_id: 92 should not exist. So to achieve this wrap B in must_not. i.e. B'
What is required is A OR B'
So the final query will be:
{
"query": {
"bool": {
"should": [
{
"nested": {
"path": "fields",
"query": {
"bool": {
"must": [
{
"term": {
"fields.field_id": 92
}
}
],
"must_not": [
{
"exists": {
"field": "fields.value"
}
}
]
}
}
}
},
{
"bool": {
"must_not": [
{
"nested": {
"path": "fields",
"query": {
"term": {
"fields.field_id": 92
}
}
}
}
]
}
}
]
}
}
}

Using multiple Should queries

I want to get docs that are similar to multiple "groups" but separately. Each group has it's own rules (terms).
When I try to use more than one Should query inside a "bool" I get items that are a mix of both Should's terms.
I want to use 1 query total and not msearch for example.
Can someone please help me with that?
{
"explain": true,
"query": {
"filtered": {
"filter": {
"bool": {
"must_not": [
{
"term": {
"p_id": "123"
}
},
{
"term": {
"p_id": "124"
}
}
]
}
},
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"term": {
"cat": "1"
}
},
{
"term": {
"cat": "2"
}
},
{
"term": {
"keys": "a"
}
},
{
"term": {
"keys": "b"
}
}
]
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"term": {
"cat": "6"
}
},
{
"term": {
"cat": "7"
}
},
{
"term": {
"keys": "r"
}
},
{
"term": {
"keys": "u"
}
}
]
}
}
]
}
}
}
},
"from": 0,
"size": 3
}
You can try using a terms aggregation on multiple fields with scripting and add a top hits aggregation as a sub-aggregation. Be warned this will be pretty slow. Add this after the query/filter and adjust the size parameter as needed
"aggs": {
"Cat_and_Keys": {
"terms": {
"script": "doc['cat'].values + doc['keys'].values"
},
"aggs":{ "separate_docs": {"top_hits":{"size":1 }} }
}
}

How to query multiple parameters in a nested field in elasticsearch

I'm trying to search for keyword and then add nested queries for amenities which is a nested field of an array of objects.
With the query below I am able to search when I'm only matching one amenity id but when I have more than one it doesn't return anything.
Anyone have an idea what is wrong with my query ?
{
"sort": [
{
"_score": {
"order": "desc"
}
},
{
"_geo_distance": {
"geolocation": [
100,
10
],
"order": "asc",
"unit": "m",
"mode": "min",
"distance_type": "sloppy_arc"
}
}
],
"query": {
"bool": {
"must": [
{
"multi_match": {
"fields": [
"name^2",
"city",
"state",
"zip"
],
"fuzziness": 5,
"query": "complete"
}
},
{
"nested": {
"path": "amenities",
"query": {
"bool": {
"must": [
{
"term": {
"amenities.id": "1"
}
},
{
"term": {
"amenities.id": "2"
}
}
]
}
}
}
}
]
}
}
}
When you do:
"must": [
{
"term": {
"amenities.id": "1"
}
},
{
"term": {
"amenities.id": "2"
}
}]
What you're actually saying is find me any document where "amenities.id"="1" and "amenities.id"="2" which unless "amenities.id" is a list of values it won't work.
What you probably want to say is find me any document where "amenities.id"="1" or "amenities.id"="2"
To do that you should use should instead of must:
"should": [
{
"term": {
"amenities.id": "1"
}
},
{
"term": {
"amenities.id": "2"
}
}]

Elasticsearch Query for getting field with 'AND' relation

I'm having elastic document as below
I want a search query satisfying condition:
how to get the those OPERATIONS and CATEGORY values that has both AREA=Mumbai and AREA=Chennai
So Output should be CATEGORY:Consulting1 , OPERATIONS: Regulatory Operations
Use terms Query :
{
"query": {
"terms": {
"AREA": [
"Mumbai",
"Chennai"
]
}
}
}
May be that works:
{
"query": {
"bool": {
"must": [
{"term": { "AREA" : "Mumbai" }},
{"term": { "AREA" : "Chennai" }}
]
}
}
}
Try this and let me know:
{
"size": 0,
"query": {
"bool": {
"should": [
{
"term": {
"AREA": "mumbai"
}
},
{
"term": {
"AREA": "chennai"
}
}
]
}
},
"aggs": {
"unique_operations": {
"terms": {
"field": "OPERATIONS",
"size": 10
},
"aggs": {
"count_areas": {
"cardinality": {
"field": "AREA"
}
},
"top": {
"top_hits": {
"size": 2,
"_source": {
"include": ["CATEGORY"]
}
}
},
"areas_bucket_filter": {
"bucket_selector": {
"buckets_path": {
"areasCount": "count_areas"
},
"script": "areasCount == 2"
}
}
}
}
}
}
LATER EDIT: added top_hits aggregation to get back sample documents covering the request for the categories.
Please try this one.
{
"query": {
"bool": {
"should": [
{
"query_string": {
"default_field": "AREA",
"query": "mumbai"
}
},
{
"query_string": {
"default_field": "AREA",
"query": "chennai"
}
}
]
}
}
}[![result][1]][1]

Resources