Elasticsearch match against filter only - elasticsearch

We have a multi-tenant index and need to perform queries against the index for a single tenant only. Basically, for all documents that match the filter, return any documents that match the following query, but do not include documents that only match the filter.
For example, say we have a list of documents document like so:
{ _id: 1, account_id: 1, name: "Foo" }
{ _id: 2, account_id: 2, name: "Bar" }
{ _id: 3, account_id: 2, name: "Foo" }
I thought this query would work but it doesn't:
{
"bool": {
"filter": { "term": { "account_id": 2 } },
"should": [
{ "match": { "name": "Foo" }
]
}
}
It returns both documents matching account_id: 2:
{ _id: 3, account_id: 2, name: "Foo", score: 1.111 }
{ _id: 2, account_id: 2, name: "Bar", score: 0.0 }
What I really want is it just to return document _id: 3, which is basically "Of all documents where account_id is equal to 2, return only the ones whose names match Foo".
How can I accomplish this with ES 6.2? The caveat is that the number of should and must match conditions are not always known and I really want to avoid using minimum_should_match.

Try this instead: simply replace should by must:
{
"bool": {
"filter": { "term": { "account_id": 2 } },
"must": [
{ "match": { "name": "Foo" }
]
}
}

Related

Elasticsearch OR condition with Multiple Criteria

I need to query specific results in elasticsearch, using a set of global filters, and an extended OR clause, where each case in the OR clause consists of a specific tuple of conditions. In words, it would be something like this: 'Fetch records which match these global filters: category = 'foo' and name = 'bar'. From that set, fetch records where the keywords (x, y, z) match any of the following: (1, 2, 3), (4, 5, 6), or (7, 8, 9).
For example, if I have these items:
Item 1:
category: foo, name: bar,
x: 1, y: 2, z: 3
Item 2:
category: foo, name: baz,
x: 1, y: 2, z: 3
Item 3:
category: foo, name: bar,
x: 4, y: 5, z: 6
Item 4:
category: foo, name: bar,
x: 10, y: 11, z: 12
The search should not return Item 2 (because it fails the global condition that name = 'bar'), or Item 4 (because it has (x, y, z) = (10, 11, 12), which was not one of my specified/allowed tuples). It should return the other items, which match both the global conditions and fall within the list of specified/allowed tuples for the values of x, y, z.
I know I could issue one simple query per item to do this; but I assume this would be very inefficient, since I need to specify on the order of 10K tuples or more, each time.
Apologies if this was already answered; one of the existing answers may already be adaptable for this, but I am too new to elasticsearch to recognize how to do it.
Environment: elasticsearch 7.10.1 in Python 3.8.
Here you go,
{
"query": {
"bool": {
"must": [
{
"term": {
"category": "foo"
}
},
{
"term": {
"name": "bar"
}
},
{
"bool": {
"should": [
{
"bool": {
"must": [
{
"term": {
"x": 1
}
},
{
"term": {
"y": 2
}
},
{
"term": {
"z": 3
}
}
]
}
},
{
"bool": {
"must": [
{
"term": {
"x": 4
}
},
{
"term": {
"y": 5
}
},
{
"term": {
"z": 6
}
}
]
}
},
{
"bool": {
"must": [
{
"term": {
"x": 7
}
},
{
"term": {
"y": 8
}
},
{
"term": {
"z": 9
}
}
]
}
}
]
}
}
]
}
}
}
All you need is bool query, with a combination of must and should clauses.

Return unique results in elasticsearch

I have a use case in which I have data like
{
name: "John",
parentid": "1234",
filter: {a: '1', b: '3', c: '4'}
},
{
name: "Tim",
parentid": "2222",
filter: {a: '2', b: '1', c: '4'}
},
{
name: "Mary",
parentid": "1234",
filter: {a: '1', b: '3', c: '5'}
},
{
name: "Tom",
parentid": "2222",
filter: {a: '1', b: '3', c: '1'}
}
expected results:
bucket:[{
key: "2222",
hits: [{
name: "Tom" ...
},
{
name: "Tim" ...
}]
},
{
key: "1234",
hits: [{
name: "John" ...
},
{
name: "Mary" ...
}]
}]
I want to return unique document by parentid. Although I can use top aggregation but I don't how can I paginate the bucket. As there is more chance of parentid being different than same. So mine bucket array would be large and I want to show all of them but by paginating them.
There is no direct way of doing this. But you can follow these steps to get desired result.
Step 1. You should know all parentid. This data can be obtained by doing a simple terms aggregation (Read more here) on field parentid and you will get only the list of parentid, not the documents matching to that. In the end you will have a smaller array on than you are currently expectig.
{
"aggs": {
"parentids": {
"terms": {
"field": "parentid",
"size": 0
}
}
}
}
size: 0 is required to return all results. Read more here.
OR
If you already know list of all parentid then you can directly move to step 2.
Step 2. Fetch related documents by filtering documents by parentid and here you can apply pagination.
{
"from": 0,
"size": 20,
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"term": {
"parentid": "2222"
}
}
}
}
}
from and size are used for pagination, so you can loop through each of parentid in the list and fetch all related documents.
If you are just looking for all names grouped by parent id, you can use below query:
{
"query": {
"match_all": {}
},"aggs": {
"parent": {
"terms": {
"field": "parentid",
"size": 0
},"aggs": {
"NAME": {
"terms": {
"field": "name",
"size": 0
}
}
}
}
},"size": 0
}
If you want the entire document grouped by parentdId, it will be a 2 step process as explained by Sumit above and you can use pagination there.
Aggregation doesn't give you access to all documents/document-ids in the agg result, so this will have to be a 2 step process.

query_string query doesn't take word order into consideration

I have the following documents:
{ _id: 1, name: "ello worl" }
{ _id: 2, name: "world hello" }
{ _id: 3, name: "hello world" }
When I execute the following query:
{
"query": {
"query_string": {
"query": "​*ello* *worl*"
}
}
}
Documents are ordered in the same order as above, but I was expecting them to be in 1) 3) 2) order.
My question is: Why doesn't the third document have higher score than the second one?
P.S. Using wildcards is mandatory.
So it turns out that it is impossible to sort documents by relevance with wildcards out of the box in elasticsearch. So the workaround that I found is to do a boolean query with multiple should leaf queries each of which performs a wildcard search in itself with different boosts.
{
"query": {
"bool": {
"should": [
{ query_string: { query: "ello worl", boost: 4 } },
{ query_string: { query: "ello* worl*", boost: 3, analyze_wildcard: true } },
{ query_string: { query: "*ello *worl", boost: 2, analyze_wildcard: true } },
{ query_string: { query: "*ello* *worl*", analyze_wildcard: true } }
]
}
}
}

Elastic search returning wrong results

I am running a query against elastic search but the results returned are wrong. The idea is that I can check against a range of fields with individual queries. But when I pass the following query, items which don't have the included lineup are returned.
query: {
bool: {
must: [
{match:{"lineup.name":{query:"The 1975"}}}
]
}
}
The objects are events which looks like.
{
title: 'Glastonbury'
country: 'UK',
lineup: [
{
name: 'The 1975',
genre: 'Indie',
headliner: false
}
]
},
{
title: 'Reading'
country: 'UK',
lineup: [
{
name: 'The Strokes',
genre: 'Indie',
headliner: true
}
]
}
In my case both of these events are returned.
The mapping can be seen here:
https://jsonblob.com/567e8f10e4b01190df45bb29
You need to use match_phrase query, match query is looking for either The or 1975 and it find The in The strokes and it gives you that result.
Try
{
"query": {
"bool": {
"must": [
{
"match": {
"lineup.name": {
"query": "The 1975",
"type": "phrase"
}
}
}
]
}
}
}

Elastic Search: matching documents whose array contains this field

I have a document similar to this:
{
name: "bob",
contains: ["a", "b", "c"]
},
{
name: "mary",
contains: ["a", "b"]
},
{
name: "Jason",
contains: ["b"]
}
I want to make a query to find all of the people who contain "a" (bob and mary). How can I write the query?
EDIT:
Current query:
query: {
bool: {
must: [
{ match: { exists: "yes" }},
{ term: {contains: "a"}}
],
must_not: [
{ match: { status: "removed"}}
]
}
}
A term filter/query on the contains field, such as {term: {contains: "a"}} will get you what you need. Assuming that you want to just match any document which satisfies that criteria, the full query would look something like:
{
"query": {
"filtered": {
"filter": {
"term": {
"contains": "a"
}
}
}
}
}
This works because arrays of values are indexed individually, and a term query will find documents which contain an array containing that value if the field queried is a list.

Resources