Finding docs that do not contain a given user in all nested fields - elasticsearch

I have the following mapping for a nested field called ratings:
"ratings" : {
"type" : "nested",
"properties" : {
"rating" : {
"type" : "double"
},
"user_id" : {
"type" : "long"
}
}
}
I'm attempting to find all records where a user_id does not exist in the nested field.
Here's what I have, but it's failing when there are multiple nested docs and any of the docs are not user_id 1.
{
"nested": {
"path": "ratings",
"query": {
"bool": { "must_not": [
{ "term": { "ratings.user_id": 1}}
]}}}}

If I'm understanding you correctly, and what you are trying to do is find documents for which NONE of the nested documents have a specific user_id, then this query seems to do what you want (assuming you want docs that have not been rated by user 2):
POST /test_index/_search
{
"query": {
"constant_score": {
"filter": {
"not": {
"filter": {
"nested": {
"path": "ratings",
"filter": {
"term": {
"ratings.user_id": 2
}
}
}
}
}
}
}
}
}
Here's the code I used to test it:
http://sense.qbox.io/gist/afd319e64403a7f995cbf1e9f40e5c5948729193

Related

Elasticsearch: "must" query on nested fields

How to do a "must" "match" query on multiple fields under the same nesting? Here's a reproducible ES index where the "user" field is defined as "nested" type.
PUT my_index
{
"mappings": {
"properties": {
"user": {
"type": "nested",
"properties": {
"firstname": {"type": "text"}
}
}
}
}
}
And here are 2 documents:
PUT my_index/_doc/1
{
"user" : [
{
"firstname" : "John"
},
{
"firstname" : "Alice"
}
]
}
PUT my_index/_doc/2
{
"user" : [
{
"firstname" : "Alice"
}
]
}
For this index, how can I query for documents where "John" AND "Alice" both exist? With the index defined above, I expect to get Document 1 but not Document 2. So far, I've tried the following code, but it's returning no hits:
GET my_index/_search
{
"query": {
"nested": {
"path": "user",
"query": {
"bool": {
"must": [
{"match": {"user.firstname": "John"}},
{"match": {"user.firstname": "Alice"}}
]
}
}
}
}
}
Below query is what is required.
POST my_index/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "user",
"query": {
"match": {
"user.firstname": "alice"
}
}
}
},
{
"nested": {
"path": "user",
"query": {
"match": {
"user.firstname": "john"
}
}
}
}
]
}
}
}
Notice how I've made use of two nested queries in a single must clause. That is because if you notice the documents that you have alice and john are both considered two different documents.
The query you have would work if your document structure is something like below:
POST my_index/_doc/3
{
"user" : [
{
"firstname" : ["Alice", "John"]
}
]
}
Try reading this (nested datatype) and this (nested query) link to understand more on how they work and from the second link, you can see the below info:
The nested query searches nested field objects as if they were indexed
as separate documents.
Hope that helps!

Disable hits and use exclusively inner_hits

>Beginner here.
I have made a architecture for my profile - photo's application. In this application user can search for member by member's attributes and photo's attributes. And returned is only the photo's that have matched the query.
The problem is that one user might have thousands of photo's and each time a search is ran it return's hits: full object's of the profiles( with the nested photos ).
How can i make elasticsearch return only the value's of inner_hits?
Here is my query:
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "photo",
"query": {
"bool": {
"must": [
{
"match": {
"photo.make": "BMW"
}
},
{
"match": {
"photo.model": "111"
}
}
]
}
},
"inner_hits" : {"size": 1}
}
}
]
}}}
Duplicate of: Elasticsearch: Return only nested inner_hits
Quoting:
Should be able to achieve it by disabling source-field at top-level by specifying "_source" : false
POST /networkcollection/branch_routers/_search/
{
"_source" : false,
"query": {
"nested": {
"path": "queries",
"query": {
"bool": {
"must": [
{ "match":
{ "queries.dateQuery": "20160101T200000.000Z" }
}
]
}
},
"inner_hits" : {}
}
}
}

Elasticsearch: Return only nested inner_hits

I have the following query:
GET /networkcollection/branch_routers/_search/
{
"query": {
"nested": {
"path": "queries",
"query": {
"bool": {
"must": [
{ "match":
{ "queries.dateQuery": "20160101T200000.000Z" }
}
]
}
},
"inner_hits" : {}
}
}
}
This returns both the "hits" object (the entire document), as well as the "inner_hits" object (nested inside of hits).
Is there a way to for me to only return the matched "queries" element(s) which appear in the "inner_hits" results, without getting the whole document?
Should be able to achieve it by disabling source-field at top-level by specifying "_source" : false
POST /networkcollection/branch_routers/_search/
{
"_source" : false,
"query": {
"nested": {
"path": "queries",
"query": {
"bool": {
"must": [
{ "match":
{ "queries.dateQuery": "20160101T200000.000Z" }
}
]
}
},
"inner_hits" : {}
}
}
}

Elastic Search Nested Object mapping and Query for search

I am trying to use Elastic Search and I am stuck trying to query for the nested object.
Basically my object is of the following format
{
"name" : "Some Name",
"field2": [
{
"prop1": "val1",
"prop2": "val2"
},
{
"prop1": "val3",
"prop2":: "val4"
}
]
}
Mapping I used for the nested field is the following.
PUT /someval/posts/_mapping
{
"posts": {
"properties": {
"field2": {
"type": "nested"
}
}
}
}
Say now i insert elements for /field/posts/1 and /field/posts/2 etc. I have k values for field2.prop1 and i want a query which gets the posts sorted based on most match of field2.prop1 among the K values i have. What would be the appropriate query for that.
Also I tried a simple filter but even that doesnt seem to work right.
GET /someval/posts/_search
{
"query": {
"filtered": {
"query": {
"match_all": {}
}
},
"filter" : {
"nested" : {
"path" : "field2",
"filter" : {
"bool" : {
"must" : [
{
"term" : {"field2.prop1" : "val1"}
}
]
}
},
"_cache" : true
}
}
}
}
The above query should match atleast the first post. But it returns no match. Can anyone help to clarify whats wrong here ?
There was problem in your json structure, you used filtered query , but filter(object) was in different level than query.
Find the difference.
POST /someval/posts/_search
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"nested": {
"path": "field2",
"filter": {
"bool": {
"must": [
{
"term": {
"field2.prop1": "val1"
}
}
]
}
},
"_cache": true
}
}
}
}
}

Term, nested documents and must_not query incompatible in ElasticSearch?

I have trouble combining term, must_not queries on nested documents.
Sense example can be found here : http://sense.qbox.io/gist/be436a1ffa01e4630a964f48b2d5b3a1ef5fa176
Here my mapping :
{
"mappings": {
"docs" : {
"properties": {
"tags" : {
"type": "nested",
"properties" : {
"type": {
"type": "string",
"index": "not_analyzed"
}
}
},
"label" : {
"type": "string"
}
}
}
}
}
with two documents in this index :
{
"tags" : [
{"type" : "POST"},
{"type" : "DELETE"}
],
"label" : "item 1"
},
{
"tags" : [
{"type" : "POST"}
],
"label" : "item 2"
}
When I query this index like this :
{
"query": {
"nested": {
"path": "tags",
"query": {
"bool": {
"must": {
"term": {
"tags.type": "DELETE"
}
}
}
}
}
}
}
I've got one hit (which is correct)
When I want to get documents WHICH DON'T CONTAIN the tag "DELETE", with this query :
{
"query": {
"nested": {
"path": "tags",
"query": {
"bool": {
"must_not": {
"term": {
"tags.type": "delete"
}
}
}
}
}
}
}
I've got 2 hits (which is incorrect).
This issue seems very close to this one (Elasticsearch array must and must_not) but it's not...
Can you give me some clues to resolve this issue ?
Thank you
Your original query would search in each individual nested object and eliminate the objects that don't match, but if there are some nested objects left, they do match with your query and so you get your results. This is because nested objects are indexed as a hidden separate document
Original code:
{
"query": {
"nested": {
"path": "tags",
"query": {
"bool": {
"must_not": {
"term": {
"tags.type": "delete"
}
}
}
}
}
}
}
The solution is then quite simple really, you should bring the bool query outside the nested documents. Now all the documents are discarded who have a nested object with the "DELETE" type. Just what you wanted!
The solution:
{
"query": {
"bool": {
"must_not": {
"nested": {
"path": "tags",
"query": {
"term": {
"tags.type": "DELETE"
}
}
}
}
}
}
}
NOTE: Your strings are "not analyzed" and you searched for "delete" instead of "DELETE". If you want to search case insensitive, make your strings analyzed
This should fix your problem: http://sense.qbox.io/gist/f4694f542bc76c29624b5b5c9b3ecdee36f7e3ea
Two most important things:
include_in_root on "tags.type". This will tell ES to index tag types as "doc.tags.types" : ['DELETE', 'POSTS'], so you can access an array of those values "flattened" on the root doc . This means you no longer need a nested query (see #2)
Drop the nested query.
{
"mappings": {
"docs" : {
"properties": {
"tags" : {
"type": "nested",
"properties" : {
"type": {
"type": "string",
"index": "not_analyzed"
}
},
"include_in_root": true
},
"label" : {
"type": "string"
}
}
}
}
}
{
"query": {
"bool": {
"must_not": {
"term": {
"tags.type": "DELETE"
}
}
}
}
}

Resources