Filter by length of nested array - elasticsearch

Here is my mapping:
{"field_name": {
"dynamic": "strict",
"properties": {...},
"type": "nested"
}}
And I am trying to filter only documents which have at least one field_name.
I tried:
{"query": {
"bool": {
"filter": [ { "script" : {
"script" : {
"inline": "doc['field_name'].length >= 1",
"lang": "painless"
} } ]
}
} }
But elasticsearch is screaming at me about No field found for [field_name] in mapping with types [type_name].
I also tried to wrap the previous query into a nested but didn't work either:
{ "nested": {
"path": "field_name",
"query": {
"bool": {
"filter": [ {
"script": {
"script": {
"inline": "doc['field_name'].length >= 1",
"lang": "painless"
}
}
} ]
}
}
} }
This gave the same error as above.
Any ideas?

if all object has same field , you can use exist to check if object exist, then use sum to calc count,then use script score to choose the condition you want. like below code
{
"query": {
"function_score": {
"query": {
"nested": {
"path": "field_name",
"query": {
"exists": {
"field": "field_name.same_field"
}
},
"score_mode": "sum"
}
},
"functions": [
{
"script_score": {
"script": {
"source": "_score >= 1 ? 1 : 0"
}
}
}
],
"boost_mode": "replace"
}
},
"min_score": 1
}

What I ended up doing is adding a field my_array_length during construction time. Like that I can just filter by the value of this field.

Simple approach would be using exists term for each of the fields:
{
"query": {
"filtered": {
"filter": {
"bool": {
"should": [
{
"exists": {
"field": "field_name.dynamic"
}
},
{
"exists": {
"field": "field_name.properties"
}
},
{
"exists": {
"field": "field_name.type"
}
}
],
"minimum_should_match": 1
}
}
}
}
}
You define should clause with minimum_should_match and get only relevant documents.
See exists, bool-query

Related

How do I get the size of a 'nested' type array through a Painless script in Elasticsearch version 6.7?

I am using Elasticsearch version 6.7. I have the following mapping:
{
"customers": {
"mappings": {
"customer": {
"properties": {
"name": {
"type": "keyword"
},
"permissions": {
"type": "nested",
"properties": {
"entityId": {
"type": "keyword"
},
"entityType": {
"type": "keyword"
},
"permission": {
"type": "keyword"
},
"permissionLevel": {
"type": "keyword"
},
"userId": {
"type": "keyword"
}
}
}
}
}
}
}
}
I want to run a query to that shows all customers who have > 0 permissions. I have tried the following:
{
"query": {
"bool": {
"filter": {
"script": {
"script": {
"lang": "painless",
"source": "params._source != null && params._source.permissions != null && params._source.permissions.size() > 0"
}
}
}
}
}
}
But this returns no hits because params._source is null as Painless does not have access to the _source document according to this Stackoverflow post. How can I write a Painless script that gives me all customers who have > 0 permissions?
Solution 1: Using Script with must query
POST <your_index_name>/_search
{
"query": {
"bool": {
"must": [
{
"script": {
"script": {
"lang": "painless",
"inline": """
ArrayList st = params._source.permissions;
if(st!=null && st.size()>0)
return true;
"""
}
}
}
]
}
}
}
Solution 2: Using Exists Query on nested fields
You could simply make use of Exists query something like the below to get customers who have > 0 permissions.
Query:
POST <your_index_name>/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "permissions",
"query": {
"bool": {
"should": [
{
"exists":{
"field": "permissions.permission"
}
},
{
"exists":{
"field": "permissions.entityId"
}
},
{
"exists":{
"field": "permissions.entityType"
}
},
{
"exists":{
"field": "permissions.permissionLevel"
}
}
]
}
}
}
}]
}
}
}
Solution 3: Create definitive structure but add empty values to the fields
Another alternative would be to ensure all documents would have the fields.
Basically,
Ensure that all the documents would have the permissions nested document
However for those who would not have the permissions, just set the field permissions.permission to 0
Construct a query that could help you get such documents accordingly
Below would be a sample document for a user who doesn't have permissions:
POST mycustomers/customer/1
{
"name": "john doe",
"permissions": [
{
"entityId" : "null",
"entityType": "null",
"permissionLevel": 0,
"permission": 0
}
]
}
The query in that case would be as simple as this:
POST <your_index_name>/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "permissions",
"query": {
"range": {
"permissions.permission": {
"gte": 1
}
}
}
}
}
]
}
}
}
Hope this helps!

How to create a bool filter on multiple fields in Elasticsearch?

I have the following query in Elasticsearch:
{
"script_fields": {
"travel_time": {
"script": {
"inline": "doc['DateTo'].value - doc['DateFrom'].value"
}
}
},
"stored_fields": [
"_source"
],
"query": {
"bool": {
"filter": {
"exists": {
"field": "DateTo"
}
}
}
}
}
How can I add DateFrom into exists filter?
You can add multiple exists criteria:
"query": {
"bool": {
"filter": [
{
"exists": {
"field": "DateFrom"
}
},
{
"exists": {
"field": "DateTo"
}
},
{
"script": {
"script": {
"inline": "doc['DateTo'].value - doc['DateFrom'].value > 0"
}
}
}
]
}
}

elastic exists query for nested documents

I have a nested documents as:
"someField": "hello",
"users": [
{
"name": "John",
"surname": "Doe",
"age": 2
}
]
according to this https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-exists-query.html, the above should match:
GET /_search
{
"query": {
"exists" : { "field" : "users" }
}
}
whereas the following should not,
"someField": "hello",
"users": []
but unfortunately both do not match. any ideas?
The example mentioned on the Elasticsearch blog refers to string and array of string types, not for nested types.
The following query should work for you:
{
"query": {
"nested": {
"path": "users",
"query": {
"bool": {
"must": [
{
"exists": {
"field": "users"
}
}
]
}
}
}
}
}
Also, you can refer to this issue for more info, which discusses this usage pattern.
This works for me
GET /type/_search?pretty=true
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "outcome",
"query": {
"exists": {
"field": "outcome.outcomeName"
}
}
}
}
]
}
}
}
With the following index mapping:
{
"index_name": {
"mappings": {
"object_name": {
"dynamic": "strict",
"properties": {
"nested_field_name": {
"type": "nested",
"properties": {
"some_property": {
"type": "keyword"
}
}
}
}
}
}
}
}
I needed to use this query:
GET /index_name/_search
{
"query": {
"nested": {
"path": "nested_field_name",
"query": {
"bool": {
"must": [
{
"exists": {
"field": "nested_field_name.some_property"
}
}
]
}
}
}
}
}
Elasticsearch version 5.4.3
The answer from user3775217 has worked for me but I needed to tweak it to work as expected for must_not. Essentially the bool/must needed to be wrapped around the nested portion of the query:
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "users",
"query": {
"exists": {
"field": "users"
}
}
}
}
}
]
}
}

Elasticsearch filtered query with script for term frequency

I'm using the attachment plugin: https://github.com/elastic/elasticsearch-mapper-attachments
I'm able to find documents with a specific word in 1 or more fields but unable to filter documents with a lower term frequency than searched for.
This works:
POST /crm/employee/_search
{
"query": {"filtered": {
"query": {"match": {
"employee.cv.content": "transitie"
}},
"filter": {
"bool": {
"should": [
{"terms": {
"employee.listEmployeeType.id": [
2
]
}}
]
}
}
}},
"highlight": {"fields": {"employee.cv.content" : {}}}
}
After a long search, I've found the following:
"script": {
"script": "crm['employee.cv.content'][lookup].tf() > occurrence",
"params": {
"lookup": "transitie",
"occurrence": 1
}
},
I'm unable to implement it unfortunately. I hope i've explained the issue good enough for someone to give me a push in the right direction!
{
"query": {
"filtered": {
"query": {
"match": {
"employee.cv.content": "transitie"
}
},
"filter": {
"bool": {
"should": [
{
"terms": {
"employee.listEmployeeType.id": [
2
]
}
}
],
"must": [
{
"script": {
"script": "_index['employee.cv.content'][lookup].tf() > occurrence",
"params": {
"lookup": "transitie",
"occurrence": 1
}
}
}
]
}
}
}
},
"highlight": {
"fields": {
"employee.cv.content": {}
}
}
}

How to add 2 values in elasticsearch script?

I am trying to create a rows_processed field by adding 2 fields src_s_rows and tgt_s_rows, but some how it is not working, it always gives me 0. Even when I give "script": "(doc['src_s_rows'].value)" instead of "script": "(doc['src_s_rows'].value+doc['tgt_s_rows'].value)" it still gives me 0.
What is it that I am missing, please help.
GET run_hist/task_hist/_search
{
"fields": [
"THROUGHPUT_ROWS_PER_SEC",
"start_time",
"end_time",
"src_s_rows",
"tgt_s_rows"
],
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"_id": "249885850"
}
}
]
}
}
}
},
"filter": {
"script": {
"script": "(doc['end_time'].value-doc['start_time'].value)>minutes*1",
"params": {
"minutes": 60000
}
}
},
"script_fields": {
"total_time_taken": {
"script": "(doc['end_time'].value-doc['start_time'].value)/1000"
},
"rows_processed": {
"script": "(doc['src_s_rows'].value+doc['tgt_s_rows'].value)"
}
},
"size": 10000
}
Screenshot given below
Use _source.src_s_rows.value in place of doc['src_s_rows'].value
try this
"script": "(_source.src_s_rows.value+_source.tgt_s_rows.value)"

Resources