Elasticsearch, can't remove a field inside nested field - elasticsearch

I have mappings
{
"candidate-index" : {
"mappings" : {
"properties" : {
"provider_candidates" : {
"type" : "nested",
"properties" : {
"foo" : {
"type" : "object"
},
"group_key" : {
"type" : "keyword"
}
}
}
}
}
}
I want to delete foo field
POST /candidate-index/_update_by_query
{
"script" : "ctx._source.remove(\"provider_candidates.foo\")",
"query": {
"nested": {
"path": "provider_candidates",
"query": {
"bool": {
"must": [
{
"exists": {
"field": "provider_candidates.foo"
}
}
]
}
}
}
}
}
It doesn't work. It doesn't generate an error, but the field is not removed.
I know the query part is correct, because if I turn it into _search it correctly finds documents
I also tried
POST /candidate-index/_update_by_query
{
"script" : "ctx._source.provider_candidates.remove(\"foo\")",
"query": {
"nested": {
"path": "provider_candidates",
"query": {
"bool": {
"must": [
{
"exists": {
"field": "provider_candidates.foo"
}
}
]
}
}
}
}
}
it says
{
"error" : {
"root_cause" : [
{
"type" : "script_exception",
"reason" : "runtime error",
"script_stack" : [
"ctx._source.provider_candidates.remove(\"foo\")",
" ^---- HERE"
],
"script" : "ctx._source.provider_candidates.remove(\"foo\")",
"lang" : "painless"
}
],
"type" : "script_exception",
"reason" : "runtime error",
"script_stack" : [
"ctx._source.provider_candidates.remove(\"foo\")",
" ^---- HERE"
],
"script" : "ctx._source.provider_candidates.remove(\"foo\")",
"lang" : "painless",
"caused_by" : {
"type" : "wrong_method_type_exception",
"reason" : "cannot convert MethodHandle(List,int)Object to (Object,String)Object"
}
},
"status" : 400
}

You need to loop provider_candidates field and then delete field inside it
POST /index51/_update_by_query
{
"script" : "for (int i = 0; i < ctx._source.provider_candidates.length; ++i) { ctx._source.provider_candidates[i].remove(\"foo\") }",
"query": {
"nested": {
"path": "provider_candidates",
"query": {
"bool": {
"must": [
{
"exists": {
"field": "provider_candidates.foo"
}
}
]
}
}
}
}
}

Related

how to ignore number_format_exception error in elasticsearch query

Hii how to ignore datatype error in below query since it throws error when a string value is provided for a field that has non-numeric(long) datatype. I am aware of lenient parameter but it does not work with term query.
GET employee/_search
{
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"terms": {
"employee_id": [
"abcdef"
]
}
},
{
"terms": {
"employee_name": [
"abcdef"
]
}
}
]
}
}
]
}
}
}
Error Message
"caused_by": {
"type": "number_format_exception",
"reason": "For input string: \"abcdef\""
}
Elasticsearch details
"version" : {
"number" : "7.1.1",
"build_flavor" : "oss",
"build_type" : "tar",
"build_hash" : "Unknown",
"build_date" : "2020-11-03T08:48:42.499923Z",
"build_snapshot" : false,
"lucene_version" : "8.0.0",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
}
So the expected behaviour is the datatype error should be ignored and the rest of query runs and produces the result since it is in a should condition and if there is a must condition then give no result
mapping of index
{
"employee" : {
"mappings" : {
"dynamic" : "true",
"properties" : {
"employee_id" : {
"type" : "long"
}
}
}
}
}
You can use the query_string_query instead:
{
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"query_string": {
"query": "employee_id:abcdef" <---
}
},
{
"terms": {
"employee_name": [
"abcdef"
]
}
}
]
}
}
]
}
}
Your original query was a terms query which is equivalent to a logical OR. As such, you can adapt the query string to be:
"employee_id:(abcdef OR xyz OR 123)"
where the value type won't play a role.

Elasticsearch scripting array size

Can anyone help me to construct below query. I get below error, when running this query. ES version is 7.9.0;
In my model there is a field "repliedBy" which is an array field. It's value is always initialized with empty array. But on some entities it has one or couple of objects. I need to write a query to get all items with empty array only.
GET myTable/_search
{
"query": {
"bool": {
"must": [
{
"script": {
"script": {
"source": "doc['repliedBy'].size() == params.val",
"params": {
"val": 0
}
}
}
},
{
"range": {
"receivedDate": {
"gte": "2020-09-15T07:51:21.000Z",
"lte": "2020-12-01T07:51:21.000Z"
}
}
}
]
}
}
}
Error:
"error" : {
"root_cause" : [
{
"type" : "script_exception",
"reason" : "runtime error",
"script_stack" : [
"org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:90)",
"org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:41)",
"doc['repliedBy'].size() == params.val",
" ^---- HERE"
],
"script" : "doc['repliedBy'].size() == params.val",
"lang" : "painless",
"position" : {
"offset" : 4,
"start" : 0,
"end" : 37
}
}
],
This is the job for a bool/must_not/exists query combination, like this:
{
"query": {
"bool": {
"must_not": [
{
"exists": {
"field": "repliedBy.id"
}
}
],
"filter": [
{
"range": {
"receivedDate": {
"gte": "2020-09-15T07:51:21.000Z",
"lte": "2020-12-01T07:51:21.000Z"
}
}
}
]
}
}
}

Boosting an Elasticsearch result by 'age' if applicable

I want to search multiple indices in Elasticsearch (news items in search_news and documents in search_documents) and whenever an index has a publicationDate field (news items only), I want to 'sort' this, so I boost newer news items. I am using Elasticsearch 6.8.
I found the script_scoring example in https://dzone.com/articles/23-useful-elasticsearch-example-queries (last one). But this throws errors and based on the documentation I came up to
GET /search_*/_search
{
"query": {
"function_score": {
"query": {
"bool": {
"must": {
"query_string": {
"query": "Lorem Ipsum"
}
},
"must_not": {
"exists": {
"field": "some_exlusion_field"
}
}
}
},
"script_score": {
"script": {
"params" : {
"threshold": "2019-04-04"
},
"source": "publishDate = doc['publishDate'].value; if (publishDate > Date.parse('yyyy-MM-dd', threshold).getTime()) { return log(2.5) } return log(1);"
}
}
}
}
}
This results in the error:
{
"error": {
"root_cause": [
{
"type": "script_exception",
"reason": "compile error",
"script_stack": [
"publishDate = doc['publis ...",
"^---- HERE"
],
"script": "publishDate = doc['publishDate'].value; if (publishDate > Date.parse('yyyy-MM-dd', threshold).getTime()) { return log(2.5) } return log(1);",
"lang": "painless"
}
}
I managed to minify the source to:
"source": "if (doc['publishDate'] > '2019-04-04') { return 5 } return 1;"
But no success:
"failures" : [
{
"shard" : 0,
"index" : "search_document_page",
"node" : "c0iLpxiJRqmgwS0KY8OybA",
"reason" : {
"type" : "script_exception",
"reason" : "runtime error",
"script_stack" : [
"org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:81)",
"org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:39)",
"if (doc['publishDate'] > '2019-04-04') { ",
" ^---- HERE"
],
"script" : "if (doc['publishDate'] > '2019-04-04') { return 5 } return 1;",
"lang" : "painless",
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "No field found for [publishDate] in mapping with types []"
}
}
},
{
"shard" : 0,
"index" : "search_news",
"node" : "c0iLpxiJRqmgwS0KY8OybA",
"reason" : {
"type" : "script_exception",
"reason" : "runtime error",
"script_stack" : [
"if (doc['publishDate'] > '2019-04-04') { ",
" ^---- HERE"
],
"script" : "if (doc['publishDate'] > '2019-04-04') { return 5 } return 1;",
"lang" : "painless",
"caused_by" : {
"type" : "class_cast_exception",
"reason" : "Cannot apply [>] operation to types [org.elasticsearch.index.fielddata.ScriptDocValues.Dates] and [java.lang.String]."
}
}
}
]
}
}
Any suggestion for checking the existence of the field in doc and how to check the date properly?
For the existence check ( doc here ) :
if (!doc.containsKey('publishDate')) {
return 1;
}
And for the date comparison, you can try this way
if (Date.parse('yyyy-MM-dd', params.threshold).getMillis() > doc['publishDate'].getMillis()) {
return 5;
} else {
return 1;
}

Display field value of data type token_count

I have the following mapping:
"fullName" : {
"type" : "text",
"norms" : false,
"similarity" : "boolean",
"fields" : {
"raw" : {
"type" : "keyword"
},
"terms" : {
"type" : "token_count",
"analyzer" : "standard"
}
}
}
I want to display the value of terms field. When I do the following, I get the fullName but not the terms value
GET /_search
{"_source": ["fullName","fullName.terms"],
"query": {
"bool" : {
"must" : {
"script" : {
"script" : {
"source": "doc['fullName.terms'].value != 3,
"lang": "painless"
}
}
}
}
}
}
How can I get it?
You need to configure that your token count is stored - Here documentation
You should modify your mapping :
"terms" : {
"type" : "token_count",
"analyzer" : "standard",
"store": true
}
Then to retrive the value you need to explicitly ask for stored value in your query : ( here documentation )
GET /_search
{
"_source": [
"fullName"
],
"stored_fields": [
"fullName.terms"
],
"query": {
"bool": {
"must": {
"script": {
"script": {
"source": "doc['fullName.terms'].value != 3",
"lang": "painless"
}
}
}
}
}
}

elasticsearch searching array field inside nested type

i am trying to filter my result using nested filter but i am getting incorrect result
here is my mapping info
{
"stock" : {
"mappings" : {
"clip" : {
"properties" : {
"description" : {
"type" : "string"
},
"keywords" : {
"type" : "nested",
"properties" : {
"category" : {
"type" : "string"
},
"tags" : {
"type" : "string",
"index_name" : "tag"
}
}
},
"tags" : {
"type" : "string",
"index_name" : "tag"
},
"title" : {
"type" : "string"
}
}
}
}
}
}
clip document data
{
"_index" : "stock",
"_type" : "clip",
"_id" : "AUnsTOBBpafrKleQN284",
"_score" : 1.0,
"_source":{
"title": "journey to forest",
"description": "this clip contain information about the animals",
"tags": ["birls", "wild", "animals", "roar", "forest"],
"keywords": [
{
"tags": ["spring","summer","autumn"],
"category": "Weather"
},
{
"tags": ["Cloudy","Stormy"],
"category": "Season"
},
{
"tags": ["Exterior","Interior"],
"category": "Setting"
}
]
}
i am trying to filter tags inside nested field 'keywords'
here is my query
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"nested": {
"path": "keywords",
"filter": {
"bool": {
"must": [
{
"terms": { "tags": ["autumn", "summer"] }
}
]
}
}
}
}
}
}
}
i am getting no result why ?
what's wrong with my query or schema please help
The above query is syntactically incorrect . You need to provide the full path to tags from root keywords in the term query i.e.keywords.tags
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"nested": {
"path": "keywords",
"filter": {
"bool": {
"must": [
{
"terms": { "keywords.tags": ["autumn", "summer"] }
}
]
}
}
}
}
}
}
}

Resources