ElasticSearch 2.X finding a numeric value exists in the array fails - elasticsearch

I have mapping of one filed is as follows in ES2.3
"move_in_ts": {
"type": "integer"
}
"move_out_ts": {
"type": "integer"
}
Sample document stores data as follows:
"move_in_ts": [
1475280000,
1475539200,
1475712000,
1475884800,
1477008000,
1477785600
]
I have a script in my DSL query (trying to find an integer in that array)
"script": "if(doc['move_in_ts'] && doc['move_in_ts'].values.contains('1475280000')){return 200;}"
and also tried this:
"script": "if(doc['move_in_ts'] && doc['move_in_ts'].contains('1475280000')){return 200;}"
and also tried this:
"script": "if(doc['move_in_ts'] && doc['move_in_ts'].contains(1475280000)){return 200;}"
and also tried this:
"script": "if(doc['move_in_ts'] && doc['move_in_ts'].values.contains(1475280000)){return 200;}"
but in all above cases, I get the following error:
"reason": {
"type": "null_pointer_exception",
"reason": null
}
It might be possible that this field doesn't exist at all in few documents (I cannot use filter in my use case, I need to have it in the script only)
What am I doing wrong or how to get it work?

I am not able to reproduce the issue(I am also using ES 2.3). You would also need toLong() to get the right results or it will give zero results. I created sample index like this.
PUT books
{
"mappings": {
"book":{
"properties": {
"move_in_ts":{
"type": "integer"
}
}
}
}
}
I indexed few docs
POST books/book
{
"move_in_ts" : null
}
POST books/book
{
"move_in_ts" : [4,null]
}
POST books/book
{
"move_in_ts" : []
}
POST books/book
{
"some_other_field" : "some value"
}
POST books/book
{
"move_in_ts" : [
1475280000,
1475539200,
1475712000,
1475884800,
1477008000,
1477785600
]
}
Then below query gives right result
GET books/book/_search
{
"query": {
"bool": {
"filter": {
"script": {
"script": "if(doc['move_in_ts'] && doc['move_in_ts'].values.contains(1475280000.toLong())){return 200;}"
}
}
}
}
}

Related

ElasticSearch 6.7 painless, how to access nested document

When I use ES 5.5 update to 6.7.
Painless script does’t work
This is 5.5
If I want get a nested document [transFilter]
I do this
params['_source’]['carFilter’]
It works very well。
But
When I used 6.7 version
params['_source’]['carFilter’]
I found it does’t work
All params['_source’] is null
my mappings
carFilter": {
"type": "nested",
"properties": {
"time": {
"type": "long"
}
}
}
my data example
"carFilter" : [
{
"time" : 20200120
},
{
"time" : 20200121
}
]
and my query script example
{
"query" : {
"bool" : {
"must" : [
{
"script" : {
"script" : {
"inline" : "if(params['_source']!=null){
if(params['_source']['carFilter']!=null){
for(def item:params['_source']['carFilter'] ){
if (item.time>1) { return true; }
}
}
}
return false;",
"lang" : "painless",
"params" : {
"rentTime" : 1000
}
}
}
}
]
}
}
}
even no error
but fact
if(params['_source']!=null){
this line already return
The simple painless above is just to illustrate the problem, and a relatively real one is attached below.
double carPrice=0.00;if(!params['_source'].empty){"+
" def days=params['_source']['everyDayPrice'];if(params['_source']['everyDayPrice']!=null){int size=days.length;" +
" if(size>0){for(int i=0;i<size;i++){String day = days[i]['day'];Double price = days[i]['price'];"+
" if(price!=null&&params.get(day)!=null){carPrice=carPrice+params.get(day)*price;}}}}}" +
" return carPrice/params.total"
Looking at your query, you would want to filter the documents having carFilter.time > 1 and why not use a simple Nested Query:
POST <your_index_name>/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "carFilter",
"query": {
"range": {
"carFilter.time": {
"gte": 1
}
}
}
}
}
]
}
}
}
Note that I've made use of Range Query to evaluate the time based on what you are looking for.
I'd suggest you go through this answer if the above doesn't help.
Let me know if you have any queries.

Elasticsearch 5.4: Use normal and nested fields in same Painless script query?

I have a mapping like this:
{
printings: {
type: "nested",
properties: {
prop1: {type: "number"}
}
},
prop2: {type: "number"}
}
I then want to build a Painless query like this:
"script": {
"lang": "painless",
"inline": "doc['prop1'] > (3 * doc['printings.prop2'])"
}
However testing this in Sense doesn't work. If I replace the nested prop2 with a simple number then it does work. Is there a way to access both root and nested props in a single scripted query?
You can try below query.
{
"query": {
"script": {
"script": {
"lang": "painless",
"inline": "params['_source']['prop1'] > (2 * params['_source']['printings']['prop2'])"
}
}
}
}
But please keep this thing mind that _source is very slow. read more about here
Unfortunately, you cannot access nested context from root and you cannot access root context from nested because nested documents are separate documents, even though they are stored close to the parent. But you can solve it with a different mapping using copy_to field feature. Here is a mapping:
{
"mappings": {
"sample": {
"properties": {
"printings": {
"type": "nested",
"properties": {
"prop2": {
"type": "integer",
"copy_to": "child_prop2"
}
}
},
"prop1": {
"type": "integer"
},
"child_prop2": {
"type": "integer"
}
}
}
}
}
In this case the values from nested documents will be copied to the parent. You don't have to explicitly fill this new field, here is ax example of bulk indexing:
POST http://localhost:9200/_bulk HTTP/1.1
{"index":{"_index":"my_index","_type":"sample","_id":null}}
{"printings":[{"prop2":1},{"prop2":4}],"prop1":2}
{"index":{"_index":"my_index","_type":"sample","_id":null}}
{"printings":[{"prop2":0},{"prop2":1}],"prop1":2}
{"index":{"_index":"my_index","_type":"sample","_id":null}}
{"printings":[{"prop2":1},{"prop2":0}],"prop1":2}
After that you can use this query
{
"query": {
"script": {
"script": {
"inline": "doc['prop1'].value > (3 * doc['child_prop2'].value)",
"lang": "painless"
}
}
}
}
The first document won't match. The second one will match by the first subdocument. The third one will match by the second subdocument.

Elasticsearch Mapping - Rename existing field

Is there anyway I can rename an element in an existing elasticsearch mapping without having to add a new element ?
If so whats the best way to do it in order to avoid breaking the existing mapping?
e.g. from fieldCamelcase to fieldCamelCase
{
"myType": {
"properties": {
"timestamp": {
"type": "date",
"format": "date_optional_time"
},
"fieldCamelcase": {
"type": "string",
"index": "not_analyzed"
},
"field_test": {
"type": "double"
}
}
}
}
You could do this by creating an Ingest pipeline, that contains a Rename Processor in combination with the Reindex API.
PUT _ingest/pipeline/my_rename_pipeline
{
"description" : "describe pipeline",
"processors" : [
{
"rename": {
"field": "fieldCamelcase",
"target_field": "fieldCamelCase"
}
}
]
}
POST _reindex
{
"source": {
"index": "source"
},
"dest": {
"index": "dest",
"pipeline": "my_rename_pipeline"
}
}
Note that you need to be running Elasticsearch 5.x in order to use ingest. If you're running < 5.x then you'll have to go with what #Val mentioned in his comment :)
Updating field name in ES (version>5, missing has been removed) using _update_by_query API:
Example:
POST http://localhost:9200/INDEX_NAME/_update_by_query
{
"query": {
"bool": {
"must_not": {
"exists": {
"field": "NEW_FIELD_NAME"
}
}
}
},
"script" : {
"inline": "ctx._source.NEW_FIELD_NAME = ctx._source.OLD_FIELD_NAME; ctx._source.remove(\"OLD_FIELD_NAME\");"
}
}
First of all, you must understand how elasticsearch and lucene store data, by immutable segments (you can read about easily on Internet).
So, any solution will remove/create documents and change mapping or create a new index so a new mapping as well.
The easiest way is to use the update by query API: https://www.elastic.co/guide/en/elasticsearch/reference/2.4/docs-update-by-query.html
POST /XXXX/_update_by_query
{
"query": {
"missing": {
"field": "fieldCamelCase"
}
},
"script" : {
"inline": "ctx._source.fieldCamelCase = ctx._source.fieldCamelcase; ctx._source.remove(\"fieldCamelcase\");"
}
}
Starting with ES 6.4 you can use "Field Aliases", which allow the functionality you're looking for with close to 0 work or resources.
Do note that aliases can only be used for searching - not for indexing new documents.

Elasticsearch query on array index

How do I query/filter by index of an array in elasticsearch?
I have a document like this:-
PUT /edi832/record/1
{
"LIN": [ "UP", "123456789" ]
}
I want to search if LIN[0] is "UP" and LIN[1] exists.
Thanks.
This might look like a hack , but then it will work for sure.
First we apply token count type along with multi field to capture the the number of tokens as a field.
So the mapping will look like this -
{
"record" : {
"properties" : {
"LIN" : {
"type" : "string",
"fields" : {
"word_count": {
"type" : "token_count",
"store" : "yes",
"analyzer" : "standard"
}
}
}
}
}
}
LINK - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#token_count
So to check if the second field exists , its as easy as checking if this field value is more than or equal to 2.
Next we can use the token filter to check if the token "up" exists in position 0.
We can use the scripted filter to check this.
Hence a query like below should work -
{
"query": {
"filtered": {
"query": {
"range": {
"LIN.word_count": {
"gte": 2
}
}
},
"filter": {
"script": {
"script": "for(pos : _index['LIN'].get('up',_POSITIONS)){ if(pos.position == 0) { return true}};return false;"
}
}
}
}
}
Advanced scripting - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-advanced-scripting.html
Script filters - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-script-filter.html

elasticsearch filtering by the size of a field that is an array

How can I filter documents that have a field which is an array and has more than N elements?
How can I filter documents that have a field which is an empty array?
Is facets the solution? If so, how?
I would have a look at the script filter. The following filter should return only the documents that have at least 10 elements in the fieldname field, which is an array. Keep in mind that this could be expensive depending on how many documents you have in your index.
"filter" : {
"script" : {
"script" : "doc['fieldname'].values.length > 10"
}
}
Regarding the second question: do you really have an empty array there? Or is it just an array field with no value? You can use the missing filter to get documents which have no value for a specific field:
"filter" : {
"missing" : { "field" : "user" }
}
Otherwise I guess you need to use scripting again, similarly to what I suggested above, just with a different length as input. If the length is constant I'd put it in the params section so that the script will be cached by elasticsearch and reused, since it's always the same:
"filter" : {
"script" : {
"script" : "doc['fieldname'].values.length > params.param1"
"params" : {
"param1" : 10
}
}
}
javanna's answer is correct on Elasticsearch 1.3.x and earlier, since 1.4 the default scripting module has changed to groovy (was mvel).
To answer OP's question.
On Elasticsearch 1.3.x and earlier, use this code:
"filter" : {
"script" : {
"script" : "doc['fieldname'].values.length > 10"
}
}
On Elasticsearch 1.4.x and later, use this code:
"filter" : {
"script" : {
"script" : "doc['fieldname'].values.size() > 10"
}
}
Additionally, on Elasticsearch 1.4.3 and later, you will need to enable the dynamic scripting as it has been disabled by default, because of security issue. See: https://www.elastic.co/guide/en/elasticsearch/reference/1.4/modules-scripting.html
Still posting to here for who stuck same situation with me.
Let's say your data look like this:
{
"_source": {
"fieldName" : [
{
"f1": "value 11",
"f2": "value 21"
},
{
"f1": "value 12",
"f2": "value 22"
}
]
}
}
Then to filter fieldName with length > 1 for example:
"query": {
"bool" : {
"must" : {
"script" : {
"script" : {
"inline": "doc['fieldName.f1'].values.length > 1",
"lang": "painless"
}
}
}
}
}
The script syntax is as ES 5.4 documentation https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-script-query.html.
Imho the correct way of filtering arrays by size using scripting is :
"filter" : {
"script" : {
"script" : "_source.fieldName.size() > 1"
}
}
If I do that as #javanna suggests it throws exception groovy.lang.MissingPropertyException: No such property: length for class: java.lang.String
If you have an array of objects that aren't mapped as nested, keep in mind that Elastic will flatten them into:
attachments: [{size: 123}, {size: 456}] --> attachments.size: [123, 456]
So you want to reference your field as doc['attachments.size'].length, not doc['attachments'].length, which is very counter-intuitive.
Same for doc.containsKey(attachments.size).
The .values part is deprecated and no longer needed.
Based on this:
https://code.google.com/p/guava-libraries/source/browse/guava/src/com/google/common/collect/RegularImmutableList.java?r=707f3a276d4ea8e9d53621d137febb00cd2128da
And on lisak's answer here.
There is size() function which returns the length of list:
"filter" : {
"script" : {
"script" : "doc['fieldname'].values.size() > 10"
}
}
Easiest way to do this is to "denormalize" your data so that you have a property that contains the count and a boolean if it exists or not. Then you can just search on those properties.
For example:
{
"id": 31939,
"hasAttachments": true,
"attachmentCount": 2,
"attachments": [
{
"type": "Attachment",
"name": "txt.txt",
"mimeType": "text/plain"
},
{
"type": "Inline",
"name": "jpg.jpg",
"mimeType": "image/jpeg"
}
]
}
When you need to find documents which contains some field which size/length should be larger then zero #javanna gave correct answer. I only wanted to add if your field is text field and you want to find documents which contains some text in that field you can't use same query. You will need to do something like this:
GET index/_search
{
"query": {
"bool": {
"must": [
{
"range": {
"FIELD_NAME": {
"gt": 0
}
}
}
]
}
}
}
This is not exact answer to this question because answer already exists but solution for similar problem which I had so maybe somebody will find it useful.
a suggestion about the second question:
How can I filter documents that have a field which is an empty array?
{
"query": {
"bool": {
"must_not": {
"exists": {
"field": "fieldname"
}
}
}
}
}
will return docs with empty fieldname: [] arrays. must (rather than must_not will return the opposite).
Here is what worked for me:
GET index/search {
"query": {
"bool": {
"filter" : {
"script" : {
"script" : "doc['FieldName'].length > 10"
}
}
}
}
}
For version 7+:
"filter": {
"script": {
"script": {
"source": "doc['fieldName.keyword'].length > 10",
"lang": "painless"
}
}
}
Ref. https://medium.com/#felipegirotti/elasticsearch-filter-field-array-more-than-zero-8d52d067d3a0

Resources