Elastic search range not working for double indexed as long - elasticsearch

I have mapped a field as long, but the input data is decimal (100.123).
I've tried any range search and it doesn't work. I've verified and the data is in the proper index and I can find them if I search for missing/exists.
Range query:
"range": {
"nr_val": {
"from": 123,
"to": 1234
}
}
Is Elasticsearch just ignoring the values, treating them as strings in a range search ?
So in my situation, what can I do to make a range search from:100, to:200 work for 100.123 other than a full dump and re-import? Are there any conversion options available?
Update with detailed specs
{
"state": "open",
"settings": {
"index": {
"creation_date": "1447858537098",
"number_of_shards": "5",
"uuid": "iiPzQXasQadvnDF1da8oMw",
"version": {
"created": "1070299"
},
"number_of_replicas": "1"
}
},
"mappings": {
"mongo_doc": {
"properties": {
"parent": {
"type": "string"
},
"data.current.specs.nr._nrm_val": {
"type": "double"
},
"data.current.specs.nr_b._nrm_val": {
"type": "double"
},
"data": {
"properties": {
"current": {
"properties": {
"specs": {
"properties": {
"nr": {
"properties": {
"_nrm_val": {
"type": "double"
}
}
},
"nr_b": {
"properties": {
"_nrm_val": {
"type": "long"
}
}
}
}
}
}
}
}
}
}
}
},
"aliases": []
}
Seems that the mapping is not quite right... switched to ['data']['properties']['current']['properties'](...) notation.

In your case that field should have been double, not long. And the indexed value for 100.123 is 100 and you loose the decimals.
At this point, other than re-indexing which is ideal, probably just scripted filtering will do it:
{
"query": {
"filtered": {
"filter": {
"script": {
"script": "_source['nr'].value >= param1 && _source['nr'].value <= param2",
"params": {
"param1": 100,
"param2": 200
}
}
}
}
}
}
but it will be expensive because of the _source loading.

Related

Elasticsearch remove a field from an object of an array in a dynamically generated index

I'm trying to delete fields from an object of an array in Elasticsearch. The index has been dynamically generated.
This is the mapping:
{
"mapping": {
"_doc": {
"properties": {
"age": {
"type": "long"
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"result": {
"properties": {
"resultid": {
"type": "long"
},
"resultname": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
},
"timestamp": {
"type": "date"
}
}
}
}
}
}
this is a document:
{
"result": [
{
"resultid": 69,
"resultname": "SFO"
},
{
"resultid": 151,
"resultname": "NYC"
}
],
"age": 54,
"name": "Jorge",
"timestamp": "2020-04-02T16:07:47.292000"
}
My goals is to remove all the fields resultid in result in all the document of the index. After update the document should look like this:
{
"result": [
{
"resultname": "SFO"
},
{
"resultname": "NYC"
}
],
"age": 54,
"name": "Jorge",
"timestamp": "2020-04-02T16:07:47.292000"
}
I tried using the following articles on stackoverflow but with no luck:
Remove elements/objects From Array in ElasticSearch Followed by Matching Query
remove objects from array that satisfying the condition in elastic search with javascript api
Delete nested array in elasticsearch
Removing objects from nested fields in ElasticSearch
Hopefully someone can help me find a solution.
You should reindex your index in a new one with _reindex API and call a script to remove your fields :
POST _reindex
{
"source": {
"index": "my-index"
},
"dest": {
"index": "my-index-reindex"
},
"script": {
"source": """
for (int i=0;i<ctx._source.result.length;i++) {
ctx._source.result[i].remove("resultid")
}
"""
}
}
After you can delete your first index :
DELETE my-index
And reindex it :
POST _reindex
{
"source": {
"index": "my-index-reindex"
},
"dest": {
"index": "my-index"
}
}
I combined the answer from Luc E with some of my own knowledge in order to reach a solution without reindexing.
POST INDEXNAME/TYPE/_update_by_query?wait_for_completion=false&conflicts=proceed
{
"script": {
"source": "for (int i=0;i<ctx._source.result.length;i++) { ctx._source.result[i].remove(\"resultid\")}"
},
"query": {
"bool": {
"must": [
{
"exists": {
"field": "result.id"
}
}
]
}
}
}
Thanks again Luc!
If your array has more than one copy of element you want to remove. Use this:
ctx._source.some_array.removeIf(tag -> tag == params['c'])

Terms aggregation with nested wildcard path

Given the following nested object of nested objects
{
[...]
"nested_parent":{
"nested_child_1":{
"classifier":"one"
},
"nested_child_2":{
"classifier":"two"
},
"nested_child_3":{
"classifier":"two"
},
"nested_child_4":{
"classifier":"five"
},
"nested_child_5":{
"classifier":"six"
}
[...]
}
I'm wanting to aggregate on the wildcard-ish field nested_parent.*.classifier, along the lines of
{
"size": 0,
"aggs": {
"termsAgg": {
"nested": {
"path": "nested_parent.*"
},
"aggs": {
"termsAgg": {
"terms": {
"size": 1000,
"field": "nested_parent.*.classifier"
}
}
}
}
}
}
which does not seem to work -- possibly because the path and field are not defined clearly enough.
How can I aggregate on nested objects with dynamically created nested mappings which share most of their properties, including the classifier on which I intend to terms-aggregate?
Tdlr;
A bit late to the party.
I would suggest a different approach as I don't see a possible solution using wildcards.
My solution would involve using the copy_to to create a field that you will be able to access using aggregation.
Solution
The idea is to create a field that will store the values of all your classifiers.
Which you can be doing aggregation on.
PUT /54198251/
{
"mappings": {
"properties": {
"classifiers": {
"type": "keyword"
},
"parent": {
"type": "nested",
"properties": {
"child": {
"type": "nested",
"properties": {
"classifier": {
"type": "keyword",
"copy_to": "classifiers"
}
}
},
"child2": {
"type": "nested",
"properties": {
"classifier": {
"type": "keyword",
"copy_to": "classifiers"
}
}
}
}
}
}
}
}
POST /54198251/_doc
{
"parent": {
"child": {
"classifier": "c1"
},
"child2": {
"classifier": "c2"
}
}
}
GET /54198251/_search
{
"aggs": {
"classifiers": {
"terms": {
"field": "classifiers",
"size": 10
}
}
}
}
Will give you:
"aggregations": {
"classifiers": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "c1",
"doc_count": 1
},
{
"key": "c2",
"doc_count": 1
}
]
}
}

Nested query in ElasticSearch - two levels

I have the next mapping :
"c_index": {
"aliases": {},
"mappings": {
"an": {
"properties": {
"id": {
"type": "string"
},
"sm": {
"type": "nested",
"properties": {
"cr": {
"type": "nested",
"properties": {
"c": {
"type": "string"
},
"e": {
"type": "long"
},
"id": {
"type": "string"
},
"s": {
"type": "long"
}
}
},
"id": {
"type": "string"
}
}
}
}
}
}
And I need a query than gives me all the cr's when:
an.id == x and sm.id == y
I tried with :
{"query":{"bool":{"should":[{"terms": {"_id": ["x"]}},
{"nested":{"path": "sm","query":{
"match": {"sm.id":"y"}}}}]}}}
But runs very slow and gives more info than i need.
What's the most efficient way to do that ? Thank you!
You don't need nested query here. Also, use filter instead of should if you want to find documents matching all the queries (the exception would be if you wanted the query to affect the score, like match query, which is not the case here, then you could use should + minimum_should_match option)
{
"query": {
"bool": {
"filter": [
{ "term": { "_id": "x" } },
{ "term": { "sm.id": "y" } }
]
}
}
}

In Elasticsearch, how to move data from one field into another field

I have an index with mappings that look like this:
"mappings": {
"default": {
"_all": {
"enabled": false
},
"properties": {
"Foo": {
"properties": {
"Bar": {
"type": "keyword"
}
}
}
}
}
I am trying to change the mapping to introduce a sub-field of Bar, called Code, whilst migrating the string currently in Bar into Bar.Code. Here is the new mapping:
"mappings": {
"default": {
"_all": {
"enabled": false
},
"properties": {
"Foo": {
"properties": {
"Bar": {
"properties": {
"Code": {
"type": "keyword"
}
}
}
}
}
}
}
In order to do this, I think I need to do a _reindex and specify a pipeline. Is that correct? If so, how does my pipeline access the original data?
I have tried variations on the following code, but without success:
PUT _ingest/pipeline/transformFooBar
{
"processors": [
{
"set": {
"field": "Bar.Code",
"value": "{{_source.Bar}}"
}
}
]
}
POST _reindex
{
"source": {
"index": "foo_v1"
},
"dest": {
"index": "foo_v2",
"pipeline": "transformFooBar"
}
}
Ah, I almost had the syntax right. The _source is not required:
// Create a pipeline with a SET processor
PUT _ingest/pipeline/transformFooBar
{
"processors": [
{
"set": {
"field": "Bar.Code",
"value": "{{Bar}}"
}
}
]
}
// Reindex using the above pipeline
POST _reindex
{
"source": {
"index": "foo_v1"
},
"dest": {
"index": "foo_v2",
"pipeline": "transformFooBar"
}
}

Elasticsearch Aggregation - Unable to perform aggregation to object

I have a mapping with an inner object as follows:
{
"mappings": {
"_all": {
"enabled": false
},
"properties": {
"foo": {
"name": {
"type": "string",
"index": "not_analyzed"
},
"address": {
"type": "object",
"properties": {
"address": {
"type": "string"
},
"city": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
When I try the following aggregation it does not return any data:
post data:*/foo/_search?search_type=count
{
"query": {
"match_all": {}
},
"aggs": {
"unique": {
"cardinality": {
"field": "address.city"
}
}
}
}
When I try to put field city or address.city, aggregation returns zero but if i put foo.address.city it is then when i get the correct respond by elasticsearch. This also affects kibana behavior
Any ideas why this is happening? I saw there is a mapping refactoring that might affects this. I use elasticsearch version 1.7.1
To add on this if, I use the relative path in a search query as follows it works normally:
"query": {
"filtered": {
"filter": {
"term": {
"address.city": "london"
}
}
}
}
Seems its this same issue.
This is seen when the type name and field name is same.

Resources