How does Elasticsearch store a float value into an keyword field - elasticsearch

I have created this index:
{
"users" : {
"mappings" : {
"properties" : {
"user_id" : {
"type" : "keyword"
}
}
}
}
}
and I added this entry:
PUT users/_doc/1
{
"user_id": 4.0000
}
When I query like this, I always can get this entry
GET /_search
{
"query": {
"term": {
"user_id": {
"value": 4.0, // not exact same value as I put in
"boost": 1.0
}
}
}
}
I wonder why this happens.

Related

ES query to match all elements in array

So I got this document with a
nested array that I want to filter with this query.
I want ES to return all documents where all items have changes = 0 and that only.
If document has even a single item in the list with a change = 1, that's discarded.
Is there any way I can achieve this starting from the query I have already wrote? Or should I use a script instead?
DOCUMENTS:
{
"id": "abc",
"_source" : {
"trips" : [
{
"type" : "home",
"changes" : 0
},
{
"type" : "home",
"changes" : 1
}
]
}
},
{
"id": "def",
"_source" : {
"trips" : [
{
"type" : "home",
"changes" : 0
},
{
"type" : "home",
"changes" : 0
}
]
}
}
QUERY:
GET trips_solutions/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"id": {
"value": "abc"
}
}
},
{
"nested": {
"path": "trips",
"query": {
"range": {
"trips.changes": {
"gt": -1,
"lt": 1
}
}
}
}
}
]
}
}
}
EXPECTED RESULT:
{
"id": "def",
"_source" : {
"trips" : [
{
"type" : "home",
"changes" : 0
},
{
"type" : "home",
"changes" : 0
}
]
}
}
Elasticsearch version: 7.6.2
Already read this answers but they didn't help me:
https://discuss.elastic.co/t/how-to-match-all-item-in-nested-array/163873
ElasticSearch: How to query exact nested array
First off, if you filter by id: abc, you obviously won't be able to get id: def back.
Second, due to the nature of nested fields which are treated as separate subdocuments, you cannot query for all trips that have the changes equal to 0 -- the connection between the individual trips is lost and they "don't know about each other".
What you can do is return only the trips that matched your nested query using inner_hits:
GET trips_solutions/_search
{
"_source": "false",
"query": {
"bool": {
"must": [
{
"nested": {
"inner_hits": {},
"path": "trips",
"query": {
"term": {
"trips.changes": {
"value": 0
}
}
}
}
}
]
}
}
}
The easiest solution then is to dynamically save this nested info on a parent object like discussed here and using range/term query on the resulting array.
EDIT:
Here's how you do it using copy_to onto the doc's top level:
PUT trips_solutions
{
"mappings": {
"properties": {
"trips_changes": {
"type": "integer"
},
"trips": {
"type": "nested",
"properties": {
"changes": {
"type": "integer",
"copy_to": "trips_changes"
}
}
}
}
}
}
trips_changes will be an array of numbers -- I presume they're integers but more types are available.
Then syncing a few docs:
POST trips_solutions/_doc
{"trips":[{"type":"home","changes":0},{"type":"home","changes":1}]}
POST trips_solutions/_doc
{"trips":[{"type":"home","changes":0},{"type":"home","changes":0}]}
And finally querying:
GET trips_solutions/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "trips",
"query": {
"term": {
"trips.changes": {
"value": 0
}
}
}
}
},
{
"script": {
"script": {
"source": "doc.trips_changes.stream().filter(val -> val != 0).count() == 0"
}
}
}
]
}
}
}
Note that we first filter normally using the nested term query to narrow down our search context (scripts are slow so this is useful). We then check if there are any non-zero changes in the accumulated top-level changes and reject those that apply.

Trying to update a nested geoip location field in elasticsearch

Here is what I've tried:
POST orders/_update_by_query
{
"script" : "ctx._source.geoip += newElement",
"params": {
"newElement": {
"location" : "[40.730610, -73.935242]"
}
},
"query": {
"term": {
"CITY": {
"value": "nyc"
}
}
}
}
The above throws error Unknown key for a START_OBJECT in [params].
Second Attempt:
POST orders/_update_by_query
{
"script":{
"source":
"for (item in ctx._source.geoip){item.location = '[40.730610, -73.935242]'}",
"lang":"painless"
},
"query": {
"term": {
"CITY": {
"value": "nyc"
}
}
}
}
The above throws null pointer exception, and points to the period at source.geoip
I also tried changing the value of location to just test but receive the same errors.
Here is my mapping:
{
"orders" : {
"mappings" : {
"properties" : {
"geoip" : {
"dynamic" : "true",
"properties" : {
"location" : {
"type" : "geo_point"
}
}
}
}
}
I am using ES v7.2 and Kibana v7.2
A couple of issues in the 1st approach:
params need to be defined within the script object, not below it
newElement needs to be accessed using params.newElement
you cannot append += params.newElement to a nonexistent ctx._source.geoip
you cannot append an object to a single-value field -- you can just assign it
location is of the geo_point type, so either [40.730610, -73.935242] ([lon, lat]) or "-73.935242,40.730610" ("lat,lon"), but not a mixture of both
Working command:
POST orders/_update_by_query
{
"script": {
"inline": "ctx._source.geoip = params.newElement",
"params": {
"newElement": {
"location": [
40.73061,
-73.935242
]
}
}
},
"query": {
"term": {
"CITY": {
"value": "nyc"
}
}
}
}

ElasticSearch nested query score

I have an index :
PUT my_index2
{
"mappings": {
"my_type": {
"properties": {
"user": {
"type": "nested"
}
}
}
}
}
I have two documents:
POST my_index2/my_type/
{
"user": [
{
"name": "Alice Don"
},
{
"name": "Smith"
}
]
}
POST my_index2/my_type/
{
"user": [
{
"name": "Alice David"
}
]
}
When I search it:
GET my_index2/_search
{
"query": {
"nested" : {
"path" : "user",
"query" : {
"bool" : {
"should" : [
{ "match" : {"user.name" : "Alice"} }
]
}
}
}
}
}
Although both documents have one "Alice", the score of the first one is higher. How could that possible?
Your first document has shorter "name", so you got more same chars between "query" and "name"

Elastic GeoHash Query - Aggregation Filter

I am trying to query an elastic index where the result of the query is a list of the geohashes with only one matching document.
I can get a simple list of all geo hashes and their document counts using the following:
{
"size" : 0,
"aggregations" : {
"boundingbox" : {
"filter" : {
"geo_bounding_box" : {
"location" : {
"top_left" : "34.5, -118.9",
"bottom_right" : "33.3, -116."
}
}
},
"aggregations":{
"grid" : {
"geohash_grid" : {
"field": "location",
"precision": 4
}
}
}
}
}
}
However I can't work out the correct syntax to filter the query, the closest I can get are below:
This fails with 503 org.elasticsearch.search.aggregations.bucket.filter.InternalFilter cannot be cast to org.elasticsearch.search.aggregations.InternalMultiBucketAggregation
"aggregations":{
"grid" : {
"geohash_grid" : {
"field": "location",
"precision": 4
}
},
"grid_bucket_filter" : {
"bucket_selector" : {
"buckets_path" :{
"docCount" : "grid" //Also tried `"docCount" : "doc_count"`
},
"script" : "params.docCount == 1"
}
}
}
This fails with 400 No aggregation found for path [doc_count]
"aggregations":{
"grid" : {
"geohash_grid" : {
"field": "location",
"precision": 4
}
},
"grid_bucket_filter" : {
"bucket_selector" : {
"buckets_path" :{
"docCount" : "doc_count"
},
"script" : "params.docCount > 1"
}
}
}
How can I filter based on the doc_count in a geohash grid?
You need to do it like this, i.e. the bucket selector pipeline shall be specified as a sub-aggregation of the geohash_grid one. Plus you need to use _count instead of doc_count(see here):
{
"aggregations": {
"grid": {
"geohash_grid": {
"field": "location",
"precision": 4
},
"aggs": {
"grid_bucket_filter": {
"bucket_selector": {
"buckets_path": {
"docCount": "_count"
},
"script": "params.docCount > 1"
}
}
}
}
}
}

ElasticSearch how to setup geo_point

I'm trying to setup a geo_point object on ES 1.0.0 and run a simple proof of concept query against it but the query is failing to return any hits. Here are my setup steps:
1) Create the mapping:
PUT jay/geotest/_mapping
{
"geotest" : {
"properties" : {
"name" : {
"type" : "string"
},
"pin" : {
"type": "geo_point"
}
}
}
}
2) verify the mapping:
GET jay/geotest/_mapping
3) Add a piece of data
put jay/geotest/1
{
"name": "test1",
"pin": {
"lat": 0,
"lon": 0
}
}
4) query for that data:
GET jay/geotest/_search?search_type=count
{
"filtered" : {
"filter" : {
"geo_distance" : {
"distance" : "100km",
"pin" : {
"lat" : 0,
"lon" : 0
}
}
}
}
}
My expected result is that I will get one hit returned but instead nothing is returned by the query.
Thanks in advance!
I think you're missing the "query" part of the request.
POST jay/geotest/_search
{
"query": {
"filtered": {
"filter": {
"geo_distance": {
"distance": "100km",
"pin": {
"lat": 0,
"lon": 0
}
}
}
}
}
}
I've just tested your steps, and making that change returns the document.

Resources