Elasticsearch - Is it possible to create histograms without having the field indexed - elasticsearch

I come across the following phrase
https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-disk-usage.html
For instance if you have a numeric field called foo that you need to run histograms on but that you never need to filter on, you can safely disable indexing on this field in your mappings:
PUT index
{
"mappings": {
"properties": {
"foo": {
"type": "integer",
"index": false
}
}
}
}
Does it mean aggregations like histograms can be created though the field is NOT indexed ?

Yes, that's correct and that's easy to test:
Create the index:
PUT index
{
"mappings": {
"properties": {
"foo": {
"type": "integer",
"index": false
}
}
}
}
Index a sample document:
PUT index/_doc/1
{
"foo": 23
}
Run an histogram aggregation:
POST index/_search
{
"aggs": {
"histo": {
"histogram": {
"field": "foo",
"interval": 10
}
}
}
}
Results:
"aggregations" : {
"histo" : {
"buckets" : [
{
"key" : 20.0,
"doc_count" : 1
}
]
}
}

Related

ES query to match all elements in array

So I got this document with a
nested array that I want to filter with this query.
I want ES to return all documents where all items have changes = 0 and that only.
If document has even a single item in the list with a change = 1, that's discarded.
Is there any way I can achieve this starting from the query I have already wrote? Or should I use a script instead?
DOCUMENTS:
{
"id": "abc",
"_source" : {
"trips" : [
{
"type" : "home",
"changes" : 0
},
{
"type" : "home",
"changes" : 1
}
]
}
},
{
"id": "def",
"_source" : {
"trips" : [
{
"type" : "home",
"changes" : 0
},
{
"type" : "home",
"changes" : 0
}
]
}
}
QUERY:
GET trips_solutions/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"id": {
"value": "abc"
}
}
},
{
"nested": {
"path": "trips",
"query": {
"range": {
"trips.changes": {
"gt": -1,
"lt": 1
}
}
}
}
}
]
}
}
}
EXPECTED RESULT:
{
"id": "def",
"_source" : {
"trips" : [
{
"type" : "home",
"changes" : 0
},
{
"type" : "home",
"changes" : 0
}
]
}
}
Elasticsearch version: 7.6.2
Already read this answers but they didn't help me:
https://discuss.elastic.co/t/how-to-match-all-item-in-nested-array/163873
ElasticSearch: How to query exact nested array
First off, if you filter by id: abc, you obviously won't be able to get id: def back.
Second, due to the nature of nested fields which are treated as separate subdocuments, you cannot query for all trips that have the changes equal to 0 -- the connection between the individual trips is lost and they "don't know about each other".
What you can do is return only the trips that matched your nested query using inner_hits:
GET trips_solutions/_search
{
"_source": "false",
"query": {
"bool": {
"must": [
{
"nested": {
"inner_hits": {},
"path": "trips",
"query": {
"term": {
"trips.changes": {
"value": 0
}
}
}
}
}
]
}
}
}
The easiest solution then is to dynamically save this nested info on a parent object like discussed here and using range/term query on the resulting array.
EDIT:
Here's how you do it using copy_to onto the doc's top level:
PUT trips_solutions
{
"mappings": {
"properties": {
"trips_changes": {
"type": "integer"
},
"trips": {
"type": "nested",
"properties": {
"changes": {
"type": "integer",
"copy_to": "trips_changes"
}
}
}
}
}
}
trips_changes will be an array of numbers -- I presume they're integers but more types are available.
Then syncing a few docs:
POST trips_solutions/_doc
{"trips":[{"type":"home","changes":0},{"type":"home","changes":1}]}
POST trips_solutions/_doc
{"trips":[{"type":"home","changes":0},{"type":"home","changes":0}]}
And finally querying:
GET trips_solutions/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "trips",
"query": {
"term": {
"trips.changes": {
"value": 0
}
}
}
}
},
{
"script": {
"script": {
"source": "doc.trips_changes.stream().filter(val -> val != 0).count() == 0"
}
}
}
]
}
}
}
Note that we first filter normally using the nested term query to narrow down our search context (scripts are slow so this is useful). We then check if there are any non-zero changes in the accumulated top-level changes and reject those that apply.

Elastic Query for SQL like query

want to search in elastic search, like we do in SQL query = (age = 25 and name = xyz).
This for a single field and single data.
Yes, its very much possible, just use below ES Mapping and query:
Mapping
{
"mappings": {
"properties": {
"name": {
"type": "text"
},
"age" :{
"type" : "integer"
}
}
},
"settings": {
"index": {
"number_of_shards": "1",
"number_of_replicas": "1"
}
}
}
Index a doc
{
"name": "xyz",
"age" : 25
}
Query
{
"query": {
"bool": {
"must": [
{
"match": {
"name": "xyz"
}
},
{
"match": {
"age": 25
}
}
]
}
}
}
In addition to the accepted answer
POST _sql?format=txt {
"query":"SELECT age, name FROM collection WHERE age=25 AND name ='xyz'"
}
See also
https://www.elastic.co/what-is/elasticsearch-sql

Elastic Search Query Where value exists between 2 field values

Im trying to figure out how to construct a query against elastic search where the query value exists between the range of 2 field values.
Lets say I have a template
{
"template": "addresses",
"mappings": {
"addresses": {
"properties": {
"street_number_1": { "type": "integer" },
"street_number_2": { "type": "integer" },
... //other unimportant fields
}
}
}
Based on the above definition, If I have an address of 100-120 High Street, where street_number_1 is 100 and street_number_2 is 120, if I were to perform a Search for 112 High Street, this record should be returned as it is between 100 and 120. What kind of elastic search function/query would allow me to do this?
You have two options. With your current mapping, you can use two range queries like this:
{
"query": {
"bool": {
"filter": [
{
"range": {
"street_number_1": {
"lte": 112
}
}
},
{
"range": {
"street_number_2": {
"gte": 112
}
}
}
]
}
}
}
The second option involves changing your mapping to use an integer range for the street number. Define your street number mapping like this:
PUT addresses
{
"mappings": {
"_doc": {
"properties": {
"street_number": {
"type": "integer_range"
}
}
}
}
}
Then index your address document like this:
PUT addresses/_doc/1
{
"street_number" : {
"gte" : 100,
"lte" : 120
}
}
And finally query it like this:
POST addresses/_search
{
"query" : {
"term" : {
"street_number" : {
"value": 112
}
}
}
}

Range Query on a score returned by match Query in Elastic Search

Suppose I have a set of documents like :-
{
"Name":"Random String 1"
"Type":"Keyword"
"City":"Lousiana"
"Quantity":"10"
}
Now I want to implement a full text search using an N-gram analyazer on the field Name and City.
After that , I want to filter only the results returned with
"_score" :<Query Score Returned by ES>
greater than 1.2 (Maybe By Range Query Aggregation Method)
And after that apply term aggregation method on the property: "Type" and then return the top results in each bucket by using "top_hits" aggregation method.
How can I do so ?
I've been able to implement everything apart from the Range Query on score returned by a search query.
if you want to score the documents organically then i you can use min_score in query to filter the matched documents for the score.
for ngram analyer i added whitespace tokenizer and a lowercase filter
Mappings
PUT index1
{
"settings": {
"analysis": {
"analyzer": {
"edge_n_gram_analyzer": {
"tokenizer": "whitespace",
"filter" : ["lowercase", "ednge_gram_filter"]
}
},
"filter": {
"ednge_gram_filter" : {
"type" : "NGram",
"min_gram" : 2,
"max_gram": 10
}
}
}
},
"mappings": {
"document_type" : {
"properties": {
"Name" : {
"type": "text",
"analyzer": "edge_n_gram_analyzer"
},
"City" : {
"type": "text",
"analyzer": "edge_n_gram_analyzer"
},
"Type" : {
"type": "keyword"
}
}
}
}
}
Index Document
POST index1/document_type
{
"Name":"Random String 1",
"Type":"Keyword",
"City":"Lousiana",
"Quantity":"10"
}
Query
POST index1/_search
{
"min_score": 1.2,
"size": 0,
"query": {
"bool": {
"should": [
{
"term": {
"Name": {
"value": "string"
}
}
},
{
"term": {
"City": {
"value": "string"
}
}
}
]
}
},
"aggs": {
"type_terms": {
"terms": {
"field": "Type",
"size": 10
},
"aggs": {
"type_term_top_hits": {
"top_hits": {
"size": 10
}
}
}
}
}
}
Hope this helps

Querying Nested JSON based on 1 term value

I have indexed JSON like below format
JSON:
{"work":[{"organization":"abc", end:"present"},{"organization":"edf", end:"old"}]}
{"work":[{"organization":"edf", end:"present"},{"organization":"abc", end:"old"}]}
I want to query records where organization is "abc" and end is "present"
but below query is not working
work.0.organization: "abc" AND work.0.end:"present"
No records are matched
if I give query like below
work.organization: "abc" AND work.end:"present"
Both the records are matched. Whereas only the first record is what I want
The matched record should be only the below
{"work":[{"organization":"abc", end:"present"},{"organization":"edf", end:"old"}]}
You have to use nested_types. First map work as nested type in elastic using following mappings
PUT index_name_3
{
"mappings": {
"document_type" : {
"properties": {
"work" : {
"type": "nested",
"properties": {
"organization" : {
"type" : "text"
},
"end" : {
"type" : "text"
}
}
}
}
}
}
}
Use the following query to do nested filter match and innerhits
{
"query": {
"nested": {
"path": "work",
"inner_hits": {},
"query": {
"bool": {
"must": [{
"term": {
"work.organization": {
"value": "abc"
}
}
},
{
"term": {
"work.end": {
"value": "present"
}
}
}
]
}
}
}
}
}

Resources