What does total value shows inside the _search query result in elasticsearch?

What does total value shows inside the _search query result in elasticsearch? - elasticsearch

When we call the elasticsearch, say as follows:
POST https:////_search with body:
{
"from": 0,
"size": 1,
"query": {
"bool": {
"must": [
{
"range": {
"createdAt": {
"gt": "2019-11-11T10:00:00"
}
}
}
]
}
},
"sort": [
{
"createdAt" : {
"order" : "desc"
}
}
]
}
I see that I get only 1 result as pagination is set to 1 but total inside hits in response shows 2. This is the response I get:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": “<index-name>”,
"_type": "_doc",
"_id": "5113c843-dff3-499f-a12e-44c7ac103bcf_0",
"_score": null,
"_source": {
"oId": "5113c843-dff3-499f-a12e-44c7ac103bcf",
"oItemId": 0,
"createdAt": "2019-11-13T11:00:00"
},
"sort": [
1573642800000
]
}
]
}
}
Doesn’t total doesn’t capture the pagination part? And it only cares about the query report? It should show the total count of items matching the query irrespective of the pagination set, right?

Yes, You are right that total doesn't capture the pagination part and just cares about the query report ie. whatever the total no of the document matches for a given query.
To be precise, it is as explained in official ES docs .
total (Object) Metadata about the number of returned documents.
Returned parameters include:
value: Total number of returned documents. relation: Indicates whether
the number of documents returned. Returned values are:
eq: Accurate gte: Lower bound, including returned documents
It means its the total no of returned documents, but as pagination is set to 1 in your example, inner hits have just 1 document.You can cross-check this understanding easily by creating a sample example as below:
Create a sample index with just 1 text field:
URL:- http://localhost:9200/{your-index-name}/ --> PUT method
{
"mappings": {
"properties": {
"name": {
"type": "text"
}
}
},
"settings": {
"index": {
"number_of_shards": "1",
"number_of_replicas": "1"
}
}
}
Once the above index is created index below 4 documents:
URL:- http://localhost:9200/{your-index-name}/_doc/{1,2,like..} --> POST method
{
"name": "foo 1"
}
{
"name": "foo bar"
}
{
"name": "foo"
}
{
"name": "foo 2"
}
Now when you hit below search query without pagination:
{
"query": {
"bool": {
"must": [
{
"match": {
"name": "foo"
}
}
]
}
}
}
It gives below response:
{
"took": 9,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 4, --> Note 4 here
"relation": "eq"
},
"max_score": 0.12199639,
"hits": [
{
"_index": "59638303",
"_type": "_doc",
"_id": "1",
"_score": 0.12199639,
"_source": {
"name": "foo"
}
},
{
"_index": "59638303",
"_type": "_doc",
"_id": "3",
"_score": 0.12199639,
"_source": {
"name": "foo"
}
},
{
"_index": "59638303",
"_type": "_doc",
"_id": "2",
"_score": 0.09271725,
"_source": {
"name": "foo bar"
}
},
{
"_index": "59638303",
"_type": "_doc",
"_id": "4",
"_score": 0.09271725,
"_source": {
"name": "foo 1"
}
}
]
}
}
But when you hit a search query with pagination:
{
"from": 0,
"size": 1,--> note size 1
"query": {
"bool": {
"must": [
{
"match": {
"name": "foo"
}
}
]
}
}
}
it gives below response
{
"took": 23,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 4, --> this is still 4
"relation": "eq"
},
"max_score": 0.12199639,
"hits": [
{
"_index": "59638303",
"_type": "_doc",
"_id": "1",
"_score": 0.12199639,
"_source": {
"name": "foo"
}
}
]
}
}
Now in the above query, you can change the size and check only inner-hits array gets change but the outer hits object which contains total always remains same as 4, this confirms your understanding is correct.

Related

Finding numbers with decimal zeros in elasticsearch

I'm displaying numbers with decimal zeros like this: 25785 --> 25'785.00
I want to copy & paste this displayed number in the search field and find my actual number.
When I do it my query looks like this "query": "(25785.00 OR 25785.00*)", but the indexed number is 25785 and it doesn't get found.
Can I index this field differently so it'll also find the numbers with the decimal zeros?
Mapping:
"my-money" : {
"type" : "text",
"fields" : {
"raw" : {
"type" : "double"
}
}
},

You can use matchphrase query. Details can be found here
Mappings:
PUT /mstest
{
"mappings": {
"test": {
"properties": {
"money": {
"type": "text",
"fields": {
"raw": {
"type": "double"
}
}
}
}
}
}
}
Existing data:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "mstest",
"_type": "test",
"_id": "AXlhj0RUNamWTgl090_3",
"_score": 1,
"_source": {
"money": 257851111
}
},
{
"_index": "mstest",
"_type": "test",
"_id": "AXlhjR3f7ALnT2aUN_qN",
"_score": 1,
"_source": {
"money": 25785
}
}
]
}
}
Search query for number '25785':
GET mstest/test/_search
{
"query": {
"match_phrase": {
"money.raw": "25785.00"
}
}
}
Output:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "mstest",
"_type": "test",
"_id": "AXlhjR3f7ALnT2aUN_qN",
"_score": 1,
"_source": {
"money": 25785
}
}
]
}
}
See if this unblocks you.

mysql field="value" in elasticsearch

I want to display only the items that contain the word itself when "google" searches
How can I only search for items that have only the word "google"?
Request body
(Request created in postman)
{
"query": {
"bool": {
"must": [
{
"match": {
"body": "google"
}
}
]
}
}
}
Response body
(Request created in postman)
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 0.6587735,
"hits": [
{
"_index": "s_t",
"_type": "_doc",
"_id": "3",
"_score": 0.6587735,
"_source": {
"body": "google"
}
},
{
"_index": "s_t",
"_type": "_doc",
"_id": "4",
"_score": 0.5155619,
"_source": {
"body": "google map"
}
},
{
"_index": "s_t",
"_type": "_doc",
"_id": "5",
"_score": 0.5155619,
"_source": {
"body": "google-map"
}
}
]
}
}
I need this output
(Request created in postman)
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 0.69381464,
"hits": [
{
"_index": "s_t",
"_type": "_doc",
"_id": "3",
"_score": 0.69381464,
"_source": {
"body": "google"
}
}
]
}
}
In mysql with this query I reach my goal.
Similar query in mysql:
select * from s_t where body='google'

well i assume you automap or use a text in your mappings.
specify .keyword in your query. Note this is case sensitive.
{
"query": {
"bool": {
"must": [
{
"match": {
"body.keyword": "google"
}
}
]
}
}
}
If you only want to query your body field using exact match. You need to reindex it using keyword. Take a look at: Exact match in elastic search query

Incorrect output result of "aggs" query

I have a query that searches the number of entries in a given datetime window (i.e. between 2017-02-17T15:00:00.000 and 2017-02-17T16:00:00.000). When I execute this query, I get the incorrect result (it's better said that the result is unexpected):
POST /myindex/_search
{
"size": 0,
"aggs": {
"range": {
"date_range": {
"field": "Datetime",
"ranges": [
{ "to": "2017-02-17T16:00:00||-1H/H" },
{ "from": "2017-02-17T16:00:00||/H" }
]
}
}
}
}
This is the output:
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 11,
"max_score": 0,
"hits": []
},
"aggregations": {
"range": {
"buckets": [
{
"key": "*-2017-02-17T15:00:00.000Z",
"to": 1487343600000,
"to_as_string": "2017-02-17T15:00:00.000Z",
"doc_count": 0
},
{
"key": "2017-02-17T16:00:00.000Z-*",
"from": 1487347200000,
"from_as_string": "2017-02-17T16:00:00.000Z",
"doc_count": 0
}
]
}
}
}
In myindex I have two entries with the following values of Datetime:
2017-02-17T15:15:00.000Z
2017-02-17T15:02:00.000Z
So, the result should be equal to 2.
I don't understand how to interpret the current output. Which fields defines the number of entries?
UPDATE:
data structure:
PUT /myindex
{
"mappings": {
"intensity": {
"_all": {
"enabled": false
},
"properties": {
"Country_Id": {
"type":"keyword"
},
"Datetime": {
"type":"date"
}
}
}
}
}
sample data:
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 5,
"max_score": 1,
"hits": [
{
"_index": "myindex",
"_type": "intensity",
"_id": "4",
"_score": 1,
"_source": {
"Country_Id": "1",
"Datetime": "2017-02-18T15:01:00.000Z"
}
},
{
"_index": "myindex",
"_type": "intensity",
"_id": "6",
"_score": 1,
"_source": {
"Country_Id": "1",
"Datetime": "2017-03-16T16:15:00.000Z"
}
},
{
"_index": "myindex",
"_type": "intensity",
"_id": "1",
"_score": 1,
"_source": {
"Country_Id": "1",
"Datetime": "2017-02-17T15:15:00.000Z"
}
},
{
"_index": "myindex",
"_type": "intensity",
"_id": "7",
"_score": 1,
"_source": {
"Country_Id": "1",
"Datetime": "2017-03-16T16:18:00.000Z"
}
},
{
"_index": "myindex",
"_type": "intensity",
"_id": "3",
"_score": 1,
"_source": {
"Country_Id": "1",
"Datetime": "2017-02-17T15:02:00.000Z"
}
}
]
}
}
The answer that I get:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 11,
"max_score": 0,
"hits": []
},
"aggregations": {
"range": {
"buckets": [
{
"key": "2017-02-17T15:00:00.000Z-2017-02-17T16:00:00.000Z",
"from": 1487343600000,
"from_as_string": "2017-02-17T15:00:00.000Z",
"to": 1487347200000,
"to_as_string": "2017-02-17T16:00:00.000Z",
"doc_count": 0
}
]
}
}
}

Your ranges are wrong, do it like this instead
POST /myindex/_search
{
"size": 0,
"aggs": {
"range": {
"date_range": {
"field": "Datetime",
"ranges": [
{
"from": "2017-02-17T16:00:00Z||-1H/H",
"to": "2017-02-17T16:00:00Z||/H"
}
]
}
}
}
}

Elasticsearch: do exact searches where the query contains special characters like '#'

Get the results of only those documents which contain '#test' and ignore the documents that contain just 'test' in elasticsearch

People may gripe at you about this question, so I'll note that it was in response to my comment on this post.
You're probably going to want to read up on analysis in Elasticsearch, as well as match queries versus term queries.
Anyway, the convention here is to use a .raw sub-field on a string field. That way, if you want to do searches involving analysis, you can use the base field, but if you want to search for exact (un-analyzed) values, you can use the sub-field.
So here is a simple mapping that accomplishes this:
PUT /test_index
{
"mappings": {
"doc": {
"properties": {
"post_text": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
Now if I add these two documents:
PUT /test_index/doc/1
{
"post_text": "#test"
}
PUT /test_index/doc/2
{
"post_text": "test"
}
A "match" query against the base field will return both:
POST /test_index/_search
{
"query": {
"match": {
"post_text": "#test"
}
}
}
...
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.5945348,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": 0.5945348,
"_source": {
"post_text": "#test"
}
},
{
"_index": "test_index",
"_type": "doc",
"_id": "2",
"_score": 0.5945348,
"_source": {
"post_text": "test"
}
}
]
}
}
But the "term" query below will only return the one:
POST /test_index/_search
{
"query": {
"term": {
"post_text.raw": "#test"
}
}
}
...
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": 1,
"_source": {
"post_text": "#test"
}
}
]
}
}
Here is the code I used to test it:
http://sense.qbox.io/gist/2f0fbb38e2b7608019b5b21ebe05557982212ac7

how to index for specific fields of documents using elasticsearch

My requirement is to store specific fields of document to index in elasticsearch.
Example:
My document is
{
"name":"stev",
"age":26,
"salary":25000
}
This is my document but i don't want indexing total document.I want store only name field.
I created one index emp and write mapping like below
"person" : {
"_all" : {"enabled" : false},
"properties" : {
"name" : {
"type" : "string", "store" : "yes"
}
}
}
When see the index document
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "test",
"_id": "AU1_p0xAq8r9iH00jFB_",
"_score": 1,
"_source": { }
}
,
{
"_index": "test",
"_type": "test",
"_id": "AU1_lMDCq8r9iH00jFB-",
"_score": 1,
"_source": { }
}
]
}
}
name fields is not generated,Why?
any one help to me

It's hard to tell what you're doing wrong from what you posted, but I can give you an example that works.
Elasticsearch will, by default, index whatever source documents you give it. Every time it sees a new document field, it will create a mapping field with sensible defaults, and it will index them by default as well. If you want to exclude fields, you can set "index": "no" and "store": "no" in the mapping for each field you want to exclude. If you want that behavior to be the default for every field, you can use the "_default_" property for specifying that fields not be stored (though I couldn't get it to work for not indexing).
You probably also will want to disable "_source", and use the "fields" parameter in your search queries.
Here is an example. The index definition looks like this:
PUT /test_index
{
"mappings": {
"person": {
"_all": {
"enabled": false
},
"_source": {
"enabled": false
},
"properties": {
"name": {
"type": "string",
"index": "analyzed",
"store": "yes"
},
"age": {
"type": "integer",
"index": "no",
"store": "no"
},
"salary": {
"type": "integer",
"index": "no",
"store": "no"
}
}
}
}
}
Then I can add a few documents with the bulk api:
POST /test_index/person/_bulk
{"index":{"_id":1}}
{"name":"stev","age":26,"salary":25000}
{"index":{"_id":2}}
{"name":"bob","age":30,"salary":28000}
{"index":{"_id":3}}
{"name":"joe","age":27,"salary":35000}
Since I disabled "_source", a simple query will return only ids:
POST /test_index/_search
...
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "test_index",
"_type": "person",
"_id": "1",
"_score": 1
},
{
"_index": "test_index",
"_type": "person",
"_id": "2",
"_score": 1
},
{
"_index": "test_index",
"_type": "person",
"_id": "3",
"_score": 1
}
]
}
}
But if I specify that I want the "name" field, I'll get it:
POST /test_index/_search
{
"fields": [
"name"
]
}
...
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "test_index",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"name": [
"stev"
]
}
},
{
"_index": "test_index",
"_type": "person",
"_id": "2",
"_score": 1,
"fields": {
"name": [
"bob"
]
}
},
{
"_index": "test_index",
"_type": "person",
"_id": "3",
"_score": 1,
"fields": {
"name": [
"joe"
]
}
}
]
}
}
You can prove to yourself that the other fields were not stored by running:
POST /test_index/_search
{
"fields": [
"name", "age", "salary"
]
}
which will return the same result. I can also prove that the "age" field wasn't indexed by running this query, which would return a document if "age" had been indexed:
POST /test_index/_search
{
"fields": [
"name", "age"
],
"query": {
"term": {
"age": {
"value": 27
}
}
}
}
...
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
Here is a bunch of code I used for playing around with this. I wanted to use a _default mapping and/or field to handle this without having to specify the settings for each field. I was able to make it work in terms of not storing data, but each field was still indexed.
http://sense.qbox.io/gist/d84967923d6c0757dba5f44240f47257ba2fbe50

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

What does total value shows inside the _search query result in elasticsearch? - elasticsearch

Related

Finding numbers with decimal zeros in elasticsearch

mysql field="value" in elasticsearch

Incorrect output result of "aggs" query

Elasticsearch: do exact searches where the query contains special characters like '#'

how to index for specific fields of documents using elasticsearch

Categories

Resources