Elasicsearch sort by inner field - elasticsearch

I have documents that one of their field looks like the following -
"ingredients": [{
"unit": "MG",
"value": 123,
"key": "abc"
}]
And I would like to sort the different records according to the ascending value of specific ingredient. That is if I have 2 records which have use ingredient with key "abc", one with value 1 and one with value 2. The one with ingredient value 1 should appear first.
Each of those records may have more than on ingredient.
Thank you in advance!

The search query to sort will be:
{
"sort":{
"ingredients.value":{
"order":"asc"}
}}

Related

Generic way to get prev/next search results by id in Elasticsearch

Say I have a million (many) documents in my index. I execute a search query sorting the items by some key X.
Now I have a very long list of results: [..., id1, id2, id3, ...]
Question: how do I get id1 and id3 if I know id2 but don't want to execute the whole search/don't want to get all ids?
I'm looking of a generic solution that works for any search query. Given an id that for certain exists in the results of a query, how to get prev/next by that id. The query should NOT have prior knowledge of anything else than the id whose prev/next are searched for. (In other words, if ordered by title and searched for prev/next of id X, the title of X is not known at query time, only X's id.)
It is of course possible to execute multiple search queries and achieve the same end result by getting id2 and then playing with ordering to get ids 1 and 3.
EDIT:
I think Luc E's answer isn't what I'm looking for. In that scenario, knowledge of the original objects title is required to query for prev/next. I'm looking for a solution where only the id is known at query time.
Example data looks like this:
[...
{id: 32, title: 'AAA'},
{id: 12, title: 'BBB'},
{id: 99, title: 'CCC'},
{id: 3, title: 'DDD'},
{id: 1001, title: 'EEE'},
...]
What I know: id 99. What I don't know: what is title of id 99.
What I want: ids of the prev/next items sorted by title field (=3 and 12).
To put it yet another way: I have id 99 but not the title in my hand. I want a query that gives me ids 3 and 12 (they are prev/next sorted by title).
What you want to do is called deep scrolling, you have only two ways to make it :
scroll
search_after
The easiest way is the search_after but you will need to make two requests :
one request for id3
Another one for id1
So, in this example I am looking for id2 : 128. I can sort documents with the field title and I have get beforehand the value of title for id2 which is title_of_128.
To perform the search_after, I have to add the _id on a sub sort condition
Here is my query :
POST test/_search
{
"size": 2,
"search_after": ["title_of_128","128"],
"sort": [
{
"title": {
"order": "asc"
},
"_id": {
"order": "asc"
}
}
]
}
The result of this query is id2 and id3
Now I inverse the direction of the sort in order to retrieve the id1 :
POST test/_search
{
"size": 2,
"search_after": ["title_of_128","128"],
"sort": [
{
"title": {
"order": "desc"
},
"_id": {
"order": "desc"
}
}
]
}
The result of this query is id2 and id1
Note that sort with _id is deprecated and the best practice is to copy the _id in another field if you want to use search_after

Elasticsearch: how to know which field the results are sorted by?

In Elasticsearch, is there any way to check which field the results are sorted by? I want something like inner-hits for sort clause.
Imagine that your documents have this kind of form:
{"numerals" : [ // nested
{"key": "point", "value": 30},
{"key": "points", "value": 200},
{"key": "score", "value": 20},
{"key": "scores", "value": 40}
]
}
and you sort the results by:
{"numerals.value": {
"nested_path": "numerals",
"nested_filter": {
"match": {
"numerals.key": "score"}}}}
Now I have no idea how to know the field by which the results are actually sorted: it's probably scores at this document, but is perhaps score at the others? There are 2 problems - 1. You cannot use inner-hits nor highlight for the nested fields. and - 2. Even if you can, it doesn't solve the issue if there are multiple matching candidates.
The question is about sorting by fields that are inside nested objects.
So this is what the documention
https://www.elastic.co/guide/en/elasticsearch/guide/current/nested-sorting.html
and
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-sort.html#_nested_sorting_example
says:
Elasticsearch will first restrict the nested documents by the "nested_filter"-query and then sort on the same way as for multi-valued fields:
Exactly the way as if there would be only the filtered nested documents as inner objects aka as if there would be only the root document with a multi-valued field which contains exactly all value which belong to the filtered nested objects
( in your example there will only one value remain: 20).
If you want to be sure about the sort order insert a "mode" parameter:
"min", "max", "sum", "avg" or "median"
If you do not specify the "mode" parameter according to the corresponding issue the min-value will be picked for "asc" and the max-value will be picked for "desc"-order:
By default when sorting on a multi-valued field the lowest or highest
value will be picked from the field values depending on the sort
order.

Rethinkdb - Get first item of an array

My data structure
{
"group": "fruits"
"items": ["apple", "orange", "banana"]
}
I need to pull first item from the "items" array without knowing the value. Is it possible?
I think I figured out the answer. I can use "nth(0)" to get item at index 0.

Grouping non null fields together in Kibana

Given the following three User entries in an ElasticSearch index:
"user": [
{
"userId": "100",
"hobby": "chess"
}
"user": [
{
"userId": "200",
"hobby": "music"
}
"user": [
{
"userId": "300",
"hobby": ""
}
I want to create a vertical bar chart to compare the number of users who have a hobby as opposed to those who do not. Individual hobbies should not be shown separately, but grouped together.
If split along the Y axis, one block would take up two thirds of the height (the two users with hobbies) and one block one third of the height (the one user with no hobbies).
How could one achieve this grouping in Kibana?
Thanks
You'll need to choose Split Bars and then Filters aggregation. Once you have that selected you should see Query 1 with * in it. Change the * to hobby:*. Next hit Add Filter and put in NOT hobby:*
The filters aggregation lets you bucket things pretty much any way you can search for things.

Require a number of matches against text in ElasticSearch

I'm trying to create a filter against ElasticSearch that requires more than one match before the result is returned. For example, in the following text:
If you're uneasy at the idea of riding in a vehicle that drives itself, just wait till you see Google's new car. It has no gas pedal, no brake and no steering wheel. Google has been demonstrating its driverless technology for several years by retrofitting Toyotas, Lexuses and other cars with cameras and sensors. But now, for the first time, the company has unveiled a prototype of its own: a cute little car that looks like a cross between a VW Beetle and a golf cart.
If I set the minimum number of matches to 2 and searched for Google, I would expect this result because Google appears in the text twice. However, searching on Toyota with the same number of expected matches should not result in this article.
How do I construct this filter?
Probably not exactly what you are looking for, but you could add explain to your query and then filter on the client side by number of term matches. From the docs, query would look like this:
GET /_search?explain
{
"query" : { "match" : { "tweet" : "honeymoon" }}
}
Results would look like this:
"_explanation": {
"description": "weight(tweet:honeymoon in 0)
[PerFieldSimilarity], result of:",
"value": 0.076713204,
"details": [
{
"description": "fieldWeight in 0, product of:",
"value": 0.076713204,
"details": [
{
"description": "tf(freq=1.0), with freq of:",
"value": 1,
"details": [
{
"description": "termFreq=1.0",
"value": 1
}
]
},
{
"description": "idf(docFreq=1, maxDocs=1)",
"value": 0.30685282
},
{
"description": "fieldNorm(doc=0)",
"value": 0.25,
}
]
}
]
}
You could then filter on the description field for term frequency and look for a value > 1.
I believe you may be able to do this directly (no client side filtering) by using scripting, as you can get reference to term frequency:
Term statistics:
Term statistics for a field can be accessed with a subscript operator like this: _index['FIELD']['TERM']. This will never return null, even if term or field does not exist. If you do not need the term frequency, call _index['FIELD'].get('TERM', 0) to avoid uneccesary initialization of the frequencies. The flag will have only affect is your set the index_options to docs (see mapping documentation).
_index['FIELD']['TERM'].df()
df of term TERM in field FIELD. Will be returned, even if the term is not present in the current document.
_index['FIELD']['TERM'].ttf()
The sum of term frequencys of term TERM in field FIELD over all documents. Will be returned, even if the term is not present in the current document.
_index['FIELD']['TERM'].tf()
tf of term TERM in field FIELD. Will be 0 if the term is not present in the current document.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-scripting.html
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-advanced-scripting.html
However, I've not done this and there are the normal concerns about both security and performance when using server side scripting.

Resources