Elasticsearch result - elasticsearch

I am writing queries in the Elasticsearch for my app.I need it to search within several indices and aggregate the result(For example, shows 3 items of each indices)like below.
I tested nested, aggregation, joining queries but it is not the answer.I need the result to be returned as below
{
index1: [
{item1},
{item2},
],
index2: [
{item3},
{item4},
{item5},
]
}
Does anybody know what should I do?

You can do multi-index search and the use aggregation and sorting on based on _index metadata.
Your query should look like this:
GET index_1,index_2/_search
{
"query": {
"terms": {
"_index": ["index_1", "index_2"]
}
},
"aggs": {
"indices": {
"terms": {
"field": "_index",
"size": 10
}
}
},
"sort": [
{
"_index": {
"order": "asc"
}
}
],
"script_fields": {
"index_name": {
"script": {
"lang": "painless",
"source": "doc['_index']"
}
}
}
}
For more information you can check ES official documentation here.

Related

Deduplicate and perform composite aggregation on deduced result

I've an index in elastic search which contains data of daily transactions. Each doc has mainly three fields as below :
TxnId, Status, TxnType,userId
two documents can have same TxnIds.
I'm looking for a query that provides aggregation over status,TxnType for unique txnIds. Basically I'm looking for something like : select unique txnIds from user_table group by status,txnType.
I've a ES query which will dedup on TxnIds. I've another ES query which can perform composite aggregation on status and txnType. I want to do both things in Single query.
I tried collapse feature . I also tried cardinality and dedup features. But query is not giving correct output.:
{
"size": 0,
"query": {
"bool": {
"filter": [
{
"term": {
"streamSource": 3
}
}
]
}
},
"collapse": {
"field": "txnId"
},
"aggs": {
"buckets": {
"composite": {
"size": 30,
"sources": [
{
"status": {
"terms": {
"field": "status"
}
}
},
{
"txnType": {
"terms": {
"field": "txnType"
}
}
}
]
}
}
}
}

ElasticSearch return aggregations random order

I've got the following ElasticSearch-query, to get 10 documents from each "category" grouped on "cat.id":
"aggs": {
"test": {
"terms": {
"size": 10,
"field": "cat.id"
},
"aggs": {
"top_test_hits": {
"top_hits": {
"_source": {
"includes": [
"id"
]
},
"size": 10
}
}
}
}
}
This is working fine. However I cannot seem to find a way, to randomly take 10 results from each bucket. The results are always the same. And I would like to have 10 random items from each bucket. I tried all kinds of things which are intended for documents, but non of them seem to be working.
As was already suggested in this answer, you can try using random sort in the top_hits aggregation, using a _script like this:
{
"aggs": {
"test": {
"terms": {
"size": 10,
"field": "cat.id"
},
"aggs": {
"top_test_hits": {
"top_hits": {
"_source": {
"includes": [
"id"
]
},
"size": 10,
"sort": {
"_script": {
"type": "number",
"script": {
"lang": "painless",
"source": "(System.currentTimeMillis() + doc['_id'].value).hashCode()"
},
"order": "asc"
}
}
}
}
}
}
}
}
Random sorting was broadly covered in this question.
Hope that helps!

Aggregate over multiple fields without subaggregation

I have documents in my ElasticSearch which have two fields. I want to build an aggregate over the combination of these, kind of like in SQL GROUP BY field_A, field_B and get a row per existing combination. I read everywhere that I should use subaggregation for this.
{
"aggs": {
"sales_by_article": {
"terms": {
"field": "catalogs.article_grouping",
"size": 1000000,
"order": {
"total_amount": "desc"
}
},
"aggs": {
"total_amount": {
"sum": {
"script": "Math.round(doc['amount.value'].value*100)/100.0"
}
},
"sales_by_submodel": {
"terms": {
"field": "catalogs.submodel_grouping",
"size": 1000,
"order": {
"total_amount": "desc"
}
},
"aggs": {
"total_amount": {
"sum": {
"script": "Math.round(doc['amount.value'].value*100)/100.0"
}
}
}
}
}
}
},
"size": 0
}
With the following simplified result:
{
"aggregations": {
"sales_by_article": {
"buckets": [
{
"key": "19114",
"total_amount": {
"value": 426794.25
},
"sales_by_submodel": {
"buckets": [
{
"key": "12",
"total_amount": {
"value": 51512.200000000004
}
},
...
]
}
},
...
]
}
}
}
However, the problem with this is that the ordering is not what I want. In this particular case, it first orders the articles based on total_amount per article, and then within an article it orders the submodels based on total_amount per submodel. However, what I want to achieve is to only have the deepest level and get an aggregation for the combination of article and submodel, ordered by the total_amount of this combination. This is the result I would like:
{
"aggregations": {
"sales_by_article_and_submodel": {
"buckets": [
{
"key": "1911412",
"total_amount": {
"value": 51512.200000000004
}
},
...
]
}
}
}
It's discussed in the docs a bit here: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_multi_field_terms_aggregation
Basically you can use a script to create a term which is derived from each document (using as many fields as you want) at query run time, but it will be slow. If you are doing it for ad hoc analysis, it'll work fine. If you need to serve these requests at some high rate, then you probably want to make a field in your model that is a combination of the two fields you're interested in, so the index is populated for you already.
Example query using the script approach:
GET agreements/agreement/_search?size=0
{
"aggs" : {
"myAggregationName" : {
"terms" : {
"script" : {
"source": "doc['owningVendorCode'].value + '|' + doc['region'].value",
"lang": "painless"
}
}
}
}
}
I have learned I should use composite aggregates for this.

How to sort query result with hit count

Hi I've indexed some info into ElasticSearch like
{"info":"002345 Groot 7AP"}
and supported a query template
GET _search?size=5
`{"query": {
"match_phrase_prefix": {
"info": "%s"
}
}
}`
so I can search info by any terms.
the default order is "_score":"desc"
and now I want to return query results sorting by hit count, so the frequently used infos would show up.
I read some aggregation api on elastic.co, but don't know how to write the query body.
Thanks.
Try this if this works:
`{
"aggs": {
"top_tags": {
"terms": {
"field": "type",
"size": 3
},
"aggs": {
"top_sales_hits": {
"top_hits": {
"sort": [
{
"_score": {
"order": "desc"
}
}
],
"size" : 1
}
}
}
}`
}
}`

how to get the top 1 document of each type, from a search on index(having multiple types)?

We have an index named "machines", and have types "auto, bike, car, flight" in ElasticSearch
I want to get the similar brands from my search on an index - from every type
How do I query to get the top 1 document of each type, from a search on an index (having multiple types) via the Elasticsearch REST API?
Try this, using top_hits aggregation:
GET /machines/_search?search_type=count
{
"query": {
"match_all": {} //your query here
},
"aggs": {
"top-types": {
"terms": {
"field": "_type"
},
"aggs": {
"top_docs": {
"top_hits": {
"sort": [
{
"_score": {
"order": "desc"
}
}
],
"size": 1
}
}
}
}
}
}

Resources