How to get 3 random search results in elasticserch query - elasticsearch

I have my elasticsearch query that returns record between the range of publishedDates:
{
query : {
bool: {
filter: [
],
must: {
range: {
publishedDate: {
gte: "2018-11-01",
lte: "2019-03-30"
}
}
}
}
}
from: 0,
size: 3,
}
I need to show 3 random results every time I send this query
It is mentioned in the elastic search documentation that I can send a seed to get random results:
After following the documentation, I updated my query as:
{
"query" : {
"bool": {
"filter": [
],
"must": {
"range": {
"publishedDate": {
"gte": "2018-11-01",
"lte": "2019-03-30"
}
}
}
},
"function_score": {
"functions": [
{
"random_score": {
"seed": "123123123"
}
}
]
}
},
"from": 0,
"size": 3
}
But it is not working (saying query is malformed), can anyone suggest how to correct this query to return 3 random search results.

If you just need random results returned, you could restructure the query to be similar to the following
{
"query": {
"function_score": {
"query": {
"range": {
"publishedDate": {
"gte": "2018-11-01",
"lte": "2019-03-30"
}
}
},
"boost": "5",
"random_score": {},
"boost_mode": "multiply"
}
},
"from": 0,
"size": 3
}
Modified from the elastic documentation -
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html

Related

Query to get random n items from top 100 items in Elastic Search

I need to write a query in elasticsearch to get random 12 items in the top 100 sorted items.
I tried something like this, but I am unable to get random 12 items(I can get only the top 12 items).
The query I used:
GET product/_search
{
"sort": [
{
"DateAdded": {
"order": "desc"
}
}
],
"query": {
"function_score": {
"query": {
"bool": {
"must": [
{
"term": {
"definitionName": {
"value": "ABC"
}
}
},
{
"range": {
"price": {
"gt": 0
}
}
}
]
}
},
"functions": [
{
"random_score": {
"seed": 314159265359
}
}
]
}
},
"size": 12
}
Can anybody guide me where am I going wrong? (I am a beginner in writing ElasticQueries)
Thanks in Advance.
EDIT: doesnot work, window_size recalculate score on the X top results.
Also:
need to set: "track_scores" to true at the top level.
corect syntax is:
"rescore": {
"window_size": 10,
"query": {
"score_mode": "max", //wathever
"rescore_query": {
"bool": {
"should": [
{
//your query here - you can use a function or a script score too
}
]
}
},
"query_weight": 0.7,
"rescore_query_weight": 1.2
}
}
Ok i understand better.
Indeed you have to sort by date (top 100) and rescore with a random function (read https://www.elastic.co/guide/en/elasticsearch/reference/7.x/search-request-body.html#request-body-search-post-filter).
Should be something like:
{
"sort": [
{
"DateAdded": {
"order": "desc"
}
}
],
"query": {
"bool": {
"must": [
{
"term": {
"definitionName": {
"value": "ABC"
}
}
},
{
"range": {
"price": {
"gt": 0
}
}
}
]
}
},
"size": 100,
"rescore": {
"window_size": 12,
"query": {
"rescore_query": {
"random_score": {
"seed": 314159265359
}
}
}
}
}

Aggregation not taking place on basis of size paramter passed in ES query

My ES query looks like this. I am trying to get average rating for indexes starting from 0 to 9. But ES is taking the average of all the records.
GET review/analytics/_search
{
"_source": "r_id",
"from": 0,
"size": 9,
"query": {
"bool": {
"filter": [
{
"terms": {
"b_id": [
236611
]
}
},
{
"range": {
"r_date": {
"gte": "1970-01-01 05:30:00",
"lte": "2019-08-13 17:13:17",
"format": "yyyy-MM-dd HH:mm:ss"
}
}
},
{
"terms": {
"s_type": [
"aggregation",
"organic",
"survey"
]
}
},
{
"bool": {
"must_not": [
{
"terms": {
"s_id": [
392
]
}
}
]
}
},
{
"term": {
"status": 2
}
},
{
"bool": {
"must_not": [
{
"terms": {
"ba_id": []
}
}
]
}
}
]
}
},
"sort": [
{
"featured": {
"order": "desc"
}
},
{
"r_date": {
"order": "desc"
}
}
],
"aggs": {
"avg_rating": {
"filter": {
"bool": {
"must_not": [
{
"term": {
"rtng": 0
}
}
]
}
},
"aggs": {
"rtng": {
"avg": {
"field": "rtng"
}
}
}
},
"avg_rating1": {
"filter": {
"bool": {
"must_not": [
{
"term": {
"rtng": 0
}
}
]
}
},
"aggs": {
"rtng": {
"avg": {
"field": "rtng"
}
}
}
}
}
}
The query results shows the doc_count as 43 . whereas i want it to be 9 so that i can calculate the average correctly. I have specified the size above. The result of query seems to be calculated correctly but aggregation result is not proper.
from and size have no impact on the aggregations. They only define how many documents will be returned in the hits.hits array.
Aggregations always run on the whole document set selected by whatever query is in your query section.
If you know the IDs of the "first" nine documents, you can add a terms query in your query so that only those 9 documents are selected and so that the average rating is only computed on those 9 documents.

Elastic search find difference in a field using range query

I have to find out how many KWH has been run between two given time. For now, I am having 2 queries to find out last and the first record between the time using asc and desc sorting and doing subtraction to get the KWH value between the time is there any other way to get the KWH without 2 queries
Range query:
"query": {
"bool": {
"must": [
{
"range": {
"createdtime": {
"gte": "1566757800000",
"lte": "1566844199000",
"boost": 2.0
}
}
},
{
"match": {
"meter_id": 101
}
}
]
}
},
"size" : 1,
"from": 0,
"sort": { "createdtime" : {"order" : "desc"} }
}
another query is almost same except the order is asc
So both the 2 queries will return the record, and I am doing the subtractions in the result set to find out the differences.
You could run one query only and use top_hits aggregation to extract the "first" and "last" value, but it won't calculate the difference. You'd still have to do it outside Elasticsearch.
{
"size": 0,
"query": {
"bool": {
"must": [
{
"range": {
"createdtime": {
"gte": "1566757800000",
"lte": "1566844199000",
"boost": 2.0
}
}
},
{
"match": {
"meter_id": 101
}
}
]
}
},
"aggs": {
"range": {
"filter": {
"range": {
"createddate": {
"gte": "2016-08-19T10:00:00",
"lte": "2016-08-23T10:00:00"
}
}
},
"aggs": {
"min": {
"top_hits": {
"sort": [{"createddate": {"order": "asc"}}],
"_source": {"includes": [ "kwh_value" ]},
"size" : 1
}
},
"max": {
"top_hits": {
"sort": [{"createddate": {"order": "desc"}}],
"_source": {"includes": [ "kwh_value" ]},
"size" : 1
}
}
}
}
}
}

Need aggregation on document inner array object - ElasticSearch

I am trying to do aggregation over the following document
{
"pid": 900000,
"mid": 9000,
"cid": 90,
"bid": 1000,
"gmv": 1000000,
"vol": 200,
"data": [
{
"date": "25-11-2018",
"gmv": 100000,
"vol": 20
},
{
"date": "24-11-2018",
"gmv": 100000,
"vol": 20
},
{
"date": "23-11-2018",
"gmv": 100000,
"vol": 20
}
]
}
The analysis which needs to be done here is:
Filter on mid or/and cid on all documents
Filter range on data.date for last 7 days and sum data.vol over that range for each pid
sort the documents over the sum obtained in previous step in desc order
Group these results by pid.
This means we are trying to get top products by sum of the volume (quantity sold) within a date range for specific cid/mid.
PID here refers product ID,
MID refers here merchant ID,
CID refers here category ID
Firstly you need to change your mapping to run the query on nested fields.
change the type for field 'data' as 'nested'.
Then you can use the range query in filter along with the terms filter on mid/cid to filter on the data. Once you get the correct data set, then you can aggregate on the pid following the sub aggregation on sum of vol.
Here is the below query.
{
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [
{
"range": {
"data.date": {
"gte": "28-11-2018",
"lte": "25-11-2018"
}
}
},
{
"must": [
{
"terms": {
"mid": [
"9000"
]
}
}
]
}
]
}
}
]
}
},
"aggs": {
"AGG_PID": {
"terms": {
"field": "pid",
"size": 0,
"order": {
"TOTAL_SUM": "desc"
},
"min_doc_count": 1
},
"aggs": {
"TOTAL_SUM": {
"sum": {
"field": "data.vol"
}
}
}
}
}
}
You can modify the query accordingly. Hope this will be helpful.
Please find nested aggregation query which sorts by "vol" for each bucket of "pid". You can add any number of filters in the query part.
{
"size": 0,
"query": {
"bool": {
"must": [
{
"term": {
"mid": "2"
}
}
]
}
},
"aggs": {
"top_products_sorted_by_order_volume": {
"terms": {
"field": "pid",
"order": {
"nested_data_object>order_volume_by_range>order_volume_sum": "desc"
}
},
"aggs": {
"nested_data_object": {
"nested": {
"path": "data"
},
"aggs": {
"order_volume_by_range": {
"filter": {
"range": {
"data.date": {
"gte": "2018-11-26",
"lte": "2018-11-27"
}
}
},
"aggs": {
"order_volume_sum": {
"sum": {
"field": "data.ord_vol"
}
}
}
}
}
}
}
}
}
}

How do I limit an ElasticSearch API count by date?

I'm trying to count the number of query matches over a given time range, hitting the URL /{index}/_count with the body indicated below.
I'm new to Query DSL, so it's quite possible I'm overlooking something obvious. However, the straightforward application of a count to an existing query doesn't work. I don't see anything in the docs that indicate a count query should receive special treatment.
I've tried adding a range and aggregations to the query, but I keep getting the following error or some variant:
indices:data/read/count[s]]]; nested:
QueryParsingException[[graylog2_NN] request does not support [{label}]]
Limit query by timestamp:
{
"query": {
"term": { "level":3 },
"range": {
"timestamp": {
"from": "2015-06-16 15:10:09.322",
"to": "2015-06-16 16:10:09.322",
"include_lower": true,
"include_upper": true
}
}
}
}
Use an aggregation:
{
"query": {
"term": { "level":3 }
},
"aggs": {
"range": {
"date_range": {
field: "_timestamp",
"ranges": {
{ "to": "now-1d" },
{ "from": "now-2d" },
}
}
}
}
}
I've also tried plugging in the query exported from the UI (bug icon on an individual stream display), no joy there either (one hour's worth of matches):
{
"from": 0,
"size": 100,
"query": {
"match_all": {}
},
"post_filter": {
"bool": {
"must": [
{
"range": {
"timestamp": {
"from": "2015-06-16 15:10:09.322",
"to": "2015-06-16 16:10:09.322",
"include_lower": true,
"include_upper": true
}
}
},
{
"query": {
"query_string": {
"query": "streams:5568c9dbe4b0b31b781bf105"
}
}
}
]
}
},
"sort": [
{
"timestamp": {
"order": "desc"
}
}
],
"highlight": {
"require_field_match": false,
"fields": {
"*": {
"fragment_size": 0,
"number_of_fragments": 0
}
}
}
}
I've found a query that both matches and lines up pretty closely with numbers I get from the UI ("Search in the last 1 day"):
{
"query": {
"filtered": {
"query": {
"term": { "level":3 }
},
"filter": {
"range": { "timestamp": { "gte": "now-1d" } }
}
}
}
}
Try the following query that uses bool query. I use a different timestamp format, which is the default in elasticsearch. Try that format first, if no luck modify the timestamp format to match yours.
{
"query": {
"bool" : {
"should" : [
{
"term": { "level":3 }
},
{
"range": {
"timestamp": {
"from": "2015-06-16T15:10:09",
"to": "2015-06-16T16:10:09"
}
}
}
]
}
}
}

Resources