Specify size for each subquery in Elasticsearch - elasticsearch

I have query that is similar to union operation in SQL. What i need is to specify the size of result set for each index. For example i want to get 10 records from first index and 15 records from second index.
My query:
{
"query": {
"bool": {
"should": [
{
"bool": {
"must": [{
"match_phrase_prefix": {"userName": "ar" }
}]
}
},
{
"bool": {
"must": [{
"match_phrase_prefix": { "groupName": "ar" }
}]
}
}
]
}
}
}
Url to send query:
http://website.com:9200/user_data,group_data/_search
If you have any thoughts i'd be very grateful.
Thank you

I think you can't do that with a simple query.
But can do that with the Top Hits aggregation, which lets you group result sets by certain fields via a bucket aggregator. Your case should look like:
{
"query": {
"bool": {
"should": [
{
"bool": {
"must": [{
"match_phrase_prefix": {"userName": "ar" }
}]
}
},
{
"bool": {
"must": [{
"match_phrase_prefix": { "groupName": "ar" }
}]
}
}
]
}
}, #Your query stills the same
"size": 0, #This will bring back nothing within the field "hits", so you can focus in the "aggregations" field.
"aggs": {
"10_usernames": {
"top_hits": {
"_source": {
"includes": [ "userName" ]
},
"size" : 10
}
},
"15_groupames": {
"top_hits": {
"_source": {
"includes": [ "groupName" ]
},
"size" : 15
}
}
}
}
You'll see your results within the "aggregations" field.
Hope this is helpful! :D

Ok, thanks for help
Eventually i've chosen another approach. I use Multi Search API, which allows you executing several requests at once. My query is:
POST http://website.com:9200/_msearch
{"index": "user_data"}
{"size":10,"query":{"bool":{"must":[{"match_phrase_prefix":{"userName":"##USER_TEXT##"}}]}}}
{"index": "group_data"}
{"size":15,"query":{"bool":{"must":[{"match_phrase_prefix":{"groupName":"##USER_TEXT##"}}]}}}

Related

How do i make ES aggregation to work on the output of a function_score query?

I have a function_score query and i want to apply top_hit aggregation on the output of my function_score query, as such i am using function_score query to filter parents based child properties, then i want to run some queries over those parents via the aggregation and then sort them accordingly.
Query:
POST test/_search?size=0
{
"query": {
"function_score": {
"query": {
"match_all" : {}
},
"functions": [
{
"filter": {
"bool": {
"must": [
{
"has_child": {
"type": "track",
"query": {
"exists": {
"field": "url"
}
}
}
}
]
}
},
"weight": 5
}
]
}
},
"aggs": {
"top_tags": {
"filter": {
"bool": {
"must": [
{
"exists": {
"field": "common_info.CD.Handle"
}
},
{
"match_phrase" : {
"common_info.CD.device" : "D/C"
}
}
]
}
},
"aggs": {
"test_agg_on_doc": {
"top_hits": {
"sort" : [
],
"_source": {
"includes": [
"common_info.CD.Handle"
]
},
"size": 1
}
}
}
}
}
}
When i run this query my function_score query is not getting considered at all as such and "aggs" are working on the total number of docs, but i want it to run on the docs that are filtered using function_score query. Any help would be highly appreciated. thanks

ElasticSearch multimatch substring search

I have to combine two filters to match requirements:
- a specific list of values in r.status field
- one of the multiple text fields contains the value.
Result query (with using Nest, but it doesn't matter) looks like:
{
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [
{
"term": {
"isActive": {
"value": true
}
}
},
{
"nested": {
"query": {
"bool": {
"must": [
{
"terms": {
"r.status": [
"VALUE_1",
"VALUE_2",
"VALUE_3"
]
}
},
{
"bool": {
"should": [
{
"match": {
"r.g.firstName": {
"type": "phrase",
"query": "SUBSTRING_VALUE"
}
}
},
{
"match": {
"r.g.lastName": {
"type": "phrase",
"query": "SUBSTRING_VALUE"
}
}
}
]
}
}
]
}
},
"path": "r"
}
}
]
}
}
]
}
}
}
Also tried with multi_match query:
{
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [
{
"term": {
"isActive": {
"value": true
}
}
},
{
"nested": {
"query": {
"bool": {
"must": [
{
"terms": {
"r.status": [
"VALUE_1",
"VALUE_2",
"VALUE_3"
]
}
},
{
"multi_match": {
"query": "SUBSTRING_VALUE",
"fields": [
"r.g.firstName",
"r.g.lastName"
]
}
}
]
}
},
"path": "r"
}
}
]
}
}
]
}
}
}
FirstName and LastName are configured in index mappings as text:
"firstName": {
"type": "text"
},
"lastName": {
"type": "text"
}
Elastic gives a lot of full-text search options: multi_match, phrase, wildcards etc. But all of them fail in my case looking a sub-string in my text fields. (terms query and isActive one work well, I just tried to run only them).
What options do I have also or maybe where I made a mistake?
UPD: Combined wildcards worked for me, but such query looks ugly. Looking for a more elegant solution.
The elasticsearch way is to use ngram tokenizer.
The ngram analyzer will split your terms with a sliding window. For example, the input "Hello World" will generate the following terms:
Hel
Hell
Hello
ell
ello
...
Wor
World
orl
...
You can configure the minimum and maximum size of the sliding window (in the example the minimum size is 3). Once the sub terms are generated you can use a match query an the subfield.
Another point, it is weird to use must within a filter. If you are interested in the score, you should always use must otherwise use filter. Read this article for a good understanding.

elasticsearch multi field query is not working as expected

I've been facing some issues with multi field elasticsearch query. I am trying to query all the documents which matches the field called func_name to two hard coded strings, even though my index has documents with both these function names, but the query result is always fetching only one func_name. So far I have tried following queries.
1) Following returns only one function match, even though the documents have another function as well
GET /_search
{
"query": {
"multi_match": {
"query": "FEM_DS_GetTunerStatusInfo MDM_TunerStatusPrint",
"operator": "OR",
"fields": [
"func_name"
]
}
}
}
2) following intermittently gives me both the functions.
GET /_search
{
"query": {
"match": {
"func_name": {
"query": "MDM_TunerStatusPrint FEM_DS_GetTunerStatusInfo",
"operator": "or"
}
}
}
}
3) Following returns only one function match, even though the documents have another function as well
{
"query": {
"bool": {
"should": [
{ "match": { "func_name": "FEM_DS_GetTunerStatusInfo" }},
{ "match": { "func_name": "MDM_TunerStatusPrint" }}
]
}
}
}
Any help is much appreciated.
Thanks for your reply. Lets assume that I have following kind of documents in my elasticsearch. I want my search to return first two documents out of all as they matches my func_name.
{
"_index": "diag-178999",
"_source": {
"severity": "MIL",
"t_id": "03468500",
"p_id": "000007c6",
"func_name": "MDM_TunerStatusPrint",
"timestamp": "2017-06-01T02:04:51.000Z"
}
},
{
"_index": "diag-344563",
"_source": {
"t_id": "03468500",
"p_id": "000007c6",
"func_name": "FEM_DS_GetTunerStatusInfo",
"timestamp": "2017-07-20T02:04:51.000Z"
}
},
{
"_index": "diag-101010",
"_source": {
"severity": "MIL",
"t_id": "03468500",
"p_id": "000007c6",
"func_name": "some_func",
"timestamp": "2017-09-15T02:04:51.000Z"
}
The "two best ways" to request your ES is to filter by terms on a particular field or to aggregate your queries so that you can rename the field, apply multiple rules, and give a more understandable format to your response
See : https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html and the other doc page is here, very useful :
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations.html
In your case, you should do :
{
"from" : 0, "size" : 2,
"query": {
"filter": {
"bool": {
"must": {
"term": {
"func_name" : "FEM_DS_GetTunerStatusInfo OR MDM_TunerStatusPrint",
}
}
}
}
}
}
OR
"aggs": {
"aggregationName": {
"terms": {
"func_name" : "FEM_DS_GetTunerStatusInfo OR MDM_TunerStatusPrint"
}
}
}
}
The aggregation at the end is just here to show you how to do the same thing as your query filter. Let me know if it's working :)
Best regards
As I understand, you should use filtered query to match any document with one of the values of func_name mentioned above:
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"terms": {
"func_name": [
"FEM_DS_GetTunerStatusInfo",
"MDM_TunerStatusPrint"
]
}
}
]
}
}
}
}
}
See:
Filtered Query, Temrs Query
UPDATE in ES 5.0:
{
"query": {
"bool": {
"must": [
{
"terms": {
"func_name": [
"FEM_DS_GetTunerStatusInfo",
"MDM_TunerStatusPrint"
]
}
}
]
}
}
}
See: this answer

elasticsearch must query combine OR?

I have been trying to use a must query with bool but I am failing to get the results.
In pseudo-SQL:
SELECT * FROM info WHERE (ulevel= '1.3.10' or ulevel= '1.3.6') AND (#timestamp between '2017-06-05T07:00:00.000Z' and '2017-06-05T07:00:00.000Z')
Here is what I have:
"query": {
"bool": {
"must": [
{
"query_string": {
"default_field": "_all",
"query": "*"
},
"range": {
"#timestamp": {
"from": "2017-06-05T07:00:00.000Z",
"to": "2017-06-05T07:20:00.000Z"
}
},
"bool": {
"should": [
{"term": { "ulevel": "1.3.10"}},
{"term": { "ulevel": "1.3.6"}}
]
}
}
]
}
}
Does anyone have a solution?
Thank you so much.
You can use terms query for the first part and the range query for the second part
GET _search
{
"query": {
"bool": {
"must": [
{
"terms": {
"ulevel": [
"1.3.10",
"1.3.6"
]
}
},
{
"range": {
"#timestamp": {
"gte": "2017-06-05T07:00:00.000Z",
"lte": "2017-06-05T07:20:00.000Z"
}
}
}
]
}
},
"from": 0,
"size": 20
}
Some Notes :
Filters documents that have fields that match any of the provided terms (not analyzed)
Also you can use some date spesific formulation with rage filter. Please check the range query page https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html#ranges-on-dates more information.
Update:
Added from and size for comment question.

Select distinct values of bool query elastic search

I have a query that gets me some user post data from an elastic index. I am happy with that query, though I need to make it return rows with unique usernames. Current, it displays relevant posts by users, but it may display one user twice..
{
"query": {
"bool": {
"should": [
{ "match_phrase": { "gtitle": {"query": "voice","boost": 1}}},
{ "match_phrase": { "gdesc": {"query": "voice","boost": 1}}},
{ "match": { "city": {"query": "voice","boost": 2}}},
{ "match": { "gtags": {"query": "voice","boost": 1} }}
],"must_not": [
{ "term": { "profilepicture": ""}}
],"minimum_should_match" : 1
}
}
}
I have read about aggregations but didn't understand much (also tried to use aggs but didn't work either).... any help is appreciated
You would need to use terms aggregation to get all unique users and then use top hits aggregation to get only one result for each user. This is how it looks.
{
"query": {
"bool": {
"should": [
{
"match_phrase": {
"gtitle": {
"query": "voice",
"boost": 1
}
}
},
{
"match_phrase": {
"gdesc": {
"query": "voice",
"boost": 1
}
}
},
{
"match": {
"city": {
"query": "voice",
"boost": 2
}
}
},
{
"match": {
"gtags": {
"query": "voice",
"boost": 1
}
}
}
],
"must_not": [
{
"term": {
"profilepicture": ""
}
}
],
"minimum_should_match": 1
}
},
"aggs": {
"unique_user": {
"terms": {
"field": "userid",
"size": 100
},
"aggs": {
"only_one_post": {
"top_hits": {
"size": 1
}
}
}
}
},
"size": 0
}
Here size inside user aggregation is 100, you can increase that if you have more unique users(default is 10), also the outermost size is zero to get only aggregation results. One important thing to remember is your user ids have to be unique, i.e ABC and abc will be considered different users, you might have to make your userid not_analyzed to be sure about that. More on that.
Hope this helps!!

Resources