Elasticsearch query from multiple indexes with paination per index - elasticsearch

I have the following query running over multiple indexes. I want to have pagination per index.
I don't want to lose results just because one index has more results than the other.
GET /research-one,research-two/_search
{
"query": {
"bool": {
"should": [
{
"match_phrase": {
"urls.value": "https://www.stackoverflow.com"
}
},
{
"query_string": {
"default_field": "urls.value",
"query": "https://stackexchange.com/*"
}
}
]
}
},
"size": 20,
"from": 0
}
Let's say that in this case, research-one has 10000 results and research-two has 2 results.
I don't know head which one has more results.
Thanks

Related

How can I combine multiple queries in Kibana DevTools to get one result for each?

I'm trying to write a query in Kibana DevTools that would give me one match for each query.
Let's say I've got fields (field1, field2) that I want to match with a specific value.
I want to display 1 result for each field if any logs were found for any of these fields. The code that I'm using right now that displays 1 log for 1 search is below:
(I'm looking for log that was created in the last 30 minutes and using sort to get the last one)
GET default*/_search
{
"size": 1,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "field1:somevalue*"}
},
{
"range":{"#timestamp":{"gte": "now-30m"}}
}
]
}
},
"sort" :{
"#timestamp":{
"order":"desc"
}
}
}
How can I modify it to display 1 log for each field (field1, field2). Should look somehow like this:
GET default*/_search
{
"size": 1,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "field1:somevalue*"}
},
{
"range":{"#timestamp":{"gte": "now-30m"}}
}
]
}
},
"sort" :{
"#timestamp":{
"order":"desc"
}
}
}
AND
{
"size": 1,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "field2:somevalue*"}
},
{
"range":{"#timestamp":{"gte": "now-30m"}}
}
]
}
},
"sort" :{
"#timestamp":{
"order":"desc"
}
}
}
The one above obviously doesn't work but wanted to visualize what I mean.
Any help appreciated
Val - here are some screenshots of the error I'm getting trying your solution:
Image 1
Image 2
I suggest using the multi search API:
GET default*/_msearch
{}
{"size":1,"query":{"bool":{"must":[{"query_string":{"query":"field1:somevalue*"}},{"range":{"#timestamp":{"gte":"now-30m"}}}]}},"sort":{"#timestamp":{"order":"desc"}}}
{}
{"size":1,"query":{"bool":{"must":[{"query_string":{"query":"field2:somevalue*"}},{"range":{"#timestamp":{"gte":"now-30m"}}}]}},"sort":{"#timestamp":{"order":"desc"}}}
In the response, you'd get exactly what you expect, i.e. the response for each query.

Query on multiple range of document

What I want to search is to extract documents among certain range of documents, not the whole documents. I know ids of documents. For example, I want to query matching some sentences with query field - 'pLabel' among the documents ids of which I know via different process. My trial is as below but I got bunch of documents which is different with my expectation.
For example, in such documents as eid1, eid2...etc groups, I want to query filtering out the matching documents out of the groups (eid1, eid2, eid3, ...). Query is shown as below.
How I fix query statement to get the right search result?
{
"query": {
"bool": {
"must": [
{
"query_string": {
"default_field": "pLabel" ,
"query": "search words here"
}
}
] ,
"must_not": [] ,
"should": [
{
"term": {
"eid": "eid1"
}
} ,
{
"term": {
"eid": "eid2"
}
}
]
}
} ,
"size": 0 ,
"_source": [
"eid"
] ,
"aggs": {
"eids": {
"terms": {
"field": "eid" ,
"size": 1000
}
}
}
}
You need to move the should clause of the Doc IDs inside the must clause.
Right now the query can return any document that matches the query_string clause, it'll only prefer docs that matches the Doc IDs.
Also, you should use terms query
{
"query": {
"bool": {
"must": [
{
"query_string": {
"default_field": "pLabel",
"query": "search words here"
}
},
{
"terms": {
"user": ["eid1", "eid2"]
}
}
]
}
},
"size": 0,
"_source": [
"eid"
],
"aggs": {
"eids": {
"terms": {
"field": "eid",
"size": 1000
}
}
}
}

Elasticsearch search in documents with certain values for a field

I have an index with following document structure with 5 fields. I have written a search query as follows :
{
"query": {
"query_string": {
"fields": [
"field1.keyword",
"field2.keyword",
"field3.keyword"
],
"query": "*abc*"
}
},
"from": 0,
"size": 1000
}
This works fine but as a new requirement I have to search only in documents where field4 has a given set of values suppose (1,2,3) and omit rest of the documents.
It is possible for me to obtain a list of field4 values which are to be omitted as they are present in the db with skip status.
Please suggest a solution for the same.Thanks in advance.
I suggest using a filter query inside a bool query to match the docs that meet the condition.
{
"query": {
"bool": {
"must": {
"query_string": {
"fields": [
"field1.keyword",
"field2.keyword",
"field3.keyword"
],
"query": "*abc*"
}
},
"filter": {
"terms": {
"field4.keyword": [1, 2, 3]
}
}
}
}
}

Can I use query_string to filter result before scoring for performance for Elasticsearch 1.5?

I'm writing an API on top of Elasticsearch 1.5 for people to be able to filter the documents before scoring from query_string. This is the my query
"query": {
"filtered": {
"filter": {
"query": {
"bool": {
"must_not": [{
"query_string": {
"query": "url:/.*valid_url.*/"
}
}],
"must": {
"terms": {
"brand": ["the brand"]
}
}
}
}
},
"query": {
"match_all": {}
}
}
}
However, in order to use query_string I need to put that under query after filtered.filter I'm not sure if that would count as filter documents before scoring for performance?
I have about 6 million documents so from the query above, does it mean it will filter all the 6 million documents with brand equals to the brand and the url must not be /.*valid_url.*/ using regular expression? And whatever's left from the filter will be scored but in this case it will be match_all?

elasticsearch default_field vs fields different results

Here is two queries.
First:
{
"query": {
"filtered": {
"query": {
"query_string": {
"query": "27444.2",
"default_field": "text"
}
}
}
},
"from": 0,
"size": 50
}
Second:
{
"query": {
"filtered": {
"query": {
"query_string": {
"query": "27444.2"
}
}
}
},
"fields": ["text"],
"from": 0,
"size": 50
}
The only difference between them is that in first i use default_field to specify a field to search, and in second i specify it through fields param. The field name is the same.
I expect both variant to produce same results, but thats not the case. The first variant doesn't return any results, and the second return a result. So what im doing wrong here? Where is the catch
elasticsearch 1.4.2
The way you have given fields param is wrong.
In the second case you are referring to the field params in the query where you are restricting the results to show only certain fields and not the entire _source
The following one is what you are looking for -
{
"query": {
"filtered": {
"query": {
"query_string": {
"query": "27444.2",
"fields": ["text"]
}
}
}
},
"from": 0,
"size": 50
}
2 queries are not the same.
First searches the field 'text' and second searches all fields and in response, returns only 'field'.

Resources