Elasticsearch - unify search results from different indexes - elasticsearch

I want to perform a search query on different indexes with different search queries and unify the results.
I know there is a multi-target syntax, which allows me to perform specific query over multiple indexes.
What I want is different query for each index and then perform something like UNION (SQL).
Is there a way to achieve that?

You can use the _index metadata field. This will help you to query on multiple indexes with different queries
Adding a working example with index data, search query and search result
Index Data
POST /index1/_doc/1
{
"name":"foo"
}
POST /index2/_doc/1
{
"name":"bar"
}
Search Query:
{
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"match": {
"name": "foo"
}
},
{
"term": {
"_index": "index1"
}
}
]
}
},
{
"bool": {
"must": [
{
"match": {
"name": "bar"
}
},
{
"term": {
"_index": "index2"
}
}
]
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "index1",
"_type": "_doc",
"_id": "1",
"_score": 1.287682,
"_source": {
"name": "foo"
}
},
{
"_index": "index2",
"_type": "_doc",
"_id": "1",
"_score": 1.287682,
"_source": {
"name": "bar"
}
}
]

Related

Elasticsearch query filter combination issue

Im trying to understand why the below elasticsearch query does not work.
EDIT:
The fields mentioned in the query are from different indices. For example Filter has classification field which is in a different index to the fields mentioned in the query string.
The expectation of the filter query is that when the user searches specifically on classification field i.e. secret or protected then the values are displayed. Else if the user searches for any other field from a different index for example firstname or person, then it should not consider any filter applied as firstname or person is not part of the filter
{
"query": {
"bool": {
"filter": {
"terms": {
"classification": [
"secret",
"protected"
]
}
},
"must": {
"query_string": {
"query": "*john*",
"fields": [
"classification",
"firstname",
"releasability",
"person"
]
}
}
}
}
}
The result expected is john in the field person is returned. This works when there is no filter applied in the above code as
{
"query": {
"query_string": {
"query": "*john*",
"fields": [
"classification",
"firstname",
"releasability",
"person"
]
}
}
}
The purpose of the filter is only to filter records when the said fields contain the values mentioned, otherwise it should work for all values.
Why is it not producing the results for john and only producing results for classification values only?
Adding a working example with sample index data and search query.
To know more about Bool query refer this official documentation
Index Data:
Index data in my_index index
{
"name":"John",
"title":"b"
}
{
"name":"Johns",
"title":"a"
}
Index data in my_index1 index
{
"classification":"protected"
}
{
"classification":"secret"
}
Search Query :
POST http://localhost:9200/_search
{
"query": {
"bool": {
"should": [
{
"bool": {
"filter": [
{
"terms": {
"classification": [
"secret",
"protected"
]
}
}
]
}
},
{
"bool": {
"must": [
{
"query_string": {
"query": "*john*",
"fields": [
"name",
"title"
]
}
}
]
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "my_index",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"name": "John",
"title": "b"
}
},
{
"_index": "my_index",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"name": "Johns",
"title": "a"
}
},
{
"_index": "my_index1",
"_type": "_doc",
"_id": "1",
"_score": 0.0,
"_source": {
"classification": "secret"
}
},
{
"_index": "my_index1",
"_type": "_doc",
"_id": "2",
"_score": 0.0,
"_source": {
"classification": "protected"
}
}
]

Elasticsearch associating exact match terms

I have a search index of filenames containing over 100,000 entries that share about 500 unique variations of the main filename field. I have recently made some modifications to certain filename values that are being generated from my data. I was wondering if there is a way to link certain queries to return an exact match. In the following query:
"query": {
"bool": {
"must": [
{
"match": {
"filename": "foo-bar"
}
}
],
}
}
how would it be possible to modify the index and associate the results so that above query will also match results foo-bar-baz, but not foo-bar-foo or any other variation?
Thanks in advance for your help
You can use a term query instead of a match query. Perfect to use on a keyword:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html
Adding a working example with index data and search query. (Using the default mapping)
Index Data:
{
"fileName": "foo-bar"
}
{
"fileName": "foo-bar-baz"
}
{
"fileName": "foo-bar-foo"
}
Search Query:
{
"query": {
"bool": {
"should": [
{
"match": {
"fileName.keyword": "foo-bar"
}
},
{
"match": {
"fileName.keyword": "foo-bar-baz"
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "test",
"_type": "_doc",
"_id": "1",
"_score": 0.9808291,
"_source": {
"fileName": "foo-bar"
}
},
{
"_index": "test",
"_type": "_doc",
"_id": "2",
"_score": 0.9808291,
"_source": {
"fileName": "foo-bar-baz"
}
}
]

elasticSearch: bool query with multiple values on one field

This works:
GET /bitbucket$$pull-request-activity/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"prid": "12343"
}
},
{
"match": {
"repoSlug": "com.xxx.vserver"
}
}
]
}
}
}
But I would like to capture multiple prids in one call.
This does not work however:
GET /bitbucket$$pull-request-activity/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"prid": "[12343, 11234, 13421]"
}
},
{
"match": {
"repoSlug": "com.xxx.vserver"
}
}
]
}
}
}
any hints?
As you are using must in your bool query, then this represents logical AND, so be sure that all the documents that you are Matching of the prid field, should also match with "repoSlug": "com.xxx.vserver".
If none of the documents match with "repoSlug": "com.xxx.vserver", then no result will return.
And, if only 2 documents match, then only 2 of them will be returned in the search result, and not all the documents.
Adding Working example with mapping, sample docs and search query
Index Sample Data :
{
"id":"1",
"message":"hello"
}
{
"id":"2",
"message":"hello"
}
{
"id":"3",
"message":"hello-bye"
}
Search Query:
{
"query": {
"bool": {
"must": [
{
"match": {
"id": "[1, 2, 3]"
}
},
{
"match": {
"message": "hello"
}
}
]
}
}
}
Search Result :
"hits": [
{
"_index": "foo14",
"_type": "_doc",
"_id": "1",
"_score": 1.5924306,
"_source": {
"id": "1",
"message": "hello"
}
},
{
"_index": "foo14",
"_type": "_doc",
"_id": "3",
"_score": 1.4903541,
"_source": {
"id": "3",
"message": "hello-bye"
}
},
{
"_index": "foo14",
"_type": "_doc",
"_id": "2",
"_score": 1.081605,
"_source": {
"id": "2",
"message": "hello"
}
}
]

Elastic search ---: MUST_NOT query not working

I have a query in which i want to add a must_not clause that would discard all records that have blank data for a some field. I tried a lot of ways but none worked. when I issue the same query (mentioned below) with other specific fields then it works fine.
this query should get all records that do not have "registrationType1" field empty/blank
query:
{
"size": 20,
"_source": [
"registrationType1"
],
"query": {
"bool": {
"must_not": [
{
"term": {
"registrationType1": ""
}
}
]
}
}
}
the results below still contains "registrationType1" with empty values
results:
**"_source": {
"registrationType1": ""}}
, * {
"_index": "oh_animal",
"_type": "animals",
"_id": "3842002",
"_score": 1,
"_source": {
"registrationType1": "A&R"}}
, * {
"_index": "oh_animal",
"_type": "animals",
"_id": "3842033",
"_score": 1,
"_source": {
"registrationType1": "AMHA"}}
, * {
"_index": "oh_animal",
"_type": "animals",
"_id": "3842213",
"_score": 1,
"_source": {
"registrationType1": "AMHA"}}
, * {
"_index": "oh_animal",
"_type": "animals",
"_id": "3842963",
"_score": 1,
"_source": {
"registrationType1": ""}}
, * {
"_index": "oh_animal",
"_type": "animals",
"_id": "3869063",
"_score": 1,
"_source": {
"registrationType1": ""}}**
PFB mappings for the field above
"registrationType1": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
You need to use the keyword subfield in order to do this:
{
"size": 20,
"_source": [
"registrationType1"
],
"query": {
"bool": {
"must_not": [
{
"term": {
"registrationType1.keyword": "" <-- change this
}
}
]
}
}
}
If you do not specify any text value on the text fields, there is basically nothing to analyze and return the documents accordingly.
In similar way, if you remove must_not and replace it with must, it would show empty results.
What you can do is, looking at your mapping, query must_not on keyword field. Keyword fields won't be analysed and in that way your query would return the results as you expect.
Query
POST myemptyindex/_search
{
"query": {
"bool": {
"must_not": [
{
"term": {
"registrationType1.keyword": ""
}
}
]
}
}
}
Hope this helps!
I am using elasticsearch version 7.2,
I replicated your data and ingested in my elastic index,and tried querying with and without .keyword.
I am getting the desired result when using the ".keyword" in the field name.It is not returning the docs which have registrationType1="".
Note - The query does not works when not using the ".keyword"
I have added my sample code below, have a look if that helps.
from elasticsearch import Elasticsearch
es = Elasticsearch()
es.indices.create(index="test", ignore=400, body={
"mappings": {
"_doc": {
"properties": {
"registrationType1": {
"type": "text",
"field": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
})
data = {
"registrationType1": ""
}
es.index(index="test",doc_type="_doc",body=data,id=1)
search = es.search(index="test", body={
"size": 20,
"_source": [
"registrationType1"
],
"query": {
"bool": {
"must_not": [
{
"term": {
"registrationType1.keyword": ""
}
}
]
}
}
})
print(search)
Executing the above should not return any results as we are inserting empty for the field
There was some issue with the mappings itself, I deleted the index and re-indexed it with new mappings and its working now.

Query for : How many elements of an array are matching in a document attribute in ElasticSearch

I've many documents having an attribute that is an array of values like these:
{
"_index": "myindex",
"_type": "mytype",
"_id": "myid1",
"_source": {
"tags": [
"devid",
"batman",
"obama"
]
}
},
{
"_index": "myindex",
"_type": "mytype",
"_id": "myid2",
"_source": {
"tags": [
"devid",
"superman"
]
}
}
I have an array of elements like: ["devid", "batman", "pippo"]
I want to get all the documents matching at least one element of the array, sorted by how many elements are matched.
For example, I expect that myid1 will have an higher score than myid2.
How can I do this?
At the moment I'm "stuck" here:
{
"query": {
"function_score": {
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"terms": {
"tags": ["devid", "batman", "pippo"]
}
}
}
}
}
}
}
It only filters by terms and sets 1 as score to both.
I'm noob with elasticsearch any hint is welcome!
Using the terms query instead of filter would result in documents with more terms matching get a higher score.
Example :
{
"query": {
"terms": {
"tags": [
"devid",
"batman",
"pippo"
]
}
}
}

Resources