multiple search conditions in one query in es and distinguish the items according to the conditions - elasticsearch

For one case I need to put multiple search conditions in one query to reduce the number of queries we need.
However, I need to distinguish the returning items based on the conditions.
Currently I achieved this goal by using function score query, specifically: each condition is assigned with a score, and I can differentiate the results based on those scores.
However, the performance is not that good. Plus now we need to get the doc count of each condition.
So is there any way to do it? I'm thinking using aggregation, but not sure if I can do it.
Thanks!
update:
curl -X GET 'localhost:9200/locations/_search?fields=_id&from=0&size=1000&pretty' -d '{
"query":{
"bool":{
"should":[
{
"filtered":{
"filter":{
"bool":{
"must":[{"term":{"city":"new york"}},{"term":{"state":"ny"}}]
}
}
}
},
{
"filtered":{
"filter":{
"bool":{
"must":[{"term":{"city":"los angeles"}},{"term":{"state":"ca"}}]
}
}
}
}
]
}
}}'

Well to answer the first part of your question , names queries are the best.
For eg:
{
"query": {
"bool": {
"should": [
{
"match": {
"field1": {
"query": "qbox",
"_name": "firstQuery"
}
}
},
{
"match": {
"field2": {
"query": "hosted Elasticsearch",
"_name": "secondQuery"
}
}
}
]
}
}
}
This will return an additional field called matched_queries for each hit which will have the information on queries matched for that document.
You can find more info on names queries here
But this this information cant be used for aggregation.
So you need to handle the second part of your question in a separate manner.
Filter aggregation for each query type would be the idea solution here.
For eg:
{
"query": {
"bool": {
"should": [
{
"match": {
"text": {
"query": "qbox",
"_name": "firstQuery"
}
}
},
{
"match": {
"source": {
"query": "elasticsearch",
"_name": "secondQuery"
}
}
}
]
}
},
"aggs": {
"firstQuery": {
"filter": {
"term": {
"text": "qbox"
}
}
},
"secondQuery": {
"filter": {
"term": {
"source": "elasticsearch"
}
}
}
}
}
You can find more on filter aggregation here

Related

what is purpose in must nested in filter elasticsearch?

what's difference between the following es filter query?
1. filter context for multi query conditions:
{
"query": {
"bool": {
"filter": [
{ "term": { "status": "published" }},
{ "range": { "publish_date": { "gte": "2015-01-01" }}}
]
}
}
}
must in filter context:
{
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [
{ "term": { "status": "published" }},
{ "range": { "publish_date": { "gte": "2015-01-01" }}}
]
}
}
]
}
}
}
The first query is used in scenarios where you just want to filter using AND operator on different fields. By default if you write filter query in this way, it would be executed as AND operation.
The second query, in your case/scenario, does exactly as the first query (no difference, just two ways of doing same thing), however the reason we can "also" do that is to implement/cover more complex filter use-cases that uses many different AND and OR combinations.
Note that in Elasticsearch AND is represented by must while OR is represented by should clauses.
Let's say I would want to filter a scenario like I want all documents having
sales from department 101 or
sales from department 101B along with price > 150.
You probably would have to end up writing query in the below way:
POST sometestindex/_search
{
"query":{
"bool":{
"filter":[
{
"bool":{
"should":[
{
"term":{
"dept.keyword":"101"
}
},
{
"bool":{
"must":[
{
"term":{
"dept.keyword":"101B"
}
},
{
"range":{
"price":{
"gte":150
}
}
}
]
}
}
],
"minimum_should_match": 1
}
}
]
}
}
}
In short, for your scenario, first query is just a short-hand way of writing the second-query, however if you have much more complex filter logic, then you need to leverage the Bool query inside your filter as you've mentioned in your second query, as I've mentioned in the sample example.
Hope that clarifies!

Elasticsearch: How to combine regex query with filter

I have a search that in some situations needs to be searched by a regex query
GET my-index/_search
{
"query": {
"regexp":{
"name":".*something.*"
}
}
}
And sometimes needs to be filtered, like so:
GET /my-index/_search
{
"query":{
"bool":{
"filter":[
{
"term":{
"createdByEmail.keyword":"me.email#example.com"
}
}
]
}
}
I want to combine these 2 so that it will only show me resolts where the name matches the regex AND the createdByEmail matches the email address I'm sending in.
You can add first query inside must clause of second as below:
{
"query": {
"bool": {
"must": [
{
"regexp": {
"name": ".*something.*"
}
}
],
"filter": [
{
"term": {
"createdByEmail.keyword": "me.email#example.com"
}
}
]
}
}
}

elasticsearch multi field query is not working as expected

I've been facing some issues with multi field elasticsearch query. I am trying to query all the documents which matches the field called func_name to two hard coded strings, even though my index has documents with both these function names, but the query result is always fetching only one func_name. So far I have tried following queries.
1) Following returns only one function match, even though the documents have another function as well
GET /_search
{
"query": {
"multi_match": {
"query": "FEM_DS_GetTunerStatusInfo MDM_TunerStatusPrint",
"operator": "OR",
"fields": [
"func_name"
]
}
}
}
2) following intermittently gives me both the functions.
GET /_search
{
"query": {
"match": {
"func_name": {
"query": "MDM_TunerStatusPrint FEM_DS_GetTunerStatusInfo",
"operator": "or"
}
}
}
}
3) Following returns only one function match, even though the documents have another function as well
{
"query": {
"bool": {
"should": [
{ "match": { "func_name": "FEM_DS_GetTunerStatusInfo" }},
{ "match": { "func_name": "MDM_TunerStatusPrint" }}
]
}
}
}
Any help is much appreciated.
Thanks for your reply. Lets assume that I have following kind of documents in my elasticsearch. I want my search to return first two documents out of all as they matches my func_name.
{
"_index": "diag-178999",
"_source": {
"severity": "MIL",
"t_id": "03468500",
"p_id": "000007c6",
"func_name": "MDM_TunerStatusPrint",
"timestamp": "2017-06-01T02:04:51.000Z"
}
},
{
"_index": "diag-344563",
"_source": {
"t_id": "03468500",
"p_id": "000007c6",
"func_name": "FEM_DS_GetTunerStatusInfo",
"timestamp": "2017-07-20T02:04:51.000Z"
}
},
{
"_index": "diag-101010",
"_source": {
"severity": "MIL",
"t_id": "03468500",
"p_id": "000007c6",
"func_name": "some_func",
"timestamp": "2017-09-15T02:04:51.000Z"
}
The "two best ways" to request your ES is to filter by terms on a particular field or to aggregate your queries so that you can rename the field, apply multiple rules, and give a more understandable format to your response
See : https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html and the other doc page is here, very useful :
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations.html
In your case, you should do :
{
"from" : 0, "size" : 2,
"query": {
"filter": {
"bool": {
"must": {
"term": {
"func_name" : "FEM_DS_GetTunerStatusInfo OR MDM_TunerStatusPrint",
}
}
}
}
}
}
OR
"aggs": {
"aggregationName": {
"terms": {
"func_name" : "FEM_DS_GetTunerStatusInfo OR MDM_TunerStatusPrint"
}
}
}
}
The aggregation at the end is just here to show you how to do the same thing as your query filter. Let me know if it's working :)
Best regards
As I understand, you should use filtered query to match any document with one of the values of func_name mentioned above:
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"terms": {
"func_name": [
"FEM_DS_GetTunerStatusInfo",
"MDM_TunerStatusPrint"
]
}
}
]
}
}
}
}
}
See:
Filtered Query, Temrs Query
UPDATE in ES 5.0:
{
"query": {
"bool": {
"must": [
{
"terms": {
"func_name": [
"FEM_DS_GetTunerStatusInfo",
"MDM_TunerStatusPrint"
]
}
}
]
}
}
}
See: this answer

how to know which keywords matched in elasticsaearch

Say that I query:
POST /story/story/_search
{
"query":{
"bool":{
"should":[
{
"match":{
"termVariations":{
"query":"not driving",
"type":"boolean",
"operator":"AND"
}
}
},
{
"match":{
"termVariations":{
"query":"driving",
"type":"boolean",
"operator":"AND"
}
}
}
]
}
}
}
This query returned by one analyzer or another 3 documents.
How do I tell which should clause was matched? Can Elasticsearch return the matched phrase along with the result?
Thanks!
The best option here would be named queries.
You can name your query and the name of the queries that matched would be provided per document.
{
"query": {
"bool": {
"should": [
{
"match": {
"name.first": {
"query": "qbox",
"_name": "first"
}
}
},
{
"match": {
"name.last": {
"query": "search",
"_name": "last"
}
}
}
]
}
}
}
Thanks #keety! highlight was exactly what I was looking for!! :-)

Elastic search filtered query, query part being ignored?

I'm building up the following search in code, the idea being that it filters down the set of matches then queries this so I can add score based on certain fields. For some reason the filter part works but whatever I put in the query (i.e. in the below I have no index sdfsdfsdf) it still returns anything matching the filter.
Is the syntax wrong?
{
"query":{
"filtered":{
"query":{
"bool":{
"must":{
"match":{
"sdfsdfsdf":{
"query":"4",
"boost":2.0
}
}
}
},
"filter":{
"bool":{
"must":[
{
"terms":{
"_id":[
"55f93ead5df34f1900abc20b",
"55f8ab0226ec4bb216d7c938",
"55dc4e949dcf833308c63d6b"
]
}
},
{
"range":{
"published_date":{
"lte":"now"
}
}
}
],
"must_not":{
"terms":{
"_id":[
"55f0a799acccc28204a5058c"
]
}
}
}
}
}
}
}
}
Your filter is not at the right level. It should not be inside query but at the same level as query like this:
{
"query": {
"filtered": {
"query": { <--- query and filter at the same level
"bool": {
"must": {
"match": {
"sdfsdfsdf": {
"query": "4",
"boost": 2
}
}
}
}
},
"filter": { <--- query and filter at the same level
"bool": {
"must": [
{
"terms": {
"_id": [
"55f93ead5df34f1900abc20b",
"55f8ab0226ec4bb216d7c938",
"55dc4e949dcf833308c63d6b"
]
}
},
{
"range": {
"published_date": {
"lte": "now"
}
}
}
],
"must_not": {
"terms": {
"_id": [
"55f0a799acccc28204a5058c"
]
}
}
}
}
}
}
}
You need to replace sdfsdfsdf with your existing field name in your type, e.g. title, otherwise I think it will fallback to match_all query.
"match":{
"title":{
"query": "some text here",
"boost":2.0
}
}

Resources