Elasticsearch query to match on one field but should filter results based on an other field - elasticsearch

Looking for help formulating a query for the following use case. I have a text field and an integer field which is an ID. I need to search for matching text but expecting the response to contain results that match one of the IDs. As an example, I have two fields. One is product ID which is text. And an owner ID which is an integer. Owners must be allowed to view only those products that are owned by them. To add on, an owner can have multiple IDs.
Sample records in Elasticsearch:
{
"product": "MKL89ADH12",
"ownerId" : 98765
},
{
"product": "POIUD780H",
"ownerId" : 12345
},
{
"product": "UJK87TG89",
"ownerId" : 98765
},
{
"product": "897596YHJ",
"ownerId" : 98765
},
{
"product": "LKGGN764HH",
"ownerId" : 784512
}
If 98765 and 12345 belong to the same owner, they should be able to view first 4 products only. And no results should be returned if they search for LKGGN764HH .
I tried following query but it gives me no results.
{
"size": 24,
"query": {
"bool": {
"must":[{
"match" : {
"product": {
"query": "MKL89ADH12"
}
}
},
{
"match" : {
"product": {
"query": "LKGGN764HH"
}
}
},
{
"terms": {
"ownerId": [98765, 12345],
"boost": 1.0
}
}
]
}
}
}
I am expecting the response to contain MKL89ADH12 because I am filtering by the ownerId. Looking for help formulating the right query for my use case.

You need to filter on "ownerId": [98765, 12345] and then use "should" clause to return documents which match any of the text.
{
"size": 24,
"query": {
"bool": {
"filter": [
{
"terms": {
"ownerId": [
98765,
12345
],
"boost": 1
}
}
],
"should": [
{
"match": {
"product": {
"query": "MKL89ADH12"
}
}
},
{
"match": {
"product": {
"query": "LKGGN764HH"
}
}
}
],
"minimum_should_match": 1
}
}
}
Above query will translate to
select * from index
where ownerid in ( 98765, 12345)
AND (product IN ("MKL89ADH12","LKGGN764HH))
while your works like
select * from index
where ownerid in ( 98765, 12345)
AND product = "MKL89ADH12"
AND product = "LKGGN764HH"

Related

How can we sort records by specific value of a filed in elastic search

We want to sort the records by specific value of a filed, for example :-
We have data with country code, name & other details and we want to show records at the top which have country code 'US', after us we want to show the results of country code 'AR'.
so if we are searching for obama, then all obama from US will come first and after that obama from AR will be available in results and we have also want to sort us records base on some rating score.
I am trying filter query with boost but not getting expected data because with filter we are getting only filtered records but we want sort the records basis on boost of specific value of country filed
{
"query": {
"bool": {
"must": [
{
"bool": {
"must": [
{
"match_phrase_prefix": {
"name": {
"query": "obama"
}
}
}
],
"boost": 2.0
}
}
],
"filter": {
"bool": {
"should": [
{
"term": {
"countryCode": {
"value": "US",
"boost": 4
}
}
},
{
"term": {
"countryCode": {
"value": "AR",
"boost": 3
}
}
},
{
"term": {
"countryCode": {
"value": "ES",
"boost": 2
}
}
}
]
}
}
}
},
"size": 50,
"sort": [
{
"rating": {
"order": "desc"
}
},
{
"_score": {
"order": "desc"
}
}
]
}
Expectation :
All records which belongs with country US should be available on top base on sorting by rating
All records which belongs with country AR should be available after US's records with respective rating order
All records which belongs with country ES should be available after Ar's records with respective rating order
Expected example:
[
{name:"obama a", countryCode:us, rating:5}
{name:"obama b", countryCode:us, rating:4}
{name:"obama ac", countryCode:ar, rating:3}
{name:"obama ess", countryCode:es, rating:3.5}
]
If you want to tune the score but not drop the document you can use should.
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html
must
The clause (query) must appear in matching documents and will
contribute to the score.
filter
The clause (query) must appear in matching documents. However unlike
must the score of the query will be ignored. Filter clauses are
executed in filter context, meaning that scoring is ignored and
clauses are considered for caching.
should
The clause (query) should appear in the matching document.
must_not
The clause (query) must not appear in the matching documents. Clauses
are executed in filter context meaning that scoring is ignored and
clauses are considered for caching. Because scoring is ignored, a
score of 0 for all documents is returned.
Here is an example:
POST test_stackoverflow_us/_bulk?refresh=true&pretty
{ "index": {}}
{"name":"obama a", "countryCode":"us", "rating":5}
{ "index": {}}
{"name":"obama b", "countryCode":"us", "rating":4}
{ "index": {}}
{"name":"obama ac", "countryCode":"ar", "rating":3}
{ "index": {}}
{"name":"obama ess", "countryCode":"es", "rating":3.5}
GET test_stackoverflow_us/_search
{
"query": {
"bool": {
"must": [
{
"bool": {
"must": [
{
"match_phrase_prefix": {
"name": {
"query": "obama"
}
}
}
],
"boost": 2
}
}
],
"should": [
{
"term": {
"countryCode": {
"value": "US",
"boost": 4
}
}
},
{
"term": {
"countryCode": {
"value": "AR",
"boost": 3
}
}
},
{
"term": {
"countryCode": {
"value": "ES",
"boost": 2
}
}
}
]
}
},
"size": 50,
"sort": [
{
"rating": {
"order": "desc"
}
},
{
"_score": {
"order": "desc"
}
}
]
}

match query on elastic search with multiple or conditions

I have three fields status,type and search. What I want is to search the data which contains status equals to NEW or status equals to IN PROGRESS and type is equal to abc or type equals to xyz and search contains( partial match ).
My call looks like below -
{
"query": {
"bool" : {
"must" : [{
"match": {
"status": {
"query": "abc",
}
}
}, {
"match": {
"type": {
"query": "NEW",
}
}
},{
"query_string": {
"query": "*abc*", /* for partial search */
"fields": ["title", "name"]
}
}]
}
}
}
Nest your boolqueries. I think what you are missing is this:
"bool": { "should": [
{ "match": { "status": "abc" } },
{ "match": { "status": "xyz" } }
]}
This is a query which MUST match one of the should clauses as only should clauses are given.
EDIT to explain the differences:
{
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"match": {
"status": "abc"
}
},
{
"match": {
"status": "xyz"
}
}
]
}
},
{
"terms": {
"type": [
"NEW",
"IN_PROGRESS"
]
}
},
{
"query_string": {
"query": "*abc*",
"fields": [
"title",
"name"
]
}
}
]
}
}
}
So you have a boolquery at top. Every of the 3 inner queries must be true.
The first is a nested boolquery which is true if status matches either abc or xyz.
The second is true if type matches exactly NEW or IN_PROGRESS - Note the difference here. The First one would also match ABC or aBc or potentially "abc XYZ" depending on your analyzer. You might want terms for both.
The third is what you had before.

Elasticsearch - Aggregations on part of bool query

Say I have this bool query:
"bool" : {
"should" : [
{ "term" : { "FirstName" : "Sandra" } },
{ "term" : { "LastName" : "Jones" } }
],
"minimum_should_match" : 1
}
meaning I want to match all the people with first name Sandra OR last name Jones.
Now, is there any way that I can get perform an aggregation on all the documents that matched the first term only?
For example, I want to get all of the unique values of "Prizes" that anybody named Sandra has. Normally I'd just do:
"query": {
"match": {
"FirstName": "Sandra"
}
},
"aggs": {
"Prizes": {
"terms": {
"field": "Prizes"
}
}
}
Is there any way to combine the two so I only have to perform a single query which returns all of the people with first name Sandra or last name Jones, AND an aggregation only on the people with first name Sandra?
Thanks alot!
Use post_filter.
Please refer the following query. Post_filter will make sure that your bool should clause don't effect your aggregation scope.
Aggregations are filtered based on main query as well, but they are unaffected by post_filter. Please refer to the link
{
"from": 0,
"size": 20,
"aggs": {
"filtered_lastname": {
"filter": {
"query": {
"match": {
"FirstName": "sandra"
}
}
},
"aggs": {
"prizes": {
"terms": {
"field": "Prizes",
"size": 10
}
}
}
}
},
"post_filter": {
"bool": {
"should": [{
"term": {
"FirstName": "Sandra"
}
}, {
"term": {
"LastName": "Jones"
}
}],
"minimum_should_match": 1
}
}
}
Running a filter inside the aggs before aggregating on prizes can help you achieve your desired usecase.
Thanks
Hope this helps

how to distinct value after query in elasticsearch

I use elasticsearch like :
{
"query": {
"match_phrase": {
"title": "my title"
}
},
"aggs": {
"unique_title": {
"cardinality": {
"field": "title"
}
}
}
}
i just want to sql
select distinct title from table where title like '%my title%'
the result give me multiple same results, "cardinality" dont worked whit "query"
if you dont understand me, Please forgive my poor English ^_^
Cardinality aggregation calculates the count of distinct values for a field.
Hence the equivalent sql query for the elasticsearch query you wrote would look like:
select count(distinct title) from table where title like '%my title%'
What you need to use is the Terms aggregation for getting the distinct titles.
{
"query": {
"match_phrase": {
"title": "my title"
}
},
"aggs": {
"unique_title": {
"terms": {
"field": "title"
}
}
}
}
And you need to look into the "aggregations" section of the search response to get the distinct values in the "buckets" array.
You can use below query to get expected result:
GET my_index/my_type/_search
{
"from": 0,
"size": 200,
"query": {
"filtered": {
"filter": {
"bool": {
"must": {
"query": {
"wildcard": {
"title": "*my title*"
}
}
}
}
}
}
},
"_source": {
"includes": [
"title"
],
"excludes": []
}
}

elasticsearch inner join

I have an index with some fields, my documents contains valid "category" data also contains "url"(analyzed field) data but not contains respsize..
in the other hand documents that contains "respsize" data (greater than 0) also contains "url" data but not contains "category" data..
I think you got the point, I need join or intersection whatever that a query returns all documents contains respsize and category that have same same url documents.
Here what I did so far;(url field analyzed, rest of them not_analyzed)
here documents that have category:
and other documents have respsize that I need to combine them based on url
I need a dsl query that return records that have same url token(in this scenario it will be www.domainname.com) with merge category and respsize,
I simply want field in second img "category":"27" like in img1 but of course with rest of all fields.
here is my query but not work
GET webproxylog/accesslog/_search
{
"query": {
"filtered": {
"filter" : {
"and" : {
"filters": [
{
"not": {
"filter": {
"terms": {
"category": [
"-",
"-1",
"0"
]
},
"term": {
"respsize": "0"
}
}
},
"term": {
"category": "www.hurriyet.com.tr"
}
}
],
"_cache" : true
}
}
}
},
"sort": [
{
"respsize": {
"order": "desc"
}
}
]
}
You can try the query below. It will require the url field to be the one you specify (i.e. must) and then either of the next two clauses (i.e. should) must be true, i.e. category should be not one of the given terms or the respsize must be greater than 0.
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"url": "www.hurriyet.com.tr"
}
}
],
"should": [
{
"not": {
"terms": {
"category": [
"-",
"-1",
"0"
]
}
}
},
{
"range": {
"respsize": {
"gt": 0
}
}
}
]
}
}
}
}
}

Resources