Elasticsearch: Conditionally filter query on fields if they exist in multi-index query - elasticsearch

I have a query for a general search which spans multiple indices. Some of the indices have a field called is_published and some have a field called date_review, some have both.
I'm struggling to write a query which will search across fields and filter on the fields mentioned above but only if they exist. I have managed to achieve what I want on the individual fields using missing and/or exists, but it excludes the other variants.
In english, I want to keep documents in the result where:
is_published is true OR the field does not exist
date_review is in the future OR the field does not exist
So, if a document has is_published and it's false, remove it. If a document has date_review in the past, remove it. If it has is_published == false and date_review is in the future, remove it.
I hope this makes sense?
For the purpose of answering, assume the documents might look like this:
// Has `is_published` flag
{
"label": "My document",
"body": "Lorem ipsum doler et sum.",
"is_published": true
}
// Has `date_review` flag
{
"label": "My document",
"body": "Lorem ipsum doler et sum.",
"date_review": "2017-01-01"
}
// Has both `is_published` and `date_review` flags
{
"label": "My document",
"body": "Lorem ipsum doler et sum.",
"is_published": true
"date_review": "2017-01-01"
}
At the moment, my [unfiltered] query looks like this:
{
"index": "index-1,index-2,index-3",
"type": "item",
"body": {
"query": {
"filtered": {
"query": {
"multi_match": {
"query": "my serach phrase",
"type": "phrase_prefix",
"fuzziness": null,
"fields": [
"label^3",
"body",
]
}
},
"filter": []
}
}
}
}
Very grateful for any pointers.
Thanks.

You can try a query like this one:
{
"query": {
"filtered": {
"query": {
"multi_match": {
"query": "my serach phrase",
"type": "phrase_prefix",
"fuzziness": null,
"fields": [
"label^3",
"body"
]
}
},
"filter": {
"bool": {
"must": [
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"missing": {
"field": "is_published"
}
},
{
"term": {
"is_published": true
}
}
]
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"missing": {
"field": "date_review"
}
},
{
"range": {
"date_review": {
"gt": "now"
}
}
}
]
}
}
]
}
}
}
}
}

Related

Elastic Search : Search keyword results of a specific category

I'm trying to build a query where I'm trying to search for names of people of a specific country. If I provide input as John and USA, I should only find results of people by the name John (by the property : name) from USA (by the property : country) and results from other countries shouldn't appear in the results.
What I have tried :
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "John",
"fields": ["username", "first_name", "last_name"],
"fuzziness": "AUTO",
"minimum_should_match": "50%"
}
}
],
"filter": [
{
"match": {
"country": "USA"
}
},
{
"match": {
"is_citizen": true
}
}
]
}
}
With the above query the problem I'm seeing is that the results also show people **who don't have their name as John but are from USA
**.
Expectation : To filter results of given keyword specific to given country.
Instead of using should you need to use must clause in your name query.
Below query should give you expected results. refer boolean query official doc to understand the difference with examples.
"query": {
"bool": {
"must": [ --> note `must` here
{
"multi_match": {
"query": "John",
"fields": ["username", "first_name", "last_name"],
"fuzziness": "AUTO",
"minimum_should_match": "50%"
}
}
],
"filter": [
{
"match": {
"country": "USA"
}
},
{
"match": {
"is_citizen": true
}
}
]
}
}
You are using should clause thats why it is not working. You can use must insted of should and it will resolved your issue.
You can use "type":"phrase_prefix" to match Jo with John.
You can change your query as shown below and it will work:
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "John",
"fields": ["username", "first_name", "last_name"],
"type":"phrase_prefix",
"minimum_should_match": "50%"
}
}
],
"filter": [
{
"match": {
"country": "USA"
}
},
{
"match": {
"is_citizen": true
}
}
]
}
}

Multi match query with terms lookup searching multiple indices elasticsearch 6.x

All,
I am working on building a NEST 6.x query that takes a serach term and looks in different fields in different indices.
This is the one I got so far but is not returning any results that I am expecting.
Please see the details below
Indices used
dev-sample-search
user-agents-search
The way the search should work is as follows.
The value in the query field(27921093) is searched against the
fields agentNumber, customerName, fileNumber, documentid(These are all
analyzed fileds).
The search should limit the documents to the agentNumbers the user
sampleuser#gmail.com has access to( sample data for
user-agents-search) is added below.
agentNumber, customerName, fileNumber, documentid and status are
part of the index dev-sample-search.
status field is defined as a keyword.
The fields in the user-agents-search index are all keywords
Sample user-agents-search index data:
{
"id": "sampleuser#gmail.com"",
"user": "sampleuser#gmail.com"",
"agentNumber": [
"123.456.789",
"1011.12.13.14"
]
}
Sample dev-sample-search index data:
{
"agentNumber": "123.456.789",
"customerName": "Bank of america",
"fileNumber":"test_file_1123",
"documentid":"1234456789"
}
GET dev-sample-search/_search
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": [
{
"multi_match": {
"type": "best_fields",
"query": "27921093",
"operator": "and",
"fields": [
"agentNumber",
"customerName",
"fileNumber",
"documentid^10"
]
}
}
],
"filter": [
{
"bool": {
"must": [
{
"terms": {
"agentNumber": {
"index": "user-agents-search",
"type": "_doc",
"user": "sampleuser#gmail.com",
"path": "agentNumber"
}
}
},
{
"bool": {
"must_not": [
{
"terms": {
"status": {
"value": "pending"
}
}
},
{
"term": {
"status": {
"value": "cancelled"
}
}
},
{
"term": {
"status": {
"value": "app cancelled"
}
}
}
],
"should": [
{
"term": {
"status": {
"value": "active"
}
}
},
{
"term": {
"status": {
"value": "terminated"
}
}
}
]
}
}
]
}
}
]
}
}
}
I see a couple of things that you may want to look at:
In the terms lookup query, "user": "sampleuser#gmail.com", should be "id": "sampleuser#gmail.com",.
If at least one should clause in the filter clause should match, set "minimum_should_match" : 1 on the bool query containing the should clause

ElastiSearch Query: How to do inline "calculation" between fields, and then use it as boost variable?

I have an Books Index with fields something like this:
{
"title": "To Kill a Mockingbird",
"summary": "To Kill a Mockingbird takes place in Alabama during the Depression..",
"type": "book",
"views": 36
},
{
"title": "The Genius of Birds",
"summary": "The Genius Of Birds shines a new light on a genuinely underrated kind..",
"type": "book",
"views": 10
},
{
"title": "Handbook of Bird Biology",
"summary": "The Handbook of Bird Biology is an essential reference for birdwatchers..",
"type": "book",
"views": 27
}
In ElasticSearch v5.1, below is my current simple Query which is working on it's own:
{
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"term": {
"type": "book"
}
}
]
}
},
"must": {
"multi_match": {
"query": "the bird",
"fields": [
"title",
"summary"
]
}
}
}
}
}
(Searching for the words the bird from the fields: title, summary where the type must be book)
This gives me a simple result based on title and summary fields. But i need it to be modified a little bit more.
Is it possible to modify the Query to look something like:
..
"must": {
"multi_match": {
"query": "the bird",
"fields": [
"title^(0.1*views)",
"summary"
]
}
}
..
I don't know how to call it in ES, but basically i want to boost a field (the title) by another field (the view).
Or in the simplest form, something like:
field1^(field2)
Thanks Aarchit Saxena for the hint in the comment section. Now i know it is called field_value_factor, and then by exploring further from there, i've now finally managed to get the query i needed.
The original query (above) has became like this now:
{
"query": {
"function_score": {
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"term": {
"type": "book"
}
}
]
}
},
"must": {
"multi_match": {
"query": "the bird",
"fields": [
"title",
"summary"
]
}
}
}
},
"functions": [
{
"field_value_factor": {
"field": "views",
"factor": 1,
"modifier": "none",
"missing": 1
}
}
],
"boost": 1,
"boost_mode": "multiply"
}
}
}
Thank you.

ElasticSearch combine must-match with multi-match

I have been trying to combine MUST-MATCH with MULTI-MATCH but couldn't get it to work. Basically I want these MUST conditions:
"must": [{ "match": { "city": $city } },
{ "match": { "is_displayed": 1 } },
{ "match": { "status": "active" } }]
and I want these matches:
"multi_match": {
"query": $query,
"type": $selectedType,
"fields": fieldArray,
}
where $query is the textbox values $selectedType is one of the multi-match query types and fieldArray is the fields to search for. For example, when the text box value is "hello world" and fieldArray is ['title', 'cuisine'], either "hello" and/or "world" must match either or all of the specified fields. Any insight and advice is appreciated.
I guess adding another clause in must block will do the needful.
{
"query": {
"bool": {
"must": [
{
"match": {
"city": "$city"
}
},
{
"match": {
"is_displayed": 1
}
},
{
"match": {
"status": "active"
}
},
"query_string": {
"fields": fieldArray,
"query": "*$query*"
}
}
]
}
}
}

exact match query in elasticsearch

I'm trying to run an exact match query in ES
in MYSQL my query would be:
SELECT * WHERE `content_state`='active' AND `author`='bob' AND `title` != 'Beer';
I looked at the ES docs here:
https://www.elastic.co/guide/en/elasticsearch/guide/current/_finding_exact_values.html
and came up with this:
{
"from" : '.$offset.', "size" : '.$limit.',
"filter": {
"and": [
{
"and": [
{
"term": {
"content_state": "active"
}
},
{
"term": {
"author": "bob"
}
},
{
"not": {
"filter": {
"term": {
"title": "Beer"
}
}
}
}
]
}
]
}
}
but my results are still coming back with the title = Beer, it doesn't seem to be excluding the titles that = Beer.
did I do something wrong?
I'm pretty new to ES
I figured it out, I used this instead...
{
"from" : '.$offset.', "size" : '.$limit.',
"query": {
"bool": {
"must": [
{
"query_string": {
"default_field": "content_state",
"query": "active"
}
},
{
"query_string": {
"default_field": "author",
"query": "bob"
}
}
],
"must_not": [
{
"query_string": {
"default_field": "title",
"query": "Beer"
}
}
]
}
}
}
Query String Query is a pretty good concept to handle various relationship between search criteria. Have a quick look into Query string query syntax to understand in detail about this concept
{
"query": {
"query_string": {
"query": "(content_state:active AND author:bob) AND NOT (title:Beer)"
}
}
}
Filters are supposed to work on exact values, if you had defined your mapping in a manner where title was a non-analyzed field, your previous attempt ( with filters) would have worked as well.
{
"mappings": {
"test": {
"_all": {
"enabled": false
},
"properties": {
"content_state": {
"type": "string"
},
"author": {
"type": "string"
},
"title": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}

Resources