ElasticSearch: Find record with multiple conditions on a list of sub-elements - elasticsearch

I'm saving documents like this to ElasticSearch:
[
{
"text": "Sam works for Google.",
"entities": [
{
"text": "Sam",
"type": "PERSON"
},
{
"text": "Google",
"type": "ORGANIZATION"
}
]
}
]
It's essentially a sentence and entities that appear in that sentence. Now, I want to find any document that has entities of type "PERSON" AND "ORGANIZATION". I tried a boolean must query:
{
"bool": {
"must": [
{
"match": {
"entities.type": "PERSON"
}
},
{
"match": {
"entities.type": "ORGANIZATION"
}
}
]
}
}
... but that seems to try to look for entities that that are of both types, which obviously returns nothing. How do I need to formulate my query?
Thanks!

You should use below query as your original query dont have correct field name.
{
"query": {
"bool": {
"must": [
{
"match": {
"entities.type": "PERSON"
}
},
{
"match": {
"entities.type": "ORGANIZATION"
}
}
]
}
}
}

Related

Elasticsearch: "must" query on nested fields

How to do a "must" "match" query on multiple fields under the same nesting? Here's a reproducible ES index where the "user" field is defined as "nested" type.
PUT my_index
{
"mappings": {
"properties": {
"user": {
"type": "nested",
"properties": {
"firstname": {"type": "text"}
}
}
}
}
}
And here are 2 documents:
PUT my_index/_doc/1
{
"user" : [
{
"firstname" : "John"
},
{
"firstname" : "Alice"
}
]
}
PUT my_index/_doc/2
{
"user" : [
{
"firstname" : "Alice"
}
]
}
For this index, how can I query for documents where "John" AND "Alice" both exist? With the index defined above, I expect to get Document 1 but not Document 2. So far, I've tried the following code, but it's returning no hits:
GET my_index/_search
{
"query": {
"nested": {
"path": "user",
"query": {
"bool": {
"must": [
{"match": {"user.firstname": "John"}},
{"match": {"user.firstname": "Alice"}}
]
}
}
}
}
}
Below query is what is required.
POST my_index/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "user",
"query": {
"match": {
"user.firstname": "alice"
}
}
}
},
{
"nested": {
"path": "user",
"query": {
"match": {
"user.firstname": "john"
}
}
}
}
]
}
}
}
Notice how I've made use of two nested queries in a single must clause. That is because if you notice the documents that you have alice and john are both considered two different documents.
The query you have would work if your document structure is something like below:
POST my_index/_doc/3
{
"user" : [
{
"firstname" : ["Alice", "John"]
}
]
}
Try reading this (nested datatype) and this (nested query) link to understand more on how they work and from the second link, you can see the below info:
The nested query searches nested field objects as if they were indexed
as separate documents.
Hope that helps!

ElasticSearch combine must-match with multi-match

I have been trying to combine MUST-MATCH with MULTI-MATCH but couldn't get it to work. Basically I want these MUST conditions:
"must": [{ "match": { "city": $city } },
{ "match": { "is_displayed": 1 } },
{ "match": { "status": "active" } }]
and I want these matches:
"multi_match": {
"query": $query,
"type": $selectedType,
"fields": fieldArray,
}
where $query is the textbox values $selectedType is one of the multi-match query types and fieldArray is the fields to search for. For example, when the text box value is "hello world" and fieldArray is ['title', 'cuisine'], either "hello" and/or "world" must match either or all of the specified fields. Any insight and advice is appreciated.
I guess adding another clause in must block will do the needful.
{
"query": {
"bool": {
"must": [
{
"match": {
"city": "$city"
}
},
{
"match": {
"is_displayed": 1
}
},
{
"match": {
"status": "active"
}
},
"query_string": {
"fields": fieldArray,
"query": "*$query*"
}
}
]
}
}
}

match query on elastic search with multiple or conditions

I have three fields status,type and search. What I want is to search the data which contains status equals to NEW or status equals to IN PROGRESS and type is equal to abc or type equals to xyz and search contains( partial match ).
My call looks like below -
{
"query": {
"bool" : {
"must" : [{
"match": {
"status": {
"query": "abc",
}
}
}, {
"match": {
"type": {
"query": "NEW",
}
}
},{
"query_string": {
"query": "*abc*", /* for partial search */
"fields": ["title", "name"]
}
}]
}
}
}
Nest your boolqueries. I think what you are missing is this:
"bool": { "should": [
{ "match": { "status": "abc" } },
{ "match": { "status": "xyz" } }
]}
This is a query which MUST match one of the should clauses as only should clauses are given.
EDIT to explain the differences:
{
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"match": {
"status": "abc"
}
},
{
"match": {
"status": "xyz"
}
}
]
}
},
{
"terms": {
"type": [
"NEW",
"IN_PROGRESS"
]
}
},
{
"query_string": {
"query": "*abc*",
"fields": [
"title",
"name"
]
}
}
]
}
}
}
So you have a boolquery at top. Every of the 3 inner queries must be true.
The first is a nested boolquery which is true if status matches either abc or xyz.
The second is true if type matches exactly NEW or IN_PROGRESS - Note the difference here. The First one would also match ABC or aBc or potentially "abc XYZ" depending on your analyzer. You might want terms for both.
The third is what you had before.

I want my query to treat the content of two columns as one

I have a set of news articles. These have both tags and articleTags.
Our API has a endpoint that returns articles that matches all tags.
E.g. searching for an article that contains both sport and fail:
"bool": {
"must": [
[
{
"term": {
"tags": "sport"
}
},
{
"term": {
"tags": "fail"
}
},
{
"term": {
"articleTags": "sport"
}
},
{
"term": {
"articleTags": "fail"
}
}
]
]
}
This worked when we only had tags, but when we introduced articleTags then it obviously didn't work as expected.
Is there a way we could make Elasticsearch treat tags and articleTags as
one namespace so I could do a query like this?
"bool": {
"must": [
[
{
"term": {
"mergedTags": "sport"
}
},
{
"term": {
"mergedTags": "fail"
}
}
]
]
}
I feel multi match query would be the best solution here.
There is a type of multi match query which is called cross_fields .
And its function as told by the documentation is
Treats fields with the same analyzer as though they were one big field. Looks for each word in any field. See cross_fields.
My suggestion involves using copy_to to create that "merged" field:
"tags": {
"type": "string",
"copy_to": "mergedTags"
},
"articleTags": {
"type": "string",
"copy_to": "mergedTags"
},
"mergedTags": {
"type": "string"
}
And the updated query is a simple as:
"query": {
"bool": {
"must": [
[
{
"term": {
"mergedTags": "sport"
}
},
{
"term": {
"mergedTags": "fail"
}
}
]
]
}
}

Elastic Search : Match Query not working in Nested Bool Filters

I am able to get data for the following elastic search query :
{
"query": {
"filtered": {
"query": [],
"filter": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"term": {
"gender": "malE"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
]
}
}
}
}
}
However, If I query using "match" - I get error message with 400 status response
{
"query": {
"filtered": {
"query": [],
"filter": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"match": {
"gender": "malE"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
]
}
}
}
}
}
Is match query not supported in nested bool filters ?
Since the term query looks for the exact term in the field’s inverted index and I want to query gender data as case_insensitive field - Which approach shall I try ?
Settings of the index :
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"analyzer_keyword": {
"tokenizer": "keyword",
"filter": "lowercase"
}
}
}
}
}
}
Mapping for field Gender:
{"type":"string","analyzer":"analyzer_keyword"}
The reason you're getting an error 400 is because there is no match filter, only match queries, even though there are both term queries and term filters.
Your query can be as simple as this, i.e. no need for a filtered query, simply put your term and match queries into a bool/should:
{
"query": {
"bool": {
"should": [
{
"match": {
"gender": "male"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
}
This answer is for ElasticSearch 7.x. As I understand from the question, you would like to use a match query for the gender field and a term query for the sentiment field. The mappings for each of these field should look like below:
"sentiment": {
"type": "keyword"
},
"gender": {
"type": "text"
}
The corresponding search API would be:
"query": {
"bool": {
"must": [
{
"terms": {
"sentiment": [
"very positive", "positive"
]
}
},
{
"match": {
"gender": "malE"
}
}
]
}
}
This search API returns all the documents where gender is "Male"/"MALE"/"mALe" etc. So, you may have indexed the gender field holding "mALe", but, the match query for "gender": "malE" will still be able to retrieve it. In the latest version of ElasticSearch, if the query is a match type, the value (which is "gender": "malE") will be automatically lower cased internally before search begins. But, it should not be that tough for a client of the API to pass a lowercase to the match query at the onset itself. Coming to the sentiment field, since, its a keyword field, you can search for values that contain spaces too like very positive.

Resources