ElasticSearch filter by multiple keywords - elasticsearch

In ElasticSearch 6.8 I have indexed many documents that contains a collection of tags. The tags are mapped as keyword.
"tags": {
"type": "keyword"
},
When doing
"query" : {
"bool" : {
"must" : { "match" : { "name" : "beach" } },
"filter" : {
"terms" : { "tags" : ["games", "cars"] }
}
}
}
I get documents that contains at least one of those tags. But I want to filter out all documents that do not contains ALL the given tags.
I tried
"query" : {
"bool" : {
"must" : { "match" : { "name" : "beach" } },
"filter" : {
"terms" : {
"tags" : ["games", "cars"],
"minimum_should_match": 2
}
}
}
}
But it throws an error: "[terms] query does not support [minimum_should_match]"
Which would be the correct way of filtering out documents that do not contain those two tags? Note that the real query may contain other "should" clauses as well.

Related

Elasticsearch one field match different values

I want to do a query, with one field match different values,In SQL it likes:
select * from location where address like '%California%' and address like '%145%';
I tried using must condition array, it contains several phrase match conditions, but its doesnt work!
{
"from" : 0,
"size" : 10,
"query" : {
"bool" : {
"must" : {
"bool" : {
"must" : [ {
"match" : {
"address" : {
"query" : "California",
"type" : "phrase"
}
}
}, {
"match" : {
"address" : {
"query" : "145",
"type" : "phrase"
}
}
} ]
}
}
}
},
"sort" : [ {
"pageRankScore" : {
"order" : "desc",
"unmapped_type" : "double"
}
} ]
}
Thats my code, it only do a match '145', never match 'California'.
My question is: with several values, how to do a fuzzy match in one field?
Help me, thanks a lot!

Multiple filter aggregations vs Returning all buckets

I'm interesting in having sub-aggregrations but for specific keyword value.
"aggregations" : {
"Keyword" : {
"terms" : {
"field" : "keyword"
},
"aggregations" : {
"Concept" : {
"terms" : {
"field" : "concept"
}
}
The following returns only the top 10 first, which does not necessary contains the values I'm interesting in.
I see two main ways of solving my issue:
returning all the buckets and then selecting the ones I'm interesting in.
adding filter aggregations for all the value I'm interesting in. So if I'm interesting in 10 keyword/values, I will perform 10 filter aggregations.
What is the best solution in term of performance?
So if I'm interesting in 10 keyword/values, I will perform 10 filter aggregations.
This isn't necessarily true.
You can create a single filter that eliminates unwanted keywords and you can do this upfront at the query stage:
{
"size" : 0,
"query" : {
"bool" : {
"filter" : [
{
"terms" : {
"keyword" : [ "abc", "def", "ghi" ]
}
}
]
}
},
"aggs" : {
"Keyword" : {
"terms" : {
"field" : "keyword"
},
"aggs" : {
"Concept" : {
"terms" : {
"field" : "concept"
}
}
}
}
}
}
The query stage would filter it down to just abc, def, and ghi. Then the aggregation would work as you expect, but only against documents with those values.

Elasticsearch filter when term mix letters and numbers

I'm doing the following query to search some itens:
{
"filtered" : {
"query" : {
"match" : {
"name_db" : {
"query" : "Human",
"type" : "boolean"
}
}
},
"filter" : {
"terms" : {
"cat" : [ "B8E" ],
"execution" : "bool"
}
}
}
}
See that "cat" field? When it's something like "B8E" there are no results (even though it should), while when it's something like "320" the results are correct. What could be wrong? Why mixing letters and number would be a problem?
Thanks in advance.
PS: I'm new to elasticsearch
I'm pretty sure your field cat is an analyzed string and hence is being indexed in lowercase (and that makes no difference for numbers). If you try this query instead you'll get results.
{
"filtered" : {
"query" : {
"match" : {
"name_db" : {
"query" : "Human",
"type" : "boolean"
}
}
},
"filter" : {
"terms" : {
"cat" : [ "b8e" ], <--- search in lowercase
"execution" : "bool"
}
}
}
}
UPDATE
If you want to index the cat field in uppercase so that you can search it using uppercase (e.g. "B8E"), you need to change its mapping to being not_analyzed, like this:
"cat": {
"type": "string",
"index": "not_analyzed"
}

multiple match must fields not working in elastic search

below query is fetching result if i give existing record that is fine , but if i change name field from 'John' to 'John1' then still record is fetching.
{
"query" : {
"bool" : {
"must" : [
{ "match" : {"employeeId" : "1234"}},
{ "match" : {"name" : "John"}}
]
}
}
}
I tried another alternative query as well but still giving result.which query is correct in terms of performance?but both are giving results if i change name record from 'John' to 'John1'
{
"filter": {
"bool" : {
"must" : {
"term" : {
"employeeId" : "1234"
}
}
}
},
"query": {
"match" : {
"name" : {
"query" : "John",
"type" : "phrase"
}
}
}
}
This because you are doing match, if you want do exact search you need to use filter
Notice we assuce the mapping of name column is analyzed
{
"query" :{
"filtered" : {
"filter" : {
"bool" : {
"must" : [
{ "term" : {"employeeId" : "1234"}},
{ "term" : {"name" : "john"}}
]
}
}
}
}
}

How can I aggregate filtered nested documents in ElasticSearch?

Suppose I have an index with nested document that looks like this:
{
"id" : 1234
"cars" : [{
"id" : 987
"name" : "Volkswagen"
}, {
"id": 988
"name" : "Tesla"
}
]
}
I now want to get a count aggregation of "car" documents that match a certain criteria, e.g. that match a search query. My initial attempt was the following query:
{
"query" : {
"nested" : {
"path" : "cars",
"query" : {
"query_string" : {
"fields" : ["cars.name"],
"query" : "Tes*"
}
}
}
},
"aggregations" : {
"cars" :{
"nested" : {
"path" : "cars"
},
"aggs" : {
"cars" : {
"terms" : {
"field" : "cars.id"
}
}
}
}
}
}
I was hoping here to get an aggregation result with only the ids of cars whose name begin with "Tes". However, the aggregation instead uses all cars that are in a top-level document that also contains a matching nested documents. That is, in the above example "Volkswagen" would also be counted because the top-level document also contains a car that does match.
How can I get an aggregation of just the matching nested documents?
In the mean time I've figured it out: to achieve this a filter aggregation should be added around the the terms aggregation like so:
"aggregations" : {
"cars" :{
"nested" : {
"path" : "cars"
},
"aggs" : {
"cars-filter" : {
"filter" : {
"query" : {
"query_string" : {
"fields" : ["cars.name"],
"query" : "Tes*"
}
}
},
"aggs" : {
"cars" : {
"terms" : {
"field" : "cars.id"
}
}
}
}
}
}
}

Resources