Elasticsearch Multiple Prefix Keywords - elasticsearch

I need to use the prefix filter, but allow multiple different prefixes, i.e.
{"prefix": {"myColumn": ["This", "orThis", "orEvenThis"]}}
This does not work. And if I add each as a separate prefix is also obviously doesn't work.
Help is appreciated.
Update
I tried should but without any luck:
$this->dsl['body']['query']['bool']['should'] = [
["prefix" => ["myColumn" => "This"]],
["prefix" => ["myColumn" => "orThis"]]
];
When I add those two constraints, I get ALL responses (as though filter is not working). But if I use must with either of those clauses, then I do get a response back with the correct prefix.

Based on your comments, it sounds like it may just be an issue with the syntax. With all ES queries (just like SQL ones), I suggest starting simple and just submitting them to ES as the raw DSL outside of code (although in your case this wasn't easily doable). For the request, it's a pretty straight forward one:
{
"query" : {
"bool" : {
"must" : [ ... ],
"filter" : [
{
"bool" : {
"should" : [
{
"prefix" : {
"myColumn" : "This"
}
},
{
"prefix" : {
"myColumn" : "orThis"
}
},
{
"prefix" : {
"myColumn" : "orEvenThis"
}
}
]
}
}
]
}
}
}
I added it as a filter because the optional nature of your prefixing is not improving relevancy: it's literally asking that one of them must match. In such cases where the question is "does this match? yes / no", then you should use a filter (with the added bonus that that's cacheable!). If you're asking "does this match, and which matches better?" then you want a query (because that's relevancy / scoring).
Note: The initial issue appeared to be that the bool / must was unmentioned and the suggestion was to just use a bool / should.
{
"bool" : {
"should" : [
{
"prefix" : {
"myColumn" : "This"
}
},
{
"prefix" : {
"myColumn" : "orThis"
}
},
{
"prefix" : {
"myColumn" : "orEvenThis"
}
}
]
}
}
behaves differently than
{
"bool" : {
"must" : [ ... ],
"should" : [
{
"prefix" : {
"myColumn" : "This"
}
},
{
"prefix" : {
"myColumn" : "orThis"
}
},
{
"prefix" : {
"myColumn" : "orEvenThis"
}
}
]
}
}
because the must impacts the required nature of should. Without must, should behaves like a boolean OR. However, with must, it behaves as a completely optional function to improve relevancy (score). To make it go back to the boolean OR behavior with must, you must add minimum_should_match to the bool compound query.
{
"bool" : {
"must" : [ ... ],
"should" : [
{
"prefix" : {
"myColumn" : "This"
}
},
{
"prefix" : {
"myColumn" : "orThis"
}
},
{
"prefix" : {
"myColumn" : "orEvenThis"
}
}
],
"minimum_should_match" : 1
}
}
Notice that it's a component of the bool query, and not of either should or must!

Related

Query documents with access control filter

Each document in my Elasticsearch index has two access control lists containing user ids. One is an allow list, the other is a deny list. I am trying to add a filter to a given query that considers these ACLs. I thought I could use a bool query with a must clause for the given query, a filter clause for the allow list, and a must_not clause for the deny list. What I have so far (example for user 1):
{
"bool" : {
"must" : {
[given query]
},
"filter" : [ {
"match" : {
"acl.allow" : {
"query" : "/user/1",
"type" : "boolean"
}
}
}],
"must_not" : [ {
"match" : {
"acl.deny" : {
"query" : "/user/1",
"type" : "boolean"
}
}
}]
}
}
Unfortunately, this query does not return the desired result. It returns objects that have not listed user 1 in their allow list (a behavior I don't understand). Also, it (obviously) ignores objects with empty access control lists (which should be visible to anyone). Any suggestions to fix that?
I figured it out. First of all, using match isn't really a good solution for that kind of query—due to its analyzer. Using term though left me puzzled why I did not get any results. Term queries only return results if the corresponding field is set to not_analyzed. Thus I changed my mapping:
"acl": {
"properties": {
"allow": {
"type": "string",
"index": "not_analyzed"
},
"deny": {
"type": "string",
"index": "not_analyzed"
}
}
}
My second problem—treating objects with empty ACLs as visible to anyone—was solved using exists nested in must_not nested in bool. This is recommended as substitute for the deprecated missing query. My final query looks like this and passed all ACL related tests I could think of.
{
"bool" : {
"must" : {
[given query]
},
"filter" : {
"bool" : {
"should" : [ {
"terms" : {
"acl.allow" : [ "/user/1" ]
}
}, {
"bool" : {
"must_not" : {
"exists" : {
"field" : "acl.allow"
}
}
}
} ]
}
},
"must_not" : {
"terms" : {
"acl.deny" : [ "/user/1" ]
}
}
}
}

How to query on multiple fields in elasticsearch?

i have tried the multiple field query and it works fine. But I would like to know what other options are generally used to query multiple fields in elasticsearch?
Structured queries with multiple terms, for finding exact values, the same as SQL
https://www.elastic.co/guide/en/elasticsearch/guide/current/_finding_multiple_exact_values.html
"bool" : {
"must" : [
{ "term" : { "tags" : "search" } },
{ "term" : { "tag_count" : 1 } }
]
}
For example, consider following sql query,
SELECT product
FROM products
WHERE (price = 20 OR productID = "XHDK-A-1293-#fJ3")
AND (price != 30)
In these situations, you will need the bool filter. This is a compound filter that accepts other filters as arguments, combining them in various Boolean combinations.
The Query DSL would be,
GET /my_store/products/_search
{
"query" : {
"filtered" : {
"filter" : {
"bool" : {
"should" : [
{ "term" : {"price" : 20}},
{ "term" : {"productID" : "XHDK-A-1293-#fJ3"}}
],
"must_not" : {
"term" : {"price" : 30}
}
}
}
}
}
}
Follow the below link for documentation
https://www.elastic.co/guide/en/elasticsearch/guide/current/combining-filters.html

Elasticsearch: can this complex filtere query be simplified?

I have the following filtered query:
{
"query": {
"filtered" : {
"query": {
....
},
"filter" : {
"bool" : {
"should" : [
{
bool : {
"must" : [
{ "term1" : { "name1" : "value1" } },
{ "term2" : { "name2" : "value2" } }
]
}
},
{
"bool" : {
"must" : [
......
]
}
},
{
"bool" : {
"must" : [
......
]
}
},
{
"bool" : {
"must" : [
.......
]
}
}
]
}
}
}
}
}
Is there any room for me to improve the complex filters without harming performance? How?
I feel the filter part can be simplified, but not sure. Also not sure about any performance impact.
Thanks!
UPDATE
Under each must clause, there is a GROUP of clauses. See the first must clause for an example.
UPDATE 2
According to Duc's input and Mario's updated answer, I am very happy that what I have so far is the right way. No changes are is needed.
I choose Mario's response as the answer because it confirms that mine is right.
You don't need Bool with only one must clause. You put your conditions directly within the outer should.
{
"query": {
"filtered" : {
"query": {
....
},
"filter" : {
"bool" : {
"should" : [
{
.......
},
{
.......
},
{
.......
},
{
.......
}
]
}
}
}
}
}
UPDATE: since I see from your update that you have several clauses within your nested must statements, my suggestion is no longer valid. Your query filter is ok like it is (except that term1 and term2 aren't valid, but that's not the point of the question)

Elastic(search): How to structure nested queries correctly?

I'm currently quite confuse about the structuring of queries in elastic. Let me explain what I mean with the following template that works fine for me:
{
"template" : {
"query" : {
"filtered" : {
"query" : {
"bool" : {
"must" : [
{ "match" : {
"user" : "{{param_user}}"
} },
{ "match" : {
"session" : "{{param_session}}"
} },
{ "range" : {
"date" : {
"gte" : "{{param_from}}",
"lte" : "{{param_to}}"
}
} }
]
}
}
}
}
}
}
Ok so I want to get entries of a specific session of a user in a certain time period. Now if you take a llok at this link http://www.elastic.co/guide/en/elasticsearch/guide/current/combining-filters.html you can find the following query:
{
"query" : {
"filtered" : {
"filter" : {
"bool" : {
"should" : [
{ "term" : {"price" : 20}},
{ "term" : {"productID" : "XHDK-A-1293-#fJ3"}}
],
"must_not" : {
"term" : {"price" : 30}
}
}
}
}
}
}
In this example we have right after the "filtered" the "filter" keyword. However if I exchange my second "query" with a "filter" as in the example , my template won't work anymore. This is really counterintuitive and I payed alot of time to figure this out. A̶l̶s̶o̶ ̶I̶ ̶d̶o̶n̶'̶t̶ ̶u̶n̶d̶e̶r̶s̶t̶a̶n̶d̶ ̶w̶h̶y̶ ̶w̶e̶ ̶n̶e̶e̶d̶ ̶t̶o̶ ̶p̶u̶t̶ ̶e̶v̶e̶r̶y̶ ̶f̶i̶l̶t̶e̶r̶ ̶i̶n̶ ̶s̶e̶p̶a̶r̶a̶t̶e̶ ̶̶{̶ ̶}̶̶ ̶e̶v̶e̶n̶ ̶t̶h̶o̶u̶g̶h̶ ̶t̶h̶e̶y̶ ̶a̶r̶e̶ ̶a̶l̶r̶e̶a̶d̶y̶ ̶s̶e̶p̶a̶r̶a̶t̶e̶d̶ ̶b̶y̶ ̶t̶h̶e̶ ̶a̶r̶r̶a̶y̶ ̶s̶y̶n̶t̶a̶x̶.̶
Another issue I had was that I suggested to match several fields I can just type smth like:
{
"query" : {
"match" : {
"user" : "{{param_user}}",
"session" : "{{param_session}}"
}
}
}
but it seemed that I have to use a bool query which I didn't know of, so I searched for 'elastic multi match' but got something completely different.
My question: where can I find how to structure a query properly (smth like a PEG)? The documentation only give basic examples but doesn't state what we can actually do and how.
Best regards,
Jan
Edit: Ok I just found by accident that I cannot exchange "query" with "filter" as "match" is a query and not a filter. But then again what about "range"? It seems to be a query as well as a filter... Is there a summary of keywords specifying in which context they can be used?
Is there a summary of keywords specifying in which context they can be used?
I wouldn't consider that as keywords. It's just there are both queries and filters with the same names (but not all of them).
Here is everything you need. For example there are both range query and filter. All you need is to understand the difference between filters and queries.
For example, if you want to move range section from query to filter, you can do that like shown in the code below (not tested). Since your code already contains filtered type of query, you can just create filter section right after query section.
{
"template": {
"query": {
"filtered": {
"query": {
"bool": {
"must": [
{
"match": {
"user": "{{param_user}}"
}
},
{
"match": {
"session": "{{param_session}}"
}
}
]
}
},
"filter": {
"range": {
"date": {
"gte": "{{param_from}}",
"lte": "{{param_to}}"
}
}
}
}
}
}
}
Just remember that you can filter only not analyzed fields.

How do I have to write a Search Query in ElasticSearch?

I use the Grails ElasticSearch Plugin and want to use the following query:
"bool" : {
"must" : {
"term" : { "user" : "kimchy" }
},
"must_not" : {
"range" : {
"age" : { "from" : 10, "to" : 20 }
}
},
"should" : [
{
"term" : { "tag" : "wow" }
},
{
"term" : { "tag" : "elasticsearch" }
}
],
"minimum_should_match" : 1,
"boost" : 1.0
}
Using the groovy api from the Grails plugin I would write something like:
def res = userAgentIdentService.search() {
"bool" {
"must" {
term("user" : "kimchy" )
}
"must_not" {
"range" {
age("from" : 10, "to" : 20 }
}
}
"should" : [
{
term( "tag" : "wow" )
}
{
term("tag" : "elasticsearch" )
}
]
"minimum_should_match" = 1
"boost" = 1.0
}
}
My query is not working!
Where do I have to define minimum_should_match and how do I have to define it?
How do I have to write the "should" : [ ... ] square brackets notation in the grails / groovy manner?
I think you're missing a couple of json levels in your search request. I don't think you can use the query without specifying that's a query (it could be a filter as well, or even something else). Have a look at this example from the groovy api reference:
def search = node.client.search {
indices "test"
types "type1"
source {
query {
terms(test: ["value1", "value2"])
}
}
}

Resources