elasticsearch -check if array contains a value - elasticsearch

I want to check on an field of an array long type that includes some values.
the only way I found is using script: ElasticSearch Scripting: check if array contains a value
but it still not working fore me:
Query:
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"script": {
"script": "doc['Commodity'].values.contains(param1)",
"params": {
"param1": 50
}
}
}
}
}
}
but I get 0 hits. while I have the records:
{
"_index" : "aaa",
"_type" : "logs",
"_id" : "2zzlXEOgRtujWiCGtX6s9Q",
"_score" : 1,
"_source" : {
"Commodity" : [
50
],
"Type" : 1,
"SourceId" : "fsd",
"Id" : 123
}
}

Try this instead of that script:
{
"query": {
"filtered": {
"filter": {
"terms": {
"Commodity": [
55,
150
],
"execution": "and"
}
}
}
}
}

For those of you using the latest version of Elasticsearch (7.1.1), please note that
"filtered" and "execution" are deprecated so #Andrei Stefan's answer may not help anymore.
You can go through the below discussion for alternative approaches.
https://discuss.elastic.co/t/is-there-an-alternative-solution-to-terms-execution-and-on-es-2-x/41089
In the answer written by nik9000 in the above discussion, I just replaced "term" with "terms" (in PHP) and it started working with array inputs and AND was applied with respect to each of the "terms" keys that I used.
EDIT: Upon request I will post a sample query written in PHP.
'body' => [
'query' => [
'bool' => [
'filter' => [
['terms' => ['key1' => array1]],
['terms' => ['key2' => array2]],
['terms' => ['key3' => array3]],
['terms' => ['key4' => array4]],
]
]
]
]
key1,key2 and key3 are keys present in my elasticsearch data and they will be searched for in their respective arrays. AND function is applied between the ["terms" => ['key' => array ] lines.

For those of you who are using es 6.x, this might help.
Here I am checking whether the user(rennish.joseph#gmail.com) has any orders by passing in an array of orders
GET user-orders/_search
{
"query": {
"bool": {
"filter": [
{
"terms":{
"orders":["123456","45678910"]
}
},
{
"term":{
"user":"rennish.joseph#gmail.com"
}
}
]
}
}
}

Related

How to get the best matching document in Elasticsearch?

I have an index where I store all the places used in my documents. I want to use this index to see if the user mentioned one of the places in the text query I receive.
Unfortunately, I have two documents whose name is similar enough to trick Elasticsearch scoring: Stockholm and Stockholm-Arlanda.
My test phrase is intyg stockholm and this is the query I use to get the best matching document.
{
"size": 1,
"query": {
"bool": {
"should": [
{
"match": {
"name": "intyig stockholm"
}
}
],
"must": [
{
"term": {
"type": {
"value": "4"
}
}
},
{
"terms": {
"name": [
"intyg",
"stockholm"
]
}
},
{
"exists": {
"field": "data.coordinates"
}
}
]
}
}
}
As you can see, I use a terms query to find the interesting documents and I use a match query in the should part of the root bool query to use scoring to get the document I want (Stockholm) on top.
This code worked locally (where I run ES in a container) but it broke when I started testing on a cluster hosted in AWS (where I have the exact same dataset). I found this explaining what happens and adding the search type argument actually fixes the issue.
Since the workaround is best not used on production, I'm looking for ways to have the expected result.
Here are the two documents:
// Stockholm
{
"type" : 4,
"name" : "Stockholm",
"id" : "42",
"searchableNames" : [
"Stockholm"
],
"uniqueId" : "Place:42",
"data" : {
"coordinates" : "59.32932349999999,18.0685808"
}
}
// Stockholm-Arlanda
{
"type" : 4,
"name" : "Stockholm-Arlanda",
"id" : "1832",
"searchableNames" : [
"Stockholm-Arlanda"
],
"uniqueId" : "Place:1832",
"data" : {
"coordinates" : "59.6497622,17.9237807"
}
}

Elasticsearch search with nested properties

I have an instance in which articles are stored which have various properties. But it may be that some items have no properties at all. There are countless properties and assigned values, all in random order.
Now the problem is that unfortunately it doesn't work the way I would like it to. The properties are respected, but it seems like the order of the properties is important. But it can be that there are a lot of properties in the entry in the instance and only 1-2 are queried in the search query and these can have minor deviations in the value.
The goal is to find entries that are as similar as possible, no matter the order of the properties.
Can anyone help me with this?
Elastic instance info:
"_index" : "articles",
"_type" : "_doc",
"_id" : "fYjaQXkBBdCju4scstN_",
"_score" : 1.0,
"_source" : {
"position" : "400.000",
"beschreibung" : "asc",
"menge" : 24.0,
"einheit" : "St",
"properties" : [
{
"desc" : "Farbe",
"val" : "rot"
},
{
"desc" : "Material",
"val" : "Holz"
},
{
"desc" : "Länge",
"val" : "20 cm"
},
{
"desc" : "Breite",
"val" : "100 km"
}
]
}
}
The nested part of my current query:
[nested] => Array
(
[path] => properties
[query] => Array
(
[0] => Array
(
[0] => Array
(
[bool] => Array
(
[should] => Array
(
[0] => Array
(
[match] => Array
(
[properties.desc] => Farbe
)
)
[1] => Array
(
[match] => Array
(
[properties.val] => rot
)
)
)
)
)
)
[1] => Array
(
[0] => Array
(
[bool] => Array
(
[should] => Array
(
[0] => Array
(
[match] => Array
(
[properties.desc] => Länge
)
)
[1] => Array
(
[match] => Array
(
[properties.val] => 22 cm
)
)
)
)
)
)
[2] => Array
(
[0] => Array
(
[bool] => Array
(
[should] => Array
(
[0] => Array
(
[match] => Array
(
[properties.desc] => Material
)
)
[1] => Array
(
[match] => Array
(
[properties.val] => Holz
)
)
)
)
)
)
)
)
There are two problems in your query leading to strange results:
You're using a match query on a text field, which has multiple terms. So when doing a
"match": {
"properties.val": "22 cm",
}
, Elasticsearch searches for "22" OR "cm" in the properties.val field. I assume you wanna match on the whole phrase, so you could for example use match_phrase here. Alternatively, you could put the unit into an own field. Another option would be to use the operator parameter:
"match": {
"properties.val": {
"query": "20 cm",
"operator": "and"
}
}
But be aware that this isn't looking for exact phrase. For example "20 30 cm" would also be matched, but maybe this could suit your case.
You're using the should clause on the property level. So you're basically asking for documents with properties, that "should have Farbe in their description and rot in their value", but that would match all following examples:
"properties" : [
{
"desc" : "Farbe",
"val" : "blau"
}
]
"properties" : [
{
"desc" : "Material",
"val" : "rot"
}
]
"properties" : [
{
"desc" : "Farbe",
"val" : "blau"
},
{
"desc" : "Material",
"val" : "rot"
}
]
So you need a bool query (having must or filter clauses) for each property you wanna match and a bool query around that having a should clause for each property. Your query from the question could then be like this:
{
"query": {
"bool": {
"should": [
{
"nested": {
"path": "properties",
"query": {
"bool": {
"filter": [
{
"match": {
"properties.desc": "Farbe"
}
}
],
"must": [
{
"match_phrase": {
"properties.val": "rot"
}
}
]
}
}
}
},
{
"nested": {
"path": "properties",
"query": {
"bool": {
"filter": [
{
"match": {
"properties.desc": "Länge"
}
}
],
"must": [
{
"match_phrase": {
"properties.val": "22 cm"
}
}
]
}
}
}
},
{
"nested": {
"path": "properties",
"query": {
"bool": {
"filter": [
{
"match": {
"properties.desc": "Material"
}
}
],
"must": [
{
"match_phrase": {
"properties.val": "Holz"
}
}
]
}
}
}
}
]
}
}
}
Please try, if this gives the desired results.
You could still tweak the query for example by using minimum_should_match or defining the score given by each matched property.

how to get count of not-null value based on specific field in Elasticsearch

I have elastic search index and I need total count of records that one of fields ("actual_start") is not-null how can I do this?
I have wrote this query and I want to append count of not-null actual start value to the result of my query:
$params = [
'index' => "appointment_request",
'body' => [
'from' => $request['from'],
'size' => $request['size'],
"query" => [
"bool"=>[
"must" => [
[
"term" => [
"doctor_id" => [
"value" => $request['doctor_id']
]
]
],
[
"match" => [
"book_date" => $request['book_date']
]
],
]
],
]
]
];
Take a look at Exists Query
Try this:
GET <your-index>/_count
{
"query": {
"exists": {
"field": "actual_start"
}
}
}
Then you can read the count value which will give you the total count of records that actual_start is not-null.
You can also replace _count with _search and total value under hits will give you a total count (in case you also want the hits) .
If you want to do the opposite (all records which actual_start is null):
GET <your-index>/_count
{
"query": {
"bool": {
"must_not": [
{
"exists": {
"field": "actual_start"
}
}
]
}
}
}
UPDATE
If I understand you correctly you want to append your current query with the exists query.
Example:
GET <your-index>/_search
{
"query": {
"bool": {
"must": [
{
<put-your-query-here>
},
{
"exists": {
"field": "actual_start"
}
}
]
}
}
}

Elasticsearch query on a nested field with condition

Elasticsearch v7.0
Hello and good day!
I'm trying to create a query that will have a condition: if a nested field has only 1 element, get that first element, if a nested field has 2 more or elements, get a matching nested field condition
Scenario:
I have an index named socialmedia and has a nested field named cms which places a sentiment for that document
An example document of the cms field looks like this
"_id" : 1,
"cms" : [
{
"cli_id" : 0,
"cmx_sentiment" : "Negative"
}
]
This cms field contains "cli_id" : 0 by default for its 1st element (this means it is for all the clients/users to see) but sooner or later, it goes like this:
"_id": 1,
"cms" : [
{
"cli_id" : 0,
"cmx_sentiment" : "Negative"
},
{
"cli_id" : 1,
"cmx_sentiment" : "Positive"
},
{
"cli_id" : 2,
"cmx_sentiment" : "Neutral"
},
]
The 2nd and 3rd element shows that the clients with cli_id equals to 1 and 2 has made a sentiment for that document.
Now, I want to formulate a query that if the client who logged in has no sentiment yet for a specific document, it fetches the cmx_sentiment that has the "cli_id" : 0
BUT , if the client who has logged in has a sentiment for the fetched documents according to his filters, the query will fetch the cmx_sentiment that has the matching cli_id of the logged in client
for example:
the client who has a cli_id of 2, will get the cmx_sentiment of **Neutral** according to the given document above
the client who has a cli_id of 5, will get the cmx_sentiment of **Negative** because he hasn't given a sentiment to the document
PSEUDO CODE :
If a document has a sentiment indicated by the client, get the cmx_sentiment of the cli_id == to the client's ID
if a document is fresh or the client HAS NOT labeled yet a sentiment on that document, get the element's cmx_sentiment that has cli_id == 0
I'm in need of a query to condition for the pseudo code above
Here's my sample query:
"aggs" => [
"CMS" => [
"nested" => [
"path" => "cms",
],
"aggs" => [
"FILTER" => [
"filter" => [
"bool" => [
"should" => [
[
"match" => [
"cms.cli_id" => 0
]
],
[
"bool" => [
"must" => [
[
// I'm planing to create a bool method here to test if cli_id is equalis to the logged-in client's ID
]
]
]
]
]
]
],
"aggs"=> [
"TONALITY"=> [
"terms"=> [
"field" => "cms.cmx_sentiment"
],
]
]
]
]
]
]
Is my query correct?
The problem with the query I have provided, is that it SUMS all the elements, instead of picking one only
The query above provides this scenario:
The client with cli_id 2 logs in
Both the Neutral and Negative cmx_sentiment are being retrieved, instead of the Neutral alone
After the discussion with OP I'm rewriting this answer.
To get the desired result you will have to consider the following to build the query and aggregation:
Query:
This will contain any filter applied by logged in user. For the example purpose I'm using match_all since every document has atleast one nested doc against cms field i.e. for cli_id: 0
Aggregation:
Here we have to divide the aggregations into two:
default_only
sentiment_only
default_only
In this aggregation we find count for those document which don't have nested document for cli_id: <logged in client id>. i.e. only those docs which have nested doc for cli_id: 0.
To do this we follow the steps below:
default_only Use filter aggregation to get document which does not have nested document for cli_id: <logged in client id> i.e. using must_not => cli_id: <logged in client id>
default_nested : Add sub aggregation for nested docs since we need to get the docs against sentiment which is field of nested document.
sentiment_for_cli_id : Add sub aggregation to default_nested aggregation in order to get sentiment only for default client i.e. for cli_id: 0.
default : Add this terms sub aggregation to sentiment_for_cli_id aggregation to get counts against the sentiment. Note that this count is of nested docs and since you always have only one nested doc per cli_id therefore this count seems to be the count of docs but it is not.
the_doc_count: Add this reverse_nested aggregation to get out of nested doc aggs and the count of parent docs. We add this as the sub aggregation of default aggregation.
sentiment_only
This aggregation give count against each sentiment where cli_id: <logged in client id> is present. For this we follow the same approach as we followed for default_only aggregation. But with some tweaks as below:
sentiment_only : must => cli_id: <logged in client id>
sentiment_nested : same reason as above
sentiment_for_cli_id: same but instead of default we filter for cli_id: <logged in client id>
sentiment: same as default
the_doc_count: same as above
Example:
PUT socialmedia/_bulk
{"index":{"_id": 1}}
{"cms":[{"cli_id":0,"cmx_sentiment":"Positive"}]}
{"index":{"_id": 2}}
{"cms":[{"cli_id":0,"cmx_sentiment":"Positive"},{"cli_id":2,"cmx_sentiment":"Neutral"}]}
{"index":{"_id": 3}}
{"cms":[{"cli_id":0,"cmx_sentiment":"Positive"},{"cli_id":2,"cmx_sentiment":"Negative"}]}
{"index":{"_id": 4}}
{"cms":[{"cli_id":0,"cmx_sentiment":"Positive"},{"cli_id":2,"cmx_sentiment":"Neutral"}]}
Query:
GET socialmedia/_search
{
"query": {
"match_all": {}
},
"aggs": {
"default_only": {
"filter": {
"bool": {
"must_not": [
{
"nested": {
"path": "cms",
"query": {
"term": {
"cms.cli_id": 2
}
}
}
}
]
}
},
"aggs": {
"default_nested": {
"nested": {
"path": "cms"
},
"aggs": {
"sentiment_for_cli_id": {
"filter": {
"term": {
"cms.cli_id": 0
}
},
"aggs": {
"default": {
"terms": {
"field": "cms.cmx_sentiment"
},
"aggs": {
"the_doc_count": {
"reverse_nested": {}
}
}
}
}
}
}
}
}
},
"sentiment_only": {
"filter": {
"bool": {
"must": [
{
"nested": {
"path": "cms",
"query": {
"term": {
"cms.cli_id": 2
}
}
}
}
]
}
},
"aggs": {
"sentiment_nested": {
"nested": {
"path": "cms"
},
"aggs": {
"sentiment_for_cli_id": {
"filter": {
"term": {
"cms.cli_id": 2
}
},
"aggs": {
"sentiment": {
"terms": {
"field": "cms.cmx_sentiment"
},
"aggs": {
"the_doc_count": {
"reverse_nested": {}
}
}
}
}
}
}
}
}
}
}
}
Agg Output:
"aggregations" : {
"default_only" : {
"doc_count" : 1,
"default_nested" : {
"doc_count" : 1,
"sentiment_for_cli_id" : {
"doc_count" : 1,
"default" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "Positive",
"doc_count" : 1,
"the_doc_count" : {
"doc_count" : 1
}
}
]
}
}
}
},
"sentiment_only" : {
"doc_count" : 3,
"sentiment_nested" : {
"doc_count" : 6,
"sentiment_for_cli_id" : {
"doc_count" : 3,
"sentiment" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "Neutral",
"doc_count" : 2,
"the_doc_count" : {
"doc_count" : 2
}
},
{
"key" : "Negative",
"doc_count" : 1,
"the_doc_count" : {
"doc_count" : 1
}
}
]
}
}
}
}
}

How to use ElasticSearch Query params (DSL query) for multiple types?

I have been working with the ElasticSearch from last few months, but still find it complicated when I have to pass an complicated query.
I want to run the query which will have to search the multiple "types" and each type has to be searched with its own "filters", but need to have combined "searched results"
For example:
I need to search the "user type" document which are my friends and on the same time i have to search the "object type" document which I like, according to the keyword provided.
OR
The query that has both the "AND" and "NOT" clause
Example query:
$options['query'] = array(
'query' => array(
'filtered' => array(
'query' => array(
'query_string' => array(
'default_field' => 'name',
'query' => $this->search_term . '*',
),
),
'filter' => array(
'and' => array(
array(
'term' => array(
'access_id' => 2,
),
),
),
'not' => array(
array(
'term' => array(
'follower' => 32,
),
),
array(
'term' => array(
'fan' => 36,
),
),
),
),
),
),
);
as this query is meant to search the user with access_id = 2, but must not have the follower of id 32 and fan of id 36
but this is not working..
Edit: Modified query
{
"query": {
"filtered": {
"filter": {
"and": [
{
"not": {
"filter": {
"and": [
{
"query": {
"query_string": {
"default_field": "fan",
"query": "*510*"
}
}
},
{
"query": {
"query_string": {
"default_field": "follower",
"query": "*510*"
}
}
}
]
}
}
},
{
"term": {
"access_id": 2
}
}
]
},
"query": {
"field": {
"name": "xyz*"
}
}
}
}
}
now after running this query, i am getting two results, one with follower: "34,518" & fan: "510" and second with fan:"34", but isn't it supposed to be only the second one in the result.
Any ideas?
You may want to look at the slides of a presentation that I gave this month, which explains the basics of how the query DSL works:
Terms of endearment - the ElasticSearch Query DSL explained
The problem with your query is that your filters are nested incorrectly. The and and not filters are at the same level, but the not filter should be under and:
curl -XGET 'http://127.0.0.1:9200/_all/_search?pretty=1' -d '
{
"query" : {
"filtered" : {
"filter" : {
"and" : [
{
"not" : {
"filter" : {
"and" : [
{
"term" : {
"fan" : 36
}
},
{
"term" : {
"follower" : 32
}
}
]
}
}
},
{
"term" : {
"access_id" : 2
}
}
]
},
"query" : {
"field" : {
"name" : "keywords to search"
}
}
}
}
}
'
I just tried it with the "BOOL"
{
"query": {
"bool": {
"must": [
{
"term": {
"access_id": 2
}
},
{
"wildcard": {
"name": "xyz*"
}
}
],
"must_not": [
{
"wildcard": {
"follower": "*510*"
}
},
{
"wildcard": {
"fan": "*510*"
}
}
]
}
}
}
It gives the correct answer.
but I'm not sure should it be used like this ?

Resources