How to check how many documents do not exist out of a list in elasticsearch - elasticsearch

What will be the query to retrieve the number of documents not found in a query
This is my Query
$params['body']['query']["bool"]["filter"]["terms"]["feild"] = (list);
I want to retrieve the documents not found from the list.
If my List has (A,B,C). i just need to know that C is not indexed. I don't Need A,B,D,E,F or all of the remaining documents in index.

You can use must_not clause to achieve the negation of your query as shown below:
GET my-index/_search
{
"query": {
"constant_score": {
"filter": {
"bool": {
"must_not": {
"terms": {
"field": [
"value-1", "value-2"
]
}
}
}
}
}
}
}

must_not with aggregation will give more details about that field values which you are not expecting :-
{
"_source":false,
"query": {
"bool": {
"must_not": [
{"term": {"aFieldName": "aFieldValue"}}
]
}
},
"aggregations": {
"byLocation": {
"terms": {
"field": "aFieldName"
}
}
}
}

Related

Combining filter and must in elasticsearch

What is the difference between adding a query filter inside a must and having a query filter and a must separately?
I need to apply a filter query to a search for but either of these two queries works the same for me. I would like to know if there are any differences.
Case 1:
"query": {
"bool": {
"must": [
{
"term": {
"field": {
"value": "VALUE"
}
}
},
{
"bool": {
"filter": [
{
"script": {
"script": {
"source": """
return true;
"""
}
}
}
]
}
}
]
}
}
Case 2:
"query": {
"bool": {
"must": [
{
"term": {
"field": {
"value": "VALUE"
}
}
}
],
"filter": [
{
"script": {
"script": {
"source": """
return true;
"""
}
}
}
]
}
}
In my opinion they do not differ, but I need references. Greetings.
Both the query will work exactly the same
Refer to documentation on the boolean query to know more about your structure
must: The clause (query) must appear in matching documents and will
contribute to the score.
filter: The clause (query) must appear in matching documents. However
unlike must the score of the query will be ignored. Filter clauses are
executed in filter context, meaning that scoring is ignored and
clauses are considered for caching.
Structure of your first query where multiple bool queries are combined:
{
"query": {
"bool": {
"must": [
{
"term": {},
"bool": {
"filter": {
"script": {}
}
}
}
]
}
}
}
Structure of your second query that includes single bool query:
{
"query": {
"bool": {
"must": [
{
"term": {}
}
],
"filter": [
{
"script": {}
}
]
}
}
}
As you can see, in both the search queries the document will match only when both the term query and script query condition is satisfied
They both will work exactly the same, second one would be preferred syntax because it's not as nested as first one and easier to read.

Get unique data from a field using ElasticSearch query DSL in Kibana

I have already various queries that collect data and show it in the Kibana dashboard.
Now I would like to get unique values from my result data. How can I write the query DSL for that.
Basically I would like to get unique value for the field contextMap.connectionid. Is there a way do achieve that using something similar to this example?
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "app",
"query": {
"bool": {
"must": [
{
"match": {
"app.key": "contextMap.connectionid"
}
}
]
}
}
}
}
]
}
}
}
You can calculate distinct count with the help of aggregation .
So, your search query is :
Search Query :
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "app",
"query": {
"bool": {
"must": [
{
"match": {
"app.key": "contextMap.connectionid"
}
}
]
}
}
}
}
]
}
},
"aggs": {
"uniqueconnectionId": {
"terms": {
"field": "contextMap.connectionid.keyword"
}
}
}
}
You can refer here for calculating distinct values of a field https://discuss.elastic.co/t/get-distinct-values-from-a-field-in-elasticsearch/99783

Query on multiple range of document

What I want to search is to extract documents among certain range of documents, not the whole documents. I know ids of documents. For example, I want to query matching some sentences with query field - 'pLabel' among the documents ids of which I know via different process. My trial is as below but I got bunch of documents which is different with my expectation.
For example, in such documents as eid1, eid2...etc groups, I want to query filtering out the matching documents out of the groups (eid1, eid2, eid3, ...). Query is shown as below.
How I fix query statement to get the right search result?
{
"query": {
"bool": {
"must": [
{
"query_string": {
"default_field": "pLabel" ,
"query": "search words here"
}
}
] ,
"must_not": [] ,
"should": [
{
"term": {
"eid": "eid1"
}
} ,
{
"term": {
"eid": "eid2"
}
}
]
}
} ,
"size": 0 ,
"_source": [
"eid"
] ,
"aggs": {
"eids": {
"terms": {
"field": "eid" ,
"size": 1000
}
}
}
}
You need to move the should clause of the Doc IDs inside the must clause.
Right now the query can return any document that matches the query_string clause, it'll only prefer docs that matches the Doc IDs.
Also, you should use terms query
{
"query": {
"bool": {
"must": [
{
"query_string": {
"default_field": "pLabel",
"query": "search words here"
}
},
{
"terms": {
"user": ["eid1", "eid2"]
}
}
]
}
},
"size": 0,
"_source": [
"eid"
],
"aggs": {
"eids": {
"terms": {
"field": "eid",
"size": 1000
}
}
}
}

To find the distinct fields in an elastic search query

I need the values of only one field and there are duplicate values in it.
POST _search
{
"query": {
"bool": {
"must": [
{"term": {
"report": {
"value": "some_value"
}
}}
]
}
},
"fields": [
"field_name"
]
}
I need only the distinct values of field_name.
What if you have your query, with the use of terms aggregation and then by applying a top_hits aggregation in order to narrow down to the single value which you wanted to achieve:
"aggs": {
"values": {
"terms": {
"field": "your_field"
}
}
}
This SO could be helpful as well.

How to pass list of values for a particular field in Elastic Search Query

I have a query to search for a provider_id from the Elastic Search Cluster. I am using the below query to get results for a single provider_id but need help in figuring out how can I pass a list of providers.
{
"query": {
"bool": {
"must": [{
"match": {
"message.provider_id": {
"query": 943523,
"type": "phrase"
}
}
}]
}
}
}
Suppose I want to search for provider_ids = [913523, 923523, 923523, 933523, 953523] then how should I modify the query?
You could index the provider_id as not_analyzed and then use a terms query:
POST /test_index/_search
{
"query": {
"terms": {
"message.provider_id": [
"913523", "923523", "923523", "933523", "953523"
]
}
}
}
or as a bool query with a filter if you are not going to need the score:
POST /test_index/_search
{
"query": {
"bool": {
"filter": [
{
"terms": {
"message.provider_id": [
"913523", "923523", "923523", "933523", "953523"
]
}
}
]
}
}
}

Resources