Query on multiple range of document - elasticsearch

What I want to search is to extract documents among certain range of documents, not the whole documents. I know ids of documents. For example, I want to query matching some sentences with query field - 'pLabel' among the documents ids of which I know via different process. My trial is as below but I got bunch of documents which is different with my expectation.
For example, in such documents as eid1, eid2...etc groups, I want to query filtering out the matching documents out of the groups (eid1, eid2, eid3, ...). Query is shown as below.
How I fix query statement to get the right search result?
{
"query": {
"bool": {
"must": [
{
"query_string": {
"default_field": "pLabel" ,
"query": "search words here"
}
}
] ,
"must_not": [] ,
"should": [
{
"term": {
"eid": "eid1"
}
} ,
{
"term": {
"eid": "eid2"
}
}
]
}
} ,
"size": 0 ,
"_source": [
"eid"
] ,
"aggs": {
"eids": {
"terms": {
"field": "eid" ,
"size": 1000
}
}
}
}

You need to move the should clause of the Doc IDs inside the must clause.
Right now the query can return any document that matches the query_string clause, it'll only prefer docs that matches the Doc IDs.
Also, you should use terms query
{
"query": {
"bool": {
"must": [
{
"query_string": {
"default_field": "pLabel",
"query": "search words here"
}
},
{
"terms": {
"user": ["eid1", "eid2"]
}
}
]
}
},
"size": 0,
"_source": [
"eid"
],
"aggs": {
"eids": {
"terms": {
"field": "eid",
"size": 1000
}
}
}
}

Related

Elastic Search query with should returning 10.000 results but nothing matches

So I have an index of about 60GB data and basically I want to make a query to retrieve 1 specific product based off its reference number.
here is my query:
GET myindex/_search
{
"_source": [
"product.ref",
"product.urls.*",
"product.i18ns.*.title",
"product_sale_elements.quantity",
"product_sale_elements.prices.*.price",
"product_sale_elements.listen_price.*",
"product.images.image_url",
"product.image_count",
"product.images.visible",
"product.images.position"
],
"size": "6",
"from": "0",
"query": {
"function_score": {
"functions": [
{
"field_value_factor": {
"field": "product.sales_count",
"missing": 0,
"modifier": "log1p"
}
},
{
"field_value_factor": {
"field": "product.image_count",
"missing": 0,
"modifier": "log1p"
}
},
{
"field_value_factor": {
"field": "featureCount",
"missing": 0,
"modifier": "log1p"
}
}
],
"query": {
"bool": {
"filter": [
{
"term": {
"product.is_visible": true
}
}
],
"should": [
{
"query_string": {
"default_field": "product.ref",
"query": "13141000",
"boost": 2
}
}
]
}
}
}
},
"aggs": {
"by_categories": {
"terms": {
"field": "categories.i18ns.de_DE.title.raw",
"size": 100
}
}
}
}
My question therefore is, why does this query give me back 10k results whereas I just wanted the 1 single product with that reference number.
If I do:
GET my-index/_search
{
"query": {
"match": {
"product.ref": "13141000"
}
}
}
it matches correctly. How is should different then a normal match query?
If you have must or filter clauses, as you do, then anything than matches must or filter does not have to match your should clause, since it's considered "optional"
You can either move query_string within your should clause to filter or set minimum_should_match to 1 like this
...
"should": [
{
"query_string": {
"default_field": "product.ref",
"query": "13141000",
"boost": 2
}
}
],
"minimum_should_match" : 1,
...
Must - The condition must match.
Should - If the condition matches, then it will improve the score in a non-filter context. (If minimum_should_match is not declared explicitly)
As you can see, must is similar to filter but also provides scoring. Filter will not be providing any scoring.
You can put this clause inside a new must clause:
{
"query_string": {
"default_field": "product.ref",
"query": "13141000",
"boost": 2
}
}
Boost will not effect scoring if you put the above inside the filter clause.
Read more about bool queries here

Elastic Search Query to get results for multiple keywords(i.e Country name)

Key String will be like
"india,singapore" without quotes.
How to split and search the keyword
Expected result will be match the country with india or singapore.
So far i tried..
{
"_source": "country_name",
"query": {
"bool": {
"must": [
{
"term": {
"country_name.keyword": "india,singapore"
}
}
],
"must_not": [],
"should": []
}
},
"from": 0,
"size": 10,
"sort": [],
"aggs": {}
}
But it will showing only those content have match the exact key string "india,singapore"
you can use terms query in place of term query like below:
{
"_source": "country_name",
"query": {
"bool": {
"must": [
{
"terms": {
"country_name.keyword": ["india","singapore"]
}
}
]
}
},
"from": 0,
"size": 10
}

How to check how many documents do not exist out of a list in elasticsearch

What will be the query to retrieve the number of documents not found in a query
This is my Query
$params['body']['query']["bool"]["filter"]["terms"]["feild"] = (list);
I want to retrieve the documents not found from the list.
If my List has (A,B,C). i just need to know that C is not indexed. I don't Need A,B,D,E,F or all of the remaining documents in index.
You can use must_not clause to achieve the negation of your query as shown below:
GET my-index/_search
{
"query": {
"constant_score": {
"filter": {
"bool": {
"must_not": {
"terms": {
"field": [
"value-1", "value-2"
]
}
}
}
}
}
}
}
must_not with aggregation will give more details about that field values which you are not expecting :-
{
"_source":false,
"query": {
"bool": {
"must_not": [
{"term": {"aFieldName": "aFieldValue"}}
]
}
},
"aggregations": {
"byLocation": {
"terms": {
"field": "aFieldName"
}
}
}
}

Aggregation with fuzzy filter

Is possible in Elastisearch to have an aggregation which will have a filter/query including fuzzy?
ATM i have documents which contains nested object[]. What I want to achieve:
- select from each document 0..n nested objects which match a filter
- from this array of nested objects take the distinct one
- sort them by _score
- take the top 5 or X
- use the terms for an autocomplete/suggestions (should work more as a "like" and not autocomplete)
Until now I tried different types of aggregations like: significant_terms, top_hits but not in a good combination so I don't get the desired result.
Problems:
significant_terms doesn't return a value until he figures out when a term is significant (maybe i did not use a good analyzer)
top-hits returns any nested obj from the selected document and also contains duplicates
Here is an example of my query
GET customerinsights/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "CustomerInsightTargets",
"query": {
"bool": {
"must": [
{
"match": {
"CustomerInsightTargets.CustomerInsightValue": {
"query": "2017",
"operator": "AND",
"fuzziness": 2
}
}
}
]
}
}
}
}
]
}
} ,
"aggs": {
"root": {
"nested": {
"path": "CustomerInsightTargets"
},
"aggs": {
"top_tags": {
"terms": {
"field": "CustomerInsightTargets.CustomerInsightSource.keyword"
},
"aggs": {
"top_tag_hits": {
"top_hits": {
"sort": [
{
"_score": {
"order": "desc"
}
}
],
"size": 5,
"_source": "CustomerInsightTargets"
}
}
}
}
}
}
},
"size": 0,
"_source": "CustomerInsightTargets"
}

To find the distinct fields in an elastic search query

I need the values of only one field and there are duplicate values in it.
POST _search
{
"query": {
"bool": {
"must": [
{"term": {
"report": {
"value": "some_value"
}
}}
]
}
},
"fields": [
"field_name"
]
}
I need only the distinct values of field_name.
What if you have your query, with the use of terms aggregation and then by applying a top_hits aggregation in order to narrow down to the single value which you wanted to achieve:
"aggs": {
"values": {
"terms": {
"field": "your_field"
}
}
}
This SO could be helpful as well.

Resources