Improving performance of Elasticsearch exists query - elasticsearch

I have the following query, which finds records that do not contain any of the following fields: timestamp_login, timestamp_logout, timestamp_signup and groups by user_city.
{
"query": {
"bool": {
"must": [],
"must_not": [
{
"exists": {
"field": "timestamp_login"
}
},
{
"exists": {
"field": "timestamp_logout"
}
},
{
"exists": {
"field": "timestamp_signup"
}
}
]
}
},
"aggs": {
"group_by_item": {
"terms": {
"script": "doc['user_city.keyword'].value?.toLowerCase()",
"size": 10,
"order": {
"_count": "desc"
}
}
},
"distinct_terms": {
"cardinality": {
"script": "doc['user_city.keyword'].value?.toLowerCase()"
}
}
},
"size": 0
}
However, the query often times out. Is there a more efficient way to pull records where a list of fields are missing? Also, I'm running ES 5.6.
Thanks for your help!

Related

Performing a text search and filtering on nested terms in elasticsearch

I'm trying to perform a search th e.g. searches the word coyotes in the description , but are red and green and are in the cartoon category. Now I think I understand you can't have match and terms in the same query (the query below doesn't work for this reason), but also you that you shouldn't use terms to search on a text field. Can anyone point me in the right direction?
here's my query
GET /searchproducts/_search
{
"query": {
"match": {
"description": {
"query": "coyote"
}
},
"bool": {
"should": [{
"terms": {
"colours.name": ["red", "green"]
}
},
{
"terms": {
"categories.name": ["Cartoon"]
}
}
]
}
},
"aggs": {
"colours": {
"terms": {
"field": "colour.name.value",
"size": 100
}
},
"categories": {
"terms": {
"field": "categories.id",
"size": 100
}
}
}
}
You can use a bool query to combine multiple queries. Try out this query:
{
"query": {
"bool": {
"should": [
{
"match": {
"description": {
"query": "coyote"
}
}
},
{
"bool": {
"should": [
{
"terms": {
"colours.name": [
"red",
"green"
]
}
},
{
"terms": {
"categories.name": [
"Cartoon"
]
}
}
]
}
}
]
}
},
"aggs": {
"colours": {
"terms": {
"field": "colour.name.value",
"size": 100
}
},
"categories": {
"terms": {
"field": "categories.id",
"size": 100
}
}
}
}

Need aggregation of only the query results

I need to do an aggregation but only with the limited results I get form the query, but it is not working, it returns other results outside the size limit of the query. Here is the query I am doing
{
"size": 500,
"query": {
"bool": {
"must": [
{
"term": {
"tags.keyword": "possiblePurchase"
}
},
{
"term": {
"clientName": "Ci"
}
},
{
"range": {
"firstSeenDate": {
"gte": "now-30d"
}
}
}
],
"must_not": [
{
"term": {
"tags.keyword": "skipPurchase"
}
}
]
}
},
"sort": [
{
"firstSeenDate": {
"order": "desc"
}
}
],
"aggs": {
"byClient": {
"terms": {
"field": "clientName",
"size": 25
},
"aggs": {
"byTarget": {
"terms": {
"field": "targetName",
"size": 6
},
"aggs": {
"byId": {
"terms": {
"field": "id",
"size": 5
}
}
}
}
}
}
}
}
I need the aggregations to only consider the first 500 results of the query, sorted by the field I am requesting on the query. I am completely lost. Thanks for the help
Scope of the aggregation is the number of hits of your query, the size parameter is only used to specify the number of hits to fetch and display.
If you want to restrict the scope of the aggregation on the first n hits of a query, I would suggest the sampler aggregation in combination with your query

ElasticSearch query with prefix for aggregation

I am trying to add a prefix condition for my ES query in a "must" clause.
My current query looks something like this:
body = {
"query": {
"bool": {
"must":
{ "term": { "article_lang": 0 }}
,
"filter": {
"range": {
"created_time": {
"gte": "now-3h"
}
}
}
}
},
"aggs": {
"articles": {
"terms": {
"field": "article_id.keyword",
"order": {
"score": "desc"
},
"size": 1000
},
"aggs": {
"score": {
"sum": {
"field": "score"
}
}
}
}
}
}
I need to add a mandatory condition to my query to filter articles whose id starts with "article-".
So, far I have tried this:
{
"query": {
"bool": {
"should": [
{ "term": { "article_lang": 0 }},
{ "prefix": { "article_id": {"value": "article-"} }}
],
"filter": {
"range": {
"created_time": {
"gte": "now-3h"
}
}
}
}
},
"aggs": {
"articles": {
"terms": {
"field": "article_id.keyword",
"order": {
"score": "desc"
},
"size": 1000
},
"aggs": {
"score": {
"sum": {
"field": "score"
}
}
}
}
}
}
I am fairly new to ES and from the documentations online, I know that "should" is to be used for "OR" conditions and "must" for "AND". This is returning me some data but as per the condition it will be consisting of either article_lang=0 or articles starting with article-. When I use "must", it doesn't return anything.
I am certain that there are articles with id starting with this prefix because currently, we are iterating through this result to filter out such articles. What am I missing here?
In your prefix query, you need to use the article_id.keyword field, not article_id. Also, you should prefer filter over must since you're simply doing yes/no matching (aka filters)
{
"query": {
"bool": {
"filter": [ <-- change this
{
"term": {
"article_lang": 0
}
},
{
"prefix": {
"article_id.keyword": { <-- and this
"value": "article-"
}
}
}
],
"filter": {
"range": {
"created_time": {
"gte": "now-3h"
}
}
}
}
},
"aggs": {
"articles": {
"terms": {
"field": "article_id.keyword",
"order": {
"score": "desc"
},
"size": 1000
},
"aggs": {
"score": {
"sum": {
"field": "score"
}
}
}
}
}
}

elasticsearch aggregation with filter from query

I'm new to elasticsearch and forgive if my question would be commonplace. I use ElasticSearch v2.2. The next query
{
"query": {
"bool": {
"must": {
"multi_match": {
"query": "nokia",
"fields": [
"*.right",
"*.correct_keyboard_layout"
],
"fuzziness": "AUTO"
}
},
"filter": [
{
"terms": {
"brands": ["Nokia"]
}
},
{
"terms": {
"models_id": ["2432", "5234"]
}
},
{
"terms": {
"stores": ["999"]
}
}
]
}
},
"aggs": {
"filtered": {
"aggs": {
"models_id": {
"terms": {
"field": "models_id",
"size": 0
}
},
"category_id": {
"terms": {
"field": "category_id",
"size": 0
}
}
}
}
}
}
I get in the aggregation result, excluding the filter from the request (that is, through all the records that match the query "Nokia", but I just need answers on these models, and in aggregation in response lists all models), although here
https://www.elastic.co/guide/en/elasticsearch/guide/current/_filtering_queries_and_aggregations.html
It says that the filter should be taken out of the request, and It do not understand why I do not work.
What am I doing wrong?

elasticsearch facets OR filter

I have a problem with my elasticsearch DSL, in that when using facet navigation, when I apply my facet filter, the next set of results don't include any further facets, even though I've asked for them.
When I do the initial search, I get the results I want back:
{
"sort": {
"_score": {},
"salesQuantity": {
"order": "asc"
}
},
"query": {
"filtered": {
"query": {
"match": {
"categoryTree": "D01"
}
},
"filter": {
"term": {
"publicwebEnabled": true,
"parentID": 0
}
}
}
},
"facets": {
"delivery_locations": {
"terms": {
"field": "delivery_locations",
"all_terms": true
}
},
"categories": {
"terms": {
"field": "categoryTree",
"all_terms": true
}
},
"collectable": {
"terms": {
"field": "collectable",
"all_terms": true
}
}
},
"from": 0,
"size": 12}
When I then apply a filter like so, the results I get back do not include the facets:
{
"sort": {
"_score": {},
"salesQuantity": {
"order": "asc"
}
},
"query": {
"filtered": {
"query": {
"match": {
"categoryTree": "D01"
}
},
"filter": {
"term": {
"publicwebEnabled": true,
"parentID": 0
},
"or": [
{
"range": {
"Retail_Price": {
"to": "49.99",
"from": "0"
}
}
}
]
}
}
},
"facets": {
"delivery_locations": {
"terms": {
"field": "delivery_locations",
"all_terms": true
}
},
"categories": {
"terms": {
"field": "categoryTree",
"all_terms": true
}
},
"collectable": {
"terms": {
"field": "collectable",
"all_terms": true
}
}
},
"from": 0,
"size": 12}
NOTE, I'm adding the OR filter above - because users may choose multiple price ranges to filter on.
Am I doing something wrong?
I want the new facets returned as altering the prices would obviously alter the facet counts of the other facets...
Add the original term-filter inside the or-filter, or add another boolean filter to wrap your whole filter inside a boolean expression. I dont think you can add the two filters just by comma-separating them like that.

Resources