Sort based on the service time of stores - elasticsearch

My project contains some stores with their working time and I index them in ElasticSearch. Now there are some scenarios in my product:
Whenever the client requests for the stores which are available now, I use the following range filter:
bool: {
must: [
{ range: {startTime: { lte: now}} },
{ range: {endTime: { gte: now}} }
]
}
Let's call the result Online stores.
When the client requests for all stores, I have to give them all the documents, but I have to sort them, first online stores and then other stores.
I can do that by two queries, one for online and another one for offline store but I want to do that once. Any idea?

You can achieve this by using should as an "optional" clause:
If the bool query is in a query context and has a must or filter
clause then a document will match the bool query even if none of the
should queries match. In this case these clauses are only used to
influence the score.
The bool query takes a more-matches-is-better approach, so the score
from each matching must or should clause will be added together to
provide the final _score for each document.
The query might look like this:
POST my-should/doc/_search
{
"query": {
"bool": {
"must": {
"match_all": {}
},
"should": {
"bool": {
"must": [
{
"range": {
"startTime": {
"lte": "2018-06-24T16:39:59"
}
}
},
{
"range": {
"endTime": {
"gte": "2018-06-22T16:39:59"
}
}
}
],
"_name": "Online"
}
}
}
}
}
The match part of this bool query will define which documents will match, and the should part will boost those that also match additional criteria.
Note that here we used Named Queries to highlight that the "Online" part of the query was matched to a document. The response could look like this:
"hits": [
{
"_index": "my-should",
"_type": "doc",
"_id": "BKgZLWQBERN2JBe1CQ5t",
"_score": 3,
"_source": {
"startTime": "2018-06-23T16:39:59",
"endTime": "2018-06-23T16:39:59"
},
"matched_queries": [
"Online"
]
},
{
"_index": "my-should",
"_type": "doc",
"_id": "BagaLWQBERN2JBe12A7y",
"_score": 1,
"_source": {
"startTime": "2018-06-20T16:39:59",
"endTime": "2018-06-21T16:39:59"
}
}
]
Hope that helps!

Related

combine terms and bool query in elasticsearch

I would like to do a search in an elasticsearch index but only for a list of ids. I can select the ids with a terms query
{
"query": {
"terms": {
"_id": list_of_ids
}
}
}
Now I want to search in the resulting list, which can be done with a query like this
{
"query": {
"bool": {
"must": {}
}
}
}
My question is how can I combine those two queries?
One solution I found is to add the ids into the must query like this
{
"query": {
"bool": {
"must": {}
"should": [{
"term": {
"_id": id1
},
"term": {
"_id": id2
}]
}
}
}
}
which works fine. However, if the list of ids is very large it can lead to errors.
elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'failed to create query:
I am wondering whether there is a more compact way to write such a query? I think the error above is caused by my query just being too long since I added thousands of term searches... there must be a way to just provide an array, like in the terms query?
solved it
{
"query": {
"bool": {
"must": {},
"filter": {
"terms": {
"_id": list_of_ids
}
}
}
}
}
sorry I am a bit of a newbie to elasticsearch...
You can also use IDs query, which returns documents based on their IDs.
Adding a working example with index data, search query, and search result.
Index Data:
{
"name":"buiscuit",
"cost":"55",
"discount":"20"
}
{
"name":"multi grain bread",
"cost":"55",
"discount":"20"
}
Search Query:
{
"query": {
"bool": {
"must": {
"match": {
"name": "bread"
}
},
"filter": {
"ids": {
"values": [
"1",
"2",
"4"
]
}
}
}
}
}
Search Result:
"hits": [
{
"_index": "65431114",
"_type": "_doc",
"_id": "1",
"_score": 0.5754429,
"_source": {
"name": "multi grain bread",
"cost": "55",
"discount": "20"
}
}
]

What if I use query in filter clausses in elasticsearch?

What if I use query in filter clausses in elasticsearch? Will ES calculate score?
For example,
case 1:
{
"query": {
"bool": {
"filter": {
"bool":{
"should":{
}
}
}
}
}
}
case 2:
{
"query": {
"bool": {
"should": {
"bool":{
"filte":{
}
}
}
}
}
}
Will ES calculate scores in these two case?
The filter clause (query) must appear in matching documents. However, unlike
must the score of the query will be ignored. Filter clauses are
executed in filter context, meaning that scoring is ignored and
clauses are considered for caching.
Refer to this elasticsearch documentation on bool queries, to know more about this
Adding a working example with index data, search query, and search result
Index data:
{
"name": "milk",
"cost": 40
}
{
"name": "bread",
"cost": 55
}
Search Query 1:
In this, the inner bool query is wrapped in the outer filter clause, so the scoring of the should clause is ignored
{
"query": {
"bool": {
"filter": {
"bool": {
"should": {
"match": {
"name": "bread"
}
}
}
}
}
}
}
Search Result 1:
"hits": [
{
"_index": "64505740",
"_type": "_doc",
"_id": "1",
"_score": 0.0,
"_source": {
"name": "bread",
"cost": 55
}
}
]
Search Query 2:
In this, the inner bool query is wrapped in the filter clause, so the outer bool should clause, will not make any difference to the score
{
"query": {
"bool": {
"should": {
"bool": {
"filter": {
"term": {
"name": "bread"
}
}
}
}
}
}
}
Search Result 2:
"hits": [
{
"_index": "64505740",
"_type": "_doc",
"_id": "1",
"_score": 0.0,
"_source": {
"name": "bread",
"cost": 55
}
}
]
So both of your search queries will return a 0.0 score, meaning that the scoring is ignored due to the filter clause
in Elasticsearch each query under the filter section would not be involved in score calculation. It means that in both of your queries if you add your logic inside of the filter, Elasticsearch won't calculate the score. But if you add some part of your logic in the must, should or must_not section, Elasticsearch will calculate the score.

How to add fuzziness to search as you type field in Elasticsearch?

I've been trying to add some fuzziness to my search as you type field type on Elasticsearch, but never got the needed query. Anyone have any idea to implement this?
Fuzzy Query returns documents that contain terms similar to the search term, as measured by a Levenshtein edit distance.
The fuzziness parameter can be specified as:
AUTO -- It generates an edit distance based on the length of the term.
For lengths:
0..2 -- must match exactly
3..5 -- one edit allowed Greater than 5 -- two edits allowed
Adding working example with index data and search query.
Index Data:
{
"title":"product"
}
{
"title":"prodct"
}
Search Query:
{
"query": {
"fuzzy": {
"title": {
"value": "prodc",
"fuzziness":2,
"transpositions":true,
"boost": 5
}
}
}
}
Search Result:
"hits": [
{
"_index": "test",
"_type": "_doc",
"_id": "1",
"_score": 2.0794415,
"_source": {
"title": "product"
}
},
{
"_index": "test",
"_type": "_doc",
"_id": "2",
"_score": 2.0794415,
"_source": {
"title": "produt"
}
}
]
Refer these blogs to get a detailed explaination on fuzzy query
https://www.elastic.co/blog/found-fuzzy-search
https://qbox.io/blog/elasticsearch-optimization-fuzziness-performance
Update 1:
Refer this ES official documentation
The fuzziness , prefix_length , max_expansions , rewrite , and
fuzzy_transpositions parameters are supported for the terms that are
used to construct term queries, but do not have an effect on the
prefix query constructed from the final term.
There are some open issues and discuss links that states that - Fuzziness not work with bool_prefix multi_match (search-as-you-type)
https://github.com/elastic/elasticsearch/issues/56229
https://discuss.elastic.co/t/fuzziness-not-work-with-bool-prefix-multi-match-search-as-you-type/229602/3
I know this question is asked long ago but I think this worked for me.
Since Elasticsearch allows a single field to be declared with multiple data types, my mapping is like below.
PUT products
{
"mappings": {
"properties": {
"title": {
"type": "text",
"fields": {
"product_type": {
"type": "search_as_you_type"
}
}
}
}
}
}
After adding some data to the index I fetched like this.
GET products/_search
{
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "prodc",
"type": "bool_prefix",
"fields": [
"title.product_type",
"title.product_type._2gram",
"title.product_type._3gram"
]
}
},
{
"multi_match": {
"query": "prodc",
"fuzziness": 2
}
}
]
}
}
}

why does elasticsearch calculates score for term queries?

I want to make a simple query based on knowing a unique field value using a term query. For instance:
{
"query": {
"term": {
"products.product_id": {
"value": "Ubsdf-234kjasdf"
}
}
}
}
Regarding term queries, Elasticsearch documentation states:
Returns documents that contain an exact term in a provided field.
You can use the term query to find documents based on a precise value such as a price, a product ID, or a username.
On the other hand, documentation also suggests that the _score is calculated for queries where relevancy matters (and is not the case for filter context which involves exact match).
I find it a bit confusing. Why does Elasticsearch calculates _score for term queries which are supposed to be concerned with exact match and not relevancy?
term queries are not analyzed, hence they would not go with the analysis phase, hence used for an exact match, but their score is still calculated when used in query context.
When you use term queries in filter context, then it means you are not searching on them, and rather doing filtering on them, hence there is no score calculated for them.
More info on query and filter context in official ES doc.
Both the example of term query in filter and query context shown in my below example
Term query in query context
{
"query": {
"bool": {
"must": [
{
"term": {
"title": "c"
}
}
]
}
},
"size": 10
}
And result with a score
"hits": [
{
"_index": "cpp",
"_type": "_doc",
"_id": "4",
"_score": 0.2876821, --> notice score is calculated
"_source": {
"title": "c"
}
}
]
Term query in filter context
{
"query": {
"bool": {
"filter": [ --> prev cluase replaced by `filter`
{
"term": {
"title": "c"
}
}
]
}
},
"size": 10
}
And search result with filter context
"hits": [
{
"_index": "cpp",
"_type": "_doc",
"_id": "4",
"_score": 0.0, --> notice score is 0.
"_source": {
"title": "c"
}
}
]
Filter context means that you need to wrap your term query inside a bool/filter query, like this:
{
"query": {
"bool": {
"filter": {
"term": {
"products.product_id": {
"value": "Ubsdf-234kjasdf"
}
}
}
}
}
}
The above query will not compute scores.

Query for : How many elements of an array are matching in a document attribute in ElasticSearch

I've many documents having an attribute that is an array of values like these:
{
"_index": "myindex",
"_type": "mytype",
"_id": "myid1",
"_source": {
"tags": [
"devid",
"batman",
"obama"
]
}
},
{
"_index": "myindex",
"_type": "mytype",
"_id": "myid2",
"_source": {
"tags": [
"devid",
"superman"
]
}
}
I have an array of elements like: ["devid", "batman", "pippo"]
I want to get all the documents matching at least one element of the array, sorted by how many elements are matched.
For example, I expect that myid1 will have an higher score than myid2.
How can I do this?
At the moment I'm "stuck" here:
{
"query": {
"function_score": {
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"terms": {
"tags": ["devid", "batman", "pippo"]
}
}
}
}
}
}
}
It only filters by terms and sets 1 as score to both.
I'm noob with elasticsearch any hint is welcome!
Using the terms query instead of filter would result in documents with more terms matching get a higher score.
Example :
{
"query": {
"terms": {
"tags": [
"devid",
"batman",
"pippo"
]
}
}
}

Resources