Query for : How many elements of an array are matching in a document attribute in ElasticSearch - elasticsearch

I've many documents having an attribute that is an array of values like these:
{
"_index": "myindex",
"_type": "mytype",
"_id": "myid1",
"_source": {
"tags": [
"devid",
"batman",
"obama"
]
}
},
{
"_index": "myindex",
"_type": "mytype",
"_id": "myid2",
"_source": {
"tags": [
"devid",
"superman"
]
}
}
I have an array of elements like: ["devid", "batman", "pippo"]
I want to get all the documents matching at least one element of the array, sorted by how many elements are matched.
For example, I expect that myid1 will have an higher score than myid2.
How can I do this?
At the moment I'm "stuck" here:
{
"query": {
"function_score": {
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"terms": {
"tags": ["devid", "batman", "pippo"]
}
}
}
}
}
}
}
It only filters by terms and sets 1 as score to both.
I'm noob with elasticsearch any hint is welcome!

Using the terms query instead of filter would result in documents with more terms matching get a higher score.
Example :
{
"query": {
"terms": {
"tags": [
"devid",
"batman",
"pippo"
]
}
}
}

Related

Elasticsearch Bool query with minimum_should_match set to zero not honored

I add 3 documents
POST test/_doc
{"value": 1}
POST test/_doc
{"value": 2}
POST test/_doc
{"value": 3}
then do the following query I expect to return all the 3 docs with documents matching should clause being ranked higher
GET /test/_search
{
"query": {
"bool": {
"minimum_should_match": 0,
"should": [
{
"range": {
"value": {
"gte": 2
}
}
}
]
}
}
}
but instead i get only 2 docs (value 2,3) "minimum_should_match": 0, does not have any effect until i add the filter or must clause in the bool query like below,
GET /test/_search
{
"query": {
"bool": {
"filter": [ { "match_all": { } } ],
"should": [
{
"range": {
"value": {
"gte": 2
}
}
}
]
}
}
}
What I want
in the bool query, either the must clause or filter clause is empty or filled, the should clause must not filter any documents BUT only participate in ranking, please share how can i achieve that, thanks
It's a little weird that minimum_should_match: 0 is not working with the should clause. This may be due to the documentation mentioned here
No matter what number the calculation arrives at, a value greater than
the number of optional clauses, or a value less than 1 will never be
used. (ie: no matter how low or how high the result of the calculation
result is, the minimum number of required matches will never be lower
than 1 or greater than the number of clauses.
There are two ways in which you can get all the documents in the result and using the should clause only for the ranking purpose
Use must or filter clause with match_all query, which you already figured out as shown in the question above.
Another way could be to use the should clause with the boost parameter
Search Query:
{
"query": {
"bool": {
"should": [
{
"range": {
"value": {
"gte": 2,
"boost": 2.0
}
}
},
{
"range": {
"value": {
"lt": 2,
"boost": 1.0
}
}
}
]
}
}
}
Search Result will be
"hits": [
{
"_index": "68040640",
"_type": "_doc",
"_id": "2",
"_score": 2.0,
"_source": {
"value": 2
}
},
{
"_index": "68040640",
"_type": "_doc",
"_id": "3",
"_score": 2.0,
"_source": {
"value": 3
}
},
{
"_index": "68040640",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"value": 1
}
}
]

Elasticsearch - unify search results from different indexes

I want to perform a search query on different indexes with different search queries and unify the results.
I know there is a multi-target syntax, which allows me to perform specific query over multiple indexes.
What I want is different query for each index and then perform something like UNION (SQL).
Is there a way to achieve that?
You can use the _index metadata field. This will help you to query on multiple indexes with different queries
Adding a working example with index data, search query and search result
Index Data
POST /index1/_doc/1
{
"name":"foo"
}
POST /index2/_doc/1
{
"name":"bar"
}
Search Query:
{
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"match": {
"name": "foo"
}
},
{
"term": {
"_index": "index1"
}
}
]
}
},
{
"bool": {
"must": [
{
"match": {
"name": "bar"
}
},
{
"term": {
"_index": "index2"
}
}
]
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "index1",
"_type": "_doc",
"_id": "1",
"_score": 1.287682,
"_source": {
"name": "foo"
}
},
{
"_index": "index2",
"_type": "_doc",
"_id": "1",
"_score": 1.287682,
"_source": {
"name": "bar"
}
}
]

combine terms and bool query in elasticsearch

I would like to do a search in an elasticsearch index but only for a list of ids. I can select the ids with a terms query
{
"query": {
"terms": {
"_id": list_of_ids
}
}
}
Now I want to search in the resulting list, which can be done with a query like this
{
"query": {
"bool": {
"must": {}
}
}
}
My question is how can I combine those two queries?
One solution I found is to add the ids into the must query like this
{
"query": {
"bool": {
"must": {}
"should": [{
"term": {
"_id": id1
},
"term": {
"_id": id2
}]
}
}
}
}
which works fine. However, if the list of ids is very large it can lead to errors.
elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'failed to create query:
I am wondering whether there is a more compact way to write such a query? I think the error above is caused by my query just being too long since I added thousands of term searches... there must be a way to just provide an array, like in the terms query?
solved it
{
"query": {
"bool": {
"must": {},
"filter": {
"terms": {
"_id": list_of_ids
}
}
}
}
}
sorry I am a bit of a newbie to elasticsearch...
You can also use IDs query, which returns documents based on their IDs.
Adding a working example with index data, search query, and search result.
Index Data:
{
"name":"buiscuit",
"cost":"55",
"discount":"20"
}
{
"name":"multi grain bread",
"cost":"55",
"discount":"20"
}
Search Query:
{
"query": {
"bool": {
"must": {
"match": {
"name": "bread"
}
},
"filter": {
"ids": {
"values": [
"1",
"2",
"4"
]
}
}
}
}
}
Search Result:
"hits": [
{
"_index": "65431114",
"_type": "_doc",
"_id": "1",
"_score": 0.5754429,
"_source": {
"name": "multi grain bread",
"cost": "55",
"discount": "20"
}
}
]

Elasticsearch associating exact match terms

I have a search index of filenames containing over 100,000 entries that share about 500 unique variations of the main filename field. I have recently made some modifications to certain filename values that are being generated from my data. I was wondering if there is a way to link certain queries to return an exact match. In the following query:
"query": {
"bool": {
"must": [
{
"match": {
"filename": "foo-bar"
}
}
],
}
}
how would it be possible to modify the index and associate the results so that above query will also match results foo-bar-baz, but not foo-bar-foo or any other variation?
Thanks in advance for your help
You can use a term query instead of a match query. Perfect to use on a keyword:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html
Adding a working example with index data and search query. (Using the default mapping)
Index Data:
{
"fileName": "foo-bar"
}
{
"fileName": "foo-bar-baz"
}
{
"fileName": "foo-bar-foo"
}
Search Query:
{
"query": {
"bool": {
"should": [
{
"match": {
"fileName.keyword": "foo-bar"
}
},
{
"match": {
"fileName.keyword": "foo-bar-baz"
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "test",
"_type": "_doc",
"_id": "1",
"_score": 0.9808291,
"_source": {
"fileName": "foo-bar"
}
},
{
"_index": "test",
"_type": "_doc",
"_id": "2",
"_score": 0.9808291,
"_source": {
"fileName": "foo-bar-baz"
}
}
]

Boosting results based on selected types in elasticsearch

I have different types indexed in elastic search.
but, if I want to boost my results on some selected types then what should I do?
I could use type filter in boosting query, but type filter allows me only one type to be used in filter. I need results to be boosted on the basis of multiple types.
Example:
I have Person, Event, Location data indexed in elastic search where Person, Location and Event are my types.
I am searching for keyword 'London' in all types but i want Person and Event type records to be boosted than Location.
How could I achieve the same?
One of the ways of getting the desired functionality is by wrapping your query inside a bool query and then make use of the should clause, in order to boost certain documents
Small example:
POST test/person
{
"title": "london elise moore"
}
POST test/event
{
"title" : "london is a great city"
}
Without boost:
GET test/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "london"
}
}
]
}
}
}
With the following response:
"hits": {
"total": 2,
"max_score": 0.2972674,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "AVVx621GYvUb9aQn6r5X",
"_score": 0.2972674,
"_source": {
"title": "london elise moore"
}
},
{
"_index": "test",
"_type": "event",
"_id": "AVVx63LrYvUb9aQn6r5Y",
"_score": 0.26010898,
"_source": {
"title": "london is a great city"
}
}
]
}
And now with the added should clause:
GET test/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "london"
}
}
],
"should": [
{
"term": {
"_type": {
"value": "event",
"boost": 2
}
}
}
]
}
}
}
Which gives back the following response:
"hits": {
"total": 2,
"max_score": 1.0326607,
"hits": [
{
"_index": "test",
"_type": "event",
"_id": "AVVx63LrYvUb9aQn6r5Y",
"_score": 1.0326607,
"_source": {
"title": "london is a great city"
}
},
{
"_index": "test",
"_type": "person",
"_id": "AVVx621GYvUb9aQn6r5X",
"_score": 0.04235228,
"_source": {
"title": "london elise moore"
}
}
]
}
You could even leave out the extra boost in the should clause, cause if the should clause matches it will boost the result :)
Hope this helps!
I see two ways of doing that using that but both is using scripts
1. using sorting
POST c1_1/_search
{
"from": 0,
"size": 10,
"sort": [
{
"_script": {
"order": "desc",
"type": "number",
"script": "double boost = 1; if(doc['_type'].value == 'Person') { boost *= 2 }; if(doc['_type'].value == 'Event') { boost *= 3}; return _score * boost; ",
"params": {}
}
},
{
"_score": {}
}
],
"query": {
"bool": {
"should": [
{
"query_string": {
"query": "*",
"default_operator": "and"
}
}
],
"minimum_should_match": "1"
}
}
}
Second option Using function score.
POST c1_1/_search
{
"from": 0,
"size": 10,
"query": {
"function_score": {
"query": {
"bool": {
"should": [
{
"query_string": {
"query": "*",
"default_operator": "and"
}
}
],
"minimum_should_match": "1"
}
},
"script_score": {
"script": "_score * (doc['_type'].value == 'Person' || doc['_type'].value == 'Event'? 2 : 1)"
}
}
}
}

Resources