Elasticsearch prioritize specific _ids but don't filter? - elasticsearch

I'm trying to sort my query in elasticsearch where the query will prioritize documents with specific _ids to appear first but it won't filter the entire query based on the _ids it's just prioritizing them.
Here's an example of what I've tried as an attempt:
{"query":{"constant_score":{"filter":{"terms":{"_id":[2,3,4]}},"boost":2}}}
So the above would be included along with other queries however the query just returns the exact matches and not the rest of the results.
Any ideas as to how this can be done so that it just prioritizes the documents with the ids but doesn't filter the entire query?

Try this (and instead of that match_all() there you can use a query to actually filter the results):
{
"query": {
"function_score": {
"query": {
"match_all": {}
},
"functions": [
{
"filter": {
"terms": {
"_id": [
2,
3,
4
]
}
},
"weight": 2
}
]
}
}
}

If you need to return in exact order as you need go with
"sort": [
{
"_script": {
"script": "doc['id'] != null ? sortOrder.indexOf(doc['id'].value.toInteger()) : 0",
"type": "number",
"params": {
"sortOrder": [
2,3,4
]
},
"order": "desc"
}
},
"_score"
]
P.S. As #Val mentioned wityh _id this will not work, so you would need to store id field as separate.
If you need move documents to top look to function_score

Related

elasticsearch how do i query (search) in single document?

assuming that index's name is index & document 1's id is "1"
how can i query in single document?
something like this..
GET index/_search
{
"query": {
"id": "1",
"terms": ["is this text in document 1?"]
}
}
or
GET index/_doc/1/_search
{
...
}
far as i found,
GET test/_doc/_search
{
"query": {
"terms" : {
"_id" : ["1"]
}
}
}
this will get the document id of "1", but cannot perform any further queries.
the reason i want to query inside single document is because my app is using live-news view
and once news is retrieved from server, i want to search it in elasticsearch for keywork higlighting, and spam filtering.
You have to compose your query with Boolean Query
The best approch is to specify the id query under the filter because it will not have effect on scoring. You can next specify queries under must, must_not and should, according to your need :
GET index/_search
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": [
{
"term": {
"field": "value"
}
}
],
"must_not": [],
"should": [],
"filter": [
{
"terms": {"_id": ["1"]}
}
]
}
}
}

How to sort elasticsearch results based on number of collapsed items?

I'm using a a query with collapse in order to gather some documents under a certain person, yet I wish to sort the results based on the number of documents in which the search found a match.. this is my query:
GET documents/_search
{
"_source": {
"includes": [
"text"
]
},
"query": {
"query_string": {
"fields": [
"text"
],
"query": "some text"
}
},
"collapse": {
"field": "person_id",
"inner_hits": {
"name": "top_mathing_docs",
"_source": {
"includes": [
"doc_year",
"text"
]
}
}
}
}
Any suggestions?
Thanks
If I understand correctly, what you require here is to sort the documents i.e. parent documents, based on the count of inner_hits i.e. count of inner_hits based on person_id.
So that means, the _score of the parent documents in the result doesn't matter.
The only way I've found this doable is making use of the Top Hits Aggregation for Field Collapse Example and below is what your query would look like.
Aggregation Query Field Collapse Example:
POST <your_index_name>/_search
{
"size":0,
"query": {
"query_string": {
"fields": [
"text"
],
"query": "some text"
}
},
"aggs": {
"top_person_ids": {
"terms": {
"field": "person_id"
},
"aggs": {
"top_tags_hits": {
"top_hits": {
"size": 10
}
}
}
}
}
}
Note that I'm assuming person_id is of type keyword or any numeric.
Also if you look at query closely, I've mentioned "size":"0". Which means I'm only returning the result of aggregation.
Another note is that the above aggregation has nothing to do with Field Collapse in Search Request feature that you have posted in the question. It's just that using this aggregation, your result could be formatted in a similar way.
Let me know if this helps!

Filtering across multiple indices using ElasticSearch

Is is possible to write a conditional filter on an Elasticsearch multi-index query?
I am looking at the filter script, but I can't see anywhere in the documentation if the documents index is a variable I can check?
My existing query looks like this, note the filter script doesn't work - but I assume this is where I need to do my query.
{
"index": "tweets,articles,animals,buildings",
"type": "item",
"body": {
"query": {
"multi_match": {
"query": "cat",
"type": "phrase_prefix",
"fields": [
"label",
"body"
]
}
},
"filter": {
"script": {
"script": "if (_index == \"animals\") {return true;} else {return false}
}
},
"from": 0,
"size": 8
}
}
Obviously I'd like to do more in this filter than just exclude items from a certain index, this is simply an example.
You should be able to combine several indices query to solve this task.

ElasticSearch more_like_this with restricted result set

I want to run a more_like_this query, but only get the top results within a specific set of documents, so I would provide the IDs of these documents. Is there any way to do this? Docs indicate no.
One way would be to use a filtered query and use the id filter to specify the set of documents you want the more_like_this query to work on
Example:
{
"query": {
"filtered": {
"query": {
"more_like_this": {
"fields": [
"ticker.whitespace"
],
"like_text": "WFC",
"min_term_freq": 1,
"max_query_terms": 12
}
},
"filter": {
"ids": {
"values": [
"7667"
]
}
}
}
}
}

Filter elasticsearch results to contain only unique documents based on one field value

All my documents have a uid field with an ID that links the document to a user. There are multiple documents with the same uid.
I want to perform a search over all the documents returning only the highest scoring document per unique uid.
The query selecting the relevant documents is a simple multi_match query.
You need a top_hits aggregation.
And for your specific case:
{
"query": {
"multi_match": {
...
}
},
"aggs": {
"top-uids": {
"terms": {
"field": "uid"
},
"aggs": {
"top_uids_hits": {
"top_hits": {
"sort": [
{
"_score": {
"order": "desc"
}
}
],
"size": 1
}
}
}
}
}
}
The query above does perform your multi_match query and aggregates the results based on uid. For each uid bucket it returns only one result, but after all the documents in the bucket were sorted based on _score in descendant order.
In ElasticSearch 5.3 they added support for field collapsing. You should be able to do something like:
GET /_search
{
"query": {
"multi_match" : {
"query": "this is a test",
"fields": [ "subject", "message", "uid" ]
}
},
"collapse" : {
"field" : "uid"
},
"size": 20,
"from": 100
}
The benefit of using field collapsing instead of a top hits aggregation is that you can use pagination with field collapsing.

Resources