I'm using the following query on Elastic Search 2.3.3
es_query = {
"fields": ["title", "content"],
"query":
{
"multi_match" : {
"query": "potato tomato",
"type": "best_fields",
"fields": [ "title_cuis", "content_cuis" ]
}
}
}
I would like the results to be scored so that the first document returned is the one that contains the highest occurrence of the words "tomato" and "potato", but this doesn't seem to happen and I was wondering how I can modify the query to get that without re-indexing.
You're using best_fields, this will use the max score retrieved in matching process from title_cuis or content_cuis, separately.
Take a look to cross-fields
Related
What I need is, elastic should search in multiple fields and return data by field priority.
For example: For the search string obil hon, elastic should search in fields[title, description, modelCaption] and return data when at first it finds Mobile Phone in Title field, then in other fields.
Query I use:
{
"from": 0,
"query": {
"bool": {
"must": [
{
"query_string": {
"default_operator": "or",
"fields": [
"title^5",
"description",
"modelCaption",
"productFeatureValues.featureValue",
"productFeatureValues.featureCaption"
],
"query": "*obil* *hone*"
}
}
]
}
},
"size": 16
}
Any suggestions?
Thanks!
You can simply use the multi-match query to query multiple fields and it supports boosting a particular field like a title in your case and different operators like OR in your case.
Sample ES query for your use case:
{
"query": {
"multi_match" : {
"query" : "mobile phones",
"fields" : [ "title^5", "description","modelCaption","productFeatureValues.featureVal"],
"fuzziness" : "AUTO" --> Adding fuzziness to query
}
}
}
Here title filed is boosted by factor 5, hence if mobile phones match in title field then it would be scored higher.
Also please note, you are using wild-card in your query string which is very costly so it's better to avoid them if you can.
EDIT: Based on OP comments, included fuzziness parameter AUTO in query for better results
I'm wondering why Elasticsearch doesn't give me any results for the following Multi Match Query:
GET /stag/_search
{
"query": {
"multi_match": {
"type": "phrase_prefix",
"query": "ferran ma",
"fields": [ "fullName", "fullName.folded" ]
}
}
}
But it gives me results on:
GET /stag/_search
{
"query": {
"multi_match": {
"type": "phrase_prefix",
"query": "ferran may",
"fields": [ "fullName", "fullName.folded" ]
}
}
}
I thought that maybe there is a minimum character length per word but then I've seen the following query:
GET /stag/_search
{
"query": {
"multi_match": {
"type": "phrase_prefix",
"query": "ignasi t",
"fields": [ "fullName", "fullName.folded" ]
}
}
}
Is giving me results. So I have no idea what's going on.
Seems like the problem is explained here
The match_phrase_prefix query is a poor-man’s autocomplete. It is very
easy to use, which lets you get started quickly with
search-as-you-type but its results, which usually are good enough, can
sometimes be confusing.
Consider the query string quick brown f. This query works by creating
a phrase query out of quick and brown (i.e. the term quick must exist
and must be followed by the term brown). Then it looks at the sorted
term dictionary to find the first 50 terms that begin with f, and adds
these terms to the phrase query.
The problem is that the first 50 terms may not include the term fox so
the phrase quick brown fox will not be found. This usually isn’t a
problem as the user will continue to type more letters until the word
they are looking for appears.
In ElasticSearch how do i sort documents based on finding a phrase in the following order of fields.
Search Phrase: Miami
Fields: Title, Content, Topics
If found in Title, Content and in Topics it will show before other documents that the phrase is only found in Content.
Maybe there is a way to say:
if phrase found in Title then weight 2
if phrase found in Content then weight 1.5
if phrase found in Topics then weight 1
and this will be sum(weight) with _score
My Current query looks like
{
"index": "abc",
"type": "mydocuments",
"body": {
"query": {
"multi_match": {
"query": "miami",
"type": "phrase",
"fields": [
"title",
"content",
"topics",
"destinations"
]
}
}
}
}
You can use boosting on fields with the caret ^ notation to score them higher than other matching fields
{
"index": "abc",
"type": "mydocuments",
"body": {
"query": {
"multi_match": {
"query": "miami",
"type": "phrase",
"fields": [
"title^10",
"content^3",
"topics",
"destinations"
]
}
}
}
}
Here I have applied a weight of 10 to title and weight of 3 to content. Documents will be returned in decreasing _score order so you need to boost scores in fields that you consider more important; the values to use for boosting are up to you and may require a little trial and improvement to return documents in your preferred order.
I'm wanting to find an exact phrase (for instance, "the quick brown fox") across mutliple fields in a document.
Right now, I'm using something like this:
{
"query": {
"filtered": {
"query": {
"multi_match": {
"fields": [
"subject",
"comments"
],
"query": "the quick brown fox"
}
},
"filters": {
"and": [
{
"term": {
"priority": "high"
}
}
...more ands
]
}
}
}
}
Question is, how can I do this correctly. Right now I'm getting the best match first, which tends to be the entire phrase, but I'm getting a load of almost matches too.
If you are using an ElasticSearch cluster with version >= 1.1.0, you could set the mode of your multi-match query to phrase :
...
"query": {
"multi_match": {
"fields": [
"subject",
"comments"
],
"query": "the quick brown fox",
"type": "phrase"
}
...
It will replace the match query generated for each field by a match_phrase one, which will return only the documents containing the full phrase (you can find details in the documentation)
how are you analyzing the subject/comments fields? if you want exact match, you'll need to use the keyword tokenizer for both index/search.
I have an elasticsearch index that I am trying to search for matches based on multiple fields (title and description). If a particular term shows up in the title I want to be able to boost the score by 2*original score. If it is in description it should remain the original score. I am a bit confused by the elasticsearch documentation. Can anyone help me adjust the following query to reflect this logic?
{
"query": {
"query_string": {
"query": "string",
"fields": ["title","description"]
}
}
}
You just need to add ^2 to get a boost on the field you want:
{
"query": {
"query_string": {
"query": "string",
"fields": ["title^2","description"]
}
}
}