Elasticsearch doesn't return results on a multi match query - elasticsearch

I'm wondering why Elasticsearch doesn't give me any results for the following Multi Match Query:
GET /stag/_search
{
"query": {
"multi_match": {
"type": "phrase_prefix",
"query": "ferran ma",
"fields": [ "fullName", "fullName.folded" ]
}
}
}
But it gives me results on:
GET /stag/_search
{
"query": {
"multi_match": {
"type": "phrase_prefix",
"query": "ferran may",
"fields": [ "fullName", "fullName.folded" ]
}
}
}
I thought that maybe there is a minimum character length per word but then I've seen the following query:
GET /stag/_search
{
"query": {
"multi_match": {
"type": "phrase_prefix",
"query": "ignasi t",
"fields": [ "fullName", "fullName.folded" ]
}
}
}
Is giving me results. So I have no idea what's going on.

Seems like the problem is explained here
The match_phrase_prefix query is a poor-man’s autocomplete. It is very
easy to use, which lets you get started quickly with
search-as-you-type but its results, which usually are good enough, can
sometimes be confusing.
Consider the query string quick brown f. This query works by creating
a phrase query out of quick and brown (i.e. the term quick must exist
and must be followed by the term brown). Then it looks at the sorted
term dictionary to find the first 50 terms that begin with f, and adds
these terms to the phrase query.
The problem is that the first 50 terms may not include the term fox so
the phrase quick brown fox will not be found. This usually isn’t a
problem as the user will continue to type more letters until the word
they are looking for appears.

Related

Elastic Search: query_string query does not match exact phrase in full text search

I am using Elastic search 6.2.3. We are using the query_string full-text-query for the full-text search. At present, if we search lazy brown fox it searches any file that has all these words lazy, brown and fox but it does not look for exact-phrase 'lazy brown fox', even default slop is zero.
Here is an example:
{
"query": {
"query_string": {
"fields": [],
"query": "lazy AND brown AND fox"
}
}
}
I have looked at the match-phrase-query but the issue is that we have to specify the field name(s) in match-phrase-query whereas, in string-query, it's working with a blank array at fields option.
Please suggest, how to get the exact phrase match results using query_string full text-query?
Instead of looking at match-phrase query to run phrase query on multiple fields take a look at multi_match which do supports phrase type query
POST phrase_index/_search
{
"query": {
"multi_match": {
"query": "this is where it should work",
"fields": [],
"type": "phrase"
}
}
}

ElasticSearch Multi-match and scoring

I'm using the following query on Elastic Search 2.3.3
es_query = {
"fields": ["title", "content"],
"query":
{
"multi_match" : {
"query": "potato tomato",
"type": "best_fields",
"fields": [ "title_cuis", "content_cuis" ]
}
}
}
I would like the results to be scored so that the first document returned is the one that contains the highest occurrence of the words "tomato" and "potato", but this doesn't seem to happen and I was wondering how I can modify the query to get that without re-indexing.
You're using best_fields, this will use the max score retrieved in matching process from title_cuis or content_cuis, separately.
Take a look to cross-fields

Elasticsearch: multi_match phrase_prefix query with multiple search terms

I have a database with entries like
title: This is my awesome title
abstract: A more detailed descriptions of what [...]
I would like to build an Elasticsearch query that matches the above document with, e.g.,
awe detai
In words: A multi_match phrase_prefix query with multiple search terms. (This is intended to be used as a search-as-you-type feature.)
I see how you can combine multi_match and phrase_prefix, but it's unclear to me how to do this for multiple search terms.
Any hints?
Well there is few ways to do that
POST stack/autocomplete/1
{
"title": "This is my awesome title",
"abstract": "A more detailed descriptions of what"
}
Then you can search using query string with star but problem here is that you need to append asterix to query
POST stack/autocomplete/_search
{
"query": {
"query_string": {
"fields": [
"title",
"abstract"
],
"query": "awe* detai*"
}
}
}
If you want to match on user query then you can use like that
POST stack/autocomplete/_search
{
"query": {
"multi_match": {
"fields": [
"title",
"abstract"
],
"query": "awesome tit",
"type": "phrase_prefix"
}
}
}
One more option to consider would be to use nGram with query string so you will not need to modify user query "awe* detai*"

Finding an exact phrase in multiple fields with Elasticsearch

I'm wanting to find an exact phrase (for instance, "the quick brown fox") across mutliple fields in a document.
Right now, I'm using something like this:
{
"query": {
"filtered": {
"query": {
"multi_match": {
"fields": [
"subject",
"comments"
],
"query": "the quick brown fox"
}
},
"filters": {
"and": [
{
"term": {
"priority": "high"
}
}
...more ands
]
}
}
}
}
Question is, how can I do this correctly. Right now I'm getting the best match first, which tends to be the entire phrase, but I'm getting a load of almost matches too.
If you are using an ElasticSearch cluster with version >= 1.1.0, you could set the mode of your multi-match query to phrase :
...
"query": {
"multi_match": {
"fields": [
"subject",
"comments"
],
"query": "the quick brown fox",
"type": "phrase"
}
...
It will replace the match query generated for each field by a match_phrase one, which will return only the documents containing the full phrase (you can find details in the documentation)
how are you analyzing the subject/comments fields? if you want exact match, you'll need to use the keyword tokenizer for both index/search.

Elasticsearch rescore all results ignoring base score

I'm trying to rescore my results with the following query:
POST /archive/item/_search
{
"query": {
"multi_match": {
"fields": ["title", "description"],
"query": "1 złoty",
"operator": "and"
}
},
"rescore": {
"window_size": 50,
"query": {
"rescore_query": {
"multi_match": {
"type": "phrase",
"fields": ["title", "description"],
"query": "1 złoty",
"slop": 10
}
},
"query_weight": 0,
"rescore_query_weight": 1
}
}
}
I'm doing this because I want to score by proximity mainly.
Also, I want to ignore source field length impact on the score.
Am I doing this right? If not, what's the best practice here?
And the second question. Why window_size is needed anyway?
I don't want top results only.
The main query atcs like a filter, so all the results it returns are relevant.
I quess something like "window_size": "all" would be perfect, but I couldn't find anything in the docs.
To answer your second question, the reason it's needed is because it's designed to be for top results only. Basically it's a cost issue - the assumption is that the secondary algorithm is more expensive so it was only designed to be run on the top results. There's more discussion about this here:
https://github.com/elasticsearch/elasticsearch/issues/2640
and here:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-rescore.html
Personally I think the "all" option is a great idea, maybe you should open an issue on github?
If you want to score with proximity match all results returned by some other filter this should do:
{
"query": {
"filtered" : {
"query" : {
"multi_match": {
"type": "phrase",
"fields": ["title", "description"],
"query": "1 złoty",
"slop": 10
}
},
"filter" : {
"query": {
"multi_match": {
"fields": ["title", "description"],
"query": "1 złoty",
"operator": "and"
}
}
}
}
}
}
According to this, the filter is run before the query, so the performance shouldn't be bad as well. What's more you don't score twice, because filters don't calculate scores. Another advantage is that filters can be cached which should speed things significantly.
Keep in mind that I did short tests only, mostly focusing on syntax not results. You might want to double check it.

Resources