Elasticsearch query prefer exact match over partial match on multiple fields - elasticsearch

I am doing a free text search on documents with multiple fields. When I perform a search I want the documents that have a perfect match on any of the labels to have a higher scoring. Is there any way I can do this from the query?
For example the documents have two fields called label-a and label-b and when I perform the following multi-match query:
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "apple",
"type": "most_fields",
"fields": [
"label-a",
"label-b"
]
}
}
]
}
}
}
I get the following results (only the relevant part):
"hits": [
{
"_index": "salad",
"_type": "fruit",
"_id": "4",
"_score": 0.581694,
"_source": {
"label-a": "apple pie and pizza",
"label-b": "pineapple with apple juice"
}
},
{
"_index": "salad",
"_type": "fruit",
"_id": "2",
"_score": 0.1519148,
"_source": {
"label-a": "grape",
"label-b": "apple"
}
},
{
"_index": "salad",
"_type": "fruit",
"_id": "1",
"_score": 0.038978107,
"_source": {
"label-a": "apple apple apple apple apple apple apple apple apple apple apple apple",
"label-b": "raspberry"
}
},
{
"_index": "salad",
"_type": "fruit",
"_id": "3",
"_score": 0.02250402,
"_source": {
"label-a": "apple pie and pizza",
"label-b": "raspberry"
}
}
]
I want the second document, the one with the value grape for label-a and value apple for label-b, to have the highest score as I am searching for the value apple and one of the labels has that exact value. This should work regardless of which label the exact term appears.

Because Elasticsearch uses tf/idf model for scoring you are getting these results. Try to specify in your index fields "label-a" and "label-b" additionally as not-analyzed(raw) fields. Then rewrite your query someth like this:
{
"query": {
"bool": {
"should": {
"match": {
"label-a.raw": {
"query": "apple",
"boost": 2
}
}
},
"must": [
{
"multi_match": {
"query": "apple",
"type": "most_fields",
"fields": [
"label-a",
"label-b"
]
}
}
]
}
}
}
The should clause will boost documents with exact match and you will probably get them in the first place. Try to play with the boost number and pls check th equery before running. This is just and idea what you can do

Related

Elasticsearch - unify search results from different indexes

I want to perform a search query on different indexes with different search queries and unify the results.
I know there is a multi-target syntax, which allows me to perform specific query over multiple indexes.
What I want is different query for each index and then perform something like UNION (SQL).
Is there a way to achieve that?
You can use the _index metadata field. This will help you to query on multiple indexes with different queries
Adding a working example with index data, search query and search result
Index Data
POST /index1/_doc/1
{
"name":"foo"
}
POST /index2/_doc/1
{
"name":"bar"
}
Search Query:
{
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"match": {
"name": "foo"
}
},
{
"term": {
"_index": "index1"
}
}
]
}
},
{
"bool": {
"must": [
{
"match": {
"name": "bar"
}
},
{
"term": {
"_index": "index2"
}
}
]
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "index1",
"_type": "_doc",
"_id": "1",
"_score": 1.287682,
"_source": {
"name": "foo"
}
},
{
"_index": "index2",
"_type": "_doc",
"_id": "1",
"_score": 1.287682,
"_source": {
"name": "bar"
}
}
]

Elastic Search 1.4 phrase query with OR operator with hyphen (-) in search string

I have a issue in Elastic search 1.4 phrase query. I am creating a below index with the data.
curl -XPUT localhost:9200/test
curl -XPOST localhost:9200/test/doc/1 -d '{"field1" : "abc-xyz"}'
curl -XPOST localhost:9200/test/doc/2 -d '{"field1" : "bcd-gyz"}'
So by default field1 is analyzed by elastic search with default analyzer.
I am searching below phrase query but its not returning any result.
{
"query": {
"filtered": {
"filter": {
"bool": {
"should": [
{
"query": {
"multi_match": {
"query": "abc\\-xyz OR bcd\\-gyz",
"type": "phrase",
"fields": [
"field1"
]
}
}
}
]
}
}
}
}
}
So elastic search phrase query is not working with OR operator. Any idea why its not working, is it a limitation of elastic search because of special character hyphen (-) in text?
Based on the comment, adding a answer using query string which works with OR in phrase with multiple search, it didn't work with multiple multi-match hence have to use query string.
Using the same indexed doc, added in previous answer, but with below search query.
{
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "\"abc-xyz\" OR \"bcd-gyz\"",
"fields": [
"title"
]
}
}
]
}
}
}
Search results
"hits": [
{
"_index": "phrasemulti",
"_type": "doc",
"_id": "1",
"_score": 0.05626005,
"_source": {
"title": "bcd-gyz"
}
},
{
"_index": "phrasemulti",
"_type": "doc",
"_id": "2",
"_score": 0.05626005,
"_source": {
"title": "abc-xyz"
}
}
]
When you remove few char, pharse query won't work or when you change operator to AND, sample data doesn't return search results which is expected.
{
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "\"abc-xyz\" OR \"bcd-gz\"",
"fields": [
"title"
]
}
}
]
}
}
}
Returns only one search result, as there is no phrase bcd-gz exist in sample data.
"hits": [
{
"_index": "phrasemulti",
"_type": "doc",
"_id": "2",
"_score": 0.05626005,
"_source": {
"title": "abc-xyz"
}
}
]
Below query works fine for me
{
"query": {
"filtered": {
"filter": {
"bool": {
"should": [
{
"query": {
"multi_match": {
"query": "abc-xyz", // note passing only one query without escaping hyphen
"type": "phrase",
"fields": [
"title"
]
}
}
}
]
}
}
}
}
}
Search results with explain param
"hits": [
{
"_shard": 3,
"_node": "1h3iipehS2abfclj51Vtsg",
"_index": "phrasemulti",
"_type": "doc",
"_id": "2",
"_score": 1.0,
"_source": {
"title": "abc-xyz"
},
"_explanation": {
"value": 1.0,
"description": "ConstantScore(BooleanFilter(QueryWrapperFilter(title:\"abc xyz\"))), product of:",
"details": [
{
"value": 1.0,
"description": "boost"
},
{
"value": 1.0,
"description": "queryNorm"
}
]
}
}
]
Verified its returning results according to phrase as query abc-xy doesn't return any result.

Elasticsearch query filter combination issue

Im trying to understand why the below elasticsearch query does not work.
EDIT:
The fields mentioned in the query are from different indices. For example Filter has classification field which is in a different index to the fields mentioned in the query string.
The expectation of the filter query is that when the user searches specifically on classification field i.e. secret or protected then the values are displayed. Else if the user searches for any other field from a different index for example firstname or person, then it should not consider any filter applied as firstname or person is not part of the filter
{
"query": {
"bool": {
"filter": {
"terms": {
"classification": [
"secret",
"protected"
]
}
},
"must": {
"query_string": {
"query": "*john*",
"fields": [
"classification",
"firstname",
"releasability",
"person"
]
}
}
}
}
}
The result expected is john in the field person is returned. This works when there is no filter applied in the above code as
{
"query": {
"query_string": {
"query": "*john*",
"fields": [
"classification",
"firstname",
"releasability",
"person"
]
}
}
}
The purpose of the filter is only to filter records when the said fields contain the values mentioned, otherwise it should work for all values.
Why is it not producing the results for john and only producing results for classification values only?
Adding a working example with sample index data and search query.
To know more about Bool query refer this official documentation
Index Data:
Index data in my_index index
{
"name":"John",
"title":"b"
}
{
"name":"Johns",
"title":"a"
}
Index data in my_index1 index
{
"classification":"protected"
}
{
"classification":"secret"
}
Search Query :
POST http://localhost:9200/_search
{
"query": {
"bool": {
"should": [
{
"bool": {
"filter": [
{
"terms": {
"classification": [
"secret",
"protected"
]
}
}
]
}
},
{
"bool": {
"must": [
{
"query_string": {
"query": "*john*",
"fields": [
"name",
"title"
]
}
}
]
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "my_index",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"name": "John",
"title": "b"
}
},
{
"_index": "my_index",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"name": "Johns",
"title": "a"
}
},
{
"_index": "my_index1",
"_type": "_doc",
"_id": "1",
"_score": 0.0,
"_source": {
"classification": "secret"
}
},
{
"_index": "my_index1",
"_type": "_doc",
"_id": "2",
"_score": 0.0,
"_source": {
"classification": "protected"
}
}
]

Elasticsearch associating exact match terms

I have a search index of filenames containing over 100,000 entries that share about 500 unique variations of the main filename field. I have recently made some modifications to certain filename values that are being generated from my data. I was wondering if there is a way to link certain queries to return an exact match. In the following query:
"query": {
"bool": {
"must": [
{
"match": {
"filename": "foo-bar"
}
}
],
}
}
how would it be possible to modify the index and associate the results so that above query will also match results foo-bar-baz, but not foo-bar-foo or any other variation?
Thanks in advance for your help
You can use a term query instead of a match query. Perfect to use on a keyword:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html
Adding a working example with index data and search query. (Using the default mapping)
Index Data:
{
"fileName": "foo-bar"
}
{
"fileName": "foo-bar-baz"
}
{
"fileName": "foo-bar-foo"
}
Search Query:
{
"query": {
"bool": {
"should": [
{
"match": {
"fileName.keyword": "foo-bar"
}
},
{
"match": {
"fileName.keyword": "foo-bar-baz"
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "test",
"_type": "_doc",
"_id": "1",
"_score": 0.9808291,
"_source": {
"fileName": "foo-bar"
}
},
{
"_index": "test",
"_type": "_doc",
"_id": "2",
"_score": 0.9808291,
"_source": {
"fileName": "foo-bar-baz"
}
}
]

match query returns exact values only in elasticsearch

I have following documents:
{
"_index": "testrest",
"_type": "testrest",
"_id": "sadfasdfw1",
"_score": 1,
"_source": {,
"s_item_name": "Create",
"request_id": "35",
"confidence": "0.5",
}
},
{
"_index": "testrest",
"_type": "testrest",
"_id": "asdfds",
"_score": 1,
"_source": {,
"s_item_name": "Update",
"request_id": "35",
"confidence": "0.3333",
}
},
I am trying to get results for request_id of 35 and their confidence values.
For eg. if input is only 0. then both results should be displayed.
And if input is 0.5 then only first doc., and if 0.3 only second doc.
Here's what I tried:
{
"query": {
"bool": {
"must": [
{ "match": { "confidence_score": "0.33" }}
],
"filter": {
"term": {
"request_id": "35"
}
}
}
}
}
This gives 0 results. Since it requires exact values only, like 0.5 or 0.3333.
I thought match works for this instead of term.
How do I make the query similar to LIKE operator in SQL?
For like I should suggest you have a look at wildcard, prefix or match_phrase type in elastic search or if you are using the latest version you can write SQL statement using ES plugin.

Resources