What is difference between match and bool must match query in Elasticsearch - elasticsearch

What is the difference between Only match and bool must match query in ES?
First, Only use the match query
{
"query":{
"match":{
"address":"mill"
}
}
}
Second, use compound query
{
"query": {
"bool": {
"must": [
{ "match": { "address": "mill" } }
]
}
}
}
Can you tell me everything?
What is difference between them?

When you use only one match inside a bool must clause then there is no difference, the bool clause is useful when you want to combine multiple(boolean) criteria, more info on official ES doc. It supports below criteria.
must
must_not
filter
should
Let me show by taking a small example from your question.
Index mapping with just address and first_name
{
"mappings": {
"properties": {
"address": {
"type": "text"
},
"first_name" :{
"type" : "text"
}
}
}
}
Index 3 docs, all having same address mill, but different first_name
{
"address" : "mill",
"first_name" : "Johnson"
}
{
"address" : "mill",
"first_name" : "Parker"
}
{
"address" : "mill",
"first_name" : "opster"
}
Search query to show all adresses of mill but must_not contain first_name as parker
{
"query": {
"bool": {
"must": [
{
"match": {
"address": "mill"
}
},
{
"must_not": {
"first_name": "parker"
}
}
]
}
}
}
Result only 2 address
"hits": [
{
"_index": "so-60620921-bool",
"_type": "_doc",
"_id": "2",
"_score": 0.13353139,
"_source": {
"address": "mill",
"first_name": "opster"
}
},
{
"_index": "so-60620921-bool",
"_type": "_doc",
"_id": "3",
"_score": 0.13353139,
"_source": {
"address": "mill",
"first_name": "Johnson"
}
}
]
Based on the OP comments, providing the query and filter context, to understand the performance aspects in details.

As written in your question, they will perform the same action.
The match query is a very straight forward full-text condition statement.
The bool query allows you to add multiple fields and multiple conditions such as exists (to validate a certain field is found in the documents), should (an OR equivalent) and must_not (a NOT equivalent).
Taking again your example, since the bool query only has a single must, match condition, it will only return all the documents with the value mill contained in the address field.
Hope this is helpful! :)

Related

How to add fuzziness to search as you type field in Elasticsearch?

I've been trying to add some fuzziness to my search as you type field type on Elasticsearch, but never got the needed query. Anyone have any idea to implement this?
Fuzzy Query returns documents that contain terms similar to the search term, as measured by a Levenshtein edit distance.
The fuzziness parameter can be specified as:
AUTO -- It generates an edit distance based on the length of the term.
For lengths:
0..2 -- must match exactly
3..5 -- one edit allowed Greater than 5 -- two edits allowed
Adding working example with index data and search query.
Index Data:
{
"title":"product"
}
{
"title":"prodct"
}
Search Query:
{
"query": {
"fuzzy": {
"title": {
"value": "prodc",
"fuzziness":2,
"transpositions":true,
"boost": 5
}
}
}
}
Search Result:
"hits": [
{
"_index": "test",
"_type": "_doc",
"_id": "1",
"_score": 2.0794415,
"_source": {
"title": "product"
}
},
{
"_index": "test",
"_type": "_doc",
"_id": "2",
"_score": 2.0794415,
"_source": {
"title": "produt"
}
}
]
Refer these blogs to get a detailed explaination on fuzzy query
https://www.elastic.co/blog/found-fuzzy-search
https://qbox.io/blog/elasticsearch-optimization-fuzziness-performance
Update 1:
Refer this ES official documentation
The fuzziness , prefix_length , max_expansions , rewrite , and
fuzzy_transpositions parameters are supported for the terms that are
used to construct term queries, but do not have an effect on the
prefix query constructed from the final term.
There are some open issues and discuss links that states that - Fuzziness not work with bool_prefix multi_match (search-as-you-type)
https://github.com/elastic/elasticsearch/issues/56229
https://discuss.elastic.co/t/fuzziness-not-work-with-bool-prefix-multi-match-search-as-you-type/229602/3
I know this question is asked long ago but I think this worked for me.
Since Elasticsearch allows a single field to be declared with multiple data types, my mapping is like below.
PUT products
{
"mappings": {
"properties": {
"title": {
"type": "text",
"fields": {
"product_type": {
"type": "search_as_you_type"
}
}
}
}
}
}
After adding some data to the index I fetched like this.
GET products/_search
{
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "prodc",
"type": "bool_prefix",
"fields": [
"title.product_type",
"title.product_type._2gram",
"title.product_type._3gram"
]
}
},
{
"multi_match": {
"query": "prodc",
"fuzziness": 2
}
}
]
}
}
}

Elasticsearch - pass fuzziness parameter in query_string

I have a fuzzy query with customized AUTO:10,20 fuzziness value.
{
"query": {
"match": {
"name": {
"query": "nike",
"fuzziness": "AUTO:10,20"
}
}
}
}
How to convert it to a query_string query? I tried nike~AUTO:10,20 but it is not working.
It's possible with query_strng as well, let me show using the same example as OP provided, both match_query provided by OP matches and query_string fetches the same document with same score.
And according to this and this ES docs, Elasticsearch supports AUTO:10,20 format, which is shown in my example as well.
Also
Index mapping
{
"mappings": {
"properties": {
"name": {
"type": "text"
}
}
}
}
Index some doc
{
"name" : "nike"
}
Search query using match with fuzziness
{
"query": {
"match": {
"name": {
"query": "nike",
"fuzziness": "AUTO:10,20"
}
}
}
}
And result
"hits": [
{
"_index": "so-query",
"_type": "_doc",
"_id": "1",
"_score": 0.9808292,
"_source": {
"name": "nike"
}
}
]
Query_string with fuzziness
{
"query": {
"query_string": {
"fields": ["name"],
"query": "nike",
"fuzziness": "AUTO:10,20"
}
}
}
And result
"hits": [
{
"_index": "so-query",
"_type": "_doc",
"_id": "1",
"_score": 0.9808292,
"_source": {
"name": "nike"
}
}
]
Lucene syntax only allows you to specify "fuzziness" with the tilde symbol "~", optionally followed by 0, 1 or 2 to indicate the edit distance.
Elasticsearch Query DSL supports a configurable special value for AUTO which then is used to build the proper Lucene query.
You would need to implement that logic on your application side, by evaluating the desired edit distance based on the length of your search term and then use <searchTerm>~<editDistance> in your query_string-query.

Use Elasticsearch percolate with specific type of field name

I'm making a subscription system for notifications using the percolate type of property of Elasticsearch 7.x. The problem is that I can't make a percolate query with certain types of fields.
This is an example of the indexed data. As you can see, I have a query indexed to be able to perform a percolate query. The difference I would like to mention is the name of the field in the query which can be state or created_by.full_name.raw
{
"_index": "widgets_2020",
"_type": "widget",
"_score": 1.0,
"_source": {
"created_at": "2020-01-09T21:58:14.123Z",
"query": {
"bool": {
"must": [],
"filter": [
{
"terms": {
"created_by.full_name.raw": [
"Ivan Ledner"
]
}
}
]
}
}
}
},
{
"_index": "widgets_2020",
"_type": "widget",
"_score": 1.0,
"_source": {
"created_at": "2020-01-09T22:02:24.133Z",
"query": {
"bool": {
"must": [],
"filter": [
{
"terms": {
"state": [
"done"
]
}
}
]
}
}
}
}
When I do something like this, Elasticsearch returns the documents I expect.
widgets_2020/_search
{
"query" : {
"percolate" : {
"field" : "query",
"document" : {
"state": ["created"]
}
}
}
}
But when I search this, It returns nothing.
widgets_2020/_search
{
"query" : {
"percolate" : {
"field" : "query",
"document" : {
"created_by.full_name.raw": ["Ivan Ledner"]
}
}
}
}
Is there a different way of dealing with these types of names? Thanks in advance!
The problem was that I enabled the option map_unmapped_fields_as_text and this mapped all my fields as text as the options say. The way I solved this is mapping all the attributes manually and the percolator started to work as expected.

Elasticsearch exact match of specific field(s)

I'm trying to filter my elasticsearch index by specific fields, the "country" field to be exact. However, I keep getting loads of other results (other countries) back that are not exact.
Please could someone point me in the right direction.
I've tried the following searches:
GET http://127.0.0.1:9200/decision/council/_search
{
"query": {
"filtered": {
"filter": {
"term": {
"country": "Algeria"
}
}
}
}
}
Here is an example document:
{
"_index": "decision",
"_id": "54290140ec882c6dac5ae9dd",
"_score": 1,
"_type": "council",
"_source": {
"document": "DEV DOCUMENT"
"id": "54290140ec882c6dac5ae9dd",
"date_updated": 1396448966,
"pdf_file": null,
"reported": true,
"date_submitted": 1375894031,
"doc_file": null,
"country": "Algeria"
}
}
You can use the match_phrase query instead
POST http://127.0.0.1:9200/decision/council/_search
{
"query" : {
"match_phrase" : { "country" : "Algeria"}
}
}

Elastic Search in a complex document

I have a document stated below. I would like to do a search but I could not do it as I lacked the knowledge. Please help. How can I do searches in ElasticSearch in complex aggregates?
My Document
{
"_index": "vehicles",
"_type": "car",
"_id": "e16bd474-fa8e-4858-ab6c-3bbb3d0aa603",
"_version": 1,
"found": true,
"_source": {
"Type": {
"Name": "Mustang"
}
}
}
My Search Query
GET _search
{
"query":{
"filtered": {
"filter": {
"term": {
"Name": "Mustang"
}
}
}
},
"from":0,
"size":10
}
The Standard Analyzer is being applied to your Name field, so the term Mustang is being stored in the index as mustang. Change your query to use "Name": "mustang" and you should get a match.
If you only want the doc with "Name" : "Mustang" you can use
"query" : {
"bool" : {
"must" : {
"term" : {
"Name" : "Mustang"
}
}
}
}
There are two issues:
You are using term filter which is searching for Mustang token in the index, however the standard analyzer is being used so it is actually indexed as mustang.
You are searching in the wrong field. You should be using nested notation e.g. Type.Name
This query should work as expected:
{"query":{ "filtered": { "filter": {
"term": { "Type.Name": "mustang" }
}}}}

Resources