Elasticsearch - pass fuzziness parameter in query_string - elasticsearch

I have a fuzzy query with customized AUTO:10,20 fuzziness value.
{
"query": {
"match": {
"name": {
"query": "nike",
"fuzziness": "AUTO:10,20"
}
}
}
}
How to convert it to a query_string query? I tried nike~AUTO:10,20 but it is not working.

It's possible with query_strng as well, let me show using the same example as OP provided, both match_query provided by OP matches and query_string fetches the same document with same score.
And according to this and this ES docs, Elasticsearch supports AUTO:10,20 format, which is shown in my example as well.
Also
Index mapping
{
"mappings": {
"properties": {
"name": {
"type": "text"
}
}
}
}
Index some doc
{
"name" : "nike"
}
Search query using match with fuzziness
{
"query": {
"match": {
"name": {
"query": "nike",
"fuzziness": "AUTO:10,20"
}
}
}
}
And result
"hits": [
{
"_index": "so-query",
"_type": "_doc",
"_id": "1",
"_score": 0.9808292,
"_source": {
"name": "nike"
}
}
]
Query_string with fuzziness
{
"query": {
"query_string": {
"fields": ["name"],
"query": "nike",
"fuzziness": "AUTO:10,20"
}
}
}
And result
"hits": [
{
"_index": "so-query",
"_type": "_doc",
"_id": "1",
"_score": 0.9808292,
"_source": {
"name": "nike"
}
}
]

Lucene syntax only allows you to specify "fuzziness" with the tilde symbol "~", optionally followed by 0, 1 or 2 to indicate the edit distance.
Elasticsearch Query DSL supports a configurable special value for AUTO which then is used to build the proper Lucene query.
You would need to implement that logic on your application side, by evaluating the desired edit distance based on the length of your search term and then use <searchTerm>~<editDistance> in your query_string-query.

Related

Query string with default_operator as AND works differently in ES 2.4 and ES 6.8

The query_string seems to be returning the documents that match all the terms across fields specified in the fields parameter in ES 2.4 whereas in ES 6.8 documents that match all the terms per field are returned.
Steps to reproduce:
Insert the following documents into both ES 2.4 and ES 6.8 clusters:
PUT yields_test/il4/1
{
"security_name":"high term1",
"doc_type":"YIELDS"
}
PUT yields_test/il4/2
{
"security_name":"high term2",
"doc_type":"YIELDS"
}
PUT yields_test/il4/3
{
"security_name":"low term2",
"doc_type":"YIELDS"
}
PUT yields_test/il4/4
{
"security_name":"low term1",
"doc_type":"YIELDS"
}
PUT yields_test/il4/5
{
"security_name":"high yield",
"doc_type":"YIELDS"
}
PUT yields_test/il4/6
{
"security_name":"high yields",
"doc_type":"YIELDS"
}
PUT yields_test/il4/7
{
"security_name":"high term3",
"doc_type":"YIELD"
}
And try to search with the following query_string query in both of them:
GET yields_test/_search
{
"query": {
"query_string": {
"fields": [
"security_name",
"doc_type"
],
"query": "high yield",
"default_operator": "AND"
}
}
}
ES 2.4 returns the following documents:
[
{
"_index": "yields_test",
"_type": "il4",
"_id": "7",
"_score": 0.5098911,
"_source": {
"security_name": "high term3",
"doc_type": "YIELD"
}
},
{
"_index": "yields_test",
"_type": "il4",
"_id": "5",
"_score": 0.08322528,
"_source": {
"security_name": "high yield",
"doc_type": "YIELDS"
}
}
]
whereas ES 6.8 returns the following:
[
{
"_index": "yields_test",
"_type": "il4",
"_id": "5",
"_score": 0.5753642,
"_source": {
"security_name": "high yield",
"doc_type": "YIELDS"
}
}
]
I was able to find what changed between the versions by using the profile field in the request body.
The Lucene query which is created is different for both.
Lucene query generated by ES 2.4:
+(security_name:high | doc_type:high) +(security_name:yield | doc_type:yield)
By ES 6.8:
((+security_name:high +security_name:yield) | (+doc_type:high +doc_type:yield))
My other question about how to return the same documents in ES 6.8 still remains.
You could use a bool query:
{
"query" : {
"bool" : {
"must" : [
{"query_string": {
"fields": [ "security_name","doc_type" ],
"query": "high",
}},
{ "query_string": {
"fields": [ "security_name","doc_type" ],
"query": "yeild", }}
]
}
}
}

Elastic Multimatch string with dash (or other symbol)

I am trying to match dashes (and other symbols) in my elastic query.
It is fuzzysearch on all the fields using default whitespace analyzer.
My query:
function_score: {
query: {
multi_match: {
query: string
analyzer: "whitespace",
fuzziness: 1
}
}
}
However this has unexpected results with dash characters. E.x. Central-Park doesnt work with this. Or
Dashes only work well when I use a phrase match and strip out the double quotes. But there is no fuzziness.
Does anyone know how I can get the fuzzysearch normally with dashes please?
Adding a working example with index mapping, index data, search query, and search result
Index Mapping:
{
"mappings": {
"properties": {
"place": {
"type": "text",
"analyzer":"whitespace"
}
}
}
}
Index Data:
{
"place": "Cwntral-Park"
}
{
"place": "Central-Park"
}
{
"place": "Central-Area"
}
Search Query:
{
"query": {
"bool": {
"should": {
"match": {
"place": {
"query": "Central-Park",
"fuzziness": 1
}
}
}
}
}
}
Search Result:
"hits": [
{
"_index": "65605120",
"_type": "_doc",
"_id": "1",
"_score": 0.9808291,
"_source": {
"place": "Central-Park"
}
},
{
"_index": "65605120",
"_type": "_doc",
"_id": "3",
"_score": 0.8990934,
"_source": {
"place": "Cwntral-Park"
}
}
]

Which field did find the search query?

ı want to find a field, Which field did find the search query?
this can be any query I am not writing a specific query
for example
ı searching dilo abinin phrase or any word, and found bellow document
{
"name":"dilo abinin",
"surname: "sürücü"
}
ı want to get name keyword
You can use highlighting, to see which field matched your query
Index API
{
"name":"dilo abinin",
"surname": "sürücü"
}
Search Query:
{
"query": {
"query_string": {
"query": "dilo abinin"
}
},
"highlight": {
"fields": {
"*": {}
}
}
}
Search Result:
"hits": [
{
"_index": "65325154",
"_type": "_doc",
"_id": "1",
"_score": 0.5753642,
"_source": {
"name": "dilo abinin",
"surname": "sürücü"
},
"highlight": {
"name": [ // note this
"<em>dilo</em> <em>abinin</em>"
],
"name.keyword": [
"<em>dilo abinin</em>"
]
}
}
]

How to add fuzziness to search as you type field in Elasticsearch?

I've been trying to add some fuzziness to my search as you type field type on Elasticsearch, but never got the needed query. Anyone have any idea to implement this?
Fuzzy Query returns documents that contain terms similar to the search term, as measured by a Levenshtein edit distance.
The fuzziness parameter can be specified as:
AUTO -- It generates an edit distance based on the length of the term.
For lengths:
0..2 -- must match exactly
3..5 -- one edit allowed Greater than 5 -- two edits allowed
Adding working example with index data and search query.
Index Data:
{
"title":"product"
}
{
"title":"prodct"
}
Search Query:
{
"query": {
"fuzzy": {
"title": {
"value": "prodc",
"fuzziness":2,
"transpositions":true,
"boost": 5
}
}
}
}
Search Result:
"hits": [
{
"_index": "test",
"_type": "_doc",
"_id": "1",
"_score": 2.0794415,
"_source": {
"title": "product"
}
},
{
"_index": "test",
"_type": "_doc",
"_id": "2",
"_score": 2.0794415,
"_source": {
"title": "produt"
}
}
]
Refer these blogs to get a detailed explaination on fuzzy query
https://www.elastic.co/blog/found-fuzzy-search
https://qbox.io/blog/elasticsearch-optimization-fuzziness-performance
Update 1:
Refer this ES official documentation
The fuzziness , prefix_length , max_expansions , rewrite , and
fuzzy_transpositions parameters are supported for the terms that are
used to construct term queries, but do not have an effect on the
prefix query constructed from the final term.
There are some open issues and discuss links that states that - Fuzziness not work with bool_prefix multi_match (search-as-you-type)
https://github.com/elastic/elasticsearch/issues/56229
https://discuss.elastic.co/t/fuzziness-not-work-with-bool-prefix-multi-match-search-as-you-type/229602/3
I know this question is asked long ago but I think this worked for me.
Since Elasticsearch allows a single field to be declared with multiple data types, my mapping is like below.
PUT products
{
"mappings": {
"properties": {
"title": {
"type": "text",
"fields": {
"product_type": {
"type": "search_as_you_type"
}
}
}
}
}
}
After adding some data to the index I fetched like this.
GET products/_search
{
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "prodc",
"type": "bool_prefix",
"fields": [
"title.product_type",
"title.product_type._2gram",
"title.product_type._3gram"
]
}
},
{
"multi_match": {
"query": "prodc",
"fuzziness": 2
}
}
]
}
}
}

How to change the order of search results on Elastic Search?

I am getting results from following Elastic Search query:
"query": {
"bool": {
"should": [
{"match_phrase_prefix": {"title": keyword}},
{"match_phrase_prefix": {"second_title": keyword}}
]
}
}
The result is good, but I want to change the order of the result so that the results with matching title comes top.
Any help would be appreciated!!!
I was able to reproduce the issue with sample data and My solution is using a query time boost, as index time boost is deprecated from the Major version of ES 5.
Also, I've created sample data in such a manner, that without boost both the sample data will have a same score, hence there is no guarantee that one which has match comes first in the search result, this should help you understand it better.
1. Index Mapping
{
"mappings": {
"properties": {
"title": {
"type": "text"
},
"second_title" :{
"type" :"text"
}
}
}
}
2. Index Sample docs
a)
{
"title": "opster",
"second_title" : "Dimitry"
}
b)
{
"title": "Dimitry",
"second_title" : "opster"
}
Search query
{
"query": {
"bool": {
"should": [
{
"match_phrase_prefix": {
"title": {
"query" : "dimitry",
"boost" : 2.0 <-- Notice the boost in `title` field
}
}
},
{
"match_phrase_prefix": {
"second_title": {
"query" : "dimitry"
}
}
}
]
}
}
}
Output
"hits": [
{
"_index": "60454337",
"_type": "_doc",
"_id": "1",
"_score": 1.3862944,
"_source": {
"title": "Dimitry", <-- Dimitry in title field has doube score
"second_title": "opster"
}
},
{
"_index": "60454337",
"_type": "_doc",
"_id": "2",
"_score": 0.6931472,
"_source": {
"title": "opster",
"second_title": "Dimitry"
}
}
]
Let me know if you have any doubt understanding it.

Resources