ElasticSearch advanced query_string query - elasticsearch

What I need is, elastic should search in multiple fields and return data by field priority.
For example: For the search string obil hon, elastic should search in fields[title, description, modelCaption] and return data when at first it finds Mobile Phone in Title field, then in other fields.
Query I use:
{
"from": 0,
"query": {
"bool": {
"must": [
{
"query_string": {
"default_operator": "or",
"fields": [
"title^5",
"description",
"modelCaption",
"productFeatureValues.featureValue",
"productFeatureValues.featureCaption"
],
"query": "*obil* *hone*"
}
}
]
}
},
"size": 16
}
Any suggestions?
Thanks!

You can simply use the multi-match query to query multiple fields and it supports boosting a particular field like a title in your case and different operators like OR in your case.
Sample ES query for your use case:
{
"query": {
"multi_match" : {
"query" : "mobile phones",
"fields" : [ "title^5", "description","modelCaption","productFeatureValues.featureVal"],
"fuzziness" : "AUTO" --> Adding fuzziness to query
}
}
}
Here title filed is boosted by factor 5, hence if mobile phones match in title field then it would be scored higher.
Also please note, you are using wild-card in your query string which is very costly so it's better to avoid them if you can.
EDIT: Based on OP comments, included fuzziness parameter AUTO in query for better results

Related

ANDing search keywords for elastic Search

How can we configure elastic search so that it only returns results which matches all the words in the search query. The documents indexed have data having multiple fields and so the words of search query may match different fields of data but all the words must get matched in the result ?
you can query string query feature to search for results
sample search query
GET /_search
{
"query": {
"query_string": {
"query": "(content:this OR name:this) AND (content:that OR name:that)"
}
}
}
In this query content and name is the field name, this is the search criteria
you can build search query similar to that.
I think you're looking for a multi_match query together with and operator. This is the link to docs: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html and it seems that cross_fieldsis query type you're looking for. I'd read more on that page, but this is probably what you are looking for:
GET /_search
{
"query": {
"multi_match" : {
"query": "Will Smith",
"type": "cross_fields",
"fields": [ "first_name", "last_name" ],
"operator": "and"
}
}
}

How to boost individual documents

I have a pretty complex query and now I want to boost some documents that fulfill some criteria. I have the following simplified document structure and I try to give some documents a boost based on the id, genre, tag.
{
"id": 123,
"genres": ["ACTION", "DRAMA"],
"tags": ["For kids", "Romantic", "Nature"]
}
What I want to do is for example
id: 123 boost: 5
genres: ACTION boost: 3
tags: Romantic boost: 0.2
and boost all documents that are contained in my query and fit the criteria but I don't want to filter them out. So query clause boosting is not of any help I guess.
Edit: To make if easier to understand what I want to achieve (not sure if it is possible with elasticsearch, no is also a valid answer).
I want to search with a query and get a result set. In this set I want to boost some documents. But I don't want to enlarge the result set or filter it. The boost should be independent from the query.
For example I search for a specific tag and want to boost all documents with category 'ACTION' in the result set. But I don't want all documents with category 'ACTION' in the result set and also I don't want only documents with the specific tag AND category 'ACTION'.
I think you need to have Dynamic boosting during query time.
The first matches the id title with boost and second one matches the 'genders' ACTION.
{
"query": {
"bool": {
"should": [
{
"match": {
"title": {
"query": "id",
"boost": 5
}
}
},
{
"match": {
"content": "Action"
}
}
]
}
}
}
If you want to have multi_match match based on your query:
{
"multi_match" : {
"query": "some query terms here",
"fields": [ "id^5", "genders^3", "tags^0.2" ]
}
}
Note: the ^5 means boost for the title.
Edit:
Maybe you are asking for different types of multi_match queries (at least for ES 5.x) from the ES reference guide:
best_fields
(default) Finds documents which match any field, but uses
the _score from the best field. See best_fields.
most_fields
Finds documents which match any field and combines the _score from
each field. See most_fields.
cross_fields
Treats fields with the same analyzer as though they were one big
field. Looks for each word in any field. See cross_fields.
phrase
Runs a match_phrase query on each field and combines the _score from
each field. See phrase and phrase_prefix.
phrase_prefix
Runs a match_phrase_prefix query on each field and combines the _score
from each field. See phrase and phrase_prefix.
More at: ES 5.4 ElasticSearch reference
I found a solution and it was pretty simple. I use a boosting query. I now just nest the different boosting criteria with and my original query is now the base query.
https://www.elastic.co/guide/en/elasticsearch/reference/2.3/query-dsl-boosting-query.html
For example:
{
"query": {
"boosting": {
"positive": {
"boosting": {
"positive": {
"match": {
"director": "Spielberg"
}
},
"negative": {
"term": {
"genres": "DRAMA"
}
},
"negative_boost": 1.3
}
},
"negative": {
"term": {
"tags": "Romantic"
}
},
"negative_boost": 1.2
}
}
}

Elasticsearch: multi_match phrase_prefix query with multiple search terms

I have a database with entries like
title: This is my awesome title
abstract: A more detailed descriptions of what [...]
I would like to build an Elasticsearch query that matches the above document with, e.g.,
awe detai
In words: A multi_match phrase_prefix query with multiple search terms. (This is intended to be used as a search-as-you-type feature.)
I see how you can combine multi_match and phrase_prefix, but it's unclear to me how to do this for multiple search terms.
Any hints?
Well there is few ways to do that
POST stack/autocomplete/1
{
"title": "This is my awesome title",
"abstract": "A more detailed descriptions of what"
}
Then you can search using query string with star but problem here is that you need to append asterix to query
POST stack/autocomplete/_search
{
"query": {
"query_string": {
"fields": [
"title",
"abstract"
],
"query": "awe* detai*"
}
}
}
If you want to match on user query then you can use like that
POST stack/autocomplete/_search
{
"query": {
"multi_match": {
"fields": [
"title",
"abstract"
],
"query": "awesome tit",
"type": "phrase_prefix"
}
}
}
One more option to consider would be to use nGram with query string so you will not need to modify user query "awe* detai*"

Elasticsearch query with fields giving subset of results

I am new to Elasticsearch. This is the how my document look like :
_source :
{
"name": "this is my title",
"address" : "1300 S Belmont Road"
"ID= : 54000"
}
When i run this query :
Query 1 :
"query": {
"filtered": {
"query": {
"query_string": {
"query": "*Belmont*",
"fields": ["name^5", "address^4","ID^3"]
}
},
"filter": {...}
}
}
I get 51 results
Query 2:
But this one gives 123 results :
"query": {
"filtered": {
"query": {
"query_string": {
"query": "*Belmont*",
}
},
"filter": {...}
}
}
Why is it that the queries give different results even thogh I am Running the query on all the fields in Query 1
Mappings :
Address and Name are both string and "not_analyzed"
This is because the way _all field works. Your first query is looking for *Belmont* in specified fields with specific analyzer honored. It is internally converted to bool query and matched with each field individually.
Since address is not_analyzed, 1300 S Belmont Road will be stored as it is but _all field will have space delimited words with standard analyzer applied like 1300, s , belmont etc. From the Doc
The _all field is a special catch-all field which concatenates the
values of all of the other fields into one big string, using space as
a delimiter, which is then analyzed and indexed, but not stored.
so your second query operates on _all field and gives you more results.
Also your first query wont match "address" : "1300 S Belmont Road" as by default it will be lowercased while using wildcard so it will search for belmont and wont find it. You can change this behavior with lowercase_expanded_terms which is true by default. Try this
"query": {
"filtered": {
"query": {
"query_string": {
"query": "*Belmont*",
"fields": ["name^5", "address^4","ID^3"],
"lowercase_expanded_terms" : false
}
},
"filter": {...}
}
}
You might get more results depending on how you have stored names and address.
Hope this helps!

Elasticsearch case-insensitive query_string query with wildcards

In my ES mapping I have an 'uri' field which is currently set to not_analysed and I'm not allowed to change the mapping.I wanted to search for uri parts with a query_string query like this (this ES query is autogenerated, that is why it is a bit complicated but let's just focus on the query_string part)
{
"sort": [{"updated": {"order": "desc"}}],
"query": {
"bool": {
"must":[{
"query_string": {
"query":"*w3\\.org\\/2014\\/01\\/a*",
"lowercase_expanded_terms": true,
"default_field": "uri"
}
}],
"minimum_number_should_match": 1
}
}, "size": 50}
Now it is usually working, but I've the following url stored (fictional url): http://w3.org/2014/01/Abc.html and this query does not bring it back because of the A-a difference. Setting the expanded terms to false also not solves this. What should I do for this query to be case insensitive?
Thanks for the help in advance.
From the docs, it seems like you need a new analyzer that first transforms to lowercase and then can run the search. Have you tried that?
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/sorting-collations.html
As I read it, your pattern, lowercase_expanded_terms, only applies to expansions, not to regular words
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html
lowercase_expanded_terms
Whether terms of wildcard, prefix, fuzzy, and range queries are to be automatically lower-cased or not (since they are not analyzed). Default it true
Try to use match query instead of query string.
{
"sort": [
{
"updated": {
"order": "desc"
}
}
],
"query": {
"bool": {
"must": [
{
"match": {
"uri": "*w3\\.org\\/2014\\/01\\/a*"
}
}
]
}
},
"size": 50
}
Query string queries are not analyzed and but match queries are analyzed.

Resources