Different results with the same keywords in elasticsearch query - elasticsearch

I have a question to understand the search logic. I have an Elasticsearch 5.4 instance and make a query_string query. The default operator is OR. Other settings are not defined.
Now I search for
dog house
and get 10,500 results. Then I search for
house dog
and get only 6,200 results. That's a bit curious for me.
That's my query:
{
"query" : {
"bool" : {
"must" : [
{
"query_string" : {
"query" : "dog house~",
"default_operator" : "OR",
"fuzziness" : "AUTO"
}
},
{
"term" : {
"client" : {
"value" : "MyClient",
"boost" : 1
}
}
},
{
"range" : {
"dateCreate" : {
"gte" : "2000-01-01T00:00:00+0200",
"lte" : "2000-12-31T23:59:59+0200"
}
}
}
]
}
},
"size" : 2,
"from" : 0,
"sort" : [
{
"_score" : {
"order" : "desc"
}
}
],
"collapse" : {
"field" : "title.keyword"
}
}

It is because of the ~ operator. If you try removing it and running the query again the number should be the same if the query is dog house or house dog.
Even the number of results should be the same if you run dog house~ or house~ dog, but the number will change if you change the ~ from house to dog.
If you want both words to specify the transpositions distance of 1 that the fuzzy search can match you can try with house~ dog~ and then the results will be the same if you change the order.

Related

Applying increasingly slow query filters depending on the number of matches

Is there a way of building a ES query so that it doesn't apply slower parts like wildcard searches or including more fields... If the number of results with the previous conditions already reaches the specified query size?
I assume putting aside totalHits value.
I have tried playing with the boosting setting but ES expectedly applies all the combinations.
{
"size" : 5,
"query": {
"bool": {
"should" : [
{ "term" : { "search.autocomplete" : { "value" : "120", "boost" : 20 } }},
{ "term" : { "search.autocomplete_inverse" : { "value" : "120", "boost" : 15 } }},
{ "match" : { "search.keyword" : { "query" : "120", "boost" : 10 } }},
{ "wildcard" : { "brand.search" : { "value" : "*120*", "boost": 5}}},
{ "wildcard" : { "category.search" : { "value" : "*120*", "boost": 0}}}
]
}
}
}
A way so that if the first condition matches with 5 or more docs ES doesn't spend more time trying to find more matches.
A different approach would be to execute multiple queries in my application until I reach the desired amount of results, but it doesn't feel right...

How to query on multiple fields in elasticsearch?

i have tried the multiple field query and it works fine. But I would like to know what other options are generally used to query multiple fields in elasticsearch?
Structured queries with multiple terms, for finding exact values, the same as SQL
https://www.elastic.co/guide/en/elasticsearch/guide/current/_finding_multiple_exact_values.html
"bool" : {
"must" : [
{ "term" : { "tags" : "search" } },
{ "term" : { "tag_count" : 1 } }
]
}
For example, consider following sql query,
SELECT product
FROM products
WHERE (price = 20 OR productID = "XHDK-A-1293-#fJ3")
AND (price != 30)
In these situations, you will need the bool filter. This is a compound filter that accepts other filters as arguments, combining them in various Boolean combinations.
The Query DSL would be,
GET /my_store/products/_search
{
"query" : {
"filtered" : {
"filter" : {
"bool" : {
"should" : [
{ "term" : {"price" : 20}},
{ "term" : {"productID" : "XHDK-A-1293-#fJ3"}}
],
"must_not" : {
"term" : {"price" : 30}
}
}
}
}
}
}
Follow the below link for documentation
https://www.elastic.co/guide/en/elasticsearch/guide/current/combining-filters.html

Do query results impact elasticsearch phrase suggestions?

I'd like to know whether Elasticsearch users query results to populate phrase suggestions for direct generator or not?
Or it simply picks tokens from given index?
My queries are based on some permission sets.
So for instance, that'd be my query:
{
"size" : 0,
"query" : {
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"bool" : {
"must" : [{
"terms" : {
"Permissions" : ["permission1", "permission2", "permission3"
]
}
}
]
}
}
}
},
"suggest" : {
"DidYouMean" : {
"text" : "{{SearchPhrase}}",
"phrase" : {
"field" : "_all",
"analyzer" : "simple",
"size" : 1,
"real_word_error_likelihood" : 0.96,
"max_errors" : 5,
"gram_size" : 3,
"direct_generator" : [{
"field" : "_all",
"suggest_mode" : "popular",
"min_word_length" : 3
}
]
}
}
}
}
How would I ensure that direct generator creates suggestions and doesn't violate my permissions clause?
Is this even possible?
The term suggester and phrase suggester feeds on the tokens for generating suggest results. The query does not affect the suggest results. The suggester directly works on the reverse index and get the tokens from them. So its scope is global and never the query

Elastic Search NEST - How to have multiple levels of filters in search

I would like to have multiple levels of filters to derive a result set using NEST API in Elastic Search. Is it possible to query the results of another filter...? If yes can I do that in multiple levels?
My requirement is like a User is allowed to select / unselect options of various fields.
Example: There are totally 1000 documents in my index 'people'. There may be 3 ListBoxs, 1) City 2) Favourite Food 3) Favourite Colour. If user selects a city it filters out 600 documents. Out of those 600 documents I would like to filter Favourite food, which may result with some 300 documents. Now further I would like to filter with resp. to favourite movie to retrieve 50 documents out of previously derived 300 documents.
You don't need to query within filters to achieve what you want. Just use filtered queries, http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.html, and provide several filters. In your instance I would assume you would do something like this for your first query:
{
"filtered" : {
"query" : {
"match_all" : { }
},
"filter" : {
"and" : [
{
"term" : {
"city" : "some city"
}
}
]
}
}
}
You would then return the results from that and display them. You'd then let them select the next filter and do the following:
{
"filtered" : {
"query" : {
"match_all" : { }
},
"filter" : {
"and" : [
{
"term" : {
"city" : "some city"
}
},
{
"term" : {
"food" : "some food"
}
}
]
}
}
}
You'd then rinse and repeat for the 3 filter param:
{
"filtered" : {
"query" : {
"match_all" : { }
},
"filter" : {
"and" : [
{
"term" : {
"city" : "some city"
}
},
{
"term" : {
"food" : "some food"
}
},
{
"term" : {
"colour" : "some colour"
}
}
]
}
}
}
I haven't tested this, but the principle is sound and will work.

How do I have to write a Search Query in ElasticSearch?

I use the Grails ElasticSearch Plugin and want to use the following query:
"bool" : {
"must" : {
"term" : { "user" : "kimchy" }
},
"must_not" : {
"range" : {
"age" : { "from" : 10, "to" : 20 }
}
},
"should" : [
{
"term" : { "tag" : "wow" }
},
{
"term" : { "tag" : "elasticsearch" }
}
],
"minimum_should_match" : 1,
"boost" : 1.0
}
Using the groovy api from the Grails plugin I would write something like:
def res = userAgentIdentService.search() {
"bool" {
"must" {
term("user" : "kimchy" )
}
"must_not" {
"range" {
age("from" : 10, "to" : 20 }
}
}
"should" : [
{
term( "tag" : "wow" )
}
{
term("tag" : "elasticsearch" )
}
]
"minimum_should_match" = 1
"boost" = 1.0
}
}
My query is not working!
Where do I have to define minimum_should_match and how do I have to define it?
How do I have to write the "should" : [ ... ] square brackets notation in the grails / groovy manner?
I think you're missing a couple of json levels in your search request. I don't think you can use the query without specifying that's a query (it could be a filter as well, or even something else). Have a look at this example from the groovy api reference:
def search = node.client.search {
indices "test"
types "type1"
source {
query {
terms(test: ["value1", "value2"])
}
}
}

Resources