How to query like this in Elasticsearch - elasticsearch

I have documents like this:
{
'body': '',
'date': '',
'agency_id': ''
}
I want to get documents with these conditions:
body contains :
all of ['word1', 'word2 word3', 'word4']
Or all of: ['word5 word6', 'word7']
or all of: ['word8 word9', 'word10']
And agency_id in ['id1', 'id2', 'id3']
Would you please tell me how to create this query?

To achieve what you want you need to use two must clauses, one for body and other for agency_id. In case of body you can specify your three conditions and use a minimum should match as 1. Should be something like this:
{
"size" : 10,
"query" : {
"function_score" : {
"query" : {
"bool" : {
"must" : {
"bool" : {
"should" : [ {
"query_string" : {
"query" : "word1 worrd2 word3 word4",
"fields" : [ "body" ],
"default_operator" : "and"
}
}, {
"query_string" : {
"query" : "word5 worrd6 word7",
"fields" : [ "body" ],
"default_operator" : "and"
}
}, {
"query_string" : {
"query" : "word8 worrd9 word10",
"fields" : [ "body" ],
"default_operator" : "and"
}
} ],
"minimum_should_match" : "1"
}
},
"must" : {
"query": {
"terms" : { "agency_id" : [ "id1", "id2", "id3" ]}
}
}
}
}
}
}
}
Just make sure you use some analyzer to generate tokens for each word inside body. If you don't need any special feature you can just use standard analyzer.

Related

Elasticsearch one field match different values

I want to do a query, with one field match different values,In SQL it likes:
select * from location where address like '%California%' and address like '%145%';
I tried using must condition array, it contains several phrase match conditions, but its doesnt work!
{
"from" : 0,
"size" : 10,
"query" : {
"bool" : {
"must" : {
"bool" : {
"must" : [ {
"match" : {
"address" : {
"query" : "California",
"type" : "phrase"
}
}
}, {
"match" : {
"address" : {
"query" : "145",
"type" : "phrase"
}
}
} ]
}
}
}
},
"sort" : [ {
"pageRankScore" : {
"order" : "desc",
"unmapped_type" : "double"
}
} ]
}
Thats my code, it only do a match '145', never match 'California'.
My question is: with several values, how to do a fuzzy match in one field?
Help me, thanks a lot!

Why my Elasticsearch query retrieves all indexed documents

I've a problem to understand the functionality of the following Elasticsearch (ES 6.4) query:
{
"query" : {
"bool" : {
"should" : [
{
"match" : {
"title" : {
"query" : "example",
"operator" : "AND",
"boost" : 2
}
}
},
{
"multi_match" : {
"type" : "best_fields",
"query" : "example",
"operator" : "AND",
"fields" : [
"author", "content", "tags"
],
"boost" : 1
}
}
],
"must" : [
{
"range" : {
"dateCreate" : {
"gte" : "2000-01-01T00:00:00+0200",
"lte" : "2019-02-12T23:59:59+0200"
}
}
},
{
"term" : {
"client" : {
"value" : "test",
"boost" : 1
}
}
}
]
}
},
"size" : 10,
"from" : 0,
"sort" : [
{
"_score" : {
"order" : "desc"
}
}
]
}
The query is executed successfully but retrieves about 400,000 documents which is the total count of my index. It means that all documents are in the result set. But why? Is this really the correct behavior of the multi_match query?
When I was still using the query_string query, I only got the actual matching documents. That's why I'm a bit surprised.
You're missing minimum_should_match:
"bool" : {
"minimum_should_match": 1, <--- add this
"should" : [
...

Can't create the correct filter for my elasticsearch query

I've a problem to set the correct filter. My query looks like this:
{
"query" : {
"bool" : {
"must" : [
{
"query_string" : {
"query" : "example~",
"analyzer" : "standard",
"default_operator" : "OR",
"fuzziness" : "AUTO"
}
},
{
"term" : {
"client" : {
"value" : "MyClient",
"boost" : 1
}
}
},
{
"range" : {
"dateCreate" : {
"gte" : "2016-01-01T00:00:00+0200",
"lte" : "2016-12-31T23:59:59+0200"
}
}
},
{
"match" : {
"lang" : "php OR java"
}
}
]
}
},
"size" : 10,
"from" : 0,
"sort" : [
{
"_score" : {
"order" : "desc"
}
}
]
}
The "lang" field is of type text.
My expectation is to get all documents with the given query string and then I want select only the documents which have "PHP" or "Java" in their lang field. The lang fields only contain either "PHP" or "Java" but never both strings so I thought about using an exact matching but I can't got it to work.
The result is actually a list of two documents but with total_count=2510.
One of my documents that doesn't match:
{
"id" : "d3295f18-a033-4934-941a-21a8bef901e8",
"client" : "MyClient",
"lang" : "PHP",
"author" : null,
"dateCreate" : "2016-03-31T00:00:00+0200",
"title" : "Sample document",
"content" : "This is a short text describing the deocument."
}
Yes, the client field is also of type text.
client field must be either of keyword type to use term query or change the query for client from term to match:
{
"match" : {
"client" : {
"query" : "MyClient",
"boost" : 1
}
}
}

multiple match must fields not working in elastic search

below query is fetching result if i give existing record that is fine , but if i change name field from 'John' to 'John1' then still record is fetching.
{
"query" : {
"bool" : {
"must" : [
{ "match" : {"employeeId" : "1234"}},
{ "match" : {"name" : "John"}}
]
}
}
}
I tried another alternative query as well but still giving result.which query is correct in terms of performance?but both are giving results if i change name record from 'John' to 'John1'
{
"filter": {
"bool" : {
"must" : {
"term" : {
"employeeId" : "1234"
}
}
}
},
"query": {
"match" : {
"name" : {
"query" : "John",
"type" : "phrase"
}
}
}
}
This because you are doing match, if you want do exact search you need to use filter
Notice we assuce the mapping of name column is analyzed
{
"query" :{
"filtered" : {
"filter" : {
"bool" : {
"must" : [
{ "term" : {"employeeId" : "1234"}},
{ "term" : {"name" : "john"}}
]
}
}
}
}
}

Elasticsearch match_phrase doesn't perform the same as multi_match with type phrase?

I'm having some trouble turning a match_phrase query into a multi_match query for multiple fields. My original query:
{
"from" : 0,
"size" : 50,
"query" : {
"filtered" : {
"query" : {
"match_phrase" : {
"metadata.description" : "Search Terms"
}
},
"filter" : {
"bool" : {
"must" : [ {
"terms" : {
"collectionId" : [ "1", "2" ]
}
} ]
}
}
}
}
}
Returns results correctly, but when I rewrite the match_phrase piece as a multi_match to run against multiple fields:
{
"from" : 0,
"size" : 50,
"query" : {
"filtered" : {
"query" : {
"multi_match" : {
"query" : "Search Terms",
"fields" : [ "metadata.description", "metadata.title" ],
"type" : "phrase"
}
},
"filter" : {
"bool" : {
"must" : [ {
"terms" : {
"collectionId" : [ "1", "2" ]
}
} ]
}
}
}
}
}
I am not getting any results. Is there anything obvious I am doing wrong here?
EDIT:
It must be something to do with the filter, as
{
"from" : 0,
"size" : 50,
"query" : {
"match_phrase" : {
"metadata.description" : "Search Terms"
}
}
}
and
{
"from" : 0,
"size" : 50,
"query" : {
"multi_match" : {
"query" : "Search Terms",
"fields" : [ "metadata.description", "metadata.title" ],
"type" : "phrase"
}
}
}
both perform as expected.
I am not sure why, exactly, but not using a filtered query, and applying the filter at the top level
{
"from" : 0,
"size" : 50,
"query" : {
"multi_match" : {
"query" : "Search Terms",
"fields" : [ "metadata.description", "metadata.title" ],
"type" : "phrase"
}
},
"filter" : {
"bool" : {
"must" : [ {
"terms" : {
"collectionId" : [ "1", "2" ]
}
} ]
}
}
}
resolves the problem.

Resources