Handling Optional field search in Elasticsearch - elasticsearch

I'm using ES 5.5 and having a query dsl with javasript API request like this
client.search({
index: 'demo',
type: 'sample',
body: {
"query": {
"bool": {
"must": [
{
"match": {
"CityName": {
query: req.params.city,
slop: 100
}
}
},
{
"match": {
"StateName": {
query: req.params.state,
slop: 100
}
},
{
"match": {
"Code": {
query: req.params.code,
slop: 100
}
}
}
]
}
}
}
})
This query works fine when user gives all three values.But In my case these three parameters are not mandatory.Either user can give one value or more than one value and given fields must match the documents.Searching with one or two values doesn't return anything.

You need to replace the must with should. refer boolean query for more details
client.search({
index: 'demo',
type: 'sample',
body: {
"query": {
"bool": {
"must": [ --> replace this `must` with `should`
{
"match": {
"CityName": {
query: req.params.city,
slop: 100
}
}
},
{
"match": {
"StateName": {
query: req.params.state,
slop: 100
}
},
{
"match": {
"Code": {
query: req.params.code,
slop: 100
}
}
}
]
}
}
}
})

Related

Elastic search multiple AND & OR operator in a query

I have a following mapping applied to my index :
PUT /testing
PUT /testing/_mapping?pretty
{
"properties": {
"empID": {
"type":"long"
},
"state":{
"type":"text"
},
"Balance":{
"type":"long"
},
"loanid":{
"type":"long"
},
"rating":{
"type":"text"
},
"category":{
"type":"text"
}
}
}
Sample documents added in the index
POST testing/_doc?pretty
{
"empID":1,
"state":"NY",
"Balance":55,
"loanid":89,
"rating":"A",
"category":"PRO"
}
POST /testing/_doc?pretty
{
"empID":1,
"state":"TX",
"Balance":56,
"loanid":65,
"rating":"B",
"category":"TRIAL"
}
POST /testing/_doc?pretty
{
"empID":2,
"state":"TX",
"Balance":34,
"loanid":76,
"rating":"C",
"category":"PAID"
}
POST /testing/_doc?pretty
{
"empID":3,
"state":"TX",
"Balance":72,
"loanid":23,
"rating":"D",
"category":"FREE"
}
POST /testing/_doc?pretty
{
"dealID":3,
"state":"NY",
"Balance":23,
"loanid":67,
"rating":"E",
"category":"FREE"
}
POST /testing/_doc?pretty
{
"empID":2,
"state":"NY",
"Balance":23,
"loanid":98,
"rating":"F",
"category":"PRE"
}
POST /testing/_doc?pretty
{
"empID":2,
"state":"TX",
"Balance":19,
"loanid":100,
"rating":"D",
"category":"PAID"
}
I am trying to create ES query which is equivalent of sql query like :
select * from table_name
where empID =1 or state = 'NY'
and balance >=20 or loanid in (23, 67, 89) or rating = 'D'
and category!='FREE' or empID = 2 ;
vs (ES Query )
GET testing/_search?pretty
{
"query": {
"bool": {
"should": [
{
"match": {
"state": {
"query": "NY"
}
}
},
{
"term": {
"empID": 1
}
},
{
"bool": {
"must": [
{
"range": {
"Balance": {
"gte": 20
}
}
},
{
"bool": {
"should": [
{
"terms": {
"loanid": [
23,
67,
89
]
}
},
{
"match": {
"rating": {
"query": "D"
}
}
},
{
"bool": {
"must_not": [
{
"match": {
"category": {
"query": "FREE"
}
}
}
]
}
}
]
}
}
]
}
}
]
}
}
}
I am only getting 6 documents back wherein sql query gives 7 documents back .Could you confirm if this is how the multiple AND & OR QUERY would work in ES and help me in resolving the issue .

ElasticSearch 6.8 doesn't order by exact matches first

I've been searching for this kind of issue for some days and I didn't make it work. I followed steps like this and this but no success.
So basically, I have the following data on ElasticSearch:
{ title: "Black Dust" },
{ title: "Dust In The Wind" },
{ title: "Gold Dust Woman" },
{ title: "Another One Bites The Dust" }
and the problem is that I want to search by "Dust" word and I want the results be ordered like:
{ title: "Dust In The Wind" },
{ title: "Black Dust" },
{ title: "Gold Dust Woman" },
{ title: "Another One Bites The Dust" }
where "Dust" must appear at the top of the result instead.
Posting the mappings and query would be better than continue explaining the issue itself.
settings: {
analysis: {
normalizer: {
lowercase: {
type: 'custom',
filter: ['lowercase']
}
}
}
},
mappings: {
_doc: {
properties: {
title: {
type: 'text',
analyzer: 'standard',
fields: {
raw: {
type: 'keyword',
normalizer: 'lowercase'
},
fuzzy: {
type: 'text',
},
},
}
}
}
}
and my query is:
"query": {
"bool": {
"must": {
"query_string": {
"fields": [
"title"
],
"default_operator": "AND",
"query": "dust"
}
},
"should": {
"prefix": {
"title.raw": "dust"
}
}
}
}
Can anyone please help me in this?
Thank you!
SOLUTION!
I figured it out and I solved by performing the following query:
"query": {
"bool": {
"must": {
"bool": {
"should": [
{
"prefix": {
"title.raw": {
"value": "dust",
"boost": 1000000
}
}
},
{
"match": {
"title": {
"query": "dust",
"boost": 50000
}
}
},
{
"match": {
"title": {
"query": "dust",
"boost": 10,
"fuzziness": 1
}
}
}
]
}
}
}
}
However, while writing tests, I found a little issue.
So, I'm generating a random uuid and adding to database the following:
{ title: `${uuid} A` }
{ title: `${uuid} W` }
{ title: `${uuid} Z` }
{ title: `A ${uuid}` }
{ title: `z ${uuid}` }
{ title: `Z ${uuid}` }
When I perform the query above looking for the uuid, I get:
uuid Z
uuid A
uuid W
Z uuid
I achieved my first goal that was having the uuid on first position, but why Z is before A? (first and second result)
When everything else fails you can use a trivial substring position sort like so:
{
"query": {
"bool": {
"must": {
...
},
"should": {
...
}
}
},
"sort": [
{
"_script": {
"script": "return doc['title.raw'].value.indexOf('dust')",
"type": "number",
"order": "asc" <--
}
}
]
}
I've set the order to asc because the lower the substring index, the higher the 'score'.
EDIT
We've gotta account for index == -1 so replace the script above with:
"script": "def pos = doc['title.raw'].value.indexOf('dust'); return pos == -1 ? Integer.MAX_VALUE : pos"

Full-text search through complex structure Elasticsearch

I have the following issue in case of a full-text search in Elasticsearch. I would like to search for all indexed attributes. However, one of my Project attributes is a very complex array of hashes/objects:
[
{
"title": "Group 1 title",
"name": "Group 1 name",
"id": "group_1_id",
"items": [
{
"pos": "1",
"title": "Position 1 title"
},
{
"pos": "1.1",
"title": "Position 1.1 title",
"description": "<p>description</p>",
"extra_description": {
"rotation": "2 years",
"amount": "1.947m²"
},
"inputs": {
"unit_price": true,
"total_net": true
},
"additional_inputs": [
{
"name": "additonal_input_name",
"label": "Additional input label:",
"placeholder": "Additional input placeholder",
"description": "Additional input description",
"type": "text"
}
]
}
]
}
]
My mappings look like this:
{:title=>{:type=>"text", :analyzer=>"english"},
:description=>{:type=>"text", :analyzer=>"english"},
:location=>{:type=>"keyword"},
:company=>{:type=>"keyword"},
:created_at=>{:type=>"date"},
:due_date=>{:type=>"date"},
:specification=>
{:type=>:nested,
:properties=>
{:id=>{:type=>"keyword"},
:title=>{:type=>"text"},
:items=>
{:type=>:nested,
:properties=>
{:pos=>{:type=>"keyword"},
:title=>{:type=>"text"},
:description=>{:type=>"text", :analyzer=>"english"},
:extra_description=>{:type=>:nested, :properties=>{:rotation=>{:type=>"keyword"}, :amount=>{:type=>"keyword"}}},
:additional_inputs=>
{:type=>:nested,
:properties=>
{:label=>{:type=>"keyword"},
:placeholder=>{:type=>"text"},
:description=>{:type=>"text"},
:type=>{:type=>"keyword"},
:name=>{:type=>"keyword"}
}
}
}
}
}
}
}
The question is, how to properly seek through it? For no nested attributes, it works as a charm, but for instance, I would like to seek by title in the specification, no result is returned. I tried both:
query:
{ nested:
{
multi_match: {
query: keyword,
fields: ['title', 'description', 'company', 'location', 'specification']
}
}
}
Or
{
nested: {
path: 'specification',
query: {
multi_match: {
query: keyword
}
}
}
}
Without any result.
Edit:
It's with elasticsearch-ruby for Ruby.
I am trying to query by: MODEL_NAME.all.search(query: with_specification("Group 1 title")) where with_specification is:
def with_specification(keyword)
{
bool: {
should: [
{
nested: {
path: 'specification',
query: {
bool: {
should: [
{
match: {
'specification.title': keyword,
}
},
{
multi_match: {
query: keyword,
fields: [
'specification.title',
'specification.id'
]
}
},
{
nested: {
path: 'specification.items',
query: {
match: {
'specification.items.title': keyword,
}
}
}
}
]
}
}
}
}
]
}
}
end
Querying on multi-level nested documents must follow a certain schema.
You cannot multi-match on nested & non-nested fields at the same time and/or query on nested fields under different paths.
You can wrap your queries in a bool-should but keep the 2 rules above in mind:
GET your_index/_search
{
"query": {
"bool": {
"should": [
{
"nested": {
"path": "specification",
"query": {
"bool": {
"should": [
{
"match": {
"specification.title": "TEXT" <-- standalone match
}
},
{
"multi_match": { <-- multi-match but 1st level path
"query": "TEXT",
"fields": [
"specification.title",
"specification.id"
]
}
},
{
"nested": {
"path": "specification.items", <-- 2nd level path
"query": {
"match": {
"specification.items.title": "TEXT"
}
}
}
}
]
}
}
}
}
]
}
}
}

ElasticSearch: Complex filter by nested document

I have following document structure:
{
product_name: "Product1",
product_id: 1,
...,
articles: [
{
article_name: 'Article 101',
id: 101,
some_param: 10,
clients: []
},
{
article_name: 'Article 102',
id: 102,
some_param: 11,
clients: [
{
client_id: 10001,
client_name: "some client 1001"
}
...
]
}
]
},
{
product_name: "Product2",
product_id: 2,
...,
articles: [
{
article_name: 'Article 101',
id: 101,
some_param: 10,
clients: []
},
{
article_name: 'Article 102',
id: 102,
some_param: 10,
clients: [
{
client_id: 10001,
client_name: "some client 1001"
}
...
]
}
]
}
I need to get documents (product) ONLY if some of its articles match 2 conditions (single article should match both conditions): articles.some_param = 10 AND articles.clients.client_id = 10001
So I need to get only product with id 2.
I'm using this query now, which is incorrect (and I know why), because it fetches both documents:
{
"query": {
"bool": {
"filter": [
{
"term": {
"articles.clients.id": 10001
}
},
{
"terms": {
"articles.some_param": 10
}
}
]
}
}
}
How can I write query which gets only products which has at least 1 article which matches both conditions: articles.some_param = 10 AND articles.clients.client_id = 10001
e.g., to get Product with ID 2 only?
Something like this:
{
"query": {
"nested": {
"path": "articles",
"query": {
"bool": {
"must": [
{
"term": {
"articles.some_param": {
"value": 10
}
}
},
{
"nested": {
"path": "articles.clients",
"query": {
"term": {
"articles.clients.id":{
"value": 10001
}
}
}
}
}
]
}
}
}
}
}
UPDATE:
Try wrap second query to bool.
{
"query": {
"nested": {
"path": "articles",
"query": {
"bool": {
"must": [
{
"term": {
"articles.some_param": {
"value": 10
}
}
},
{
"bool":{
"must" : [
{
"nested": {
"path": "articles.clients",
"query": {
"term": {
"articles.clients.id":{
"value": 10001
}
}
}
}
}
]
}
}
]
}
}
}
}
}
p.s. I could be mistaken with a path on the second nested query. Just couldn't check. So you can play around with the path on the second query.
p.p.s. The filter is not the query what you need. It does not calculate the scores

How to Boost a field based on condition in ElasticSearch

I am having a query structure like
{
"sort": {},
"query": {
"bool": {
"should": [
{
"match_phrase": {
"user_categories": "Grant Writing"
}
},
{
"match_phrase": {
"user_agencies": "Census"
}
},
{
"match_phrase": {
"user_agencies": "MDA"
}
},
{
"match_phrase": {
"user_agencies": "OSD"
}
}
]
}
},
"size": 500,
"from": 0
}
Suppose this will return a list of 10 users.
What I need to get is, the user having Agency: 'Census' to be the first one in the search result (boost the results having Census as agency). How can we do this?
The following will do it. I converted some of the match_phrase queries to match queries as they contain only single terms
{
"sort": {},
"query": {
"bool": {
"should": [
{
"match_phrase": {
"user_categories": "Grant Writing"
}
},
{
"match": {
"user_agencies": {
"query": "Census",
"boost": 3
}
}
},
{
"match": {
"user_agencies": {
"query": "MDA",
}
},
{
"match": {
"user_agencies": {
"query": "OSD",
}
}
]
}
},
"size": 500,
"from": 0
}
You should boost at query time, and give a big boost documents with "Census" in the agency field. If the boost is high enough, a document matching "Census" will always be on top, regardless of the values for the other fields.
{
"sort": {},
"query": {
"bool": {
"should": [
{
"match_phrase": {
"user_categories": "Grant Writing"
}
},
{
"match_phrase": {
"user_agencies": "Census", "boost": 10
}
},
{
"match_phrase": {
"user_agencies": "MDA"
}
},
{
"match_phrase": {
"user_agencies": "OSD"
}
}
]
}
},
"size": 500,
"from": 0
}

Resources