elasticsearch: combine text match and array contains - elasticsearch

We have documents in elastic saved in the following structure:
{
...
name: "name",
ancestors: ["id1", "id2", "id3"],
...
}
I want to create a search query that searches for name: "some name" AND ancestors contains "id1".
I've tried many queries but none seems to work, or return the desired result. If it's of any help, this combined query should only return one entry every time.
Some of the queries I've tried are the following:
filtered: {
query: {
query_string: {
query: "name:name"
},
term: {
ancestors: "id1"
}
}
}
__
match: {
name: "name",
ancestors: "id1"
},
defaultOperator: 'AND'
__
bool: {
must: { term: { name: "name" }},
filter: {
term: { ancestors: "id1" }
}
}
The mappings are the following:
{
"data": {
"mappings": {
"entry": {
"properties": {
"ancestors": {
"type": "string"
},
"id": {
"type": "string"
},
"name": {
"type": "string"
}
}
}
}
}
}
We haven't changed the default mappings, that's why ancestors is of type string, but I don't think this makes any difference

Try this query:
{
"query": {
"bool": {
"must": [
{
"query_string": {
"default_field": "name",
"query": "stripe AND 1"
}
},
{
"match": {
"ancestors": "id1"
}
}
]
}
}
}

Related

ElasticSearch 6.8 doesn't order by exact matches first

I've been searching for this kind of issue for some days and I didn't make it work. I followed steps like this and this but no success.
So basically, I have the following data on ElasticSearch:
{ title: "Black Dust" },
{ title: "Dust In The Wind" },
{ title: "Gold Dust Woman" },
{ title: "Another One Bites The Dust" }
and the problem is that I want to search by "Dust" word and I want the results be ordered like:
{ title: "Dust In The Wind" },
{ title: "Black Dust" },
{ title: "Gold Dust Woman" },
{ title: "Another One Bites The Dust" }
where "Dust" must appear at the top of the result instead.
Posting the mappings and query would be better than continue explaining the issue itself.
settings: {
analysis: {
normalizer: {
lowercase: {
type: 'custom',
filter: ['lowercase']
}
}
}
},
mappings: {
_doc: {
properties: {
title: {
type: 'text',
analyzer: 'standard',
fields: {
raw: {
type: 'keyword',
normalizer: 'lowercase'
},
fuzzy: {
type: 'text',
},
},
}
}
}
}
and my query is:
"query": {
"bool": {
"must": {
"query_string": {
"fields": [
"title"
],
"default_operator": "AND",
"query": "dust"
}
},
"should": {
"prefix": {
"title.raw": "dust"
}
}
}
}
Can anyone please help me in this?
Thank you!
SOLUTION!
I figured it out and I solved by performing the following query:
"query": {
"bool": {
"must": {
"bool": {
"should": [
{
"prefix": {
"title.raw": {
"value": "dust",
"boost": 1000000
}
}
},
{
"match": {
"title": {
"query": "dust",
"boost": 50000
}
}
},
{
"match": {
"title": {
"query": "dust",
"boost": 10,
"fuzziness": 1
}
}
}
]
}
}
}
}
However, while writing tests, I found a little issue.
So, I'm generating a random uuid and adding to database the following:
{ title: `${uuid} A` }
{ title: `${uuid} W` }
{ title: `${uuid} Z` }
{ title: `A ${uuid}` }
{ title: `z ${uuid}` }
{ title: `Z ${uuid}` }
When I perform the query above looking for the uuid, I get:
uuid Z
uuid A
uuid W
Z uuid
I achieved my first goal that was having the uuid on first position, but why Z is before A? (first and second result)
When everything else fails you can use a trivial substring position sort like so:
{
"query": {
"bool": {
"must": {
...
},
"should": {
...
}
}
},
"sort": [
{
"_script": {
"script": "return doc['title.raw'].value.indexOf('dust')",
"type": "number",
"order": "asc" <--
}
}
]
}
I've set the order to asc because the lower the substring index, the higher the 'score'.
EDIT
We've gotta account for index == -1 so replace the script above with:
"script": "def pos = doc['title.raw'].value.indexOf('dust'); return pos == -1 ? Integer.MAX_VALUE : pos"

Full-text search through complex structure Elasticsearch

I have the following issue in case of a full-text search in Elasticsearch. I would like to search for all indexed attributes. However, one of my Project attributes is a very complex array of hashes/objects:
[
{
"title": "Group 1 title",
"name": "Group 1 name",
"id": "group_1_id",
"items": [
{
"pos": "1",
"title": "Position 1 title"
},
{
"pos": "1.1",
"title": "Position 1.1 title",
"description": "<p>description</p>",
"extra_description": {
"rotation": "2 years",
"amount": "1.947m²"
},
"inputs": {
"unit_price": true,
"total_net": true
},
"additional_inputs": [
{
"name": "additonal_input_name",
"label": "Additional input label:",
"placeholder": "Additional input placeholder",
"description": "Additional input description",
"type": "text"
}
]
}
]
}
]
My mappings look like this:
{:title=>{:type=>"text", :analyzer=>"english"},
:description=>{:type=>"text", :analyzer=>"english"},
:location=>{:type=>"keyword"},
:company=>{:type=>"keyword"},
:created_at=>{:type=>"date"},
:due_date=>{:type=>"date"},
:specification=>
{:type=>:nested,
:properties=>
{:id=>{:type=>"keyword"},
:title=>{:type=>"text"},
:items=>
{:type=>:nested,
:properties=>
{:pos=>{:type=>"keyword"},
:title=>{:type=>"text"},
:description=>{:type=>"text", :analyzer=>"english"},
:extra_description=>{:type=>:nested, :properties=>{:rotation=>{:type=>"keyword"}, :amount=>{:type=>"keyword"}}},
:additional_inputs=>
{:type=>:nested,
:properties=>
{:label=>{:type=>"keyword"},
:placeholder=>{:type=>"text"},
:description=>{:type=>"text"},
:type=>{:type=>"keyword"},
:name=>{:type=>"keyword"}
}
}
}
}
}
}
}
The question is, how to properly seek through it? For no nested attributes, it works as a charm, but for instance, I would like to seek by title in the specification, no result is returned. I tried both:
query:
{ nested:
{
multi_match: {
query: keyword,
fields: ['title', 'description', 'company', 'location', 'specification']
}
}
}
Or
{
nested: {
path: 'specification',
query: {
multi_match: {
query: keyword
}
}
}
}
Without any result.
Edit:
It's with elasticsearch-ruby for Ruby.
I am trying to query by: MODEL_NAME.all.search(query: with_specification("Group 1 title")) where with_specification is:
def with_specification(keyword)
{
bool: {
should: [
{
nested: {
path: 'specification',
query: {
bool: {
should: [
{
match: {
'specification.title': keyword,
}
},
{
multi_match: {
query: keyword,
fields: [
'specification.title',
'specification.id'
]
}
},
{
nested: {
path: 'specification.items',
query: {
match: {
'specification.items.title': keyword,
}
}
}
}
]
}
}
}
}
]
}
}
end
Querying on multi-level nested documents must follow a certain schema.
You cannot multi-match on nested & non-nested fields at the same time and/or query on nested fields under different paths.
You can wrap your queries in a bool-should but keep the 2 rules above in mind:
GET your_index/_search
{
"query": {
"bool": {
"should": [
{
"nested": {
"path": "specification",
"query": {
"bool": {
"should": [
{
"match": {
"specification.title": "TEXT" <-- standalone match
}
},
{
"multi_match": { <-- multi-match but 1st level path
"query": "TEXT",
"fields": [
"specification.title",
"specification.id"
]
}
},
{
"nested": {
"path": "specification.items", <-- 2nd level path
"query": {
"match": {
"specification.items.title": "TEXT"
}
}
}
}
]
}
}
}
}
]
}
}
}

Elasticsearch with nested objects query

I have an index with a nested mapping.
I want to preform a query that will return the following: give me all the documents where each word in the search term appears in one or more of the nested documents.
Here is the index:
properties: {
column_values_index_as_objects: {
type: "nested",
properties: {
value: {
ignore_above: 256,
type: 'keyword',
fields: {
word_middle: {
analyzer: "searchkick_word_middle_index",
type: "text"
},
analyzed: {
term_vector: "with_positions_offsets",
type: "text"
}
}
}
}
}
}
Here is the latest query I try:
nested: {
path: "column_values_index_as_objects",
query: {
bool: {
must: [
{
match: {
"column_values_index_as_objects.value.analyzed": {
query: search_term,
boost: 10 * boost_factor,
operator: "or",
analyzer: "searchkick_search"
}
}
}
For example if I search the words 'food and water', I want that each word will appear in at least on nested document.
The current search returns the document even if only one of the words exists
Thanks for the help!
Update:
As Cristoph suggested, the solution works. now I have the following problem.
Here is my index:
properties: {
name: {
type: "text"
},
column_values_index_as_objects: {
type: "nested",
properties: {
value: {
ignore_above: 256,
type: 'keyword',
fields: {
word_middle: {
analyzer: "searchkick_word_middle_index",
type: "text"
},
analyzed: {
term_vector: "with_positions_offsets",
type: "text"
}
}
}
}
}
}
And the query I want to preform is if I search for 'my name is guy', and will give all the documents where all the words are found - might be in the nested documents and might in the name field.
For example, I could have a document with the value 'guy' in the name field and other words in the nested documents
In order to do this, I usually split the terms and generate a request like this (foo:bar is an other criteria on an other field) :
{
"bool": {
"must": [
{
"nested": {
"path": "column_values_index_as_objects",
"query": {
"match": {
"column_values_index_as_objects.value.analyzed": {
"query": "food",
"boost": "10 * boost_factor",
"analyzer": "searchkick_search"
}
}
}
}
},
{
"nested": {
"path": "column_values_index_as_objects",
"query": {
"match": {
"column_values_index_as_objects.value.analyzed": {
"query": "and",
"boost": "10 * boost_factor",
"analyzer": "searchkick_search"
}
}
}
}
},
{
"nested": {
"path": "column_values_index_as_objects",
"query": {
"match": {
"column_values_index_as_objects.value.analyzed": {
"query": "water",
"boost": "10 * boost_factor",
"analyzer": "searchkick_search"
}
}
}
}
},
{
"query": {
"term": {
"foo": "bar"
}
}
}
]
}
}

Querying Nested JSON based on 1 term value

I have indexed JSON like below format
JSON:
{"work":[{"organization":"abc", end:"present"},{"organization":"edf", end:"old"}]}
{"work":[{"organization":"edf", end:"present"},{"organization":"abc", end:"old"}]}
I want to query records where organization is "abc" and end is "present"
but below query is not working
work.0.organization: "abc" AND work.0.end:"present"
No records are matched
if I give query like below
work.organization: "abc" AND work.end:"present"
Both the records are matched. Whereas only the first record is what I want
The matched record should be only the below
{"work":[{"organization":"abc", end:"present"},{"organization":"edf", end:"old"}]}
You have to use nested_types. First map work as nested type in elastic using following mappings
PUT index_name_3
{
"mappings": {
"document_type" : {
"properties": {
"work" : {
"type": "nested",
"properties": {
"organization" : {
"type" : "text"
},
"end" : {
"type" : "text"
}
}
}
}
}
}
}
Use the following query to do nested filter match and innerhits
{
"query": {
"nested": {
"path": "work",
"inner_hits": {},
"query": {
"bool": {
"must": [{
"term": {
"work.organization": {
"value": "abc"
}
}
},
{
"term": {
"work.end": {
"value": "present"
}
}
}
]
}
}
}
}
}

Elasticsearch how to use multi_match with wildcard

I have a User object with properties Name and Surname. I want to search these fields using one query, and I found multi_match in the documentation, but I don't know how to properly use that with a wildcard. Is it possible?
I tried with a multi_match query but it didn't work:
{
"query": {
"multi_match": {
"query": "*mar*",
"fields": [
"user.name",
"user.surname"
]
}
}
}
Alternatively you could use a query_string query with wildcards.
"query": {
"query_string": {
"query": "*mar*",
"fields": ["user.name", "user.surname"]
}
}
This will be slower than using an nGram filter at index-time (see my other answer), but if you are looking for a quick and dirty solution...
Also I am not sure about your mapping, but if you are using user.name instead of name your mapping needs to look like this:
"your_type_name_here": {
"properties": {
"user": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"surname": {
"type": "string"
}
}
}
}
}
Such a query worked for me:
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"should": [
{"query": {"wildcard": {"user.name": {"value": "*mar*"}}}},
{"query": {"wildcard": {"user.surname": {"value": "*mar*"}}}}
]
}
}
}
}
}
Similar to what you are doing, except that in my case there could be different masks for different fields.
I just did this now:
GET _search {
"query": {
"bool": {
"must": [
{
"range": {
"theDate": {
"gte": "2014-01-01",
"lte": "2014-12-31"
}
}
},
{
"match" : {
"Country": "USA"
}
}
],
"should": [
{
"wildcard" : { "Id_A" : "0*" }
},
{
"wildcard" : { "Id_B" : "0*" }
}
],"minimum_number_should_match": 1
}
}
}
Similar to suggestion above, but this is simple and worked for me:
{
"query": {
"bool": {
"must":
[
{
"wildcard" : { "processname.keyword" : "*system*" }
},
{
"wildcard" : { "username" : "*admin*" }
},
{
"wildcard" : { "device_name" : "*10*" }
}
]
}
}
}
I would not use wildcards, it will not scale well. You are asking a lot of the search engine at query time. You can use the nGram filter, to do the processing at index-time not search time.
See this discussion on the nGram filter.
After indexing the name and surname correctly (change your mapping, there are examples in the above link) you can use multi-match but without wildcards and get the expected results.
description: {
type: 'keyword',
normalizer: 'useLowercase',
},
product: {
type: 'object',
properties: {
name: {
type: 'keyword',
normalizer: 'useLowercase',
},
},
},
activity: {
type: 'object',
properties: {
name: {
type: 'keyword',
normalizer: 'useLowercase',
},
},
},
query:
query: {
bool: {
must: [
{
bool: {
should: [
{
wildcard: {
description: {
value: `*${value ? value : ''}*`,
boost: 1.0,
rewrite: 'constant_score',
},
},
},
{
wildcard: {
'product.name': {
value: `*${value ? value : ''}*`,
boost: 1.0,
rewrite: 'constant_score',
},
},
},
{
wildcard: {
'activity.name': {
value: `*${value ? value : ''}*`,
boost: 1.0,
rewrite: 'constant_score',
},
},
},
],
},
},
{
match: {
recordStatus: RecordStatus.Active,
},
},
{
bool: {
must_not: [
{
term: {
'user.id': req.currentUser?.id,
},
},
],
},
},
{
bool: {
should: tags
? tags.map((name: string) => {
return {
nested: {
path: 'tags',
query: {
match: {
'tags.name': name,
},
},
},
};
})
: [],
},
},
],
filter: {
bool: {
must_not: {
terms: {
id: existingIds ? existingIds : [],
},
},
},
},
},
},
sort: [
{
updatedAt: {
order: 'desc',
},
},
],

Resources