Querying nested fields with analyzer results in error - elasticsearch

I tried to use a synonym analyzer for my already working elastic search type. Here's the mapping of my serviceEntity:
{
"serviceentity" : {
"properties":{
"ServiceLangProps" : {
"type" : "nested",
"properties" : {
"NAME" : {"type" : "string", "search_analyzer": "synonym"},
"LONG_TEXT" : {"type" : "string", "search_analyzer": "synonym"},
"DESCRIPTION" : {"type" : "string", "search_analyzer": "synonym"},
"MATERIAL" : {"type" : "string", "search_analyzer": "synonym"},
"LANGUAGE_ID" : {"type" : "string", "include_in_all": false}
}
},
"LinkProps" : {
"type" : "nested",
"properties" : {
"TITLE" : {"type" : "string", "search_analyzer": "synonym"},
"LINK" : {"type" : "string"},
"LANGUAGE_ID" : {"type" : "string", "include_in_all": false}
}
},
"MediaProps" : {
"type" : "nested",
"properties" : {
"TITLE" : {"type" : "string", "search_analyzer": "synonym"},
"FILENAME" : {"type" : "string"},
"LANGUAGE_ID" : {"type" : "string", "include_in_all": false}
}
}
}
}
}
And these are my setting
{
"analysis": {
"filter": {
"synonym": {
"ignore_case": "true",
"type": "synonym",
"synonyms": [
"lorep, spaceship",
"ipsum, planet"
]
}
},
"analyzer": {
"synonym": {
"filter": [
"lowercase",
"synonym"
],
"tokenizer": "whitespace"
}
}
}
}
When In try to search for anything, I get this Error:
Caused by: org.elasticsearch.index.query.QueryParsingException: [nested] nested object under path [ServiceLangProps] is not of nested type
And I don't understand why. If I don't add any analyzer to my setting, everything works fine.
I'm using the java API to communicate with the elasticsearch instance. Therefore my code looks something like this for the multi match query:
MultiMatchQueryBuilder multiMatchBuilder = QueryBuilders.multiMatchQuery(fulltextSearchString, QUERY_FIELDS).analyzer("synonym");
The query string created by the java API looks like this:
{
"query" : {
"bool" : {
"must" : {
"bool" : {
"should" : [ {
"nested" : {
"query" : {
"bool" : {
"must" : [ {
"match" : {
"ServiceLangProps.LANGUAGE_ID" : {
"query" : "DE",
"type" : "boolean"
}
}
}, {
"multi_match" : {
"query" : "lorem",
"fields" : [ "ServiceLangProps.NAME", "ServiceLangProps.DESCRIPTION", "ServiceLangProps.MATERIALKURZTEXT", "ServiceLangProps.DESCRIPTION_RICHTEXT" ],
"analyzer" : "synonym"
}
} ]
}
},
"path" : "ServiceLangProps"
}
}, {
"nested" : {
"query" : {
"bool" : {
"must" : [ {
"match" : {
"LinkProps.LANGUAGE_ID" : {
"query" : "DE",
"type" : "boolean"
}
}
}, {
"match" : {
"LinkProps.TITLE" : {
"query" : "lorem",
"type" : "boolean"
}
}
} ]
}
},
"path" : "LinkProps"
}
}, {
"nested" : {
"query" : {
"bool" : {
"must" : [ {
"match" : {
"MediaProps.LANGUAGE_ID" : {
"query" : "DE",
"type" : "boolean"
}
}
}, {
"match" : {
"MediaProps.TITLE" : {
"query" : "lorem",
"type" : "boolean"
}
}
} ]
}
},
"path" : "MediaProps"
}
} ]
}
},
"filter" : {
"bool" : { }
}
}
}
}
If I try it on the LinkProps or MediaProps, I get the same error for the respective nested object.
Edit: I'm using version 2.4.6 of elasticsearch

Would be helpful to check the query string as well and knowing what version of ES is being used.
I couldnt see the synonyms_path as well as the fact you are using nested types can cause that error.
You probably have seen this already but in case you havent
https://www.elastic.co/guide/en/elasticsearch/reference/5.5/analysis-synonym-tokenfilter.html

I created a minimal example of what I'm trying to do.
My mapping looks like this:
{
"serviceentity" : {
"properties":{
"LinkProps" : {
"type" : "nested",
"properties" : {
"TITLE" : {"type" : "string", "search_analyzer": "synonym"},
"LINK" : {"type" : "string"},
"LANGUAGE_ID" : {"type" : "string", "include_in_all": false}
}
}
}
}
}
And my settings for the synonym analyzer in JAVA code:
XContentBuilder builder = jsonBuilder()
.startObject()
.startObject("analysis")
.startObject("filter")
.startObject("synonym") // The name of the analyzer
.field("type", "synonym") // The type (derivate)
.field("ignore_case", "true")
.array("synonyms", synonyms) // The synonym list
.endObject()
.endObject()
.startObject("analyzer")
.startObject("synonym")
.field("tokenizer", "whitespace")
.array("filter", "lowercase", "synonym")
.endObject()
.endObject()
.endObject()
.endObject();
The metadata which the ElasticSearch Head Chrome plugin spits out looks like this:
{
"analysis": {
"filter": {
"synonym": {
"ignore_case": "true",
"type": "synonym",
"synonyms": [
"Test, foo",
"Title, bar"
]
}
},
"analyzer": {
"synonym": {
"filter": [
"lowercase",
"synonym"
],
"tokenizer": "whitespace"
}
}
}
}
When I now use a search query to look for "Test" I get the same error as mentioned in my first post. Here's the query
{
"query": {
"bool": {
"must": {
"nested": {
"path": "LinkProps",
"query": {
"multi_match": {
"query": "Test",
"fields": [
"LinkProps.TITLE",
"LinkProps.LINK"
],
"analyzer": "synonym"
}
}
}
}
}
}
}
which leads to this error
{
"error": {
"root_cause": [
{
"type": "query_parsing_exception",
"reason": "[nested] nested object under path [LinkProps] is not of nested type",
"index": "minimal",
"line": 1,
"col": 44
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "minimal",
"node": "6AhE4RCIQwywl49h0Q2-yw",
"reason": {
"type": "query_parsing_exception",
"reason": "[nested] nested object under path [LinkProps] is not of nested type",
"index": "minimal",
"line": 1,
"col": 44
}
}
]
},
"status": 400
}
When I check the analyzer with
GET http://localhost:9200/minimal/_analyze?text=foo&analyzer=synonym&pretty=true
I get the correct answer
{
"tokens": [
{
"token": "foo",
"start_offset": 0,
"end_offset": 3,
"type": "word",
"position": 0
},
{
"token": "test",
"start_offset": 0,
"end_offset": 3,
"type": "SYNONYM",
"position": 0
}
]
}
So the analyzer seems to set up correctly. Did I messed up the mappings? I guess the problem is not because I have nested objects or is it?

I just tried this
{
"query": {
"bool": {
"must": {
"query": {
"multi_match": {
"query": "foo",
"fields": [
"LinkProps.TITLE",
"LinkProps.LINK"
],
"analyzer": "synonym"
}
}
}
}
}
}
As you can see, I removed the "nested" wrapper
"nested": {
"path": "LinkProps",
...
}
which now leads at least in some results (Not sure yet, if these will finally be the correct results). I'm trying to apply this to the original project and keep you posted if this also worked.

Related

Update Multi-field Completion Suggester Weighting

Given the following mapping:
{
"index-name-a" : {
"mappings" : {
"properties" : {
"id" : {
"type" : "integer"
},
"title" : {
"type" : "keyword",
"fields" : {
"suggest" : {
"type" : "completion",
"analyzer" : "standard",
"preserve_separators" : true,
"preserve_position_increments" : true,
"max_input_length" : 50
}
}
}
}
}
}
}
How do I update the weightings of individual documents? I have tried the following:
PUT /index-name-a/_doc/1
{
"title.suggest" : {
"weight" : 30
}
}
Which errors with:
Could not dynamically add mapping for field [title.suggest]. Existing mapping for [title] must be of type object but found [keyword]
Which makes sense, since the property is of type keyword and not an object. But I can't find the correct way to do it. Closest I could find in the docs doesn't seem to work if it's a multi-field.
Workaround
Indeed I could reproduce this behaviour on elasticsearch v 8.1
PUT /72342475-2/
{
"mappings": {
"properties": {
"id": {
"type": "integer"
},
"title": {
"type": "completion",
"analyzer": "standard",
"preserve_separators": true,
"preserve_position_increments": true,
"max_input_length": 50,
"fields": {
"suggest": {
"type": "keyword"
}
}
}
}
}
}
PUT 72342475-2/_doc/1?refresh
{
"title": ["nevermind", "smell like teen spirit"],
"title.weight": 22
}

Elasticsearch multi_match + nested search

I am trying to execute a multi_match + nested search in ElasticSearch 6.4. I have the following mappings:
"name" : {
"type" : "text"
},
"status" : {
"type" : "short"
},
"user" : {
"type" : "nested",
"properties" : {
"first_name" : {
"type" : "text"
},
"last_name" : {
"type" : "text"
},
"pk" : {
"type" : "integer"
},
"profile" : {
"type" : "nested",
"properties" : {
"location" : {
"type" : "nested",
"properties" : {
"name" : {
"type" : "text",
"analyzer" : "html_strip"
}
}
}
}
}
}
},
And this is the html_strip analyzer:
"html_strip" : {
"filter" : [
"lowercase",
"stop",
"snowball"
],
"char_filter" : [
"html_strip"
],
"type" : "custom",
"tokenizer" : "standard"
}
And my current query is this one:
"query": {
"bool": {
"must": {
"multi_match": {
"query": 'Paris',
"fields": ['name', 'user.profile.location.name']
},
},
"filter": {
"term": {
"status": 1
}
}
}
}
Obviously searching for "Paris" in user.profile.location.name doesn't work. I was trying to adapt my code to following this answer https://stackoverflow.com/a/48836012/12007123 but without any success.
What I am basically trying to achieve, is to be able to search for a value in multiple fields, this may or may not be nested.
I was also checking this discussion https://discuss.elastic.co/t/multi-match-query-string-with-nested-and-non-nested-fields/118652/5 but everything I tried wasn't successful.
If I just search for name, the search is working fine.
Any tips on how can I achieve this the right way, would be much appreciated.
EDIT:
While I didn't get an answer to my initial question, I was following Nikolay's (#nikolay-vasiliev) comment and changed th mappings to Object instead of Nested.
At least now I am able to search in user.profile.location.name. This is how the new mapping for user looks like:
"user" : {
"properties" : {
"first_name" : {
"type" : "text"
},
"last_name" : {
"type" : "text"
},
"pk" : {
"type" : "integer"
},
"profile" : {
"properties" : {
"location" : {
"properties" : {
"name" : {
"type" : "text",
"analyzer" : "html_strip"
}
}
}
}
}
}
},

Query Synomys ElasticSearch

I create a new index with synonym in this way :
PUT /test_index2
{
"settings": {
"index" : {
"analysis" : {
"filter" : {
"synonym" : {
"type" : "synonym",
"synonyms" : [
"mezzo,centro"
]
}
}
}
}
}
}
When i try this query :
{
"query":
{
"multi_match":
{
"query": "centro",
"fields": ["content"],
"analyzer": "synonym"
}
}
}
Kibana gives me this error :
[multi_match] analyzer [synonym] not found
I'm not very experienced with elastic could you help me?
You need to create a custom analyzer that leverages your synonym filter
{
"settings": {
"index" : {
"analysis" : {
"analyzer": { <-- add this
"synonym_analyzer": {
"type": "custom",
"filter": ["synonym"],
"tokenizer": "keyword"
}
},
"filter" : {
"synonym" : {
"type" : "synonym",
"synonyms" : [
"mezzo,centro"
]
}
}
}
}
}
And then you can use it in your query
{
"query":
{
"multi_match":
{
"query": "centro",
"fields": ["content"],
"analyzer": "synonym_analyzer" <-- change this
}
}
}

Can't get Elasticsearch filter to work

using Elasticsearch 2 with Rails 4, using elasticsearch-model gem
Everything is fine and even geo-point distance is working. However, I can't work out for the life of me how to make a simple boolean filter work. I have a simple boolean 'exclude_from_search_results' that (when true) should cause the record to be filtered from the results.
Here's my query in rails controller (without the filter):
#response = Firm.search(
query: {
bool: {
should: [
{ multi_match: {
query: params[:search],
fields: ['name^10', 'address_1', 'address_2', 'address_3', 'address_4', 'address_5', 'address_6'],
operator: 'or'
}
}
]
}
},
aggs: {types: {terms: {field: 'firm_type'}}}
)
I've added this both within the bool or query section, or outside it, but I either get NO documents, or all documents. (9000 should match)
Example:
#response = Firm.search(
query: {
bool: {
should: [
{ multi_match: {
query: params[:search],
fields: ['name^10', 'address_1', 'address_2', 'address_3', 'address_4', 'address_5', 'address_6'],
operator: 'or'
}
}
],
filter: {
term: {"exclude_from_search_results": "false"}
}
}
},
aggs: {types: {terms: {field: 'firm_type'}}}
)
I've also tried putting the filter clause in different places but either get error or no results. What am I doing wrong?? Probably missing something simple...
Here's my mapping:
"mappings" : {
"firm" : {
"dynamic" : "false",
"properties" : {
"address_1" : {
"type" : "string",
"index_options" : "offsets",
"analyzer" : "english"
},
"address_2" : {
"type" : "string",
"index_options" : "offsets",
"analyzer" : "english"
},
"address_3" : {
"type" : "string",
"index_options" : "offsets",
"analyzer" : "english"
},
"address_4" : {
"type" : "string",
"index_options" : "offsets",
"analyzer" : "english"
},
"address_5" : {
"type" : "string",
"index_options" : "offsets",
"analyzer" : "english"
},
"address_6" : {
"type" : "string",
"index_options" : "offsets",
"analyzer" : "english"
},
"exlude_from_search_results" : {
"type" : "boolean"
},
"firm_type" : {
"type" : "string",
"index" : "not_analyzed"
},
"location" : {
"type" : "geo_point"
},
"name" : {
"type" : "string",
"index_options" : "offsets",
"analyzer" : "english"
}
Any pointers greatly appreciated...
Your current query is doing a OR between your filter and multi-match query. Thats a reason you either get all documents.
I suppose you want to do AND between filter and multi-match query.
If this is the case then following query works for me.
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "address1",
"fields": [
"name^10",
"address1",
"address2",
"address3",
"address4",
"address5",
"address6"
],
"operator": "or"
}
},
{
"term": {
"exclude_from_search_results": {
"value": "false"
}
}
}
]
}
},
"aggs": {
"types": {
"terms": {
"field": "name"
}
}
}
}
Hope this help, Thanks.

Elasticsearch postings highlighter failing for some search strings

I have a search that works well with most search strings, but fails spectacularly on others. Experimenting, it appears to fail when at least one word in the query doesn't match (like this made up search phrase), with the error:
{
"error": "SearchPhaseExecutionException[Failed to execute phase [query_fetch], all shards failed; shardFailures {[w3zfoix_Qi-xwpVGbCbQWw][ia_test][0]: ElasticsearchIllegalArgumentException[the field [content] should be indexed with positions and offsets in the postings list to be used with postings highlighter]}]",
"status": 400
}
The simplest search which gives this error is the one below:
POST /myindex/_search
{
"from" : 0,
"size" : 25,
"query": {
"filtered" : {
"query" : {
"multi_match" : {
"type" : "most_fields",
"fields": ["title", "content", "content.english"],
"query": "Box Fexye"
}
}
}
},
"highlight" : {
"fields" : {
"content" : {
"type" : "postings"
}
}
}
}
My query is more complicated than this, and I need to use the "postings" highlighter to pull out the best matching sentence from a document.
Indexing of the relevant fields looks like:
"properties" : {
"title" : {
"type" : "string",
"fields": {
"shingles": {
"type": "string",
"analyzer": "my_shingle_analyzer"
}
}
},
"content" : {
"type" : "string",
"analyzer" : "standard",
"fields": {
"english": {
"type": "string",
"analyzer": "my_english"
},
"shingles": {
"type": "string",
"analyzer": "my_shingle_analyzer"
}
},
"index_options" : "offsets",
"term_vector" : "with_positions_offsets"
}
}

Resources