Elasticsearch completion suggester context for nested fields - elasticsearch

I'm working on simple search app with completion feature.
I need to somehow secure those suggestions so I figured out that simplest way to do so would be to add context to completion suggester. My problem is that I don't know how to use suggester context in nested fields.
This is how my mapping looks like, very simple, just 3 fields and one as nested.
curl-XPUT'http: //localhost: 9200/cr/_mapping/agreement_index'-d'{
"agreement_index": {
"properties": {
"agreement_name": {
"type": "string",
"fields": {
"suggest": {
"type": "completion",
"analyzer": "simple",
"payloads": false,
"preserve_separators": true,
"preserve_position_increments": true,
"max_input_length": 50,
"context": {
"permitted": {
"type": "category",
"path": "permitted",
"default": []
}
}
}
}
},
"permitted": {
"type": "integer"
},
"team": {
"type": "nested",
"dynamic": "false",
"properties": {
"email": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
},
"suggest": {
"type": "completion",
"analyzer": "simple",
"payloads": false,
"preserve_separators": true,
"preserve_position_increments": true,
"max_input_length": 50,
"context": {
"permitted": {
"type": "category",
"path": "permitted",
"default": []
}
}
}
}
},
"name": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
},
"suggest": {
"type": "completion",
"analyzer": "simple",
"payloads": false,
"preserve_separators": true,
"preserve_position_increments": true,
"max_input_length": 50,
"context": {
"permitted": {
"type": "category",
"path": "permitted",
"default": []
}
}
}
}
},
"permitted": {
"type": "integer"
}
}
}
}
}
}'
During indexing documents like this:
curl-XPUT'http: //localhost: 9200/cr/agreement_index/1'-d'{
"agreement_name": "QWERTY",
"team": [{
"name": "Tomasz Sobkowiak",
"permitted": ["2"],
"email": "tsobkowiak#fake.com"
}],
"permitted": ["2"]
}'
I got below error:
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"one or more prefixes needed"}],"type":"illegal_argument_exception","reason":"one or more prefixes needed"},"status":400}
After removing context from completion suggesters in nested fields everything work fine.
So my question is, how I can use context suggesters in nested fields with path pointed to field in outer document? Is something like that even possible?

The problem is in your mapping. Default can not be left empty. You need to to assign at least one default value in the mapping for context suggester.
"context": {
"permitted": {
"type": "category",
"path": "permitted",
"default": [] // <-- defaults can not be empty, provide at least one default integer value
}
}
The value of the default field is used, when ever no specific is
provided for the certain context. Note that a context is defined by at
least one value.
Also, In the document you are trying to index, you are using string in permitted whereas it is mapped as Integer.
"permitted": ["2"] // <-- change this to "permitted":[2]

Related

ElasticSearch 7.6.3 Java HighLevel Rest Client : Auto Suggest across multiple fields - How to implement

We have a index with the following fields and there is a requirement to give an Auto Suggest to the user by searching the data across all text and keyword mapping fields in the index
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1,
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 20
}
},
}
},
"mappings": {
"properties": {
"id": {
"type": "text"
},
"title": {
"type": "text"
},
"date": {
"type": "date",
"format": "yyyy-MM-dd'T'HH:mm:ss.SSSZ"
},
"subject": {
"type": "text"
},
"title_suggest": {
"type": "completion",
"analyzer": "simple",
"preserve_separators": true,
"preserve_position_increments": true,
"max_input_length": 50
},
"subject_suggest": {
"type": "completion",
"analyzer": "simple",
"preserve_separators": true,
"preserve_position_increments": true,
"max_input_length": 50
}
"fieldOr": {
"type": "text"
},
"fieldsTa": {
"type": "text"
},
"notes": {
"type": "text"
},
"fileDocs": {
"type": "nested",
"properties": {
"fileName": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "standard"
},
"fileContent": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "standard"
},
"docType": {
"type": "keyword"
},
"opinionId": {
"type": "integer"
}
}
},
"fileMeta": {
"type": "nested",
"properties": {
"url": {
"type": "text"
},
"name": {
"type": "text"
}
}
}
}
}
}
I have tried the Completion Suggest but it works with 1 fields. I have created 2 fields with *-suggest in the index and tried to create the Suggest using the completionSuggest
SuggestBuilders.completionSuggestion("my_index_suggest").text(input);
But it supports 1 field only. I am using ES 7.6.3 with Java HighLevel Rest Client and it works for 1 fields. What changes I need to do to support across multiple fields . Is this possible via JSON search ? If yes then I can create a json using Xcontentbuilder and do a Auto suggest ?
Use the copy_to and copy all your desired fields to one field and perform your suggestion on top of it.
Example from the documentation for copy_to is,
PUT my_index
{
"mappings": {
"properties": {
"first_name": {
"type": "text",
"copy_to": "full_name"
},
"last_name": {
"type": "text",
"copy_to": "full_name"
},
"full_name": {
"type": "text"
}
}
}
}
For illustration purpose, I am using my own index mapping which has just two fields name and address and I would be making autocomplete queries using prefix on both the fields and you can include more fields similarly.
Index mapping
{
"employee": {
"mappings": {
"properties": {
"address": {
"type": "text"
},
"name": {
"type": "text"
}
}
}
}
}
Search query using Rest high-level client
public SearchResponse autosuggestSearch() throws IOException {
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
BoolQueryBuilder qb = QueryBuilders.boolQuery();
PrefixQueryBuilder namePQBuilder = QueryBuilders.prefixQuery("address", "usa");
PrefixQueryBuilder addressPQBuilder = QueryBuilders.prefixQuery("address", "usa");
qb.should(namePQBuilder);
qb.should(addressPQBuilder); //Similarly add more fields prefix queries.
sourceBuilder.query(qb);
SearchRequest searchRequest = new SearchRequest("employee").source(sourceBuilder);
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
System.out.println("Search JSON query \n" + searchRequest.source().toString()); //Generated ES search JSON.
return searchResponse;
}
For this example generated search JSON
{
"query": {
"bool": {
"should": [
{
"prefix": {
"address": {
"value": "usa",
"boost": 1.0
}
}
},
{
"prefix": {
"address": {
"value": "usa",
"boost": 1.0
}
}
}
],
"adjust_pure_negative": true,
"boost": 1.0
}
}
}
}

Elasticsearch 5.2 Context Suggester Type Definition - Error

I am having problems defining a Context Suggester in Elasticsearch 5.2
This is how I try to do it:
curl -XPUT 'localhost:9200/world/port/_mapping' -d '{
"port": {
"properties": {
"name": {
"type": "string"
},
"suggest": {
"type": "completion",
"analyzer": "simple",
"payloads": true,
"preserve_separators": true,
"preserve_position_increments": true,
"max_input_length": 50,
"contexts": {
"type": {
"name": "port_type",
"type": "category",
"path": "name"
}
}
}
}
}
}'
I played around with the parameters, but it always ends up with error:
{
"error":
{
"root_cause":
[{"type":"parse_exception","reason":"missing [name] in context mapping"}],
"type":"parse_exception","reason":"missing [name] in context mapping"
},
"status":400
}
I tried to solve it by googling, but to no success.
What is the name the message is referring to?
Can you help me?
Here are couple moments:
context should be a JSON array
its elements should be just JSON dictionaries
Btw, what is "payload": true - I believe it is one more item which can be eliminated as unnecessary.
After digging into that a little bit I've made it happen with the following command:
curl -XPUT 'localhost:9200/world' -d '
{
"mappings" : {
"port": {
"properties": {
"name": {
"type": "string"
},
"suggest": {
"type": "completion",
"analyzer": "simple",
"preserve_separators": true,
"preserve_position_increments": true,
"max_input_length": 50,
"contexts": [
{
"name": "port_type",
"type": "category",
"path": "name"
}
]
}
}
}
}
}'

Elasticsearch.js analyzer error using custom analyzer

Using the latest version of the elasticsearch.js and trying to create a custom path analyzer when indexing and creating the mapping for some posts.
The goal is creating keywords out of each segment of the path. However as a start simply trying to get the analyzer working.
Here is the elasticsearch.js create_mapped_index.js, you can see the custom analyzer near the top of the file:
var client = require('./connection.js');
client.indices.create({
index: "wcm-posts",
body: {
"settings": {
"analysis": {
"analyzer": {
"wcm_path_analyzer": {
"tokenizer": "wcm_path_tokenizer",
"type": "custom"
}
},
"tokenizer": {
"wcm_path_tokenizer": {
"type": "pattern",
"pattern": "/"
}
}
}
},
"mappings": {
"post": {
"properties": {
"id": { "type": "string", "index": "not_analyzed" },
"titles": {
"type": "object",
"properties": {
"main": { "type": "string" },
"subtitle": { "type": "string" },
"alternate": { "type": "string" },
"concise": { "type": "string" },
"seo": { "type": "string" }
}
},
"tags": {
"properties": {
"id": { "type": "string", "index": "not_analyzed" },
"name": { "type": "string", "index": "not_analyzed" },
"slug": { "type": "string" }
},
},
"main_taxonomies": {
"properties": {
"id": { "type": "string", "index": "not_analyzed" },
"name": { "type": "string", "index": "not_analyzed" },
"slug": { "type": "string", "index": "not_analyzed" },
"path": { "type": "string", "index": "wcm_path_analyzer" }
},
},
"categories": {
"properties": {
"id": { "type": "string", "index": "not_analyzed" },
"name": { "type": "string", "index": "not_analyzed" },
"slug": { "type": "string", "index": "not_analyzed" },
"path": { "type": "string", "index": "wcm_path_analyzer" }
},
},
"content_elements": {
"dynamic": "true",
"type": "nested",
"properties": {
"content": { "type": "string" }
}
}
}
}
}
}
}, function (err, resp, respcode) {
console.log(err, resp, respcode);
});
If the call to wcm_path_analyzer is set to "non_analyzed" or index is omitted the index, mapping and insertion of posts work.
As soon as I try to use the custom analyzer on the main_taxonomy and categories path fields, like shown in the json above, I get this error:
response: '{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"wrong value for index [wcm_path_analyzer] for field [path]"}],"type":"mapper_parsing_exception","reason":"Failed to parse mapping [post]: wrong value for index [wcm_path_analyzer] for field [path]","caused_by":{"type":"mapper_parsing_exception","reason":"wrong value for index [wcm_path_analyzer] for field [path]"}},"status":400}',
toString: [Function],
toJSON: [Function] } { error:
{ root_cause: [ [Object] ],
type: 'mapper_parsing_exception',
reason: 'Failed to parse mapping [post]: wrong value for index [wcm_path_analyzer] for field [path]',
caused_by:
{ type: 'mapper_parsing_exception',
reason: 'wrong value for index [wcm_path_analyzer] for field [path]' } },
status: 400 } 400
Here is an example of the two objects that need the custom analyzer on the path field. I pulled this example, after inserting 15 posts into the elasticsearch index when not using the custom analyzer:
"main_taxonomies": [
{
"id": "123",
"type": "category",
"name": "News",
"slug": "news",
"path": "/News/"
}
],
"categories": [
{
"id": "157",
"name": "Local News",
"slug": "local-news",
"path": "/News/Local News/",
"main": true
},
To this point, I had googled similar questions and most said that people were missing putting the analyzers in settings and not adding the parameters to the body. I believe this is correct.
I have also reviewed the elasticsearch.js documentation and tried to create a:
client.indices.putSettings({})
But for this to be used the index needs to exist with the mappings or it throws an error 'no indices found'
Not sure where to go from here? Your suggestions are appreciated.
So the final analyzer is:
var client = require('./connection.js');
client.indices.create({
index: "wcm-posts",
body: {
"settings": {
"analysis": {
"analyzer": {
"wcm_path_analyzer": {
"type" : "pattern",
"lowercase": true,
"pattern": "/"
}
}
}
},
"mappings": {
"post": {
"properties": {
"id": { "type": "string", "index": "not_analyzed" },
"client_id": { "type": "string", "index": "not_analyzed" },
"license_id": { "type": "string", "index": "not_analyzed" },
"origin_id": { "type": "string" },
...
...
"origin_slug": { "type": "string" },
"main_taxonomies_path": { "type": "string", "analyzer": "wcm_path_analyzer", "search_analyzer": "standard" },
"categories_paths": { "type": "string", "analyzer": "wcm_path_analyzer", "search_analyzer": "standard" },
"search_tags": { "type": "string" },
// See the custom analyzer set here --------------------------^
I did determine that at least for the path or pattern analyzers that complex nested or objects cannot be used. The flattened fields set to "type": "string" was the only way to get this to work.
I ended up not needing a custom tokenizer as the pattern analyzer is full featured and already includes a tokenizer.
I chose to use the pattern analyzer as it breaks on the pattern leaving individual terms whereas the path segments the path in different ways but does not create individual terms ( I hope I'm correct in saying this. I base it on the documentation ).
Hope this helps someone else!
Steve
So I got it working ... I think that the json objects were too complex or it was the change of adding the analyzer to the field mappings that did the trick.
first I flattened out:
To:
"main_taxonomies_path": "/News/",
"categories_paths": [ "/News/Local/", "/Business/Local/" ],
"search_tags": [ "montreal-3","laval-4" ],
Then I updated the analyzer to:
"settings": {
"analysis": {
"analyzer": {
"wcm_path_analyzer": {
"tokenizer": "wcm_path_tokenizer",
"type": "custom"
}
},
"tokenizer": {
"wcm_path_tokenizer": {
"type": "pattern",
"pattern": "/",
"replacement": ","
}
}
}
},
Notice that the analyzer 'type' is set to custom.
Then when mapping theses flattened fields:
"main_taxonomies_path": { "type": "string", "analyzer": "wcm_path_analyzer" },
"categories_paths": { "type": "string", "analyzer": "wcm_path_analyzer" },
"search_tags": { "type": "string" },
which when searching yields for these fields:
"main_taxonomies_path": "/News/",
"categories_paths": [ "/News/Local News/", "/Business/Local Business/" ],
"search_tags": [ "montreal-2", "laval-3" ],
So the custom analyzer does what it was set to do in this situation.
I'm not sure if I could apply type object to the main_taxonomies_path and categories_paths, so I will play around with this and see.
I will be refining the pattern searches to format the results differently but happy to have this working.
For completeness I will put my final custom pattern analyzer, mapping and results, once I've completed this.
Regards,
Steve

ElasticSearch term query vs query_string?

When I query my index with query_string, I am getting results
But when I query using term query, I dont get any results
{
"query": {
"bool": {
"must": [],
"must_not": [],
"should": [
{
"query_string": {
"default_field": "Printer.Name",
"query": "HL-2230"
}
}
]
}
},
"from": 0,
"size": 10,
"sort": [],
"aggs": {}
}
I know that term is not_analyzed and query_string is analyzed but Name is already as "HL-2230", why doesnt it match with term query? I tried also searching with "hl-2230", I still didnt get any result.
EDIT: mapping looks like as below. Printer is the child of Product. Not sure if this makes difference
{
"state": "open",
"settings": {
"index": {
"creation_date": "1453816191454",
"number_of_shards": "5",
"number_of_replicas": "1",
"version": {
"created": "1070199"
},
"uuid": "TfMJ4M0wQDedYSQuBz5BjQ"
}
},
"mappings": {
"Product": {
"properties": {
"index": "not_analyzed",
"store": true,
"type": "string"
},
"ProductName": {
"type": "nested",
"properties": {
"Name": {
"store": true,
"type": "string"
}
}
},
"ProductCode": {
"type": "string"
},
"Number": {
"index": "not_analyzed",
"store": true,
"type": "string"
},
"id": {
"index": "no",
"store": true,
"type": "integer"
},
"ShortDescription": {
"store": true,
"type": "string"
},
"Printer": {
"_routing": {
"required": true
},
"_parent": {
"type": "Product"
},
"properties": {
"properties": {
"RelativeUrl": {
"index": "no",
"store": true,
"type": "string"
}
}
},
"PrinterId": {
"index": "no",
"store": true,
"type": "integer"
},
"Name": {
"store": true,
"type": "string"
}
}
},
"aliases": []
}
}
As per mapping provided by you above
"Name": {
"store": true,
"type": "string"
}
Name is analysed. So HL-2230 will split into two tokens, HL and 2230. That's why term query is not working and query_string is working. When you use term query it will search for exact term HL-2230 which is not there.

copy_to field referring parent field in elasticsearch

Here is my mapping of Product document type. I have "copy_to" : "product_all" in mapping.
I am expecting value of label in brand should be copied in 'product_all' in product. Is it correct way of referring the field in outer object from inner object i.e. (product_all in type product from brand), below mapping doesn't work as I don't get results back for a query made on product_all field for the value of label in brand. Am I missing something?
{
"product": {
"_timestamp": {
"enabled": true,
"store": true
},
"_all": {
"enabled": false
},
"dynamic": "strict",
"properties": {
"brand": {
"properties": {
"id": {
"type": "long"
},
"label": {
"type": "multi_field",
"fields": {
"label": {
"type": "string",
"index_analyzer": "productAnalyzer",
"search_analyzer": "productAnalyzer"
},
"raw": {
"type": "string",
"index": "not_analyzed"
}
},
"copy_to": "product_all"
}
}
},
"product_all": {
"type": "string",
"index_analyzer": "productAnalyzer",
"search_analyzer": "productAnalyzer"
}
}
}
}
I moved "copy_to":"product_all" inside label.label of mutifield. Now it works
"label": {
"type": "multi_field",
"fields": {
"label": {
"type": "string",
"copy_to": "product_all"
"index_analyzer": "productAnalyzer",
"search_analyzer": "productAnalyzer"
},
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}

Resources