Elasticsearch: Search in keywords ignoring case and accent (via aggregation) - elasticsearch

I can search for specific keywords on indexes like this:
GET */_search/?
{
"query": {
"match_all": {}
},
"size": 0,
"aggs": {
"TECH.keyword": {
"terms": {
"field": "TECH.keyword",
"include": ".*mine.*",
"order": {
"_count": "desc"
},
"size": 20
}
}
}
}
Using this query, I can get all entries that have "mine" in their TECH.keyword fields, ordered by "_count": "desc". So, it's OK.
The actual problem is that the index can contain mine, Mine or MINE or even miné in TECH.keyword fields. And I would like to return them all.
Is there a way to search in keywords ignoring case and accent?
The current mapping is:
"TECH": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},

You should be able to accomplish this with a normalizer. You can't use an analyzer on keyword fields, but you can use a normalizer. It allows you to use lowercase and asciifolding.
https://www.elastic.co/guide/en/elasticsearch/reference/6.4/normalizer.html
PUT index
{
"settings": {
"analysis": {
"normalizer": {
"my_normalizer": {
"type": "custom",
"char_filter": [],
"filter": ["lowercase", "asciifolding"]
}
}
}
},
"mappings": {
"_doc": {
"properties": {
"foo": {
"type": "keyword",
"normalizer": "my_normalizer"
}
}
}
}
}

Related

Elasticsearch query for multiple fields terms and multile filelds must not match query

My requirement is below:
I need to use multiple fields for must_not match conditions
I need to use multiple fields in terms query
When I run the below query, Getting the following error for Must_not
"reason": "[match] query doesn't support multiple fields, found [app_rating] and [satisfaction_rating]"
And for the Terms multiple fields also getting the error.
"reason": "Expected [START_OBJECT] under [should], but got a [START_ARRAY] in [MyBuckets]",
How can I correct the query?
"size":0,
"_source":["comments.keyword"],
"query":{
"bool": {
"must": [
{"match":{"source.keyword": "ONA"}}
],
"must_not":[
{"match":{"app_rating":"0","satisfaction_rating":"0","usability_rating": "0"}}
]
}
},
"aggs": {
"MyBuckets": {
"should":[{
"terms": {
"fields": ["comments.keyword"]
}
},
{
"terms":{
"fields": ["app_rating"]
}
},
{
"terms":{
"fields": ["satisfaction_rating"]
}
},
{
"terms":{
"fields": ["usability_rating"]
}
}
],
"order":{
"_count": "desc"
},
"size": "10"
}
}
}
** Below is the sample Mapping details**
'''{
"mapping": {
"properties": {
"Id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"app_rating": {
"type": "long"
},
"comments": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"conversation_rating": {
"type": "long"
},
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"satisfaction_rating": {
"type": "long"
},
"source": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"timestamp": {
"type": "long"
},
"usability_rating": {
"type": "long"
}
}
}
}
You can't use multiple fields in a single must_not clause, so you have to add multiple must_not clause also you are trying to use terms aggregations on various terms and your syntax is not correct which is causing the exception.
can you provide your index mapping and sample docs, so that I can provide the working example.

elastic search for mark character

I have two fields in Vietnamese: "mắt biếc" and "mật mã" in an index call books.
In books index, i use accifolding to transform from "mắt biếc" to "mat biec" and "mật mã" to "mat ma".
In two fields above, i need to query for a term : "mắt". But the score of two field is equal and what i want is "mắt biếc" have score greater than "mật mã".
So, how can i do that in elastic search.
You should use Function Score Query
Try this (base on version 7.x):
GET my_index/_search
{
"query": {
"function_score": {
"query": {
"match": {
"title": "mật"
}
},
"functions": [
{
"filter": {
"term": {
"title.keyword": {
"value": "mắt biếc"
}
}
},
"weight": 30
}
],
"max_boost": 30,
"score_mode": "max",
"boost_mode": "multiply"
}
}
}
Mappings example
PUT my_index
{
"settings": {
"analysis": {
"analyzer": {
"product_analyzer": {
"tokenizer": "standard",
"filter": [
"asciifolding"
]
}
}
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "product_analyzer",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"price": {
"type": "keyword"
},
"author": {
"type": "keyword"
},
"publisher": {
"type": "keyword"
}
}
}
}
You have to update your mappings in order to use title.keyword
Update Query
POST my_index/_mapping
{
"properties": {
"title": {
"type": "text",
"analyzer": "product_analyzer",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
And then
Update all documents
POST my_index/_update_by_query?conflicts=proceed
Hope this helps

Boost score based on integer value - Elasticsearch

I'm not very experienced with ElasticSearch and would like to know how to boost a search based on a certain integer value.
This is an example of a document:
{
"_index": "links",
"_type": "db1",
"_id": "mV32vWcBZsblNn1WqTcN",
"_score": 8.115617,
"_source": {
"url": "example.com",
"title": "Example website",
"description": "This is an example website, used for various of examples around the world",
"likes": 9,
"popularity": 543,
"tags": [
{
"name": "example",
"votes": 5
},
{
"name": "test",
"votes": 2
},
{
"name": "testing",
"votes": 1
}
]
}
}
Now in this particular search, the focus is on the tags and I would like to know how to boost the _score and multiply it by the integer in the votes under tags.
If this is not possible (or very hard to achieve), I would simply like to know how to boost the _score by the votes (not under tags)
Example, add 0.1 to the _score for each integer in votes
This is the current search query I'm using (for searching tags only):
{
"query": {
"nested": {
"path": "tags",
"query": {
"bool":{
"should":{
"match":{
"tags.name":"example,testing,something else"
}
}
}
}
}
}
}
I couldn't find much online, and hope someone can help me out.
How do I boost the _score with an integer value?
Update
For more info, here is the mapping:
{
"links": {
"mappings": {
"db1": {
"properties": {
"url": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"description": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"likes": {
"type": "long"
},
"popularity": {
"type": "long"
},
"tags": {
"type": "nested",
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"votes": {
"type": "long"
}
}
}
}
}
}
}
}
Update 2
Changed the tags.likes/tags.dislikes to tags.votes, and added a nested property to the tags
This took a long time to figure out. I have learnt so much on my way there.
Here is the final result:
{
"query": {
"nested": {
"path": "tags",
"query": {
"function_score": {
"query": {
"bool": {
"should": [
{
"match": {
"tags.name": "example"
}
},
{
"match": {
"tags.name": "testing"
}
},
{
"match": {
"tags.name": "test"
}
}
]
}
},
"functions": [
{
"field_value_factor": {
"field": "tags.votes"
}
}
],
"boost_mode": "multiply"
}
}
}
}
}
The array in should has helped a lot, and was glad I could combine it with function_score
You are looking at function score query: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html
And field value factor https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html#function-field-value-factor.
Snippet from documentation:
GET /_search
{
"query": {
"function_score": {
"field_value_factor": {
"field": "tags.dislikes",
"factor": 1.2,
"modifier": "sqrt",
"missing": 1
}
}
}
}
Or with script score because your nested tags field (not sure if field value score works fine with nested structure).

ElasticSearch - Sort does not work

I'm trying make a search and sort the results. However, I'm getting a error dont know why.
EDIT - I'll provide my full mappings.
"myindex": {
"mappings": {
"mytype": {
"dynamic_templates": [
{
// Dynamic templates here!
}
],
"properties": {
"fieldid": {
"type": "keyword",
"store": true
},
"fields": {
"properties": {
"myfield": {
"type": "text",
"fields": {
"sort": {
"type": "keyword",
"ignore_above": 256
}
},
"analyzer": "myanalyzer"
}
}
},
"isDirty": {
"type": "boolean"
}
}
}
}
}
}
When I performed a search with sorting, like this:
POST /index/_search
{
"sort": [
{ "myfield.sort" : {"order" : "asc"}}
]
}
I get the following error:
{
"error": {
"root_cause": [
{
"type": "query_shard_exception",
"reason": "No mapping found for [myfield.sort] in order to sort on",
"index_uuid": "VxyKnppiRJCrrnXfaGAEfA",
"index": "index"
}
]
"status": 400
}
I'm following the documentation on elasticsearch.
DOCUMENTATION
I also check this link:
DOCUMENTATION
Can someone provided me help?
Hmm it might be your mapping isn't set properly. I followed along using the following:
PUT /your-index/
{
"settings": {
"number_of_replicas": "1",
"number_of_shards": "3",
"analysis": {
"customanalyzer": {
"ID": {
"type": "custom",
"tokenizer": "keyword",
"filter": ["lowercase"]
}
}
}
},
"mappings": {
"thingy": {
"properties": {
"myfield": {
"type": "text",
"analyzer": "ID",
"fields": {
"sort": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
To double check if the index actually has myfield.sort field look at
GET /your-index/thingy/_mapping
Then, upload some document (no need to specify the sub-field(s), elasticsearch will do it for you)
POST /your-index/thingy/
{
"myfield": "some-value"
}
Now I can search with the following:
POST /your-index/thingy/_search
{
"sort": [
{ "myfield.sort": { "order": "asc" } }
]
}
So be sure to check:
Naming/typo's (you never know)
Your mapping (does it have the "myfield.sort" field)
Are you searching in the correct index?
Hopefully this helps

Elasticsearch layered ordering

I would like to be able to return typeahead items in a certain order. For example, search for Para should return:
Paracetamol
Parafin
LIQUID PARAFFIN
ISOMETHEPTENE WITH PARACETAMOL
1) The suggestions that begin with the search term para should be ordered at the top and in alphabetical order
2) The rest of the items should appear below and also in alphabetical order
Is this possible with Elasticsearch?
Update
What if I wanted the output to be like this:
Paracetamol
Parafin
Amber Paraffin
ISOMETHEPTENE WITH PARACETAMOL
LIQUID PARAFFIN
So all the terms that contain the prefix are at the top and everything else in alphabetical order.
This is my suggestion (also, you need to enable scripting):
PUT /test
{
"settings": {
"analysis": {
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"standard",
"lowercase",
"ngram"
]
},
"search_ngram": {
"type": "custom",
"tokenizer": "keyword",
"filter": "lowercase"
}
},
"filter": {
"ngram": {
"type": "ngram",
"min_gram": 2,
"max_gram": 15
}
}
}
},
"mappings": {
"test": {
"properties": {
"text": {
"type": "string",
"index_analyzer": "autocomplete",
"search_analyzer": "search_ngram",
"index_options": "positions",
"fields": {
"not_analyzed_sorting": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
POST test/test/_bulk
{"index":{"_id":1}}
{"text":"Paracetamol"}
{"index":{"_id":2}}
{"text":"Paracetamol xxx yyy zzz"}
{"index":{"_id":3}}
{"text":"Parafin"}
{"index":{"_id":4}}
{"text":"LIQUID PARAFFIN"}
{"index":{"_id":5}}
{"text":"ISOMETHEPTENE WITH PARACETAMOL"}
GET /test/test/_search
{
"query": {
"match": {
"text": "Para"
}
},
"sort": [
{
"_script": {
"type": "number",
"script": "termInfo=_index[field_to_search].get(term_to_search.toLowerCase(),_POSITIONS);if (termInfo) {for(pos in termInfo){return pos.position}};return 0;",
"params": {
"term_to_search": "Para",
"field_to_search": "text"
},
"order": "asc"
}
},
{
"text.not_analyzed_sorting": {
"order": "asc"
}
}
]
}
UPDATE
For your updated question, even if I would have preferred to have another post, use the following query:
{
"query": {
"match": {
"text": "Para"
}
},
"sort": [
{
"_script": {
"type": "number",
"script": "termInfo=_index[field_to_search].get(term_to_search.toLowerCase(),_POSITIONS);if (termInfo) {for(pos in termInfo){if (pos.position==0) return pos.position; else return java.lang.Integer.MAX_VALUE}};return java.lang.Integer.MAX_VALUE;",
"params": {
"term_to_search": "Para",
"field_to_search": "text"
},
"order": "asc"
}
},
{
"text.not_analyzed_sorting": {
"order": "asc"
}
}
]
}

Resources