elasticsearch: can I defined synonyms with boost? - elasticsearch

Let's say A, B, C are synonyms, I want to define B is "closer" to A than C
so that when I search the keyword A, in the searching results, A comes the first, B comes the second and C comes the last.
Any help?

There is no search-time mechanism (as of yet) to differentiate between matches on synonyms and source field. This is because, when indexed, a field's synonyms are placed into the inverted index alongside the original term, leaving all words equal.
This is not to say however that you cannot do some magic at index time to glean the information you want.
Create an index with two analyzers: one with a synonym filter, and one without.
PUT /synonym_test/
{
settings : {
analysis : {
analyzer : {
"no_synonyms" : {
tokenizer : "lowercase"
},
"synonyms" : {
tokenizer : "lowercase",
filter : ["synonym"]
}
},
filter : {
synonym : {
type : "synonym",
format: "wordnet",
synonyms_path: "prolog/wn_s.pl"
}
}
}
}
}
Use a multi-field mapping so that the field of interest is indexed twice:
PUT /synonym_test/mytype/_mapping
{
"properties":{
"mood": {
"type": "multi_field",
"fields" : {
"syn" : {"type" : "string", "analyzer" : "synonyms"},
"no_syn" : {"type" : "string", "analyzer" : "no_synonyms"}
}
}
}
}
Index a test document:
POST /synonym_test/mytype/1
{
mood:"elated"
}
At search time, boost the score of hits on the field with no synonymn.
GET /synonym_test/mytype/_search
{
query: {
bool: {
should: [
{ match: { "mood.syn" : { query: "gleeful", "boost": 3 } } },
{ match: { "mood.no_syn" : "gleeful" } }
]
}
}
}
Results in _score":0.2696457
Searching for the original term returns a better score:
GET /synonym_test/mytype/_search
{
query: {
bool: {
should: [
{ match: { "mood.syn" : { query: "elated", "boost": 3 } } },
{ match: { "mood.no_syn" : "elated" } }
]
}
}
}
Results in: _score":0.6558018,"

Related

Elasticsearch - use a field match to boost only and not to fetch the document

I have a query phrase that needs to match in either of the fields - name, summary or description or the exact match on the name field.
Now, I have one more new field brand. Match in this field should be used only to boost results. Meaning if there is a match only in the brand field, the doc should not be in the result set.
To solve the without brand I have the below query:
query: {
bool: {
minimum_should_match: 1,
should: [
multi_match:{
query : "Cadbury chocklate milk",
fields : [name, summary, description]
},
term: {
name_keyword: {
value: "Cadbury chocklate milk"
}
}
]
}
}
This works fine for me.
How do I fetch the data using the same query but boost docs that have brand:cadbury, without increasing the recall set(match based on brand:cadbury).
Thanks!
Using a bool inside must should work for you.
multi_match has multiple types and for phrase you have to use type:phrase.
{
"query": {
"bool": {
"must": [
{ "bool" :
{ "should" : [ {
"multi_match" :{
"type" : "phrase",
"query" : "Cadbury chocklate milk",
"fields" : ["name", "summary", "description"]
} }, {
"term": {
"name_keyword": {
"value": "Cadbury chocklate milk"
} }
}
]
}
}
],
"should" : {
"term" : {
"brand" : {
"value" : "cadbury"
}
}
}
}
}

Can only use wildcard queries on keyword, text and wildcard fields - not on [id] which is of type [long]

Elasticsearch version 7.13.1
GET test/_mapping
{
"test" : {
"mappings" : {
"properties" : {
"id" : {
"type" : "long"
},
"name" : {
"type" : "text"
}
}
}
}
}
POST test/_doc/101
{
"id":101,
"name":"hello"
}
POST test/_doc/102
{
"id":102,
"name":"hi"
}
Wildcard Search pattern
GET test/_search
{
"query": {
"query_string": {
"query": "*101* *hello*",
"default_operator": "AND",
"fields": [
"id",
"name"
]
}
}
}
Error is : "reason" : "Can only use wildcard queries on keyword, text and wildcard fields - not on [id] which is of type [long]",
It was working fine in version 7.6.0 ..
What is new change in latest ES and what is the resolution of this issue?
It's not directly possible to perform wildcards on numeric data types. It is better to convert those integers to strings.
You need to modify your index mapping to
PUT /my-index
{
"mappings": {
"properties": {
"code": {
"type": "text"
}
}
}
}
Otherwise, if you want to perform a partial search you can use edge n-gram tokenizer

ElasticSearch filtering for a tag in array

I've got a bunch of events that are tagged for their audience:
{ id = 123, audiences = ["Public", "Lecture"], ... }
I've trying to do an ElasticSearch query with filtering, so that the search will only return events that have the an exact entry of "Public" in that audiences array (and won't return events that a "Not Public").
How do I do that?
This is what I have so far, but it's returning zero results, even though I definitely have "Public" events:
curl -XGET 'http://localhost:9200/events/event/_search' -d '
{
"query" : {
"filtered" : {
"filter" : {
"term" : {
"audiences": "Public"
}
},
"query" : {
"match" : {
"title" : "[searchterm]"
}
}
}
}
}'
You could use this mapping for you content type
{
"your_index": {
"mappings": {
"your_type": {
"properties": {
"audiences": {
"type": "string",
"index": "not_analyzed"
},
}
}
}
}
}
not_analyzed
Index this field, so it is searchable, but index the
value exactly as specified. Do not analyze it.
And use lowercase term value in search query

How to search multiple fields using * by using Common Terms

I have the following mapping:
"mappings": {
"mydoctype": {
....
"properties": {
"title": {
"properties": {
"en": {
...
},
"zh_CN": {
...
},
"zh_TW": {
...
}
...
}
},
...
}
}
}
I would like to perform Common Terms on the title.* fields, but the following query does not return any results or error message.
"common" : {
"title.*" : {
"query" : "sleep",
"cutoff_frequency" : 0.001
}
}
However, if I change the above "title.*" to "title.en", then I am able to get returned results.
How can I do the "title.*" search with Common Terms? Or can I?
If you really want to use a common terms query, just know that it only works on a single field, i.e. not several and not wildcarded ones.
Otherwise, you can use a multi_match query with the cutoff_frequency like in your other question.
is this working ? i use it to search a wildcarded (with*) value on many field
{
"query" : {
"dis_max" : {
"tie_breaker" : 0,
"boost" : 1,
"queries" : [
{"wildcard" : {"title.en" : "sic*"}},
{ "wildcard" : { "title.zh_CN" : "sic*"}},
{ "wildcard" : { "title.zh_TW" : "*sic*" }}
]
}
}
}
dis_max les you run multiple queries and concat the result

Elasticsearch query on array index

How do I query/filter by index of an array in elasticsearch?
I have a document like this:-
PUT /edi832/record/1
{
"LIN": [ "UP", "123456789" ]
}
I want to search if LIN[0] is "UP" and LIN[1] exists.
Thanks.
This might look like a hack , but then it will work for sure.
First we apply token count type along with multi field to capture the the number of tokens as a field.
So the mapping will look like this -
{
"record" : {
"properties" : {
"LIN" : {
"type" : "string",
"fields" : {
"word_count": {
"type" : "token_count",
"store" : "yes",
"analyzer" : "standard"
}
}
}
}
}
}
LINK - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#token_count
So to check if the second field exists , its as easy as checking if this field value is more than or equal to 2.
Next we can use the token filter to check if the token "up" exists in position 0.
We can use the scripted filter to check this.
Hence a query like below should work -
{
"query": {
"filtered": {
"query": {
"range": {
"LIN.word_count": {
"gte": 2
}
}
},
"filter": {
"script": {
"script": "for(pos : _index['LIN'].get('up',_POSITIONS)){ if(pos.position == 0) { return true}};return false;"
}
}
}
}
}
Advanced scripting - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-advanced-scripting.html
Script filters - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-script-filter.html

Resources