How to give different weights to exact, phonetic and fuzzy queries?

How to give different weights to exact, phonetic and fuzzy queries? - elasticsearch

Note: I checked out this answer, but could not solve the problem.
So currently I am using the following query:
{
"_source": [
"title",
"bench",
"id_",
"court",
"date"
],
"size": 15,
"from": 0,
"query": {
"bool": {
"must": {
"multi_match": {
"query": "knife",
"fields": [
"title",
"body"
],
"operator": "and"
}
},
"should": {
"multi_match": {
"query": "knife",
"fields": [
"title",
"body"
],
"fuzziness" : 1,
"operator": "and"
}
}
}
},
"highlight": {
"pre_tags": [
"<tag1>"
],
"post_tags": [
"</tag1>"
],
"fields": {
"content": {}
},
"fragment_size": 30
}
}
What I want to achieve is that I want to give different weights to exact, phonetic and fuzy queries in the order exact > fuzzy > phonetic. How do I acheive this?
This is my mapping - (My analyzer is a Metaphone analyzer)
{
"courts_2": {
"mappings": {
"properties": {
"author": {
"type": "text",
"analyzer": "my_analyzer"
},
"bench": {
"type": "text",
"analyzer": "my_analyzer"
},
"citation": {
"type": "text"
},
"content": {
"type": "text",
"fields": {
"standard": {
"type": "text"
}
},
"analyzer": "my_analyzer"
},
"court": {
"type": "text"
},
"date": {
"type": "text"
},
"id_": {
"type": "text"
},
"title": {
"type": "text",
"fields": {
"standard": {
"type": "text"
}
},
"analyzer": "my_analyzer"
},
"verdict": {
"type": "text"
}
}
}
}
}

You might index phonetic fields on an separate sub-field as follow :
"mappings": {
"properties": {
"title": {
"type": "text",
"fields": {
"phonetic": {
"type": "text",
"analyzer": "my_analyzer"
}
}}}}
Then, you can do a Function score query to have the order exact > fuzzy > phonetic :
{
"_source": [
"title",
"bench",
"id_",
"court",
"date"
],
"size": 15,
"from": 0,
"query": {
"bool": {
"should": [
{
"function_score": {
"query": {
"multi_match": {
"query": "knife",
"fields": [
"title",
"body"
],
"operator": "and"
}
},
"boost": 3
}
},
{
"function_score": {
"query": {
"multi_match": {
"query": "knife",
"fields": [
"title",
"body"
],
"fuzziness": 1,
"operator": "and"
}
},
"boost": 2
}
},
{
"function_score": {
"query": {
"multi_match": {
"query": "knife",
"fields": [
"title.phonetic",
"body.phonetic"
],
"operator": "and"
}
},
"boost": 1
}
}
]
}
}
}
Hope this helps !

Related

Elasticsearch Querying Double Nested Object, Match Multiple Rows in Query Within Parent

My data model is related to patient records. At the highest level is the Patient, then their information such as Lab Panels and the individual rows of the results of the panel. So it looks like this: {Patient:{Labs:[{Results:[{}]}]}}
I am able to successfully create the two nested objects Labs nested in Patient and Results nested in Labs, populate it, and query it. What I am unable to successfully do is create a query that constrains the results to a single Lab, and then match by more than one row in the Results object.
An example is attached, where I only want labs that are "Lipid Panel" and the results are HDL <= 46 and LDL >= 140.
Any suggestions?
Example Index
PUT localhost:9200/testpipeline
{
"aliases": {},
"mappings": {
"dynamic": "false",
"properties": {
"ageAtFirstEncounter": {
"type": "float"
},
"dateOfBirth": {
"type": "date"
},
"gender": {
"type": "keyword"
},
"id": {
"type": "float"
},
"labs": {
"type": "nested",
"properties": {
"ageOnDateOfService": {
"type": "float"
},
"date": {
"type": "date"
},
"encounterId": {
"type": "keyword"
},
"id": {
"type": "keyword"
},
"isEdVisit": {
"type": "boolean"
},
"labPanelName": {
"type": "keyword"
},
"labPanelNameId": {
"type": "float"
},
"labPanelSourceName": {
"type": "text",
"store": true
},
"personId": {
"type": "keyword"
},
"processingLogId": {
"type": "float"
},
"results": {
"type": "nested",
"properties": {
"dataType": {
"type": "keyword"
},
"id": {
"type": "float"
},
"labTestName": {
"type": "keyword"
},
"labTestNameId": {
"type": "float"
},
"resultAsNumber": {
"type": "float"
},
"resultAsText": {
"type": "keyword"
},
"sourceName": {
"type": "text",
"store": true
},
"unit": {
"type": "keyword"
}
}
}
}
},
"personId": {
"type": "keyword"
},
"processingLogId": {
"type": "float"
},
"race": {
"type": "keyword"
}
}
}
}
Example Document
PUT localhost:9200/testpipeline/_doc/274746
{
"id": 274746,
"personId": "10005786.000000",
"processingLogId": 51,
"gender": "Female",
"dateOfBirth": "1945-01-01T00:00:00",
"ageAtFirstEncounter": 76,
"labs": [
{
"isEdVisit": false,
"labPanelSourceName": "Lipid Panel",
"dataType": "LAB",
"ageOnDateOfService": 76.9041,
"results": [
{
"unit": "mg/dL",
"labTestNameId": 160,
"labTestName": "HDL",
"sourceName": "HDL",
"resultAsNumber": 46.0,
"resultAsText": "46",
"id": 2150284
},
{
"unit": "mg/dL",
"labTestNameId": 158,
"labTestName": "LDL",
"sourceName": "LDL",
"resultAsNumber": 144.0,
"resultAsText": "144.00",
"id": 2150286
}
],
"id": "9ab9ba84-580b-f2d2-4d32-25658ea5f1bf",
"sourceId": 2150278,
"personId": "10003783.000000",
"encounterId": "39617217.000000",
"processingLogId": 51,
"date": "2021-11-08T00:00:00"
}
],
"lastModified": "2022-03-24T10:21:29.8682784-05:00"
}
Example Query
POST localhost:9200/testpipeline/_search
{
"fields": [
"personId",
"processingLogId",
"id",
"gender",
"ageAtFirstDOS",
"dateOfBirth"
],
"from": 0,
"query": {
"bool": {
"should": [
{
"constant_score": {
"boost": 200,
"filter": {
"bool": {
"_name": "CriteriaFilterId:2068,CriteriaId:1,CriteriaClassId:1,Points:200,T5:False,SoftScore:200",
"should": [
{
"bool": {
"must": [
{
"nested": {
"path": "labs",
"inner_hits": {
"size": 3,
"name": "labs,CriteriaFilterId:2068,CriteriaId:1,CriteriaClassId:1,Points:200,T5:False,guid:8b41f346-2861-4099-b3c0-fcd6393c367b"
},
"query": {
"bool": {
"must": [
{
"bool": {
"must": [
{
"match_phrase": {
"labs.labPanelSourceName": {
"_name": "CriteriaFilterId:2068,Pipeline.Labs.LabPanelSourceName,es_match_phrase=>'Lipid Panel' found in text",
"query": "Lipid Panel",
"slop": 100
}
}
},
{
"nested": {
"path": "labs.results",
"inner_hits": {
"size": 3,
"name": "labs.results,CriteriaFilterId:2068,CriteriaId:1,CriteriaClassId:1,Points:200,T5:False,guid:3564e83f-958b-4fe8-848e-f9edb5d7f3b2"
},
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"bool": {
"must": [
{
"range": {
"labs.results.resultAsNumber": {
"lte": 46
}
}
},
{
"term": {
"labs.results.labTestNameId": {
"value": 160
}
}
}
]
}
},
{
"bool": {
"must": [
{
"range": {
"labs.results.resultAsNumber": {
"gte": 140.0
}
}
},
{
"term": {
"labs.results.labTestNameId": {
"value": 158
}
}
}
]
}
}
],
"minimum_should_match": 2
}
}
]
}
}
}
}
]
}
}
]
}
}
}
}
]
}
}
]
}
}
}
}
],
"minimum_should_match": 1,
"filter": [
]
}
},
"size": 10,
"sort": [
{
"_score": {
"order": "desc"
}
},
{
"processingLogId": {
"order": "asc"
}
},
{
"personId": {
"order": "asc"
}
}
],
"_source": false
}

Elasticsearch: Full text search

I'm trying to build an Elasticsearch full-text search query with the following text "Gold Cartier watches" on multiple fields.
I have to follow this rule: First find all "Gold" documents. From retrieve "Gold" documents, find all "Cartier" documents and from them, find all "watches" documents.
This is my multi_match query:
{
"query": {
"multi_match": {
"query": "Fred or rose",
"fields": [
"name",
"status",
"categories.name",
"brand.name",
"reference"
]
}
}
}
There is my mapping
{
"product": {
"mappings": {
"product": {
"dynamic_date_formats": [],
"properties": {
"available": {
"type": "text"
},
"brand": {
"properties": {
"available": {
"type": "text"
},
"name": {
"type": "keyword"
},
"shopProductBrands": {
"properties": {
"available": {
"type": "text"
},
"priority": {
"type": "integer"
},
"slug": {
"type": "keyword"
}
}
},
"slug": {
"type": "keyword"
}
}
},
"categories": {
"type": "nested",
"properties": {
"available": {
"type": "text"
},
"brand": {
"properties": {
"available": {
"type": "text"
},
"name": {
"type": "keyword"
},
"slug": {
"type": "keyword"
}
}
},
"name": {
"type": "keyword"
},
"parent": {
"type": "keyword"
},
"slug": {
"type": "keyword"
}
}
},
"createdAt": {
"type": "date",
"format": "date_time_no_millis"
},
"longDescription": {
"type": "text",
"analyzer": "french_search"
},
"name": {
"type": "text",
"boost": 15,
"fields": {
"raw": {
"type": "keyword"
}
},
"analyzer": "french_search"
},
"purchasePrice": {
"type": "double"
},
"rawPrice": {
"type": "double"
},
"reference": {
"type": "keyword",
"boost": 10
},
"shortDescription": {
"type": "text",
"boost": 3,
"analyzer": "french_search"
},
"slug": {
"type": "keyword"
},
"status": {
"type": "text"
},
"updatedAt": {
"type": "date",
"format": "date_time_no_millis"
}
}
}
}
}
}
My search will retrieve all "Gold", "Cartier" and "watches" documents combined.
How can I build a query that follow my rule ?
Thanks

I'm not sure that there's an easy solution. I think the closest you can get is to use cross_fields with "operator": "and" and only search fields that have the same analyzer. Can you add "french_search" versions of each of these fields?
cross_fields analyzes the query string into individual terms, then
looks for each term in any of the fields, as though they were one big
field.
However:
The cross_field type can only work in term-centric mode on fields that
have the same analyzer. ... If there are multiple groups, they are
combined with a bool query.
So this query:
{
"query": {
"multi_match": {
"type": "cross_fields",
"query": "gold Cartier watches",
"fields": [
"name",
"status",
"categories.name",
"brand.name",
"reference"
]
}
}
}
Will become something like this:
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "gold Cartier watches",
"fields": ["name"]
}
},
{
"multi_match": {
"query": "gold Cartier watches",
"fields": ["status"]
}
},
{
"multi_match": {
"query": "gold Cartier watches",
"fields": [
"categories.name",
"brand.name",
"reference"
]
}
}
]
}
}
That query is too loose, but adding "operator": "and" or "minimum_should_match": "100%" would be too strict.
It's not pretty or efficient, but you could do application-side term parsing and build a boolean query. Something like this:
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "gold",
"fields": [
"name",
"status",
...
"reference"
]
}
},
{
"multi_match": {
"query": "Cartier",
"fields": [
"name",
"status",
...
"reference"
]
}
}
...
]
}
}

You can use this approach
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_boolean_operators
The preferred operators are + (this term must be present) and - (this term must not be present). All other terms are optional. For example, this query:
quick brown +fox -news
states that:
fox must be present
news must not be present
quick and brown are optional — their presence increases the relevance
The familiar boolean operators AND, OR and NOT (also written &&, || and !) are also supported but beware that they do not honor the usual precedence rules, so parentheses should be used whenever multiple operators are used together. For instance, the previous query could be rewritten as:
((quick AND fox) OR (brown AND fox) OR fox) AND NOT news
U can also use boosting for weighing-up result for a specific term https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_boosting

Elasticsearch not returns all fields for each hit row

I have a problem with my Elasticsearch index. I'm trying to get some fields for each row, but elastic returns not all of them when I'm searching. If I try to 'get' document by id - it returns all fields
In my query I'm trying to use _source field, but it not works - query returns only several fields from _source.
Is there any restrictions on it? Restrictions on amount or size of _source fields?
Elastic version 7.1
My mapping:
"video": {
"properties": {
"title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 128
},
"basic_edge_ngram_analyzer": {
"type": "text",
"analyzer": "basic_edge_ngram_analyzer"
},
"basic_edge_ngram_analyzer_no_digit": {
"type": "text",
"analyzer": "basic_edge_ngram_analyzer_no_digit"
},
"basic_ngram_analyzer": {
"type": "text",
"analyzer": "basic_ngram_analyzer"
},
"basic_ngram_analyzer_no_digit": {
"type": "text",
"analyzer": "basic_ngram_analyzer_no_digit"
},
"numeric_analyzer": {
"type": "text",
"analyzer": "numeric_analyzer"
},
"translit_analyzer": {
"type": "text",
"analyzer": "translit_analyzer"
},
"translit_double_metaphone_analyzer": {
"type": "text",
"analyzer": "translit_double_metaphone_analyzer"
}
}
},
"inverse_title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 128
},
"basic_edge_ngram_analyzer": {
"type": "text",
"analyzer": "basic_edge_ngram_analyzer"
},
"basic_edge_ngram_analyzer_no_digit": {
"type": "text",
"analyzer": "basic_edge_ngram_analyzer_no_digit"
},
"basic_ngram_analyzer": {
"type": "text",
"analyzer": "basic_ngram_analyzer"
},
"basic_ngram_analyzer_no_digit": {
"type": "text",
"analyzer": "basic_ngram_analyzer_no_digit"
},
"numeric_analyzer": {
"type": "text",
"analyzer": "numeric_analyzer"
},
"translit_analyzer": {
"type": "text",
"analyzer": "translit_analyzer"
},
"translit_double_metaphone_analyzer": {
"type": "text",
"analyzer": "translit_double_metaphone_analyzer"
}
}
},
"thumbnail_url": {
"type": "keyword",
"store": "true"
},
"is_classic": {
"type": "boolean",
"store": "true"
},
"is_club": {
"type": "boolean",
"store": "true"
},
"product_id": {
"type": "integer",
"store": "true"
},
"duration": {
"type": "integer",
"store": "true"
},
"feed_name": {
"type": "keyword",
"store": "true"
},
"feed_url": {
"type": "keyword",
"store": "true"
},
"created_ts": {
"type": "date",
"store": "true"
},
"hot_until": {"type": "date", "format": "date_hour_minute_second_fraction"},
"description": {
"type": "keyword"
},
"mi_tv_id": {"type": "integer"},
"total_views": {"type": "long"},
"month_views": {"type": "long"},
"week_views": {"type": "long"},
"day_views": {"type": "long"},
"blocked_countries": {"type": "keyword"},
"linked_persons": {
"type": "nested",
"properties": {
"id": {"type": "integer"},
"name": {"type": "keyword"}
}
},
"linked_tags": {
"type": "nested",
"properties": {
"id": {"type": "integer"},
"name": {"type": "keyword"}
}
},
"linked_hashtags":{
"type": "nested",
"properties": {
"id": {"type": "integer"},
"name": {"type": "keyword"}
}
},
}
}
My query:
GET /video_idx/_search
{
"aggs": {
"mi_tv_id": {
"terms": {
"field": "mi_tv_id",
"size": 10
}
},
"linked_hashtags_id": {
"aggs": {
"linked_hashtags_id": {
"terms": {
"field": "linked_hashtags.id",
"size": 10
}
}
},
"nested": {
"path": "linked_hashtags"
}
},
"author_id": {
"terms": {
"field": "author_id",
"size": 10
}
},
"linked_tags_id": {
"aggs": {
"linked_tags_id": {
"terms": {
"field": "linked_tags.id",
"size": 10
}
}
},
"nested": {
"path": "linked_tags"
}
},
"linked_persons_id": {
"aggs": {
"linked_persons_id": {
"terms": {
"field": "linked_persons.id",
"size": 10
}
}
},
"nested": {
"path": "linked_persons"
}
}
},
"highlight": {
"fields": {
"inverse_title": {
"pre_tags": ["<b>"],
"type": "plain",
"post_tags": ["</b>"]
},
"title": {
"pre_tags": ["<b>"],
"type": "plain",
"post_tags": ["</b>"]
}
}
},
"from": 0,
"size": 20,
"_source": {
"includes":[ "mi_tv_id", "author_id", "hot_until", "id", "linked_persons", "linked_hashtags", "linked_tags", "total_views", "thumbnail_url", "feed_name", "feed_url", "duration", "is_club", "is_classic", "product_id", "created_ts", "title", "inverse_title", "description"]
},
"query": {
"function_score": {
"script_score": {
"script": "\n double total = _score;\n \n if (doc['total_views'].size() > 0) {total = total * Math.log(10 + 0.000087 * doc['total_views'].value)}\n if (doc['month_views'].size() > 0) {total = total * Math.log(10 + 0.00025 * doc['month_views'].value)}\n if (doc['week_views'].size() > 0) {total = total * Math.log(10 + 0.00077 * doc['week_views'].value)}\n if (doc['day_views'].size() > 0) {total = total * Math.log(10 + 0.0025 * doc['day_views'].value)}\n if (doc['hot_until'].size() > 0) {total = 1.5 * total}\n \n if (doc['mi_tv_id'].size() > 0) {total = total * 1.5}\n \n return total \n "
},
"query": {
"bool": {
"minimum_should_match": "20%",
"should": [{
"multi_match": {
"fields": ["title.basic_ngram_analyzer", "inverse_title.basic_ngram_analyzer"],
"operator": "and",
"tie_breaker": 1.0,
"minimum_should_match": "65%",
"type": "cross_fields",
"boost": 5.5,
"query": "\u0434\u043e\u043c 2"
}
}, {
"multi_match": {
"fields": ["title.keyword", "inverse_title.keyword"],
"operator": "and",
"tie_breaker": 1.0,
"minimum_should_match": "100%",
"type": "cross_fields",
"boost": 12.5,
"query": "\u0434\u043e\u043c 2"
}
}, {
"multi_match": {
"fields": ["title.translit_analyzer", "inverse_title.translit_analyzer"],
"operator": "and",
"tie_breaker": 1.0,
"minimum_should_match": "65%",
"type": "cross_fields",
"boost": 3,
"query": "\u0434\u043e\u043c 2"
}
}, {
"multi_match": {
"fields": ["title.numeric_analyzer", "inverse_title.numeric_analyzer"],
"operator": "and",
"tie_breaker": 1.0,
"minimum_should_match": "100%",
"type": "cross_fields",
"boost": 6,
"query": "\u0434\u043e\u043c 2"
}
}, {
"multi_match": {
"fields": ["title.basic_ngram_analyzer_no_digit", "inverse_title.basic_ngram_analyzer_no_digit"],
"operator": "and",
"tie_breaker": 1.0,
"minimum_should_match": "65%",
"type": "cross_fields",
"boost": 5.5,
"query": "\u0434\u043e\u043c 2"
}
}, {
"multi_match": {
"fields": ["title.basic_edge_ngram_analyzer", "inverse_title.basic_edge_ngram_analyzer"],
"operator": "and",
"tie_breaker": 1.0,
"minimum_should_match": "65%",
"type": "cross_fields",
"boost": 5.5,
"query": "\u0434\u043e\u043c 2"
}
}, {
"multi_match": {
"fields": ["title.translit_double_metaphone_analyzer", "inverse_title.translit_double_metaphone_analyzer"],
"operator": "and",
"tie_breaker": 1.0,
"minimum_should_match": "65%",
"type": "cross_fields",
"boost": 1,
"query": "\u0434\u043e\u043c 2"
}
}, {
"multi_match": {
"fields": ["description"],
"operator": "and",
"tie_breaker": 1.0,
"minimum_should_match": "100%",
"type": "cross_fields",
"boost": 1.0,
"query": "\u0434\u043e\u043c 2"
}
}],
"must_not": [{
"terms": {
"blocked_countries": ["RU"]
}
}]
}
}
}
}
}

You need to add all the stored field to the stored_fields parameter in your query:
"_source": {
"includes":[ "mi_tv_id", "author_id", "hot_until", "id", "linked_persons", "linked_hashtags", "linked_tags", "total_views", "thumbnail_url", "feed_name", "feed_url", "duration", "is_club", "is_classic", "product_id", "created_ts", "title", "inverse_title", "description"]
},
"stored_fields": ["feed_name", "feed_url", "duration", "is_club", ...],

Ranking a partial field match to a query above a complete match on a different field

I'm implementing a name search where the possible fields are first_name, middle_initial, and last_name. Queries are usually last name first, e.g. "Smith, A" when looking for "Smith, Ashley" instead of "A Smith".
My results are scoring undesirably (Angela and Alex should be above Robert and Ted):
"Smith, Roger A"
"Smith, Ted A"
"Smith, Angela D"
"Smith, Alex N"
I've tried a lot of things both on the indexing and querying, and I have to include a fair amount of fuzziness (spelling and phonetic). The cross_match query + some fuzziness via an n-gram analyzer has met most of my needs except for this. Edit: the list above is ordered by _score, so I can't sort by other things.
Querying example, where I was trying to see if indexing the first & middle name together made a difference:
GET /_search
{
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "smith, a",
"type": "cross_fields",
"fields": [
"first_name_middle_initial^5",
"last_name^10"
]
}
},
{
"multi_match": {
"query": "smith, a",
"type": "cross_fields",
"fields": [
"first_name_middle_initial.phonetic^2",
"last_name.phonetic^5"
]
}
},
{
"multi_match": {
"query": "smith, a",
"type": "cross_fields",
"fields": [
"first_name_middle_initial.analyzed^2",
"last_name.analyzed^10"
]
}
},
{
"bool": {
"should": [
{
"match": {
"last_name.word_start": {
"query": "smith, a",
"boost": 10,
"operator": "and",
"analyzer": "searchkick_word_search"
}
}
},
{
"match": {
"last_name.word_start": {
"query": "smith, a",
"boost": 5,
"operator": "and",
"analyzer": "searchkick_word_search",
"fuzziness": 1,
"prefix_length": 0,
"max_expansions": 3,
"fuzzy_transpositions": true
}
}
}
]
}
},
{
"bool": {
"should": [
{
"match": {
"first_name_middle_initial.word_start": {
"query": "smith, a",
"boost": 10,
"operator": "and",
"analyzer": "searchkick_word_search"
}
}
}
]
}
}
]
}
}
}
I've also fiddled with the boost, trying to drown out whatever is matching on the middle initial, and even not including the middle initial in my query or the fields I'm referencing in the query (e.g. first_name only) for this at all. I can't ignore middle initial entirely in case it's the differentiating field.

Well, one of my problems may have been an out of date index. Otherwise the keys seemed to be using an ngram analyzer as one of my cross_fields matches and making sure the middle_initial was considered completely independently (sort of as a tiebreaker). Putting it in a bool subquery was intentional -- I don't want it and the other subqueries in that clause to be considered with the same weight as the cross_fields matches, as per this section of the elasticsearch guide.
Here's what ultimately solved my problem:
Index mapping:
{
<snip>
"first_name": {
"type": "text",
"fields": {
"phonetic": {
"type": "text",
"analyzer": "dbl_metaphone"
},
"word_start": {
"type": "text",
"analyzer": "searchkick_word_start_index" // includes "lowercase", "asciifolding", "searchkick_edge_ngram" (ngram from the start of the word)
}
}
},
<snip>
"last_name": {
"type": "text",
"fields": {
"phonetic": {
"type": "text",
"analyzer": "dbl_metaphone"
},
"word_start": {
"type": "text",
"analyzer": "searchkick_word_start_index"
}
}
},
"middle_initial": {
"type": "keyword",
"fields": {
"analyzed": {
"type": "text",
"analyzer": "searchkick_index" // includes lowercase, asciifolding, shingles, stemmer
}
},
"ignore_above": 30000
},
<snip>
}
}
}
Query:
{
"query": {
"bool": {
"should": [
[
{
"multi_match": {
"query": "smith, s",
"type": "cross_fields",
"fields": [
"first_name^2",
"last_name^3"
],
"tie_breaker": 0.3
}
},
{
"multi_match": {
"query": "smith, s",
"type": "cross_fields",
"fields": [
"first_name.phonetic",
"last_name.phonetic"
],
"tie_breaker": 0.3
}
},
{
"multi_match": {
"query": "smith, s",
"type": "cross_fields",
"fields": [
"first_name.word_start",
"last_name.word_start^2"
],
"tie_breaker": 0.3
}
}
],
{
"bool": {
"should": [
<snip subquery for another field>
{
"match": {
"middle_initial.analyzed": {
"query": "s",
"operator": "and"
}
}
}
]
}
}
]
}
}
}

function_score query in elasticsearch won't change score

I have an index with following doc structure: Company > Jobs (nested)
Company have name and jobs have address. I search jobs by address by default. Along with this, I'm trying to boost certain companies by their name using function_score query. But my query doesn't seem to be boosting anything or change scores.
{
"query": {
"filtered": {
"filter": {},
"query": {
"function_score": {
"query": {
"nested": {
"path": "active_jobs",
"score_mode": "max",
"query": {
"multi_match": {
"query": "United States",
"type": "cross_fields",
"fields": [
"active_jobs.address.city",
"active_jobs.address.country",
"active_jobs.address.state"
]
}
},
"inner_hits": {
"size": 1000
}
}
},
"functions": [
{
"filter": {
"term": {
"name": "Amazon"
}
},
"weight": 100
}
]
}
}
}
},
"size": 30,
"from": 0
}
[Update 1]
Here is the mapping for active_jobs property:
"active_jobs": {
"type": "nested",
"properties": {
"active": {
"type": "boolean"
},
"address": {
"properties": {
"city": {
"type": "string"
},
"country": {
"type": "string"
},
"state": {
"type": "string"
},
"state_code": {
"type": "string"
}
}
},
"id": {
"type": "long"
},
"title": {
"type": "string"
},
"updated_at": {
"type": "date",
"format": "dateOptionalTime"
}
}
}

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to give different weights to exact, phonetic and fuzzy queries? - elasticsearch

Related

Elasticsearch Querying Double Nested Object, Match Multiple Rows in Query Within Parent

Elasticsearch: Full text search

Elasticsearch not returns all fields for each hit row

Ranking a partial field match to a query above a complete match on a different field

function_score query in elasticsearch won't change score

Categories

Resources