Elasticsearch: score in percentage, with multi_match (fuzziness enabled) and filter in the same query, based in the multi_match part only - elasticsearch

Considering the Elasticsearch query below, I was wondering if it is possible to have the score in percentage (0-100%), or how could I calculate it in such way, and have it only based on the multi_match part of the query, where fuzziness is enabled.
Therefore, I would like the score to ignore the filter part of it.
Thanks in advance.
{
"index": "myindex",
"type": "mytype",
"body": {
"_source": [
"author_mt",
...
"title_t",
],
"from": 0,
"size": 100,
"query": {
"bool": {
"must": {
"multi_match": {
"query": "test",
"fields": [
"title*"
],
"fuzziness": "AUTO"
}
},
"filter": {
"bool": {
"must": [
{
"term": {
"genre_t_s": 'test'
}
}
]
}
}
}
}
}
}

The elasticsearch score is based on the TF/IDF algorithm, therefore the score can be grater than 1(or 100%). Check the following link what is relevance:
https://www.elastic.co/guide/en/elasticsearch/guide/current/relevance-intro.html

Related

ElasticSeach combine multi_match and match_phrase

I use ES 7, I want to search over multi fields, but on this field (title) must be shown firstly if it matches exactly. For now I tried :
{
"query": {
"bool": {
"must": {
"bool": {
"should": [
{
"match_phrase": {
"titre": {
"query": "test",
"boost": "20"
}
}
},
{
"multi_match": {
"fields": ["titre", "description^4", "subtitle^3"],
"query": "test",
"type": "most_fields"
}
}
]
}
}
}
}
}
It works, but I would like to order the match_phrase before other results.
The idea is the user type the exact phrase of a title, this result will appear before other based on multi_match.
Is it possible ?

Is it possible to use fuzziness for only one field in a multi_match query?

I am using the following multi_match query in Elasticsearch and I am wondering if I can use fuzziness only for "friendly_name field". I have tried different things but doesn't seem to work. I am also wondering if it possible to use an analyzer to get a similar result as the fuzziness does:
"query": {
"multi_match": {
"query": "input query",
"fields": ["code_short", "code_word","friendly_name"],
"minimum_should_match": "2"
} }, "_source": ["code", "friendly_name"]
Any help would be appreciated. Thanks.
If you only need query on one field , you don't need multi match
"match": {
"name": {
"query": "your query",
"fuzziness": "1.5",
"prefix_length": 0,
"max_expansions": 100,
"minimum_should_match": "80%"
}
}
I don't believe that you can fully replace fuzziness, but you have 2 options to explore that might work for you. ngram filter or stemmer filter.
======
Well it wasn't very clear to me what you've intended. But you can do your query that way:
"query": {
"bool": {
"should": [
{
"match": {
"friendly_name": {
"query": "text",
"fuzziness": "1.5",
"prefix_length": 0,
"max_expansions": 100
}
}
},
{
"match": {
"code_word": {
"query": "text"
}
}
},
{
"match": {
"code_short": {
"query": "text"
}
}
}
],
"minimum_should_match" : 2
}
}

How can we use exists query in tandem with the search query?

I have a scenario in Elasticsearch where my indexed docs are like this :-
{"id":1,"name":"xyz", "address": "xyz123"}
{"id":1,"name":"xyz", "address": "xyz123"}
{"id":1,"name":"xyz", "address": "xyz123", "note": "imp"}
Here the requirement stress that we have to do a term match query and then provide relevance score to them which is a straight forward thing but the additional aspect here is if any doc found in search result has note field then it should be given higher relevance. How can we achieve it with DSL query? Using exists we can check which docs contain notes but how to integrate with match query in ES query. Have tried lot of ways but none worked.
With ES 5, you could boost your exists query to give a higher score to documents with a note field. For example,
{
"query": {
"bool": {
"must": {
"match": {
"name": {
"query": "your term"
}
}
},
"should": {
"exists": {
"field": "note",
"boost": 4
}
}
}
}
}
With ES 2, you could try a boosted filtered subset
{
"query": {
"function_score": {
"query": {
"match": { "name": "your term" }
},
"functions": [
{
"filter": { "exists" : { "field" : "note" }},
"weight": 4
}
],
"score_mode": "sum"
}
}
}
I believe that you are looking for boosting query feature
https://www.elastic.co/guide/en/elasticsearch/reference/5.1/query-dsl-boosting-query.html
{
"query": {
"boosting": {
"positive": {
<put yours original query here>
},
"negative": {
"filtered": {
"filter": {
"exists": {
"field": "note"
}
}
}
},
"negative_boost": 4
}
}
}

how to make a query on a field I have not defined a mapping for

I have a field current_country that I am adding to brands, and which has not been defined in my elasticsearch mapping.
I would like to do a filtered query on this, since it is not defined I suppose it is not analyzed and a term query should work.
This is the query I am doing
{
"index": "products",
"type": "brand",
"body": {
"from": 0,
"size": 100,
"sort": [
{
"n_name": "asc"
}
],
"query": {
"filtered": {
"query": {
"function_score": {
"filter": {
"bool": {
"must": [
{
"term": {
"current_country": "DK"
}
}
]
}
}
}
}
}
}
}
}
which returns no documents from the index.
I run the following query to check if current country exists
{
"index": "products",
"type": "brand",
"body": {
"from": 0,
"size": 100,
"sort": [
{
"n_name": "asc"
}
],
"query": {
"filtered": {
"query": {
"function_score": {
"filter": {
"bool": {
"must": [
{
"exists": {
"field": "current_country"
}
}
]
}
}
}
}
}
}
}
}
which returns a total of 693 documents.
here is an example document from the index, returned when I ran the query above.
{
"_index": "products",
"_type": "brand",
"_id": "195da951241478LuxoLivingbrand",
"_score": null,
"_source": {
"categories": [
"Bordlamper og designer bordlamper der giver liv og lys"
],
"image": "http://www.fotoagent.dk/single_picture/11385/138/mega/and_tradition_flowerpot_bordlampe_lilla.jpg",
"top_price": 1695,
"low_price": 1695,
"n_name": "&Tradition",
"name": "&Tradition",
"current_country": "DK",
"current_currency": "DKK"
}
}
How can I query against current_country (preferably a filtered query).
If you do not define any mapping for a field, elasticsearch tries to detect the field as string/date/numeric. If it detects the field as string then it will use the default analyzer (standard analyzer) to analyze your input. Since standard analyzer uses lowercase token filter your input string is indexed as "dk". As term filters does not analyze the input, "DK" won't match "dk".
It can be solved by various means.
(hack) You can lowercase your input filter term. this won't work for phrases.
(better) define a mapping for your input. You can dynamically change mapping/ add new mapping easily

elasticsearch boosting slowing query

this is a very novice question but I'm trying to understand how
boosting certain elements in a document works.
I started with this query,
{
"from": 0,
"size": 6,
"fields": [
"_id"
],
"sort": {
"_score": "desc",
"vendor.name.stored": "asc",
"item_name.stored": "asc"
},
"query": {
"filtered": {
"query": {
"query_string": {
"fields": [
"_all"
],
"query": "Calprotectin",
"default_operator": "AND"
}
},
"filter": {
"and": [
{
"query": {
"query_string": {
"fields": [
"targeted_countries"
],
"query": "All US"
}
}
}
]
}
}
}
}
then i needed to boost certain elements in the document more than the others
so I did this
{
"from": 0,
"size": 60,
"fields": [
"_id"
],
"sort": {
"_score": "desc",
"vendor.name.stored": "asc",
"item_name.stored": "asc"
},
"query": {
"filtered": {
"query": {
"query_string": {
"fields": [
"item_name^4",
"vendor^4",
"id_plus_name",
"category_name^3",
"targeted_countries",
"vendor_search_name^4",
"AdditionalProductInformation^0.5",
"AskAScientist^0.5",
"BuyNowURL^0.5",
"Concentration^0.5",
"ProductLine^0.5",
"Quantity^0.5",
"URL^0.5",
"Activity^1",
"Form^1",
"Immunogen^1",
"Isotype^1",
"Keywords^1",
"Matrix^1",
"MolecularWeight^1",
"PoreSize^1",
"Purity^1",
"References^1",
"RegulatoryStatus^1",
"Specifications/Features^1",
"Speed^1",
"Target/MoleculeDescriptor^1",
"Time^1",
"Description^2",
"Domain/Region/Terminus^2",
"Method^2",
"NCBIGeneAliases^2",
"Primary/Secondary^2",
"Source/ExpressionSystem^2",
"Target/MoleculeSynonym^2",
"Applications^3",
"Category^3",
"Conjugate/Tag/Label^3",
"Detection^3",
"GeneName^3",
"Host^3",
"ModificationType^3",
"Modifications^3",
"MoleculeName^3",
"Reactivity^3",
"Species^3",
"Target^3",
"Type^3",
"AccessionNumber^4",
"Brand/Trademark^4",
"CatalogNumber^4",
"Clone^4",
"entrezGeneID^4",
"GeneSymbol^4",
"OriginalItemName^4",
"Sequence^4",
"SwissProtID^4",
"option.AntibodyProducts^4",
"option.AntibodyRanges&Modifications^1",
"option.Applications^4",
"option.Conjugate^3",
"option.GeneID^4",
"option.HostSpecies^3",
"option.Isotype^3",
"option.Primary/Secondary^2",
"option.Reactivity^4",
"option.Search^1",
"option.TargetName^1",
"option.Type^4"
],
"query": "Calprotectin",
"default_operator": "AND"
}
},
"filter": {
"and": [
{
"query": {
"query_string": {
"fields": [
"targeted_countries"
],
"query": "All US"
}
}
}
]
}
}
}
}
the query slowed down considerably, am I doing this correctly? Is there a
way to speed it up? I'm currently in the process of doing the boosting when I index the document, but using it in the query that way is best for the way my application runs. Any help is much appreciated
Query time boosting is used for assigning larger weight to a term. If you want to permanently boost a field, use index time boosting. If you don't want to use this boosting all the time, then it makes sense to create a separate mapping just for it with store: "no" set.

Resources