Passing dynamic value to script query in Elastic Search - elasticsearch

I have two mappings in my index. One of them stores some amount in different currencies and other stores current conversion rate. Records in each look like this:
http://localhost:9200/transactions/amount
[{
_index: "transactions",
_type: "amount",
_id: "AVA3fjawwMA2f8TzMTbM",
_score: 1,
_source: {
balance: 1000,
currency:"usd"
}
},
{
_index: "transactions",
_type: "amount",
_id: "AVA3flUWwMA2f8TzMTbN",
_score: 1,
_source: {
balance: 2000,
currency:"inr"
}
}]
and
http://localhost:9200/transactions/conversions
{
_index: "transactions",
_type: "conversions",
_id: "rates",
_score: 1,
_source: {
"usd": 1,
"inr":62.6
}
}
I want to query the data from amount and apply current conversion rates from conversions in a single query and get result.
I tried using scripted query and was able to convert the data based on passed params like:
GET _search
{
"query": {
"match_all": {}
},
"script_fields" : {
"test1" : {
"script" : "_source.balance * factor",
"params" : {
"factor" : 63.2
}
}
}
}
However in my case passed params are to be fetched from result of another query.
I want to visualize my data in Kibana in common currency. Kibana supports scripted queries. As per my knowledge all visualizations in Kibana can correspond to a single elastic search query so I don't have an option to do multiple queries.
I also tried exploring the possibility of using https://www.elastic.co/blog/terms-filter-lookup and adding some dynamic fields to each document in result set. However I don't think term filter allows that.

Assuming, you're trying to always plot transactions in USD, you could try the approach described in the accepted answer here:
In essence:
Model your data parent-child with each conversions document being a parent of all child transactions document in the same foreign currency. (And conversions having a standard fieldname like "conversion_divisor": 62.6)
Include a has_parent query clause for all relevant currency conversions.
Use a function_score (script_score) query to access the foreign currency multiple in each parent and generate a _score for each transaction by dividing the transaction amount by the foreign currency conversion_divisor.
Plot the _score in Kibana

Related

Atlas Search Index partial match

I have a test collection with these two documents:
{ _id: ObjectId("636ce11889a00c51cac27779"), sku: 'kw-lids-0009' }
{ _id: ObjectId("636ce14b89a00c51cac2777a"), sku: 'kw-fs66-gre' }
I've created a search index with this definition:
{
"analyzer": "lucene.standard",
"searchAnalyzer": "lucene.standard",
"mappings": {
"dynamic": false,
"fields": {
"sku": {
"type": "string"
}
}
}
}
If I run this aggregation:
[{
$search: {
index: 'test',
text: {
query: 'kw-fs',
path: 'sku'
}
}
}]
Why do I get 2 results? I only expected the one with sku: 'kw-fs66-gre' 😬
During indexing, the standard anlyzer breaks the string "kw-lids-0009" into 3 tokens [kw][lids][0009], and similarly tokenizes "kw-fs66-gre" as [kw][fs66][gre]. When you query for "kw-fs", the same analyzer tokenizes the query as [kw][fs], and so Lucene matches on both documents, as both have the [kw] token in the index.
To get the behavior you're looking for, you should index the sku field as type autocomplete and use the autocomplete operator in your $search stage instead of text
You're still getting 2 results because of the tokenization, i.e., you're still matching on [kw] in two documents. If you search for "fs66", you'll get a single match only. Results are scored based on relevance, they are not filtered. You can add {$project: {score: { $meta: "searchScore" }}} to your pipeline and see the difference in score between the matching documents.
If you are looking to get exact matches only, you can look to using the keyword analyzer or a custom analyzer that will strip the dashes, so you deal w/ a single token per field and not 3

min_score excluding documents with higher scores

I have a trove of several million documents which I'm querying like this:
const query = {
min_score: 1,
query: {
bool: {
should: [
{
multi_match: {
query: "David",
fields: ["displayTitle^2", "synopsisList.text"],
type: "phrase",
slop: 2
}
},
{
nested: {
path: "contributors",
query: {
multi_match: {
query: "David",
fields: [
"contributors.characterName",
"contributors.contributionBy.displayTitle"
],
type: "phrase",
slop: 2
}
},
score_mode: "sum"
}
}
]
}
}
};
This query is giving sane looking results for a wide range of terms. However, it has a problem with "David" - and presumably others.
"David" crops up fairly regularly in the text. With the min_score option this query always returns 0 documents. When I remove min_score I get thousands of documents the best of which has a score of 22.749.
Does anyone know what I'm doing wrong? I guess min_score doesn't work the way I think it does.
Thanks
The problem I was trying to solve was that when I added some filter clauses to the above query elastic would return all the documents that satisfied the filter even those with a score of zero. That's how should works. I didn't realise that I can nest the should inside a must which achieves the desired effect.

How to exclude a large set of of ids from elasticsearch result?

I have a lot of Products indexed in elasticsearch. I need to exclude a list of ids (that I am fetching from a SQL database), from a query in elasticsearch.
Suppose Products are stored as,
{
"id" : "1",
"name" : "shirt",
"size" : "xl"
}
We show a list of recommended products to a customer based on some algorithm using elasticsearch.
If a customer marks a product as 'Not Interested', we don't have to show him that product again.
We keep such products in a separate SQL table with product_id, customer_id and status 'not_interested'.
Now while fetching recommendations for a customer on runtime, we get the list of 'not_interested' products from the SQL database, and send the array of product_ids in a not filter in elasticsearch to exclude them from recommendation.
But the problem arises, when the size of product_ids array becomes too large.
How should I store the product_id and customer_id mapping in elasticsearch
to filter out the 'not_interested' products on runtime using elasticsearch only?
Will it make sense to store them as nested objects or parent/child documents.? Or some completely other way to store such that I can exclude some ids from the result efficiently.
You can exclude IDs (or any other literal strings) efficiently using a terms query.
Both Elasticsearch and Solr have this. It is very powerful and very efficient.
Elasticsearch has this with the IDS query. This query is in fact a terms query on the _uid field. Make sure you use this query in a mustNot clause within a bool query. See: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-ids-query.html
In Solr you can use the terms query within a fq like fq=-{!terms f=id}doc334,doc125,doc777,doc321,doc253. Note the minus to indicate that it is a negation. See: http://yonik.com/solr-terms-query/
Use "ids" query:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-ids-query.html
{
"query": {
"ids" : {
"type" : "my_type",
"values" : ["1", "4", "100"]
}
}
}
Wrapped inside a bool > must_not.
Add Terms under must_not section like the following:
{
"must_not": [
{
"terms": {
"id": [
"1",
"3",
"5"
]
}
}
]
}

Elasticsearch Autocomplete on Specific field in Specific Document

I have Documents that contain many fields which are lists of values.
I would like to be able to autocomplete from one specific such field at a time in one specific document without data duplication (like Completion Suggestors)
For example, I would like to be able to autocomplete after 3 characters from the values in the category field of the document with id: '7'.
I tried to implement something based on this but this doesn't seem to work on a list of values.
For filtering the suggestions by a field, you can add the fields to filter on in context.
"category":{
type: "completion",
payloads: false,
context: {
id: {
type: "category",
path: "id"
}
}
}
You can index the document as :
POST /myindex/myitem/1
{
id: 123,
category: {
input: "my category",
context: {
id: 123
}
}
}
The minimum length check has to be applied on the client side. ES suggesters do not provide anything like that.
Now, you can suggest on category field with a filter on id field.

How to define document ordering based on filter parameter

Hi Elasticsearch experts.
I have a problem which might be realted to the fact I am indexing DB relational data.
My scenario is the following:
I have two entities:
documents and meetings.
Documents and meetings are independent entities. Although it is possible to assign documents to meetings in a given order.
We are using a join table for this in the DB.
meetings(id,name,date)
document(id,title,author)
meeting_document(doc_id,meeting_id,order)
In elasticsearch I am indexing the documents_id as NESTED property of the meeting
meeting example:
{
id: 25
name:"test",
documents: [22,12,24,55]
}
I will fetch the meeting, after this I would like to send a request to the documents filtering on document.id and asking elasticsearch to return the list in the same order I passed in the list of ids to the filter.
What is the best way to implement this ?
Thanks
Nice Question,
I've spent some time figuring a solution for you and come up with a solution, It might be tricky one but works.
Lets have a look to my query,
I've used script score, for sorting by user defined list.
POST index/type/_search
{
"query": {
"function_score": {
"functions": [
{
"script_score": {
"script": "ar.size()-ar.indexOf(doc['docid'].value)",
"params": {
"ar": [
"1",
"2",
"4",
"3"
]
}
}
}
]
}
},
"filter": {
"terms": {
"docid": [
"1",
"2",
"4",
"3"
]
}
}
}
The thing you have to take care is,
send, same value for filter and in params. Like in the above query.
This returns me hits with doc ids, 1, 2, 4, 3 .
You have to change field name inside script and in filter, and you can use termQuery inside query object.
I've tested the code, Hope this helps!!
Thanks

Resources