How can I prioritize documents in Elasticsearch query - elasticsearch

I have products in my index. Documents are basically structured like these:
{
"_id": "product",
"_source": {
...
"type": "product",
"id": 1,
"mainTaxon": {
"name": "T-SHIRT",
},
"attributes": [
{
"code": "name",
"name": "Name",
"value": [
"BANANA T-SHIRT"
],
"score": 50
},
]
}
},
{
"_id": "product",
"_source": {
...
"type": "product",
"id": 2,
"mainTaxon": {
"name": "JEANS",
},
"attributes": [
{
"code": "name",
"name": "Name",
"value": [
"BANANA JEANS"
],
"score": 50
},
]
}
}
}
When I search for 'BANANA' I would prioritize products with mainTaxon different from JEANS. So, every product with the mainTaxon name T_SHIRT or something else would be listed before products with mainTaxon JEANS.

You can use boosting query to prioritize documents
{
"query": {
"boosting": {
"positive": {
"match": {
"attributes.value": "banana"
}
},
"negative": {
"match": {
"mainTaxon.name": "JEANS"
}
},
"negative_boost": 0.5
}
}
}
Search Result will be
"hits": [
{
"_index": "67164768",
"_type": "_doc",
"_id": "1",
"_score": 0.5364054,
"_source": {
"type": "product",
"id": 1,
"mainTaxon": {
"name": "T-SHIRT"
},
"attributes": [
{
"code": "name",
"name": "Name",
"value": [
"BANANA T-SHIRT"
],
"score": 50
}
]
}
},
{
"_index": "67164768",
"_type": "_doc",
"_id": "2",
"_score": 0.32743764,
"_source": {
"type": "product",
"id": 2,
"mainTaxon": {
"name": "JEANS"
},
"attributes": [
{
"code": "name",
"name": "Name",
"value": [
"BANANA JEANS"
],
"score": 50
}
]
}
}
]

Related

ElasticSearch wildcard highlighting with hyphen

I am having trouble with wildcard query. When i have some hyphen - it does not highlight anything after it. I played with highlight settings but did not found any solution yet. Is it normal behavior?
I am making some index:
PUT testhighlight
PUT testhighlight/_mapping/_doc
{
"properties": {
"title": {
"type": "text",
"term_vector": "with_positions_offsets"
},
"content": {
"type": "text",
"term_vector": "with_positions_offsets"
}
}
}
Then i create documents:
PUT testhighlight/_doc/1
{
"title": "1",
"content": "test-input"
}
PUT testhighlight/_doc/2
{
"title": "2",
"content": "test input"
}
PUT testhighlight/_doc/3
{
"title": "3",
"content": "testinput"
}
Then i execute this search request:
GET testhighlight/_search
{
"query": {
"bool": {
"must": [
{
"query_string": {
"fields": [
"title",
"content"
],
"query": "test*"
}
}
]
}
},
"highlight": {
"fields": {
"content": {
"boundary_max_scan": 10,
"fragment_offset": 5,
"fragment_size": 250,
"type": "fvh",
"number_of_fragments": 5,
"order": "score",
"boundary_scanner": "word",
"post_tags": [
"</span>"
],
"pre_tags": [
"""<span class="highlight-search">"""
]
}
}
}
}
It returns these hits:
"hits": [
{
"_index": "testhighlight",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"title": "2",
"content": "test input"
},
"highlight": {
"content": [
"""<span class="highlight-search">test</span> input"""
]
}
},
{
"_index": "testhighlight",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"title": "1",
"content": "test-input"
},
"highlight": {
"content": [
"""<span class="highlight-search">test</span>-input"""
]
}
},
{
"_index": "testhighlight",
"_type": "_doc",
"_id": "3",
"_score": 1.0,
"_source": {
"title": "3",
"content": "testinput"
},
"highlight": {
"content": [
"""<span class="highlight-search">testinput</span>"""
]
}
}
]
It looks alright, but didn't highlighted the whole "test-input" in document with ID 1. Is there any way to do so?

Is it possible to get whole nested object from a document in Elasticsearch?

Imagine I have a document like this:
{
"_index": "bank-accounts",
"_type": "_doc",
"_id": "1",
"_version": 4,
"_seq_no": 3,
"_primary_term": 1,
"found": true,
"_source": {
"id": 1,
"balance": 140,
"transactions": [
{
"id": "42f52474-a49b-4707-86e4-e983efb4ab31",
"type": "Deposit",
"amount": 100
},
{
"id": "3f8396a3-d747-4a4c-8926-cdcedea6b5c3",
"type": "Deposit",
"amount": 50
},
{
"id": "5693585d-6356-4d1a-8d7b-cac5d0dab39f",
"type": "Withdraw",
"amount": 10
}
],
"accountCreatedAt": 1614029062764
}
}
I do want to return only the transactions array in a query.
How would I do this within Elasticsearch? Is this even possible? I've achieved a result using fields[ "transactions.*" ], but it returns each of the fields in separate arrays:
{
...
"hits": [
{
"_index": "bank-accounts",
"_type": "_doc",
"_id": "1",
"_score": 1,
"fields": {
"transactions.id": [
"42f52474-a49b-4707-86e4-e983efb4ab31",
"3f8396a3-d747-4a4c-8926-cdcedea6b5c3",
"5693585d-6356-4d1a-8d7b-cac5d0dab39f"
],
"transactions.amount": [
100,
50,
10
],
"transactions.type": [
"Deposit",
"Deposit",
"Withdraw"
],
...
}
}
]
}
}
I mean, I could very well be using this, but I want something more simple to handle. I expect to get something like this:
*I have to use the document id in my search
{
...
"hits": [
{
"_index": "bank-accounts",
"_type": "_doc",
"_id": "1",
"_score": 3,
"transactions": [
{
"id": "42f52474-a49b-4707-86e4-e983efb4ab31",
"type": "Deposit",
"amount": 100
},
{
"id": "3f8396a3-d747-4a4c-8926-cdcedea6b5c3",
"type": "Deposit",
"amount": 50
},
{
"id": "5693585d-6356-4d1a-8d7b-cac5d0dab39f",
"type": "Withdraw",
"amount": 10
},
....
]
}
]
}
}
Is this possible to achieve?
If you only want to return the transactions array (as you have not mentioned any query condition, on which you need to search), you can achieve that using source filtering.
Adding a working example
Index Mapping:
{
"mappings": {
"properties": {
"transactions": {
"type": "nested"
}
}
}
}
Index Data:
{
"id": 1,
"balance": 140,
"transactions": [
{
"id": "42f52474-a49b-4707-86e4-e983efb4ab31",
"type": "Deposit",
"amount": 100
},
{
"id": "3f8396a3-d747-4a4c-8926-cdcedea6b5c3",
"type": "Deposit",
"amount": 50
},
{
"id": "5693585d-6356-4d1a-8d7b-cac5d0dab39f",
"type": "Withdraw",
"amount": 10
}
],
"accountCreatedAt": 1614029062764
}
Search Query:
{
"_source": [
"transactions.*"
]
}
Search Result:
"hits": [
{
"_index": "66324257",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"transactions": [
{
"amount": 100,
"id": "42f52474-a49b-4707-86e4-e983efb4ab31",
"type": "Deposit"
},
{
"amount": 50,
"id": "3f8396a3-d747-4a4c-8926-cdcedea6b5c3",
"type": "Deposit"
},
{
"amount": 10,
"id": "5693585d-6356-4d1a-8d7b-cac5d0dab39f",
"type": "Withdraw"
}
]
}
}
]

Get only the matching values and corresponding fields from ElasticSearch

In elasticsearch, let's say I have documents like
{
"name": "John",
"department": "Biology",
"address": "445 Mount Eden Road"
},
{
"name": "Jane",
"department": "Chemistry",
"address": "32 Wilson Street"
},
{
"name": "Laura",
"department": "BioTechnology",
"address": "21 Greens Road"
},
{
"name": "Mark",
"department": "Physics",
"address": "Random UNESCO Bio-reserve"
}
There is a use-case where, if I type "bio" in a search bar, I should get the matching field-value(s) from elasticsearch along with the field name.
For this example,
Input: "bio"
Expected Output:
{
"field": "department",
"value": "Biology"
},
{
"field": "department",
"value": "BioTechnology"
},
{
"field": "address",
"value": "Random UNESCO Bio-reserve"
}
What type of query should I use? I can think of using NGram Tokenizer and then use match query. But, I am not sure how shall I get only the matching field value (not the entire document) and the corresponding field name as the output.
After reading further about Completion Suggesters and Context Suggesters, I could solve this problem in the following way:
1) Keep a separate "suggest" field for each record with type "completion" with context-mapping of type "category". The mapping I created looks like as follows:
{
"properties": {
"suggest": {
"type": "completion",
"contexts": [
{
"name": "field_type",
"type": "category",
"path": "cat"
}
]
},
"name": {
"type": "text"
},
"department": {
"type": "text"
},
"address": {
"type": "text"
}
}
}
2) Then I insert the records as shown below (adding search metadata to the "suggest" field with proper "context").
For example, to insert the first record, I execute the following:
POST: localhost:9200/test_index/test_type/1
{
"suggest": [
{
"input": ["john"],
"contexts": {
"field_type": ["name"]
}
},
{
"input": ["biology"],
"contexts": {
"field_type": ["department"]
}
},
{
"input": ["445 mount eden road"],
"contexts": {
"field_type": ["address"]
}
}
],
"name": "john",
"department": "biology",
"address": "445 mount eden road"
}
3) If we want to search terms occurring in the middle of a sentence (as the search-term "bio" occurs in middle of the address field in the 4th record, we can index the entry as follows:
POST: localhost:9200/test_index/test_type/4
{
"suggest": [
{
"input": ["mark"],
"contexts": {
"field_type": ["name"]
}
},
{
"input": ["physics"],
"contexts": {
"field_type": ["department"]
}
},
{
"input": ["random unesco bio-reserve", "bio-reserve"],
"contexts": {
"field_type": ["address"]
}
}
],
"name": "mark",
"department": "physics",
"address": "random unesco bio-reserve"
}
4) Then search for the keyword "bio" like this:
localhost:9200/test_index/test_type/_search
{
"_source": false,
"suggest": {
"suggestion" : {
"text" : "bio",
"completion" : {
"field" : "suggest",
"size": 10,
"contexts": {
"field_type": [ "name", "department", "address" ]
}
}
}
}
}
The response:
{
"hits": {
"total": 0,
"max_score": 0,
"hits": []
},
"suggest": {
"suggestion": [
{
"text": "bio",
"offset": 0,
"length": 3,
"options": [
{
"text": "bio-reserve",
"_index": "test_index",
"_type": "test_type",
"_id": "4",
"_score": 1,
"contexts": {
"field_type": [
"address"
]
}
},
{
"text": "biology",
"_index": "test_index",
"_type": "test_type",
"_id": "1",
"_score": 1,
"contexts": {
"field_type": [
"department"
]
}
},
{
"text": "biotechnology",
"_index": "test_index",
"_type": "test_type",
"_id": "3",
"_score": 1,
"contexts": {
"field_type": [
"department"
]
}
}
]
}
]
}
}
Can anyone please suggest any better approach?

Elasticsearch + Kibana + Alerting (X-Pack) For Energy Monitoring System

Can somebody help me with Alerting Via X-Pack for Energy monitoring system project? The main problem here is I can't collect the 'Value' data from the database, as I want to compare it later with the upper and the lower threshold.
So here is the index:
PUT /test-1
{
"mappings": {
"Test1": {
"properties": {
"Value": {
"type": "integer"
},
"date": {
"type": "date",
"format": "yyyy-MM-dd'T'HH:mm:ss.SSSZ"
},
"UpperThreshold": {
"type": "integer"
},
"LowerThreshold": {
"type": "integer"
}
}
}
}
}
Here is the example of the input:
POST /test-1/Test1
{
"Value": "500",
"date": "2017-06-13T16:20:00.000Z",
"UpperThreshold":"450",
"LowerThreshold": "380"
}
This is my alerting code
{
"trigger": {
"schedule": {
"interval": "10s"
}
},
"input": {
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [
"logs"
],
"types": [],
"body": {
"query": {
"match": {
"message": "error"
}
}
}
}
}
},
"condition": {
"compare": {
"ctx.payload.hits.total": {
"gt": 0
}
}
},
"actions": {
"send_email": {
"email": {
"profile": "standard",
"to": [
"<account#gmail.com>"
],
"subject": "Watcher Notification",
"body": {
"text": "{{ctx.payload.hits.total}} error logs found"
}
}
}
}
}
Here is the response I got from the alerting plugin
{
"watch_id": "Alerting-Test",
"state": "execution_not_needed",
"_status": {
"state": {
"active": true,
"timestamp": "2017-07-26T15:27:35.497Z"
},
"last_checked": "2017-07-26T15:27:38.625Z",
"actions": {
"logging": {
"ack": {
"timestamp": "2017-07-26T15:27:35.497Z",
"state": "awaits_successful_execution"
}
}
}
},
"trigger_event": {
"type": "schedule",
"triggered_time": "2017-07-26T15:27:38.625Z",
"schedule": {
"scheduled_time": "2017-07-26T15:27:38.175Z"
}
},
"input": {
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [
"test-1"
],
"types": [
"Test1"
],
"body": {
"query": {
"match_all": {}
}
}
}
}
},
"condition": {
"compare": {
"ctx.payload.hits.hits.0.Value": {
"gt": 450
}
}
},
"metadata": {
"name": "Alerting-Test"
},
"result": {
"execution_time": "2017-07-26T15:27:38.625Z",
"execution_duration": 0,
"input": {
"type": "search",
"status": "success",
"payload": {
"_shards": {
"total": 5,
"failed": 0,
"successful": 5
},
"hits": {
"hits": [
{
"_index": "test-1",
"_type": "Test1",
"_source": {
"date": "2017-07-22T12:00:00.000Z",
"LowerThreshold": "380",
"Value": "350",
"UpperThreshold": "450"
},
"_id": "AV1-1P3lArbJ1tbnct4e",
"_score": 1
},
{
"_index": "test-1",
"_type": "Test1",
"_source": {
"date": "2017-07-22T18:00:00.000Z",
"LowerThreshold": "380",
"Value": "4100",
"UpperThreshold": "450"
},
"_id": "AV1-1Sq0ArbJ1tbnct4v",
"_score": 1
},
{
"_index": "test-1",
"_type": "Test1",
"_source": {
"date": "2017-07-24T18:00:00.000Z",
"LowerThreshold": "380",
"Value": "450",
"UpperThreshold": "450"
},
"_id": "AV1-1eLJArbJ1tbnct6G",
"_score": 1
},
{
"_index": "test-1",
"_type": "Test1",
"_source": {
"date": "2017-07-23T00:00:00.000Z",
"LowerThreshold": "380",
"Value": "400",
"UpperThreshold": "450"
},
"_id": "AV1-1VUzArbJ1tbnct5A",
"_score": 1
},
{
"_index": "test-1",
"_type": "Test1",
"_source": {
"date": "2017-07-23T12:00:00.000Z",
"LowerThreshold": "380",
"Value": "390",
"UpperThreshold": "450"
},
"_id": "AV1-1X4FArbJ1tbnct5R",
"_score": 1
},
{
"_index": "test-1",
"_type": "Test1",
"_source": {
"date": "2017-07-23T18:00:00.000Z",
"LowerThreshold": "380",
"Value": "390",
"UpperThreshold": "450"
},
"_id": "AV1-1YySArbJ1tbnct5T",
"_score": 1
},
{
"_index": "test-1",
"_type": "Test1",
"_source": {
"date": "2017-07-26T00:00:00.000Z",
"LowerThreshold": "380",
"Value": "4700",
"UpperThreshold": "450"
},
"_id": "AV1-1mflArbJ1tbnct67",
"_score": 1
},
{
"_index": "test-1",
"_type": "Test1",
"_source": {
"date": "2017-07-26T06:00:00.000Z",
"LowerThreshold": "380",
"Value": "390",
"UpperThreshold": "450"
},
"_id": "AV1-1oluArbJ1tbnct7M",
"_score": 1
},
{
"_index": "test-1",
"_type": "Test1",
"_source": {
"date": "2017-07-21T12:00:00.000Z",
"LowerThreshold": "380",
"Value": "400",
"UpperThreshold": "450"
},
"_id": "AV1-1IrZArbJ1tbnct3r",
"_score": 1
},
{
"_index": "test-1",
"_type": "Test1",
"_source": {
"date": "2017-07-21T18:00:00.000Z",
"LowerThreshold": "380",
"Value": "440",
"UpperThreshold": "450"
},
"_id": "AV1-1LwzArbJ1tbnct38",
"_score": 1
}
],
"total": 20,
"max_score": 1
},
"took": 1,
"timed_out": false
},
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [
"test-1"
],
"types": [
"Test1"
],
"body": {
"query": {
"match_all": {}
}
}
}
}
},
"condition": {
"type": "compare",
"status": "success",
"met": false,
"compare": {
"resolved_values": {
**"ctx.payload.hits.hits.0.Value": null**
}
}
},
"actions": []
},
"messages": []
}
Really appreciate for your help!!

How to aggregate on nested objects in elasticsearch

I have the following mapping in ES:
"mappings": {
"products": {
"properties": {
"product": {
"type" : "nested",
"properties": {
"features": {
"type": "nested"
},
"sitedetails": {
"type": "nested"
}
}
}
}
}
}
and then 3 products like this:
"hits": [
{
"_index": "catalog",
"_type": "products",
"_id": "AVNE8F4mFYOWvB4rMqdO",
"_score": 1,
"_source": {
"product": {
"ean": "abc",
"features": {
"productType": "DVD player"
},
"color": "Black",
"manufacturer": "Sony",
"sitedetails": [
{
"name": "amazon.com",
"sku": "zzz",
"url": "http://www.amazon.com/dp/zzz"
}
],
"category": "Portable DVD Players"
}
}
},
{
"_index": "catalog",
"_type": "products",
"_id": "AVNE8XkXFYOWvB4rMqdQ",
"_score": 1,
"_source": {
"product": {
"ean": "def",
"features": {
"ProductType": "MP3 player"
},
"color": "Black",
"manufacturer": "LG",
"sitedetails": [
{
"name": "amazon.com",
"sku": "aaa",
"url": "http://www.amazon.com/dp/aaa"
}
],
"category": "MP3 Players"
}
}
},
{
"_index": "catalog",
"_type": "products",
"_id": "AVNIh-xVWwxj6Cz_r8AT",
"_score": 1,
"_source": {
"product": {
"ean": "abc",
"features": {
"productType": "DVD player"
},
"color": "White",
"manufacturer": "Sony",
"sitedetails": [
{
"name": "amazon.com",
"sku": "ggg",
"url": "http://www.amazon.com/dp/ggg"
}
],
"category": "Portable DVD Players"
}
}
}
]
I need to display on the UI side 2 filters, one for Manufacturer and one for website.
How can I aggregate on product.manufacturer and product.sitedetails.name?
tnx!
Figured it out:
GET /catalog/products/_search
{
"aggs": {
"byManufacturer": {
"nested": {
"path": "product"
},
"aggs": {
"byManufacturer": {
"terms": {
"field": "product.manufacturer"
}
}
}
},
"bySeller": {
"nested": {
"path": "product.sitedetails"
},
"aggs": {
"bySeller": {
"terms": {
"field": "product.sitedetails.name"
}
}
}
}
}
}

Resources