Item variants in ElasticSearch - elasticsearch

What is the best way to use item variants in elasticsearch and retrieving only 1 item of the variant group?
For example, let's say I have the following items:
[{
"sku": "abc-123",
"group": "abc",
"color": "red",
"price": 10
},
{
"sku": "def-123",
"group": "def",
"color": "red",
"price": 10
},
{
"sku": "abc-456",
"group": "abc",
"color": "black",
"price": 20
}
]
The first item and the last one are in the same group, so I want only to return one of them if I query for items below the price of 20 (for example), but with the best hit score.
Feel free to suggest documents design and queries accordingly.

If your mapping is of Nested datatype, then you can use this to retrieve them.
GET index/type/_search
{
"size": 2000,
"_source": false,
"query": {
"bool": {
"filter": {
"nested": {
"path": "childs",
"query": {
"bool": {
"filter": {
"term": {
"childs.group.keyword": "abc"
}
}
}
},
"inner_hits": {}
}
}
}
}
}

Related

How can we sort records by specific value of a filed in elastic search

We want to sort the records by specific value of a filed, for example :-
We have data with country code, name & other details and we want to show records at the top which have country code 'US', after us we want to show the results of country code 'AR'.
so if we are searching for obama, then all obama from US will come first and after that obama from AR will be available in results and we have also want to sort us records base on some rating score.
I am trying filter query with boost but not getting expected data because with filter we are getting only filtered records but we want sort the records basis on boost of specific value of country filed
{
"query": {
"bool": {
"must": [
{
"bool": {
"must": [
{
"match_phrase_prefix": {
"name": {
"query": "obama"
}
}
}
],
"boost": 2.0
}
}
],
"filter": {
"bool": {
"should": [
{
"term": {
"countryCode": {
"value": "US",
"boost": 4
}
}
},
{
"term": {
"countryCode": {
"value": "AR",
"boost": 3
}
}
},
{
"term": {
"countryCode": {
"value": "ES",
"boost": 2
}
}
}
]
}
}
}
},
"size": 50,
"sort": [
{
"rating": {
"order": "desc"
}
},
{
"_score": {
"order": "desc"
}
}
]
}
Expectation :
All records which belongs with country US should be available on top base on sorting by rating
All records which belongs with country AR should be available after US's records with respective rating order
All records which belongs with country ES should be available after Ar's records with respective rating order
Expected example:
[
{name:"obama a", countryCode:us, rating:5}
{name:"obama b", countryCode:us, rating:4}
{name:"obama ac", countryCode:ar, rating:3}
{name:"obama ess", countryCode:es, rating:3.5}
]
If you want to tune the score but not drop the document you can use should.
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html
must
The clause (query) must appear in matching documents and will
contribute to the score.
filter
The clause (query) must appear in matching documents. However unlike
must the score of the query will be ignored. Filter clauses are
executed in filter context, meaning that scoring is ignored and
clauses are considered for caching.
should
The clause (query) should appear in the matching document.
must_not
The clause (query) must not appear in the matching documents. Clauses
are executed in filter context meaning that scoring is ignored and
clauses are considered for caching. Because scoring is ignored, a
score of 0 for all documents is returned.
Here is an example:
POST test_stackoverflow_us/_bulk?refresh=true&pretty
{ "index": {}}
{"name":"obama a", "countryCode":"us", "rating":5}
{ "index": {}}
{"name":"obama b", "countryCode":"us", "rating":4}
{ "index": {}}
{"name":"obama ac", "countryCode":"ar", "rating":3}
{ "index": {}}
{"name":"obama ess", "countryCode":"es", "rating":3.5}
GET test_stackoverflow_us/_search
{
"query": {
"bool": {
"must": [
{
"bool": {
"must": [
{
"match_phrase_prefix": {
"name": {
"query": "obama"
}
}
}
],
"boost": 2
}
}
],
"should": [
{
"term": {
"countryCode": {
"value": "US",
"boost": 4
}
}
},
{
"term": {
"countryCode": {
"value": "AR",
"boost": 3
}
}
},
{
"term": {
"countryCode": {
"value": "ES",
"boost": 2
}
}
}
]
}
},
"size": 50,
"sort": [
{
"rating": {
"order": "desc"
}
},
{
"_score": {
"order": "desc"
}
}
]
}

Is there a way to build an Elastic query with changing search values?

I want to use Elastic in PHP to process a search request from my website. For example, I have the search parameter
name
age
height
weight.
But it should not be necessary to always search for all parameters.
So it could be that only (name AND age) have values and (height AND weight) have not.
Is there a way to build one query with flexible/changing input values?
The query below would not work when there are no search values for (height AND weight).
{
"query": {
"bool": {
"should": [
{ "match": { "name.keyword": "Anna" } },
{ "match": { "age": "30" } },
{ "match": { "height": "180" } },
{ "match": { "weight": "70" } }
]
}
}
}
Search templates to the rescue:
POST _scripts/my-search-template
{
"script": {
"lang": "mustache",
"source": """
{
"query": {
"bool": {
"should": [
{{#name}}
{ "match": { "name.keyword": "{{name}}" } },
{{/name}}
{{#age}}
{ "match": { "age": "{{age}}" } },
{{/age}}
{{#height}}
{ "match": { "height": "{{height}}" } },
{{/height}}
{{#weight}}
{ "match": { "weight": "{{weight}}" } },
{{/weight}}
{ "match_none": { } }
]
}
}
}
"""
}
}
Note that since you don't know how many criteria you have, the last condition is always false and is only there to make sure the JSON is valid (i.e. the last comma doesn't stay dangling)
You can then run your query like this:
POST my-index/_search/template
{
"id": "my-search-template",
"params": {
"name": "Anna",
"age": 30
}
}
You need to handle in your application that constructs your Elasticsearch query and its very easy to do it in the application as you know what all search parameter value you got from UI, if they are not null than only includes those fields in your Elasticsearch query.
Elasticsearch doesn't support if...else like condition in query.
Tldr;
They are multiple way to address your problem in Elasticsearch.
You could be playing with the parameter minimum_should_match
You could be using template queries with conditions.
You could also perform more complex bool queries, that enumerate the possibilities for a match.
You could also use scripts to program the logic you want to see.
Minimum should match
POST /_bulk
{"index":{"_index":"73121817"}}
{"name": "ana", "age": 1, "height": 180, "weight": 70}
{"index":{"_index":"73121817"}}
{"name": "jack", "height": 180, "weight": 70}
{"index":{"_index":"73121817"}}
{"name": "emma", "age": 1, "weight": 70}
{"index":{"_index":"73121817"}}
{"name": "william", "age": 1, "height": 180}
{"index":{"_index":"73121817"}}
{"name": "jenny", "weight": 70}
{"index":{"_index":"73121817"}}
{"name": "marco", "age": 1}
{"index":{"_index":"73121817"}}
{"name": "giulia", "height": 180}
{"index":{"_index":"73121817"}}
{"name": "paul"}
GET 73121817/_search
{
"query": {
"bool": {
"should": [
{ "match": { "name.keyword": "Anna" } },
{ "match": { "age": "30" } },
{ "match": { "height": "180" } },
{ "match": { "weight": "70" } }
],
"minimum_should_match": 2
}
}
}
with the minimum should match set to 2 only 2 documents are returned ana and jack
Template queries
Well Val's answer is quite complete
You could also refer to the doc
Complex queries
Refer to the so post behind the link
Scripted queries
GET 73121817/_search
{
"query": {
"bool": {
"filter": {
"script": {
"script": """
return (!doc["name.keyword"].empty && !doc["age"].empty);
"""
}
}
}
}
}

Group by terms and get count of nested array property?

I would like to get the count from a document series where an array item matches some value.
I have documents like these:
{
"Name": "jason",
"Todos": [{
"State": "COMPLETED"
"Timer": 10
},{
"State": "PENDING"
"Timer": 5
}]
}
{
"Name": "jason",
"Todos": [{
"State": "COMPLETED"
"Timer": 5
},{
"State": "PENDING"
"Timer": 2
}]
}
{
"Name": "martin",
"Todos": [{
"State": "COMPLETED"
"Timer": 15
},{
"State": "PENDING"
"Timer": 10
}]
}
I would like to count how many documents I have where they have any Todos with COMPLETED State. And group by Name.
So from the above I would need to get:
jason: 2
martin: 1
Usually I do this with a term aggregation for the Name, and an other sub aggregation for other items:
"aggs": {
"statistics": {
"terms": {
"field": "Name"
},
"aggs": {
"test": {
"filter": {
"bool": {
"must": [{
"match_phrase": {
"SomeProperty.keyword": {
"query": "THEVALUE"
}
}
}
]
}
},
But not sure how to do this here as I have items in an array.
Elasticsearch has no problem with arrays because in fact it flattens them by default:
Arrays of inner object fields do not work the way you may expect. Lucene has no concept of inner objects, so Elasticsearch flattens object hierarchies into a simple list of field names and values.
So a query like the one you posted will do. I would use term query for keyword datatype, though:
POST mytodos/_search
{
"size": 0,
"aggs": {
"by name": {
"terms": {
"field": "Name"
},
"aggs": {
"how many completed": {
"filter": {
"term": {
"Todos.State": "COMPLETED"
}
}
}
}
}
}
}
I am assuming your mapping looks something like this:
PUT mytodos/_mappings
{
"properties": {
"Name": {
"type": "keyword"
},
"Todos": {
"properties": {
"State": {
"type": "keyword"
},
"Timer": {
"type": "integer"
}
}
}
}
}
The example documents that you posted will be transformed internally into something like this:
{
"Name": "jason",
"Todos.State": ["COMPLETED", "PENDING"],
"Todos.Timer": [10, 5]
}
However, if you need to query for Todos.State and Todos.Timer, for example, filter for those "COMPLETED" but only with Timer > 10, it will not be possible with such mapping because Elasticsearch forgets the link between fields of object array items.
In this case you would need to use something like nested datatype for such arrays, and query them with special nested query.
Hope that helps!

Can one document come into two buckets?

In elastic search, I have list of documents. And each document contain field type(possible value for type is 1,2,3,4,5). Now I want to create two bucket
one contain document with type field value as 1 and
contain all the document(including type 1).
Is it possible in elastic search? If yes then how?
I search on internet but I did not find anything that is helpful.
Following is document structure:-
"_source": { "city": "Ahmadabad",
"pId": "A1332605",
"sellerType": 1,
"seller": "Dealer",
"makeId": 7,
"makeName": "ABC",
"modelId": 673,
"type": 1
},
"_source": { "city": "Surat",
"pId": "A265843",
"sellerType": 1,
"seller": "Dealer",
"makeId": 45,
"makeName": "XYZ",
"modelId": 520,
"type": 2
}
I copied this request from a visualization that Kibana made, it should work just the same. I picked one of your integer fields, change it if you need something else.
{
"query": {
// your query
},
"size": 0,
"_source": {
"excludes": []
},
"aggs": {
"2": {
"filters": {
"filters": {
"filter_for_specific": {
"query_string": {
"query": "sellerType: 1",
"analyze_wildcard": true
}
},
"filter_for_existing": {
"query_string": {
"query": "sellerType: *",
"analyze_wildcard": true
}
}
}
}
}
}
}

Elasticsearch query fails to return results when querying a nested object

I have an object which looks something like this:
{
"id": 123,
"language_id": 1,
"label": "Pablo de la Pena",
"office": {
"count": 2,
"data": [
{
"id": 1234,
"is_office_lead": false,
"office": {
"id": 1,
"address_line_1": "123 Main Street",
"address_line_2": "London",
"address_line_3": "",
"address_line_4": "UK",
"address_postcode": "E1 2BC",
"city_id": 1
}
},
{
"id": 5678,
"is_office_lead": false,
"office": {
"id": 2,
"address_line_1": "77 High Road",
"address_line_2": "Edinburgh",
"address_line_3": "",
"address_line_4": "UK",
"address_postcode": "EH1 2DE",
"city_id": 2
}
}
]
},
"primary_office": {
"id": 1,
"address_line_1": "123 Main Street",
"address_line_2": "London",
"address_line_3": "",
"address_line_4": "UK",
"address_postcode": "E1 2BC",
"city_id": 1
}
}
My Elasticsearch mapping looks like this:
"mappings": {
"item": {
"properties": {
"office": {
"properties": {
"data": {
"type": "nested",
}
}
}
}
}
}
My Elasticsearch query looks something like this:
GET consultant/item/_search
{
"from": 0,
"size": 24,
"query": {
"bool": {
"must": [
{
"term": {
"language_id": 1
}
},
{
"term": {
"office.data.office.city_id": 1
}
}
]
}
}
}
This returns zero results, however, if I remove the second term and leave it only with the language_id clause, then it works as expected.
I'm sure this is down to a misunderstading on my part of how the nested object is flattened, but I'm out of ideas - I've tried all kinds of permutations of the query and mappings.
Any guidance hugely appreciated. I am using Elasticsearch 6.1.1.
I'm not sure if you need the entire record or not, this solution gives every record that has language_id: 1 and has an office.data.office.id: 1 value.
GET consultant/item/_search
{
"from": 0,
"size": 100,
"query": {
"bool":{
"must": [
{
"term": {
"language_id": {
"value": 1
}
}
},
{
"nested": {
"path": "office.data",
"query": {
"match": {
"office.data.office.city_id": 1
}
}
}
}
]
}
}
}
I put 3 different records in my test index for proofing against false hits, one with different language_id and one with different office ids and only the matching one returned.
If you only need the office data, then that's a bit different but still solvable.

Resources