ElasticSearch | SimpleQueryString | FieldBoosting is not working as expected

ElasticSearch | SimpleQueryString | FieldBoosting is not working as expected - elasticsearch

Mapping:
{
"s_q_s" : {
"mappings" : {
"properties" : {
"f1" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"f2" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"f3" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
Documents:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "s_q_s",
"_type" : "_doc",
"_id" : "yArwuXQBrACLjhbhLKPa",
"_score" : 1.0,
"_source" : {
"f1" : "major",
"f2" : "general",
"f3" : "ram"
}
},
{
"_index" : "s_q_s",
"_type" : "_doc",
"_id" : "yQrwuXQBrACLjhbhT6OJ",
"_score" : 1.0,
"_source" : {
"f1" : "general",
"f2" : "major",
"f3" : "ram"
}
},
{
"_index" : "s_q_s",
"_type" : "_doc",
"_id" : "ygrxuXQBrACLjhbhi6Op",
"_score" : 1.0,
"_source" : {
"f1" : "general",
"f2" : "major major major",
"f3" : "ram"
}
},
{
"_index" : "s_q_s",
"_type" : "_doc",
"_id" : "ywrxuXQBrACLjhbhuKME",
"_score" : 1.0,
"_source" : {
"f1" : "general",
"f2" : "major major",
"f3" : "ram"
}
}
]
}
}
Query:
GET s_q_s/_search
{
"query": {
"simple_query_string": {
"query": "major",
"fields": ["f1","f2^2"] //<===== f2 is twice important that f1
}
}
}
Result:
"hits" : [
{
"_index" : "s_q_s",
"_type" : "_doc",
"_id" : "yArwuXQBrACLjhbhLKPa",
"_score" : 1.0,
"_source" : {
"f1" : "major",
"f2" : "general",
"f3" : "ram"
}
},
{
"_index" : "s_q_s",
"_type" : "_doc",
"_id" : "yQrwuXQBrACLjhbhT6OJ",
"_score" : 1.0,
"_source" : {
"f1" : "general",
"f2" : "major",
"f3" : "ram"
}
},
{
"_index" : "s_q_s",
"_type" : "_doc",
"_id" : "ygrxuXQBrACLjhbhi6Op",
"_score" : 1.0,
"_source" : {
"f1" : "general",
"f2" : "major major major",
"f3" : "ram"
}
},
{
"_index" : "s_q_s",
"_type" : "_doc",
"_id" : "ywrxuXQBrACLjhbhuKME",
"_score" : 1.0,
"_source" : {
"f1" : "general",
"f2" : "major major",
"f3" : "ram"
}
}
]
Docs Excerpt:
You also can boost relevance scores for matches to particular fields using a caret (^) notation
Question:
Why the document with "major" in f1 in coming on the top rather than once with "major" in f2, when i have defined that f2 is twice as important as f1 while querying ?

According to documentation on boost, individual fields can be boosted automatically — count more towards
the relevance score — at query time, with the boost parameter
Search Query without boost:
{
"query": {
"simple_query_string": {
"fields": [
"f1",
"f2"
],
"query": "major"
}
}
}
Search Result:
"hits": [
{
"_index": "stof_64023501",
"_type": "_doc",
"_id": "1",
"_score": 1.2039728, <-- note this
"_source": {
"f1": "major",
"f2": "general",
"f3": "ram"
}
},
{
"_index": "stof_64023501",
"_type": "_doc",
"_id": "3",
"_score": 0.48608798, <-- note this
"_source": {
"f1": "general",
"f2": "major major major",
"f3": "ram"
}
}
]
Search Query with boost 2:
Matches on the f2 field will have twice the weight as those on the f1 field, but still, the score of f1 field is more than that of f2
You can see in the previous search query, the score of matching f2 field was 0.48608798, now since the boost of 2 is applied, the score has become 0.48608798 * 2 = 0.97217596.
But then also the score of f2 field is less than that of f1 field as 0.97217596 < 1.2039728.
{
"query": {
"simple_query_string": {
"fields": [
"f1",
"f2^2"
],
"query": "major"
}
}
}
Search Result:
"hits": [
{
"_index": "stof_64023501",
"_type": "_doc",
"_id": "1",
"_score": 1.2039728, <-- note this
"_source": {
"f1": "major",
"f2": "general",
"f3": "ram"
}
},
{
"_index": "stof_64023501",
"_type": "_doc",
"_id": "3",
"_score": 0.97217596, <-- note this
"_source": {
"f1": "general",
"f2": "major major major",
"f3": "ram"
}
}
]
Search Query:
Now boosting the f2 field value with parameter 3, you can see significant changes in the score. Therefore, now the result is coming accurately according to your requirement.
{
"query": {
"query_string": {
"fields": [
"f1",
"f2^3"
],
"query": "major"
}
}
}
Search Result:
"hits": [
{
"_index": "stof_64023501",
"_type": "_doc",
"_id": "3",
"_score": 1.4582639,
"_source": {
"f1": "general",
"f2": "major major major",
"f3": "ram"
}
},
{
"_index": "stof_64023501",
"_type": "_doc",
"_id": "4",
"_score": 1.4144535,
"_source": {
"f1": "general",
"f2": "major major",
"f3": "ram"
}
},
{
"_index": "stof_64023501",
"_type": "_doc",
"_id": "2",
"_score": 1.2975104,
"_source": {
"f1": "general",
"f2": "major",
"f3": "ram"
}
},
{
"_index": "stof_64023501",
"_type": "_doc",
"_id": "1",
"_score": 1.2039728,
"_source": {
"f1": "major",
"f2": "general",
"f3": "ram"
}
}
]

Related

Search as per the relevance in Elastic Search

We are trying to apply a fuzzy search on zipcodes using following analyzer
PUT test_index
{
"settings": {
"index": {
"max_ngram_diff": 40
},
"analysis": {
"analyzer": {
"autocomplete": {
"tokenizer": "whitespace",
"filter": [
"lowercase",
"autocomplete"
]
},
"autocomplete_search": {
"tokenizer": "whitespace",
"filter": [
"lowercase"
]
}
},
"filter": {
"autocomplete": {
"type": "ngram",
"min_gram": 2,
"max_gram": 40
}
}
}
},
"mappings": {
"properties": {
"zipcode": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "autocomplete_search"
}
}
}
}
Sample data in the index is as follows
PUT test_index/_doc/1
{ "zipcode": "01103" }
PUT test_index/_doc/2
{ "zipcode": "01104" }
PUT test_index/_doc/3
{ "zipcode": "11010" }
PUT test_index/_doc/4
{ "zipcode": "11016" }
PUT test_index/_doc/5
{ "zipcode": "11020" }
PUT test_index/_doc/6
{ "zipcode": "01107" }
PUT test_index/_doc/7
{ "zipcode": "11024" }
PUT test_index/_doc/8
{ "zipcode": "04110" }
Search query used on zipcode field is as follows :
GET test_index/_search
{
"query": {
"match": {
"zipcode": {
"query": "110",
"operator": "and"
}
}
}
}
We expected the data to be returned as per the more relevant one which is :
11010
11020
11024
11016
01103
01104
01107
but the actual data is returned is in this order .How can we boost the documents starting from 110.. to appear first
{
"_index" : "test_index",
"_type" : "_doc",
"_id" : "4",
"_score" : 0.45532414,
"_source" : {
"zipcode" : "11016"
}
},
{
"_index" : "test_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.45532414,
"_source" : {
"zipcode" : "01103"
}
},
{
"_index" : "test_index",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.2885665,
"_source" : {
"zipcode" : "11010"
}
},
{
"_index" : "test_index",
"_type" : "_doc",
"_id" : "5",
"_score" : 0.2885665,
"_source" : {
"zipcode" : "11020"
}
},
{
"_index" : "test_index",
"_type" : "_doc",
"_id" : "7",
"_score" : 0.2885665,
"_source" : {
"zipcode" : "11024"
}
},
{
"_index" : "test_index",
"_type" : "_doc",
"_id" : "8",
"_score" : 0.2885665,
"_source" : {
"zipcode" : "04110"
}
},
{
"_index" : "test_index",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.2885665,
"_source" : {
"zipcode" : "01104"
}
},
{
"_index" : "test_index",
"_type" : "_doc",
"_id" : "6",
"_score" : 0.2885665,
"_source" : {
"zipcode" : "01107"
}
}
Following Query gives the order as expected but i am not sure which one should i use for my use case
GET test_index/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"zipcode": "110"
}
},
{
"match_phrase_prefix": {
"zipcode": "110"
}
}
]
}
}
}
vs
GET test_index/_search
{
"query": {
"prefix" : { "zipcode" : "110" }
}
}

Add a should clause with prefix query. It will give higher score to documents which start with that prefix.
Query
{
"query": {
"bool": {
"should": [
{
"match": {
"zipcode": {
"query": "110",
"operator": "and"
}
}
},
{
"prefix": {
"zipcode": "110"
}
}
]
}
}
}
Result
"hits" : [
{
"_index" : "index24",
"_type" : "_doc",
"_id" : "3",
"_score" : 5.0441885,
"_source" : {
"zipcode" : "11010"
}
},
{
"_index" : "index24",
"_type" : "_doc",
"_id" : "4",
"_score" : 5.0441885,
"_source" : {
"zipcode" : "11016"
}
},
{
"_index" : "index24",
"_type" : "_doc",
"_id" : "5",
"_score" : 5.0441885,
"_source" : {
"zipcode" : "11020"
}
},
{
"_index" : "index24",
"_type" : "_doc",
"_id" : "7",
"_score" : 5.0441885,
"_source" : {
"zipcode" : "11024"
}
},
{
"_index" : "index24",
"_type" : "_doc",
"_id" : "1",
"_score" : 3.0168114,
"_source" : {
"zipcode" : "01103"
}
},
{
"_index" : "index24",
"_type" : "_doc",
"_id" : "2",
"_score" : 3.0168114,
"_source" : {
"zipcode" : "01104"
}
},
{
"_index" : "index24",
"_type" : "_doc",
"_id" : "6",
"_score" : 3.0168114,
"_source" : {
"zipcode" : "01107"
}
},
{
"_index" : "index24",
"_type" : "_doc",
"_id" : "8",
"_score" : 0.18093312,
"_source" : {
"zipcode" : "04110"
}
}
]

Why is there a difference in the search results when querying elasticsearch using a term query?

I have recently started learning elasticsearch and I am getting a difference in the search results of my query. The mapping of the index named "products" is provided below(I am pasting the response from my Kibana console tool) :
{
"products" : {
"mappings" : {
"properties" : {
"in_stock" : {
"type" : "long"
},
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"price" : {
"type" : "long"
},
"tags" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
The data in the index is as follows(I am pasting the response from my Kibana console tool):
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 16,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "products",
"_type" : "_doc",
"_id" : "202",
"_score" : 1.0,
"_source" : {
"name" : "Vegetable Chopper",
"price" : 10,
"in_stock" : 250,
"tags" : [
"kitchen appliances",
"vegetable slicer",
"chopper"
]
}
},
{
"_index" : "products",
"_type" : "_doc",
"_id" : "203",
"_score" : 1.0,
"_source" : {
"name" : "Dish Washer",
"price" : 90,
"in_stock" : 60,
"tags" : [
"kitchen appliances",
"electrical",
"electric washer"
]
}
},
{
"_index" : "products",
"_type" : "_doc",
"_id" : "205",
"_score" : 1.0,
"_source" : {
"name" : "Microwave Oven",
"price" : 100,
"in_stock" : 50,
"tags" : [
"kitchen appliances",
"electricals",
"oven",
"oven toaster",
"microwave"
]
}
},
{
"_index" : "products",
"_type" : "_doc",
"_id" : "206",
"_score" : 1.0,
"_source" : {
"name" : "Mixer Grinder",
"price" : 55,
"in_stock" : 130,
"tags" : [
"kitchen appliances",
"electricals",
"mixer",
"grinder",
"juicer",
"food processor"
]
}
},
{
"_index" : "products",
"_type" : "_doc",
"_id" : "207",
"_score" : 1.0,
"_source" : {
"name" : "Fruit Juicer",
"price" : 40,
"in_stock" : 100,
"tags" : [
"kitchen appliances",
"electicals",
"juicer",
"mixer",
"electric juicer",
"juice maker"
]
}
},
{
"_index" : "products",
"_type" : "_doc",
"_id" : "208",
"_score" : 1.0,
"_source" : {
"name" : "Knife Set",
"price" : 15,
"in_stock" : 250,
"tags" : [
"kitchen knife",
"steel knives",
"cutlery"
]
}
},
{
"_index" : "products",
"_type" : "_doc",
"_id" : "209",
"_score" : 1.0,
"_source" : {
"name" : "Rice Maker",
"price" : 85,
"in_stock" : 60,
"tags" : [
"kitchen appliances",
"electricals",
"electric rice cooker",
"electric pressure cooker"
]
}
},
{
"_index" : "products",
"_type" : "_doc",
"_id" : "210",
"_score" : 1.0,
"_source" : {
"name" : "Induction Cooktop",
"price" : 30,
"in_stock" : 150,
"tags" : [
"kitchen appliances",
"electricals",
"hot plate heater",
"electric hot place",
"induction cooker",
"induction stove"
]
}
},
{
"_index" : "products",
"_type" : "_doc",
"_id" : "211",
"_score" : 1.0,
"_source" : {
"name" : "Coffee Maker",
"price" : 50,
"in_stock" : 100,
"tags" : [
"kitchen appliances",
"electricals"
]
}
},
{
"_index" : "products",
"_type" : "_doc",
"_id" : "212",
"_score" : 1.0,
"_source" : {
"name" : "Wine Glasses Set",
"price" : 50,
"in_stock" : 70,
"tags" : [
"kitchen and dining",
"glassware",
"stemware"
]
}
},
{
"_index" : "products",
"_type" : "_doc",
"_id" : "213",
"_score" : 1.0,
"_source" : {
"name" : "Dinner Set",
"price" : 100,
"in_stock" : 40,
"tags" : [
"kitchen and dining",
"crockery",
"full dinner set"
]
}
},
{
"_index" : "products",
"_type" : "_doc",
"_id" : "214",
"_score" : 1.0,
"_source" : {
"name" : "Whiskey Glasses Set",
"price" : 60,
"in_stock" : 50,
"tags" : [
"kitchen and dining",
"glassware",
"whiskey glasses",
"old fashioned glass",
"rocks glass",
"short tumbler"
]
}
},
{
"_index" : "products",
"_type" : "_doc",
"_id" : "215",
"_score" : 1.0,
"_source" : {
"name" : "Mug And Saucer Set",
"price" : 35,
"in_stock" : 60,
"tags" : [
"kitchen and dining",
"mug set",
"mugs and saucer",
"crockery set"
]
}
},
{
"_index" : "products",
"_type" : "_doc",
"_id" : "201",
"_score" : 1.0,
"_source" : {
"name" : "Milk Frother",
"price" : 25,
"in_stock" : 15,
"tags" : [
"kitchen appliances",
"electricals",
"milk"
]
}
},
{
"_index" : "products",
"_type" : "_doc",
"_id" : "200",
"_score" : 1.0,
"_source" : {
"name" : "Espresso Maker",
"price" : 180,
"in_stock" : 5,
"tags" : [
"kitchen appliances",
"electrical",
"coffee maker"
]
}
},
{
"_index" : "products",
"_type" : "_doc",
"_id" : "204",
"_score" : 1.0,
"_source" : {
"name" : "Pressure Fryer",
"price" : 120,
"in_stock" : 50,
"tags" : [
"air fryer",
"kitchen appliances",
"electric fryer",
"fryer",
"health fryer"
]
}
}
]
}
}
Upon querying the data using the query below I am only matching six records:
Query - 1
GET /products/_search
{
"query": {"terms" : {"tags": ["kitchen appliances","electricals"]}}
}
The document id's matched are (201,205,206,209,210,211)
When I executed the below query then I am matching 11 records:
Query-2
GET /products/_search
{
"query": {"terms" : {"tags.keyword": ["kitchen appliances","electricals"]}}
}
The document id's that matched for the second query are : (200,201,202,203,204,205,206,207,209,210,211)
Can someone explain what is the difference between the two queries and why Query-1 is a subset of Query-2 even though both queries are being executed on the same field ?

It is better to use the match query if you have a text type field.
term query doesn't perform any analysis on the term. It returns the documents that contain exact term matching documents.
terms query works on exact terms. It returns those documents that have 1 or more exact terms.
QUERY 1:
{
"query": {
"terms": {
"tags": [
"kitchen appliances",
"electricals"
]
}
}
}
Search Result is
"hits": [
{
"_index": "67155973",
"_type": "_doc",
"_id": "3",
"_score": 1.0,
"_source": {
"name": "Microwave Oven",
"price": 100,
"in_stock": 50,
"tags": [
"kitchen appliances",
"electricals",
"oven",
"oven toaster",
"microwave"
]
}
},
{
"_index": "67155973",
"_type": "_doc",
"_id": "4",
"_score": 1.0,
"_source": {
"name": "Mixer Grinder",
"price": 55,
"in_stock": 130,
"tags": [
"kitchen appliances",
"electricals",
"mixer",
"grinder",
"juicer",
"food processor"
]
}
},
{
"_index": "67155973",
"_type": "_doc",
"_id": "7",
"_score": 1.0,
"_source": {
"name": "Rice Maker",
"price": 85,
"in_stock": 60,
"tags": [
"kitchen appliances",
"electricals",
"electric rice cooker",
"electric pressure cooker"
]
}
},
{
"_index": "67155973",
"_type": "_doc",
"_id": "8",
"_score": 1.0,
"_source": {
"name": "Induction Cooktop",
"price": 30,
"in_stock": 150,
"tags": [
"kitchen appliances",
"electricals",
"hot plate heater",
"electric hot place",
"induction cooker",
"induction stove"
]
}
},
{
"_index": "67155973",
"_type": "_doc",
"_id": "9",
"_score": 1.0,
"_source": {
"name": "Coffee Maker",
"price": 50,
"in_stock": 100,
"tags": [
"kitchen appliances",
"electricals"
]
}
},
{
"_index": "67155973",
"_type": "_doc",
"_id": "14",
"_score": 1.0,
"_source": {
"name": "Milk Frother",
"price": 25,
"in_stock": 15,
"tags": [
"kitchen appliances",
"electricals",
"milk"
]
}
}
]
As mentioned in the documentation
The term query does not analyze the search term. The term query only
searches for the exact term you provide. This means the term query may
return poor or no results when searching text fields.
QUERY 2:
{
"query": {
"terms": {
"tags.keyword": [
"kitchen appliances",
"electricals"
]
}
}
}
In the above query, you are using tags.keyword field which uses the keyword analyzer instead of the standard analyzer. Here the query searches for the exact terms i.e "kitchen appliances" OR "electricals", and therefore returns 11 documents.

How can I search the specific value in the _source from elasticSearch inquired result?

I'm collecting logs through Elastic Search. And I look up the results through a query.
When inquiring with the following query
GET test/_search
{
"query": {
"match_all":{
}
}
}
The result is inquired as follows.
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 100,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "test",
"_id" : "1a2b3c4d5e6f",
"_score" : 1.0,
"_source" : {
"team" : "Marketing"
"number" : "3"
"name" : "Mark"
}
},
{
"_index" : "test",
"_id" : "1a2b3c4d5e66",
"_score" : 1.0,
"_source" : {
"team" : "HR"
"number" : "1"
"name" : "John"
}
},
........
but, I want to be inquired as below.(Specific value of Inner_hits)
{
"name": "Mark"
},
{
"name": "John"
},
So, How can I query a specific value inner_hits?
Thanks.

You could simply use the source_filtering feature of ES, so in your case, your query will like below:
{
"_source": "name",
"query": {
"match_all": {}
}
}
And it returns search results like
"hits": [
{
"_index": "64214413",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"name": "Mark"
}
},
{
"_index": "64214413",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"name": "John"
}
}
]

How to access the date_histogram key field in the child aggregation in elasticsearch?

I want to apply some filters on the bucket response generated by the date_histogram, that filter is dependent on the key of the date_histogram output buckets.
Suppose I have following data in
{
"entryTime":"",
"soldTime:""
}
the elastic query is something like this
{
"aggs": {
"date": {
"date_histogram": {
"field": "entryTime",
"interval": "month",
"keyed": true
},
"aggs": {
"filter_try": {
"filter": {
"bool": {
"must": [
{
"range": {
"entryTime": {
"lte": 1588840533000
}
}
},
{
"bool": {
"should": [
{
"bool": {
"must": [
{
"exists": {
"field": "soldTime"
}
},
{
"range": {
"soldTime": {
"gt": 1588840533000
}
}
}
]
}
},
{
"bool": {
"must_not": [
{
"exists": {
"field": "soldTime"
}
}
]
}
}
]
}
}
]
}
}
}
}
}
}
}
so here in that bool query, I want to use the date generated for the specific bucket by date_histogram aggregation in both the range clauses instead of the hardcoded epoch time.
Even if we can access using script then also it's fine.
for further clarification, this is the boolean query and in the query want to replace this "DATE" with the date_histogram bucket key.
# (entryTime < DATE)
# AND
# (
# (soldTime != null AND soldTime > DATE)
# OR
# (soldTime == NULL)
# )
Consider below 10 Document I have:
"hits" : [
{
"_index" : "vi_test",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"deaerId" : "4",
"entryTime" : "1577869200000",
"soldTime" : "1578646800000"
}
},
{
"_index" : "vi_test",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"deaerId" : "4",
"entryTime" : "1578214800000"
}
},
{
"_index" : "vi_test",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"deaerId" : "4",
"entryTime" : "1578560400000",
"soldTime" : "1579942800000"
}
},
{
"_index" : "vi_test",
"_type" : "_doc",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"deaerId" : "4",
"entryTime" : "1579683600000",
"soldTime" : "1581325200000"
}
},
{
"_index" : "vi_test",
"_type" : "_doc",
"_id" : "5",
"_score" : 1.0,
"_source" : {
"deaerId" : "4",
"entryTime" : "1580893200000"
}
},
{
"_index" : "vi_test",
"_type" : "_doc",
"_id" : "6",
"_score" : 1.0,
"_source" : {
"deaerId" : "4",
"entryTime" : "1582189200000",
"soldTime" : "1582362000000"
}
},
{
"_index" : "vi_test",
"_type" : "_doc",
"_id" : "7",
"_score" : 1.0,
"_source" : {
"deaerId" : "4",
"entryTime" : "1582621200000",
"soldTime" : "1584349200000"
}
},
{
"_index" : "vi_test",
"_type" : "_doc",
"_id" : "8",
"_score" : 1.0,
"_source" : {
"deaerId" : "4",
"entryTime" : "1583053200000",
"soldTime" : "1583830800000"
}
},
{
"_index" : "vi_test",
"_type" : "_doc",
"_id" : "9",
"_score" : 1.0,
"_source" : {
"deaerId" : "4",
"entryTime" : "1584262800000"
}
},
{
"_index" : "vi_test",
"_type" : "_doc",
"_id" : "10",
"_score" : 1.0,
"_source" : {
"deaerId" : "4",
"entryTime" : "1585472400000"
}
}
]
Now the end of January 2020 in epoch is -> 1580515199000
So if I apply on the above-mentioned bool query,
Will get the output as the
"hits" : [
{
"_index" : "vi_test",
"_type" : "_doc",
"_id" : "4",
"_score" : 3.0,
"_source" : {
"deaerId" : "4",
"entryTime" : "1579683600000",
"soldTime" : "1581325200000"
}
},
{
"_index" : "vi_test",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"deaerId" : "4",
"entryTime" : "1578214800000"
}
}
]
As document with ID 4 satisfy (soldTime != null AND soldTime > DATE) and document with ID 2 satisfy (soldTime == null) condition from OR part.
Now for the same bool request If I use the date of end February 2020 -> 1583020799000, will get the hits as follows
"hits" : [
{
"_index" : "vi_test",
"_type" : "_doc",
"_id" : "7",
"_score" : 3.0,
"_source" : {
"deaerId" : "4",
"entryTime" : "1582621200000",
"soldTime" : "1584349200000"
}
},
{
"_index" : "vi_test",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"deaerId" : "4",
"entryTime" : "1578214800000"
}
},
{
"_index" : "vi_test",
"_type" : "_doc",
"_id" : "5",
"_score" : 1.0,
"_source" : {
"deaerId" : "4",
"entryTime" : "1580893200000"
}
}
]
ID 7: Entry in Feb, but sold in March so is in stock for Feb-2020
ID 2: Entry in Jan, not sold yet means in the stock
ID 5: Entry in Feb, not sold yet means in the stock
Now the same data required for each end of the month of a whole year to plot the trend.
Thank you

I couldn't find a way using normal queries as parent aggregation key is not available in sub aggregation. I have written a script for this which selects documents where soldTime is either null or doesnot fall in same month as entryTime
Query:
{
"query": {
"script": {
"script": """
ZonedDateTime entry;
ZonedDateTime sold;
if(doc['entryTime'].size()>0)
{
entry= doc['entryTime'].value;
}
if(doc['soldTime'].size()>0)
{
sold = doc['soldTime'].value;
}
if(sold==null || ( entry.getMonthValue()!==sold.getMonthValue()|| entry.getYear()!==sold.getYear()))
{
return true;
}
else false;
"""
}
},
"size": 10,
"aggs": {
"monthly_trend": {
"date_histogram": {
"field": "entryTime",
"interval": "month"
},
"aggs": {
"docs": {
"top_hits": {
"size": 10
}
}
}
}
}
}
Result:
"hits" : [
{
"_index" : "index22",
"_type" : "_doc",
"_id" : "55Kv83EB8a54AbXfngYU",
"_score" : 1.0,
"_source" : {
"deaerId" : "4",
"entryTime" : "1578214800000"
}
}
]
},
"aggregations" : {
"monthly_trend" : {
"buckets" : [
{
"key_as_string" : "2020-01-01T00:00:00.000Z",
"key" : 1577836800000,
"doc_count" : 1,
"docs" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "index22",
"_type" : "_doc",
"_id" : "55Kv83EB8a54AbXfngYU",
"_score" : 1.0,
"_source" : {
"deaerId" : "4",
"entryTime" : "1578214800000"
}
}
]
}
}
}
]
}
}

Is there a way in ElasticSearch to get the shortest (closest) word at top?

I have the words inside my index: "Kem, Kemi, Kemah, Kemer, Kemerburgaz, Kemang, Kembs, Kemnay, Kempley, Kempsey, Kemerovo".
When i search for "Kem" i want "Kemi" to come at top because it is the closest word. (Kem + i = Kemi). But it doesn't go the way i want.
Index:
{
"settings": {
"number_of_shards": 1,
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 15
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
}
}
},
"mappings": {
"_doc": {
"properties": {
"name": {
"fields": {
"keyword": {
"type": "keyword"
}
},
"type": "text",
"similarity": "classic",
"analyzer": "autocomplete",
"search_analyzer": "standard"
},
"id": {
"type": "keyword"
},
"slug": {
"type": "keyword"
},
"type": {
"type": "keyword"
}
}
}
}
}
Query:
{
"from" : 0, "size" : 10,
"query": {
"bool": {
"must": [
{
"match": {
"name": "Kem"
}
}
],
"should": [
{
"term": {
"name.keyword": {
"value": "Kem"
}
}
}
]
}
}
}
'
Result:
{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 143,
"max_score" : 20.795834,
"hits" : [
{
"_index" : "destinations",
"_type" : "_doc",
"_id" : "lPL8Y2YBqxTX_xwrZlGc",
"_score" : 20.795834,
"_source" : {
"id" : "c6317201",
"name" : "Kem",
"slug" : "yurtdisi/karelya-cumhuriyeti/kem"
}
},
{
"_index" : "destinations",
"_type" : "_doc",
"_id" : "se78Y2YBqxTX_xwrVFIU",
"_score" : 8.61574,
"_source" : {
"id" : "c121023",
"name" : "Kemah",
"slug" : "yurtdisi/houston-ve-civari/kemah"
}
},
{
"_index" : "destinations",
"_type" : "_doc",
"_id" : "ze78Y2YBqxTX_xwrVFo5",
"_score" : 8.61574,
"_source" : {
"id" : "c1783",
"name" : "Kemerovo",
"slug" : "yurtdisi/kemerovo-oblasti/kemerovo"
}
},
{
"_index" : "destinations",
"_type" : "_doc",
"_id" : "xe78Y2YBqxTX_xwrVFs9",
"_score" : 8.61574,
"_source" : {
"id" : "c1786",
"name" : "Kemi",
"slug" : "yurtdisi/rovaniemi/kemi"
}
},
{
"_index" : "destinations",
"_type" : "_doc",
"_id" : "Tu78Y2YBqxTX_xwrVG-X",
"_score" : 8.61574,
"_source" : {
"id" : "c1900",
"name" : "Kempsey",
"slug" : "yurtdisi/new-south-wales/kempsey"
}
},
{
"_index" : "destinations",
"_type" : "_doc",
"_id" : "Bu78Y2YBqxTX_xwrVILt",
"_score" : 8.61574,
"_source" : {
"id" : "c3000010982",
"name" : "Kempley",
"slug" : "yurtdisi/dymock/kempley"
}
},
{
"_index" : "destinations",
"_type" : "_doc",
"_id" : "B-78Y2YBqxTX_xwrVILt",
"_score" : 8.61574,
"_source" : {
"id" : "c3000010983",
"name" : "Kemnay",
"slug" : "yurtdisi/inverurie/kemnay"
}
},
{
"_index" : "destinations",
"_type" : "_doc",
"_id" : "CO78Y2YBqxTX_xwrVIb_",
"_score" : 8.61574,
"_source" : {
"id" : "c3000013079",
"name" : "Kemerburgaz",
"slug" : "eyup/kemerburgaz"
}
},
{
"_index" : "destinations",
"_type" : "_doc",
"_id" : "-fL8Y2YBqxTX_xwrZQxf",
"_score" : 8.61574,
"_source" : {
"id" : "c6190744",
"name" : "Kembs",
"slug" : "yurtdisi/haut-rhin-bolge/kembs"
}
},
{
"_index" : "destinations",
"_type" : "_doc",
"_id" : "xfL8Y2YBqxTX_xwrZSG-",
"_score" : 8.61574,
"_source" : {
"id" : "c6216986",
"name" : "Kemang",
"slug" : "yurtdisi/cakarta/kemang"
}
}
]
}
}
Now they are at same score because everyone have the "Kem" i guess. But if i do "match" or "match_phrase" the outcome is the same.

In your example it seems that you want your results sorted by length. You can do that with a script.
POST your_index/_doc/_search
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": [
{
"match": {
"name": "Kem"
}
}
],
"should": [
{
"term": {
"name.keyword": {
"value": "Kem"
}
}
}
]
}
},
"sort": [
{
"_score": {"order": "desc"}
},
{
"_script": {
"script": "doc['name.keyword'].value.length()",
"type": "number",
"order": "asc"
}
},
{
"name.keyword": {"order": "asc"}
}
]
}

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

ElasticSearch | SimpleQueryString | FieldBoosting is not working as expected - elasticsearch

Related

Search as per the relevance in Elastic Search

Why is there a difference in the search results when querying elasticsearch using a term query?

How can I search the specific value in the _source from elasticSearch inquired result?

How to access the date_histogram key field in the child aggregation in elasticsearch?

Is there a way in ElasticSearch to get the shortest (closest) word at top?

Categories

Resources