I need help for a query elasticsearch - elasticsearch

I need help for a query.
This is my query and my sample :
GET /product/_search
{
"query": {
"bool" : {
"must" : {
"multi_match" : {
"query": "Torsades",
"fields": [ "ean^10", "name^4", "brand" ]
}
}
}
}
}
[
{
"_index" : "product_2022-05-13-194440",
"_type" : "_doc",
"_id" : "1",
"_score" : 13.78764,
"_source" : {
"country" : 1,
"ean" : "3250391967858",
"name" : "Torsades Semi-complètes BIO - 500G",
"brand" : "Fiorini"
}
},
{
"_index" : "product_2022-05-13-194440",
"_type" : "_doc",
"_id" : "74",
"_score" : 13.78764,
"_source" : {
"country" : null,
"ean" : "3564700009826",
"name" : "Pâtes Torsades - Turini - 500 g",
"brand" : "Turini"
}
},
{
"_index" : "product_2022-05-13-194440",
"_type" : "_doc",
"_id" : "78",
"_score" : 11.964245,
"_source" : {
"country" : null,
"ean" : "3250391967858",
"name" : "Torsades Semi-complètes BIO - 500G - ITM BENCHMARK",
"brand" : "Fiorini"
}
}
]
I want a condition specific and I can't find the solution :
I want :
ALL products for country=1 AND (ALL products for country=null MINUS product.ean IN country=1)
In my sample, I want have 2 hits :
THIS is deleted because EAN in country=1 :
{
"_index" : "product_2022-05-13-194440",
"_type" : "_doc",
"_id" : "78",
"_score" : 11.964245,
"_source" : {
"country" : null,
"ean" : "3250391967858",
"name" : "Torsades Semi-complètes BIO - 500G - ITM BENCHMARK",
"brand" : "Fiorini"
}
}
Someone have a solution ?
EDIT :
I want this result :
[
{
"_index" : "product_2022-05-13-194440",
"_type" : "_doc",
"_id" : "1",
"_score" : 13.78764,
"_source" : {
"country" : 1,
"ean" : "3250391967858",
"name" : "Torsades Semi-complètes BIO - 500G",
"brand" : "Fiorini"
}
},
{
"_index" : "product_2022-05-13-194440",
"_type" : "_doc",
"_id" : "74",
"_score" : 13.78764,
"_source" : {
"country" : null,
"ean" : "3564700009826",
"name" : "Pâtes Torsades - Turini - 500 g",
"brand" : "Turini"
}
}
]

You tried to use Field Collapsing?
GET test/_search
{
"query": {
"bool": {
"must": {
"multi_match": {
"query": "Torsades",
"fields": [
"ean^10",
"name^4",
"brand"
]
}
}
}
},
"collapse": {
"field": "ean.keyword"
}
}
Response:
"hits" : [
{
"_index" : "test",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.5611319,
"_source" : {
"country" : 1,
"ean" : "3250391967858",
"name" : "Torsades Semi-complètes BIO - 500G",
"brand" : "Fiorini"
},
"fields" : {
"ean.keyword" : [
"3250391967858"
]
}
},
{
"_index" : "test",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.5611319,
"_source" : {
"country" : null,
"ean" : "3564700009826",
"name" : "Pâtes Torsades - Turini - 500 g",
"brand" : "Turini"
},
"fields" : {
"ean.keyword" : [
"3564700009826"
]
}
}
]

Related

How do I extract the "message" value from elasticsearch? (DSL)

How do I extract the "message" value from elasticsearch? (DSL)
I tried it. (my code)
-> I want to extract all the "message" values only.
GET 0503instgram_csv/_search
{
"query" : {
"query_string": {
"default_field": "message",
"query": ?????????????
}
}
}
-> I want to process the data by saving new field values of all "message" that are printed out.
I'd really appreciate your help.
#ESCoder
This is a picture of the result of the attempt as you said.
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 281,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "0503instgram_csv",
"_type" : "_doc",
"_id" : "zbKQMXkB98wUkKJOL8dT",
"_score" : 1.0,
"_source" : {
"message" : "\"lovablepoetree"
}
},
{
"_index" : "0503instgram_csv",
"_type" : "_doc",
"_id" : "zrKQMXkB98wUkKJOL8dU",
"_score" : 1.0,
"_source" : {
"message" : """내가 아는 사람 중에 최고 셀럽(#hanstar.kim)과 맥주 마심🍻셀럽과 술이라니....! 성공해따 나자신!!!!!! 그래놓고 사진은 나 혼자 찍어따^.^ 다음번엔 투샷을 찍어보쟈......🥰"""
}
},
{
"_index" : "0503instgram_csv",
"_type" : "_doc",
"_id" : "z7KQMXkB98wUkKJOL8dU",
"_score" : 1.0,
"_source" : {
"message" : """🔥🔥..열정을 응원 합니다. 도대현 드림❤️❤️❤️"""
}
},
{
"_index" : "0503instgram_csv",
"_type" : "_doc",
"_id" : "0LKQMXkB98wUkKJOL8dU",
"_score" : 1.0,
"_source" : {
"message" : "lovablepoetree"
}
},
{
"_index" : "0503instgram_csv",
"_type" : "_doc",
"_id" : "0bKQMXkB98wUkKJOL8dj",
"_score" : 1.0,
"_source" : {
"message" : """좋아요 69개
"""
}
},
{
"_index" : "0503instgram_csv",
"_type" : "_doc",
"_id" : "0rKQMXkB98wUkKJOL8dj",
"_score" : 1.0,
"_source" : {
"message" : """['paulchang1103,Paul Chang (장준성)🇰🇷,passionated_man,도대현,luv_____juju,쥬쥬,koonge01,영이,p.s.j___5959,또둔,panchitoyoon,Ye Suk Yoon,hyeriiing__,혤,sunny.gibbab,써니네식탁(sunny Gib-bab),_wjstn_ry_02,전수교(20),sungwoon_jinsik,윤성운🇰🇷,t_a_k_3014,🎀케이,팔로우']
"""
}
},
{
"_index" : "0503instgram_csv",
"_type" : "_doc",
"_id" : "07KQMXkB98wUkKJOL8dj",
"_score" : 1.0,
"_source" : {
"message" : "passionated_man"
}
},
{
"_index" : "0503instgram_csv",
"_type" : "_doc",
"_id" : "1LKQMXkB98wUkKJOL8dj",
"_score" : 1.0,
"_source" : {
"message" : "1일"
}
},
{
"_index" : "0503instgram_csv",
"_type" : "_doc",
"_id" : "1bKQMXkB98wUkKJOL8dj",
"_score" : 1.0,
"_source" : {
"message" : """🔥🔥..열정을 응원 합니다. 도대현 드림❤️❤️❤️"""
}
},
{
"_index" : "0503instgram_csv",
"_type" : "_doc",
"_id" : "1rKQMXkB98wUkKJOL8dj",
"_score" : 1.0,
"_source" : {
"message" : """1일답글 달기","lovablepoetree"""
}
}
]
}
}
You can do source filtering to extract only message field values
GET/_search
{
"_source": [
"message"
]
}
Update 1:
You can use the reindex API, to store only the field values of the message field in a different index
POST /_reindex
{
"source": {
"index": "old-index",
"_source": ["message"]
},
"dest": {
"index": "new-index"
}
}

How to fire a single query to fetch the following count of users by month till date in Elasticsearch

How to fire a single query to fetch the following count of users by month till date
Index: user (list of all users in a company)
users who joined this month (month till date - i.e. in Nov)
users who joined previous month (say Oct)
...
users who joined on second month (say Feb)
users who joined on first month (say Jan)
Is there a quick way to fetch all information using a single query, I would like to see a response which contains all information retrieved from a single query?
If I understood your issue well, I suggest you use a date histogram aggregation alongside with a range query.
gte stands for greater than or equal to while lte stands for less than or equal to.
I am first specifying the range I need in this case from Nov 2020 to Jan 2020. Based on this result, I would do an aggregation with interval of one month.
I am assuming for each user a document is being created in the index.
I indexed the following data in my index:
"hits" : [
{
"_index" : "month-count",
"_type" : "_doc",
"_id" : "UPv_l3UBrH4n7Et0xLpD",
"_score" : 1.0,
"_source" : {
"timestamp" : "2020-01-01"
}
},
{
"_index" : "month-count",
"_type" : "_doc",
"_id" : "qPv_l3UBrH4n7Et0-7r9",
"_score" : 1.0,
"_source" : {
"timestamp" : "2020-01-01"
}
},
{
"_index" : "month-count",
"_type" : "_doc",
"_id" : "r_sAmHUBrH4n7Et0e7s-",
"_score" : 1.0,
"_source" : {
"timestamp" : "2020-09-01"
}
},
{
"_index" : "month-count",
"_type" : "_doc",
"_id" : "s_sAmHUBrH4n7Et0gLsh",
"_score" : 1.0,
"_source" : {
"timestamp" : "2020-09-01"
}
},
{
"_index" : "month-count",
"_type" : "_doc",
"_id" : "TfsAmHUBrH4n7Et0nrwS",
"_score" : 1.0,
"_source" : {
"timestamp" : "2020-02-01"
}
},
{
"_index" : "month-count",
"_type" : "_doc",
"_id" : "-vsAmHUBrH4n7Et07Ly-",
"_score" : 1.0,
"_source" : {
"timestamp" : "2020-03-01"
}
},
{
"_index" : "month-count",
"_type" : "_doc",
"_id" : "_vsAmHUBrH4n7Et09LyD",
"_score" : 1.0,
"_source" : {
"timestamp" : "2020-03-01"
}
},
{
"_index" : "month-count",
"_type" : "_doc",
"_id" : "APsAmHUBrH4n7Et0972w",
"_score" : 1.0,
"_source" : {
"timestamp" : "2020-03-01"
}
}
]
The query which I use:
GET month-count/_search
{
"query": {
"range": {
"timestamp": {
"gte": "now-11M/M",
"lte": "now/M"
}
}
},
"aggs": {
"get_Month": {
"date_histogram": {
"field": "timestamp",
"interval": "month"
}
}
}
}
The response:
"hits" : [
{
"_index" : "month-count",
"_type" : "_doc",
"_id" : "UPv_l3UBrH4n7Et0xLpD",
"_score" : 1.0,
"_source" : {
"timestamp" : "2020-01-01"
}
},
{
"_index" : "month-count",
"_type" : "_doc",
"_id" : "qPv_l3UBrH4n7Et0-7r9",
"_score" : 1.0,
"_source" : {
"timestamp" : "2020-01-01"
}
},
{
"_index" : "month-count",
"_type" : "_doc",
"_id" : "r_sAmHUBrH4n7Et0e7s-",
"_score" : 1.0,
"_source" : {
"timestamp" : "2020-09-01"
}
},
{
"_index" : "month-count",
"_type" : "_doc",
"_id" : "s_sAmHUBrH4n7Et0gLsh",
"_score" : 1.0,
"_source" : {
"timestamp" : "2020-09-01"
}
},
{
"_index" : "month-count",
"_type" : "_doc",
"_id" : "TfsAmHUBrH4n7Et0nrwS",
"_score" : 1.0,
"_source" : {
"timestamp" : "2020-02-01"
}
},
{
"_index" : "month-count",
"_type" : "_doc",
"_id" : "-vsAmHUBrH4n7Et07Ly-",
"_score" : 1.0,
"_source" : {
"timestamp" : "2020-03-01"
}
},
{
"_index" : "month-count",
"_type" : "_doc",
"_id" : "_vsAmHUBrH4n7Et09LyD",
"_score" : 1.0,
"_source" : {
"timestamp" : "2020-03-01"
}
},
{
"_index" : "month-count",
"_type" : "_doc",
"_id" : "APsAmHUBrH4n7Et0972w",
"_score" : 1.0,
"_source" : {
"timestamp" : "2020-03-01"
}
}
]
},
"aggregations" : {
"get_Month" : {
"buckets" : [
{
"key_as_string" : "2020-01-01T00:00:00.000Z",
"key" : 1577836800000,
"doc_count" : 2
},
{
"key_as_string" : "2020-02-01T00:00:00.000Z",
"key" : 1580515200000,
"doc_count" : 1
},
{
"key_as_string" : "2020-03-01T00:00:00.000Z",
"key" : 1583020800000,
"doc_count" : 3
},
{
"key_as_string" : "2020-04-01T00:00:00.000Z",
"key" : 1585699200000,
"doc_count" : 0
},
{
"key_as_string" : "2020-05-01T00:00:00.000Z",
"key" : 1588291200000,
"doc_count" : 0
},
{
"key_as_string" : "2020-06-01T00:00:00.000Z",
"key" : 1590969600000,
"doc_count" : 0
},
{
"key_as_string" : "2020-07-01T00:00:00.000Z",
"key" : 1593561600000,
"doc_count" : 0
},
{
"key_as_string" : "2020-08-01T00:00:00.000Z",
"key" : 1596240000000,
"doc_count" : 0
},
{
"key_as_string" : "2020-09-01T00:00:00.000Z",
"key" : 1598918400000,
"doc_count" : 2
}
]
}
}
The doc_count is what you need I think.
Let me know if you need help, I will be glad to help you.
Link:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html

How to perform range searches on Height in elasticsearch

How to represent Height and HeightRange in Elasticsearch, so that its easier to do range searches
Height.java: int feet, int inches;
HeightRange.java: Height from, Height to
I want to search for users who fall in a certain range (say 5ft - 6ft)
If I understood your issue well, you ay use a range query as follows. I did a local test as follows, where I ingested the following data:
"hits" : [
{
"_index" : "height-index-array",
"_type" : "_doc",
"_id" : "OfCHdXUB1QlsTOLdRJgd",
"_score" : 1.0,
"_source" : {
"user" : "user1",
"height" : {
"feet" : 5,
"inch" : 8
}
}
},
{
"_index" : "height-index-array",
"_type" : "_doc",
"_id" : "CfCJdXUB1QlsTOLdEZxS",
"_score" : 1.0,
"_source" : {
"user" : "user2",
"height" : {
"feet" : 7,
"inch" : 9
}
}
},
{
"_index" : "height-index-array",
"_type" : "_doc",
"_id" : "CvCJdXUB1QlsTOLdEpx5",
"_score" : 1.0,
"_source" : {
"user" : "user3",
"height" : {
"feet" : 5,
"inch" : 6
}
}
},
{
"_index" : "height-index-array",
"_type" : "_doc",
"_id" : "C_CJdXUB1QlsTOLdE5yk",
"_score" : 1.0,
"_source" : {
"user" : "user4",
"height" : {
"feet" : 5,
"inch" : 8
}
}
},
{
"_index" : "height-index-array",
"_type" : "_doc",
"_id" : "T_CJdXUB1QlsTOLdFZwx",
"_score" : 1.0,
"_source" : {
"user" : "user5",
"height" : {
"feet" : 2,
"inch" : 3
}
}
}
]
The query which I used to query the the height in feet between 5 and 6:
"query": {
"range": {
"height.feet": {
"gte": 5,
"lte": 6
}
}
}
The gte is equivalent to greater than or equal to and the lte is equivalent to less than or equal to.
The results are:
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "height-index-array",
"_type" : "_doc",
"_id" : "OfCHdXUB1QlsTOLdRJgd",
"_score" : 1.0,
"_source" : {
"user" : "user1",
"height" : {
"feet" : 5,
"inch" : 8
}
}
},
{
"_index" : "height-index-array",
"_type" : "_doc",
"_id" : "CvCJdXUB1QlsTOLdEpx5",
"_score" : 1.0,
"_source" : {
"user" : "user3",
"height" : {
"feet" : 5,
"inch" : 6
}
}
},
{
"_index" : "height-index-array",
"_type" : "_doc",
"_id" : "C_CJdXUB1QlsTOLdE5yk",
"_score" : 1.0,
"_source" : {
"user" : "user4",
"height" : {
"feet" : 5,
"inch" : 8
}
}
}
]
}
Let me know if you have any issues, I will be glad to help :)
As per your request, if you need to combine both metrics, you may use a bool query:
"query": {
"bool": {
"must": [
{
"range": {
"height.feet": {
"gte": 5,
"lte": 6
}
}
},{
"range": {
"height.inch": {
"gte": 6,
"lte": 8
}
}
}
]
}
}
The response:
"hits" : [
{
"_index" : "height-index-array",
"_type" : "_doc",
"_id" : "OfCHdXUB1QlsTOLdRJgd",
"_score" : 2.0,
"_source" : {
"user" : "user1",
"height" : {
"feet" : 5,
"inch" : 8
}
}
},
{
"_index" : "height-index-array",
"_type" : "_doc",
"_id" : "CvCJdXUB1QlsTOLdEpx5",
"_score" : 2.0,
"_source" : {
"user" : "user3",
"height" : {
"feet" : 5,
"inch" : 6
}
}
},
{
"_index" : "height-index-array",
"_type" : "_doc",
"_id" : "C_CJdXUB1QlsTOLdE5yk",
"_score" : 2.0,
"_source" : {
"user" : "user4",
"height" : {
"feet" : 5,
"inch" : 8
}
}
}
]
Links:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html

Elastic Search sort by boolean field

I want to sort my list by true value in a field called trusted.
I have found that the sort option does not support boolean sorting.
How can I achieve this?
If I understood your issue well, I tried to do a test locally on ES version 7.8, and I ingested the following data in my index:
"content": "This is a test",
"trusted": true
"content": "This is a new test",
"trusted": true
"content": "This is not a test",
"trusted": false
Here is the mapping of the index:
"mappings" : {
"properties" : {
"content" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"trusted" : {
"type" : "boolean"
}
}
}
Here is the query when "order" : "desc":
{
"sort": [
{
"trusted": {
"order": "desc"
}
}
]
}
The response:
"hits" : [
{
"_index" : "boolean-sorting",
"_type" : "_doc",
"_id" : "B-YleHQBsTCl1BZvrFdA",
"_score" : null,
"_source" : {
"content" : "This is a test",
"trusted" : true
},
"sort" : [
1
]
},
{
"_index" : "boolean-sorting",
"_type" : "_doc",
"_id" : "CeYleHQBsTCl1BZvtFdJ",
"_score" : null,
"_source" : {
"content" : "This is a new test",
"trusted" : true
},
"sort" : [
1
]
},
{
"_index" : "boolean-sorting",
"_type" : "_doc",
"_id" : "DOYleHQBsTCl1BZvvVfl",
"_score" : null,
"_source" : {
"content" : "This is not a test",
"trusted" : false
},
"sort" : [
0
]
}
]
When "order":"asc", the response is:
"hits" : [
{
"_index" : "boolean-sorting",
"_type" : "_doc",
"_id" : "DOYleHQBsTCl1BZvvVfl",
"_score" : null,
"_source" : {
"content" : "This is not a test",
"trusted" : false
},
"sort" : [
0
]
},
{
"_index" : "boolean-sorting",
"_type" : "_doc",
"_id" : "B-YleHQBsTCl1BZvrFdA",
"_score" : null,
"_source" : {
"content" : "This is a test",
"trusted" : true
},
"sort" : [
1
]
},
{
"_index" : "boolean-sorting",
"_type" : "_doc",
"_id" : "CeYleHQBsTCl1BZvtFdJ",
"_score" : null,
"_source" : {
"content" : "This is a new test",
"trusted" : true
},
"sort" : [
1
]
}
]
Links:
https://www.elastic.co/guide/en/elasticsearch/reference/current/sort-search-results.html
Please let me know If i wrongly answered, I will be glad to help.

Elasticsearch suggestion scoring not working with fuzzy search

When next elasticsearch query getting data for autocomplete recieved data is not relevant and scoring not working
GET quick_search/_search
{
"suggest": {
"name-suggest": {
"text": "Clic",
"completion": {
"field": "Name",
"size": 25,
"skip_duplicates": true,
"fuzzy" : {
"fuzziness": 1,
"prefix_length": 1,
"min_length": 4,
"unicode_aware": true
}
}
}
}
}
Query for search is "Clic" but in search results fuzzy search found not maximum relevant data. How can I boost my results for maximum relevancy for words as "CLIC7000" cause for my query it more relative than "CLI36"
{
"took" : 706,
"timed_out" : false,
"_shards" : {
"total" : 15,
"successful" : 15,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : 0.0,
"hits" : [ ]
},
"suggest" : {
"name-suggest" : [
{
"text" : "Clic",
"offset" : 0,
"length" : 4,
"options" : [
{
"text" : "CLI36",
"_index" : "quick_search",
"_type" : "quick_search",
"_id" : "330719",
"_score" : 3.0,
"_source" : {
"ID" : "330719",
"Name" : "CLI36"
}
},
{
"text" : "CLI361511B001",
"_index" : "quick_search",
"_type" : "quick_search",
"_id" : "330717",
"_score" : 3.0,
"_source" : {
"ID" : "330717",
"Name" : "CLI361511B001"
}
},
{
"text" : "CLI42C6385B001",
"_index" : "quick_search",
"_type" : "quick_search",
"_id" : "185340",
"_score" : 3.0,
"_source" : {
"ID" : "185340",
"Name" : "CLI42C6385B001"
}
},
{
"text" : "CLI42PM",
"_index" : "quick_search",
"_type" : "quick_search",
"_id" : "185345",
"_score" : 3.0,
"_source" : {
"ID" : "185345",
"Name" : "CLI42PM",
}
},
{
"text" : "CLI42PM6389B001",
"_index" : "quick_search",
"_type" : "quick_search",
"_id" : "185343",
"_score" : 3.0,
"_source" : {
"ID" : "185343",
"Name" : "CLI42PM6389B001"
}
},
{
"text" : "CLI441",
"_index" : "quick_search",
"_type" : "quick_search",
"_id" : "233554",
"_score" : 3.0,
"_source" : {
"ID" : "233554",
"Name" : "CLI441"
}
},
{
"text" : "CLI451BK",
"_index" : "quick_search",
"_type" : "quick_search",
"_id" : "185334",
"_score" : 3.0,
"_source" : {
"ID" : "185334",
"Name" : "CLI451BK"
}
},
{
"text" : "CLI451BK6523B001",
"_index" : "quick_search",
"_type" : "quick_search",
"_id" : "185332",
"_score" : 3.0,
"_source" : {
"ID" : "185332",
"Name" : "CLI451BK6523B001"
}
},
{
"text" : "CLI451C",
"_index" : "quick_search",
"_type" : "quick_search",
"_id" : "185331",
"_score" : 3.0,
"_source" : {
"ID" : "185331",
"Name" : "CLI451C"
}
}
]
}
]
}
}

Resources