Simple way to find which one are in the same company with me? - elasticsearch

There is index have field like below, it saves who in which company and which position is
{
"createtime" : 1562844632272,
"post" : "director",
"personId" : 30007346088,
"comId" : 20010774891
}
now want to find the partners of someone, that is which person is in the same company. Now my implementation is
first find the person's related companies(at most 500)
{
"query": { "term": { "personId": 30007346088 } },
"sort": [ { "createtime": "desc" } ],
"_source": ["comId"],
"size":500
}
then find these companies' related persons and exclude the current person and remove duplicate partner(similarly at most 500 partners)
{
"query": {
"bool": {
"must": [{ "terms": { "comId": [20010774891,...] } } ],
"must_not": [ {"term":{"personId":30007346088}} ]
}
},
"aggs" : {
"personId" : {
"terms" : {
"field" : "personId",
"size": 500
}
}
},
"size":0
}
Obviously it's a little complicated, if could exist some more simple way to implement it?

It can work if data is stored in below format.
A unique document for each person , with document id same as person id and company stored as array
POST indexperson/_doc/1
{
"createtime": 1562844632272,
"personId": 1,
"company": [
{
"id": 100,
"post": "director"
},
{
"id": 100,
"post": "director"
}
]
}
Data:
[
{
"_index" : "indexperson",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"createtime" : 1562844632272,
"personId" : 1,
"company" : [
{
"id" : 100,
"post" : "director"
},
{
"id" : 101,
"post" : "director"
}
]
}
},
{
"_index" : "indexperson",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"createtime" : 1562844632272,
"personId" : 2,
"company" : [
{
"id" : 101,
"post" : "director"
}
]
}
},
{
"_index" : "indexperson",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"createtime" : 1562844632272,
"personId" : 3,
"company" : [
{
"id" : 100,
"post" : "director"
}
]
}
},
{
"_index" : "indexperson",
"_type" : "_doc",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"createtime" : 1562844632272,
"personId" : 4,
"company" : [
{
"id" : 104,
"post" : "director"
}
]
}
}
]
Query:
Use (terms look up)[https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-terms-query.html]. Terms look up takes doc id as parameter
GET indexperson/_search
{
"query": {
"bool": {
"must": [
{
"terms": {
"company.id": {
"index": "indexperson",
"id": "1", --> get all docs in indexperson which match with company id
"path": "company.id"
}
}
}
],
"must_not": [
{
"term": {
"personId": {
"value": 2
}
}
}
]
}
}
}
Result:
"hits" : [
{
"_index" : "indexperson",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"createtime" : 1562844632272,
"personId" : 1,
"company" : [
{
"id" : 100,
"post" : "director"
},
{
"id" : 101,
"post" : "director"
}
]
}
},
{
"_index" : "indexperson",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"createtime" : 1562844632272,
"personId" : 3,
"company" : [
{
"id" : 100,
"post" : "director"
}
]
}
}
]

Related

ElasticSearch - Filter Buckets

My elasticSearch query is like:
{
"size": 0,
"aggs": {
"group_by_id": {
"terms": {
"field": "Infos.InstanceInfo.ID.keyword",
"size": 1000
},
"aggs": {
"tops": {
"top_hits": {
"size": 100,
"sort": {
"Infos.InstanceInfo.StartTime": "asc"
}
}
}
}
}
}
}
It works fine, I have a result of this form:
aggregations
=========>group_by_id
==============>buckets
{key:id1}
===============>docs
{doc1.Status:"KO"}
{doc2.Status:"KO"}
{key:id2}
===============>docs
{doc1.Status:"KO"}
{doc2.Status:"OK"}
{key:id3}
===============>docs
{doc1.Status:"KO"}
{doc2.Status:"OK"}
I'm trying to add a filter, so when "OK" the result must be like this:
aggregations
=========>group_by_id
==============>buckets
{key:id2}
===============>docs
{doc1.Status:"KO"}
{doc2.Status:"OK"}
{key:id3}
===============>docs
{doc1.Status:"KO"}
{doc2.Status:"OK"}
and for "KO" :
aggregations
=========>group_by_id
==============>buckets
{key:id1}
===============>docs
{doc1.Status:"KO"}
{doc2.Status:"KO"}
Fields "Startime" & "Status" are at the same level "Infos.InstanceInfo.[...]"
Any idea?
EDIT
Sample docs:
{
"took" : 794,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 10000,
"relation" : "gte"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"group_by_id" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 143846,
"buckets" : [
{
"key" : "1000",
"doc_count" : 6,
"tops" : {
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "azerty",
"_type" : "_doc",
"_id" : "vHFvoXYBVWrYChNi7hB7",
"_score" : null,
"_source" : {
"Infos" : {
"InstanceInfo" : {
"ID" : "1000",
"StartTime" : "2020-12-27T00:43:56.011+01:00",
"status" : "KO"
}
}
},
"sort" : [
1609026236011
]
},
{
"_index" : "azerty",
"_type" : "_doc",
"_id" : "xHFvoXYBVWrYChNi7xAB",
"_score" : null,
"_source" : {
"Infos" : {
"InstanceInfo" : {
"ID" : "1000",
"StartTime" : "2020-12-27T00:43:56.145+01:00",
"status" : "OK"
}
}
},
"sort" : [
1609026236145
]
},
{
"_index" : "azerty",
"_type" : "_doc",
"_id" : "xXFvoXYBVWrYChNi7xAC",
"_score" : null,
"_source" : {
"Infos" : {
"InstanceInfo" : {
"ID" : "1000",
"StartTime" : "2020-12-27T00:43:56.147+01:00",
"status" : "OK"
}
}
},
"sort" : [
1609026236147
]
},
{
"_index" : "azerty",
"_type" : "_doc",
"_id" : "x3FvoXYBVWrYChNi7xAs",
"_score" : null,
"_source" : {
"Infos" : {
"InstanceInfo" : {
"ID" : "1000",
"StartTime" : "2020-12-27T00:43:56.188+01:00",
"status" : "OK"
}
}
},
"sort" : [
1609026236188
]
},
{
"_index" : "azerty",
"_type" : "_doc",
"_id" : "yHFvoXYBVWrYChNi7xAs",
"_score" : null,
"_source" : {
"Infos" : {
"InstanceInfo" : {
"ID" : "1000",
"StartTime" : "2020-12-27T00:43:56.19+01:00",
"status" : "OK"
}
}
},
"sort" : [
1609026236190
]
},
{
"_index" : "azerty",
"_type" : "_doc",
"_id" : "ynFvoXYBVWrYChNi7xBd",
"_score" : null,
"_source" : {
"Infos" : {
"InstanceInfo" : {
"ID" : "1000",
"StartTime" : "2020-12-27T00:43:56.236+01:00",
"status" : "OK"
}
}
},
"sort" : [
1609026236236
]
}
]
}
}
},
{
"key" : "2000",
"doc_count" : 2,
"tops" : {
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "azerty",
"_type" : "_doc",
"_id" : "7HL_onYBVWrYChNij4Is",
"_score" : null,
"_source" : {
"Infos" : {
"InstanceInfo" : {
"ID" : "2000",
"StartTime" : "2020-12-27T08:00:26.011+01:00",
"status" : "KO"
}
}
},
"sort" : [
1609052426011
]
},
{
"_index" : "azerty",
"_type" : "_doc",
"_id" : "9HL_onYBVWrYChNij4Kz",
"_score" : null,
"_source" : {
"Infos" : {
"InstanceInfo" : {
"ID" : "2000",
"StartTime" : "2020-12-27T08:00:26.146+01:00",
"status" : "KO"
}
}
},
"sort" : [
1609052426146
]
}
]
}
}
},
{
"key" : "3000",
"doc_count" : 6,
"tops" : {
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "azerty",
"_type" : "_doc",
"_id" : "7nNRpHYBVWrYChNiiruh",
"_score" : null,
"_source" : {
"Infos" : {
"InstanceInfo" : {
"ID" : "3000",
"StartTime" : "2020-12-27T14:09:36.015+01:00",
"status" : "KO"
}
}
},
"sort" : [
1609074576015
]
},
{
"_index" : "azerty",
"_type" : "_doc",
"_id" : "9nNRpHYBVWrYChNii7s5",
"_score" : null,
"_source" : {
"Infos" : {
"InstanceInfo" : {
"ID" : "3000",
"StartTime" : "2020-12-27T14:09:36.166+01:00",
"status" : "OK"
}
}
},
"sort" : [
1609074576166
]
},
{
"_index" : "azerty",
"_type" : "_doc",
"_id" : "93NRpHYBVWrYChNii7s5",
"_score" : null,
"_source" : {
"Infos" : {
"InstanceInfo" : {
"ID" : "3000",
"StartTime" : "2020-12-27T14:09:36.166+01:00",
"status" : "OK"
}
}
},
"sort" : [
1609074576166
]
},
{
"_index" : "azerty",
"_type" : "_doc",
"_id" : "-XNRpHYBVWrYChNii7ti",
"_score" : null,
"_source" : {
"Infos" : {
"InstanceInfo" : {
"ID" : "3000",
"StartTime" : "2020-12-27T14:09:36.209+01:00",
"status" : "OK"
}
}
},
"sort" : [
1609074576209
]
},
{
"_index" : "azerty",
"_type" : "_doc",
"_id" : "-nNRpHYBVWrYChNii7ts",
"_score" : null,
"_source" : {
"Infos" : {
"InstanceInfo" : {
"ID" : "3000",
"StartTime" : "2020-12-27T14:09:36.219+01:00",
"status" : "OK"
}
}
},
"sort" : [
1609074576219
]
},
{
"_index" : "azerty",
"_type" : "_doc",
"_id" : "_HNRpHYBVWrYChNii7ud",
"_score" : null,
"_source" : {
"Infos" : {
"InstanceInfo" : {
"ID" : "3000",
"StartTime" : "2020-12-27T14:09:36.269+01:00",
"status" : "OK"
}
}
},
"sort" : [
1609074576269
]
}
]
}
}
}
]
}
}
}
Assuming the status field is under Infos.InstanceInfo and it's of the keyword mapping, you can utilize the filter aggregation:
{
"size": 0,
"aggs": {
"status_KO_only": {
"filter": { <--
"term": {
"Infos.InstanceInfo.Status": "KO"
}
},
"aggs": {
"group_by_id": {
"terms": {
"field": "Infos.InstanceInfo.ID.keyword",
"size": 1000
},
"aggs": {
"tops": {
"top_hits": {
"size": 100,
"sort": {
"Infos.InstanceInfo.StartTime": "asc"
}
}
}
}
}
}
}
}
}
In this particular case you could've applied the same term query in the query part of the search request without having to use a filter aggregation.
If you want to get both OK and KO in the same request, you can copy/paste the whole status_KO_only aggregation, rename the 2nd one, and voila -- you now have both groups in one request. You can of course have as many differently named (top-level) filter aggs as you like.
Now, when you indeed need multiple filter aggs at once, there's a more elegant way that does not require copy-pasting -- enter the filters aggregation:
{
"size": 0,
"aggs": {
"by_statuses": {
"filters": { <--
"filters": {
"status_KO": {
"term": {
"Infos.InstanceInfo.Status": "KO"
}
},
"status_OK": {
"term": {
"Infos.InstanceInfo.Status": "OK"
}
}
}
},
"aggs": {
"group_by_id": {
"terms": {
"field": "Infos.InstanceInfo.ID.keyword",
"size": 1000
},
"aggs": {
"tops": {
"top_hits": {
"size": 100,
"sort": {
"Infos.InstanceInfo.StartTime": "asc"
}
}
}
}
}
}
}
}
}
Any of the child sub-aggregations will automatically be the buckets of the explicitly declared term filters.
I personally find the copy/paste approach more readable, esp. when constructing such requests dynamically (based on UI dropdowns and such.)

Elasticsearch Highlights document randomly

When i run a search
{
"query": {
"bool": {
"must": [
{
"match": {
"name": {
"query":"Rose wa",
"fuzziness": "AUTO"
}
}
}
]
}
},
"highlight": {
"fields": {
"name": {}
}
}
}
then following documents come in matches
"_index" : "product",
"_type" : "_doc",
"_id" : "52486",
"_score" : 19.770897,
"_source" : {",
"category_code" : "personalcare",
"name" : "Nivea Rosewater Face Wash (100 Ml)",
"category_name" : "Personal Care "
}
},
{
"_index" : "product",
"_type" : "_doc",
"_id" : "120830",
"_score" : 17.775726,
"_source" : {
"category_code" : "beverages",
"name" : "3 Roses 500G",
"category_name" : "Beverages"
},
"highlight" : {
"name" : [
"3 <em>Roses</em> 500G"
]
}
},
why some document have highlighted fields while other don't?
and how do i ensure that highlight is always present for matched documents

GET TOP HIT FROM A VALUE IF THIS IS 0 KIBANA

My first post, I spend the weekend looking for an answer without a good result
I will try to explain my issue, I have this Index
ST ID
0 1
1 1
0 2
1 2
0 2
1 3
0 3
For example, I need to show the last records from each ID when them are 0, for example, in this index I have to show only ID 1 and ID 2, becuase the last record has ST to 0 in ID 1 and 2
Could some try to help me with this issue?
BR
Mapping:
PUT index34
{
"mappings": {
"properties": {
"ST":{
"type": "integer"
},
"ID":{
"type": "integer"
},
"Date":{
"type": "date"
}
}
}
}
Data:
[
{
"_index" : "index34",
"_type" : "_doc",
"_id" : "LO7Z7W0B_-hMjUaqtwHw",
"_score" : 1.0,
"_source" : {
"ST" : 1,
"ID" : 1,
"Date" : "2019-10-21T12:00:00Z"
}
},
{
"_index" : "index34",
"_type" : "_doc",
"_id" : "Le7Z7W0B_-hMjUaq0QEz",
"_score" : 1.0,
"_source" : {
"ST" : 0,
"ID" : 1,
"Date" : "2019-10-21T12:01:00Z"
}
},
{
"_index" : "index34",
"_type" : "_doc",
"_id" : "Lu7a7W0B_-hMjUaqAwE0",
"_score" : 1.0,
"_source" : {
"ST" : 1,
"ID" : 2,
"Date" : "2019-10-21T12:02:00Z"
}
},
{
"_index" : "index34",
"_type" : "_doc",
"_id" : "L-7a7W0B_-hMjUaqGAEr",
"_score" : 1.0,
"_source" : {
"ST" : 0,
"ID" : 2,
"Date" : "2019-10-21T12:04:00Z"
}
},
{
"_index" : "index34",
"_type" : "_doc",
"_id" : "MO7a7W0B_-hMjUaqNAGA",
"_score" : 1.0,
"_source" : {
"ST" : 0,
"ID" : 3,
"Date" : "2019-10-21T12:04:00Z"
}
},
{
"_index" : "index34",
"_type" : "_doc",
"_id" : "Me7a7W0B_-hMjUaqTQFP",
"_score" : 1.0,
"_source" : {
"ST" : 1,
"ID" : 3,
"Date" : "2019-10-21T12:06:00Z"
}
}
]
Query: I am getting max date for all terms and then getting the max value when ST was zero. If these two match(which means 0 was latest document) then I am keeping tha bucket
GET index34/_search
{
"size": 0,
"aggs": {
"ID": {
"terms": {
"field": "ID",
"size": 10000
},
"aggs": {
"maxDate": {
"max": {
"field": "Date"
}
},
"pending_status": {
"filter": {
"term": {
"ST": 0
}
},
"aggs": {
"filtered_maxdate": {
"max": {
"field": "Date"
}
}
}
},
"buckets_latest_status_pending": {
"bucket_selector": {
"buckets_path": {
"filtereddate": "pending_status>filtered_maxdate",
"maxDate": "maxDate"
},
"script": "params.filtereddate==params.maxDate"
}
}
}
}
}
}
Response:
"aggregations" : {
"ID" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 1,
"doc_count" : 2,
"pending_status" : {
"doc_count" : 1,
"filtered_maxdate" : {
"value" : 1.57165926E12,
"value_as_string" : "2019-10-21T12:01:00.000Z"
}
},
"maxDate" : {
"value" : 1.57165926E12,
"value_as_string" : "2019-10-21T12:01:00.000Z"
}
},
{
"key" : 2,
"doc_count" : 2,
"pending_status" : {
"doc_count" : 1,
"filtered_maxdate" : {
"value" : 1.57165944E12,
"value_as_string" : "2019-10-21T12:04:00.000Z"
}
},
"maxDate" : {
"value" : 1.57165944E12,
"value_as_string" : "2019-10-21T12:04:00.000Z"
}
}
]
}

Search in nested object

I'm having trouble making a query on elasticsearch 7.3
I create an index as this:
PUT myindex
{
"mappings": {
"properties": {
"files": {
"type": "nested"
}
}
}
}
After I create three documents:
PUT myindex/_doc/1
{
"SHA256" : "94ee059335e587e501cc4bf90613e0814f00a7b08bc7c648fd865a2af6a22cc2",
"files" : [
{
"filename" : "firstfilename.exe",
"datetime" : 111111111
},
{
"filename" : "secondfilename.exe",
"datetime" : 111111144
}
]
}
PUT myindex/_doc/2
{
"SHA256" : "87ee059335e587e501cc4bf90613e0814f00a7b08bc7c648fd865a2af6a22c5a",
"files" : [
{
"filename" : "thirdfilename.exe",
"datetime" : 111111133
},
{
"filename" : "fourthfilename.exe",
"datetime" : 111111122
}
]
}
PUT myindex/_doc/3
{
"SHA256" : "565e049335e587e501cc4bf90613e0814f00a7b08bc7c648fd865a2af6a22c5a",
"files" : [
{
"filename" : "fifthfilename.exe",
"datetime" : 111111155
}
]
}
How can I get the last two files based on the datetime (ids: 1 and 3)?
I would SHA256 of the last two DATETIME ordered by DESC..
I did dozens of tests but none went well...
I don't write the code I tried because I'm really on the high seas ...
I would a result like this or similar:
{
"SHA256": [
"94ee05933....a2af6a22cc2",
"565e04933....a2af6a22c5a"
]
}
Query:
GET myindex/_search
{
"_source":"SHA256",
"sort": [
{
"files.datetime": {
"mode":"max",
"order": "desc",
"nested_path": "files"
}
}
],
"size": 2
}
Result:
"hits" : [
{
"_index" : "myindex",
"_type" : "_doc",
"_id" : "3",
"_score" : null,
"_source" : {
"SHA256" : "565e049335e587e501cc4bf90613e0814f00a7b08bc7c648fd865a2af6a22c5a"
},
"sort" : [
111111155
]
},
{
"_index" : "myindex",
"_type" : "_doc",
"_id" : "1",
"_score" : null,
"_source" : {
"SHA256" : "94ee059335e587e501cc4bf90613e0814f00a7b08bc7c648fd865a2af6a22cc2"
},
"sort" : [
111111144
]
}
]
In sort you will get the max date time value . So If you need to get file names too , you can add it in _source and use sort file to get appropriate file name.
A bit more complicated query this will give you exactly two values.
GET myindex/_search
{
"_source": "SHA256",
"query": {
"bool": {
"must": [
{
"nested": {
"path": "files",
"query": {
"match_all": {}
},
"inner_hits": {
"size":1,
"sort": [
{
"files.datetime": "desc"
}
]
}
}
}
]
}
},
"sort": [
{
"files.datetime": {
"mode": "max",
"order": "desc",
"nested_path": "files"
}
}
],
"size": 2
}
Result:
[
{
"_index" : "myindex",
"_type" : "_doc",
"_id" : "3",
"_score" : null,
"_source" : {
"SHA256" : "565e049335e587e501cc4bf90613e0814f00a7b08bc7c648fd865a2af6a22c5a"
},
"sort" : [
111111155
],
"inner_hits" : {
"files" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "myindex",
"_type" : "_doc",
"_id" : "3",
"_nested" : {
"field" : "files",
"offset" : 0
},
"_score" : null,
"_source" : {
"filename" : "fifthfilename.exe",
"datetime" : 111111155
},
"sort" : [
111111155
]
}
]
}
}
}
},
{
"_index" : "myindex",
"_type" : "_doc",
"_id" : "1",
"_score" : null,
"_source" : {
"SHA256" : "94ee059335e587e501cc4bf90613e0814f00a7b08bc7c648fd865a2af6a22cc2"
},
"sort" : [
111111144
],
"inner_hits" : {
"files" : {
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "myindex",
"_type" : "_doc",
"_id" : "1",
"_nested" : {
"field" : "files",
"offset" : 1
},
"_score" : null,
"_source" : {
"filename" : "secondfilename.exe",
"datetime" : 111111144
},
"sort" : [
111111144
]
}
]
}
}
}
}
]

Fetching unique data in Elasticsearch

I have following data
ID: 1, fldname: pawan
ID: 1, fldname: pawan1
ID: 1, fldname: pawan2
ID: 2, fldname: pawan3
ID: 3, fldname: pawan4
ID: 4, fldname: pawan5
I am trying to get unique data based on ID field, similar to what we get in MySQL while firing group by queries like:
select * from table_name where fldname like 'pawan%' group by ID
This will return unique values. Same works in sphinx search when we use group by function.
Is there any way to get unique values in elasticsearch..?
Below is my sample mapping:
"mappings": {
"my_type": {
"properties": {
"docid": {
"type": "keyword"
},
"flgname": {
"type": "text"
}
}
}
}
I suggest that you slightly modify your mapping:
{
"record" : {
"dynamic" : "false",
"_all" : {
"enabled" : false
},
"properties" : {
"docid" : {
"type" : "long"
},
"flgname" : {
"type" : "text"
}
}
}
}
so that docid is a long
Then you could try fuzzy queries for filtering, together with aggregations, like this one here which retrieves the minimum, maximum, average and count of docid's:
{
"from" : 0,
"size" : 10,
"_source" : true,
"query" : {
"bool" : {
"must" : [ {
"match" : {
"flgname" : {
"query" : "pawan",
"operator" : "OR",
"fuzziness" : "1",
"prefix_length" : 1,
"max_expansions" : 50,
"fuzzy_transpositions" : true,
"lenient" : false,
"zero_terms_query" : "NONE",
"boost" : 1.0
}
}
} ]
}
},
"aggs" : {
"my_cardinality" : {
"cardinality" : {
"field" : "docid"
}
},
"my_avg" : {
"avg" : {
"field" : "docid"
}
},
"my_min" : {
"min" : {
"field" : "docid"
}
},
"my_max" : {
"max" : {
"field" : "docid"
}
}
}
}
By the way this is the result of the above query on the data you proposed:
{
"took" : 47,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"failed" : 0
},
"hits" : {
"total" : 6,
"max_score" : 0.9808292,
"hits" : [ {
"_index" : "stack_overflow1",
"_type" : "record",
"_id" : "40b5eac0-743b-4a6a-a06d-3ae4d56f4aca",
"_score" : 0.9808292,
"_source" : {
"docid" : "1",
"flgname" : "pawan"
}
}, {
"_index" : "stack_overflow1",
"_type" : "record",
"_id" : "27821c39-e722-4361-bc07-0dcd5181a1ad",
"_score" : 0.7846634,
"_source" : {
"docid" : "2",
"flgname" : "pawan3"
}
}, {
"_index" : "stack_overflow1",
"_type" : "record",
"_id" : "86fcd9c1-a688-4a6a-9c45-e91791a8b902",
"_score" : 0.7846634,
"_source" : {
"docid" : "4",
"flgname" : "pawan5"
}
}, {
"_index" : "stack_overflow1",
"_type" : "record",
"_id" : "fb00a3cc-f1b8-4073-8808-f2ddbc4979e2",
"_score" : 0.55451775,
"_source" : {
"docid" : "1",
"flgname" : "pawan1"
}
}, {
"_index" : "stack_overflow1",
"_type" : "record",
"_id" : "18e5e20d-17a7-4d59-b2f1-7bf325a4c4df",
"_score" : 0.55451775,
"_source" : {
"docid" : "3",
"flgname" : "pawan4"
}
}, {
"_index" : "stack_overflow1",
"_type" : "record",
"_id" : "fbf49af6-f574-4ad2-8686-cbbedc5e70c4",
"_score" : 0.23014566,
"_source" : {
"docid" : "1",
"flgname" : "pawan2"
}
} ]
},
"aggregations" : {
"my_cardinality" : {
"value" : 4
},
"my_max" : {
"value" : 4.0
},
"my_avg" : {
"value" : 2.0
},
"my_min" : {
"value" : 1.0
}
}
}
If you make flgname also a keyword, then you can use sub-aggregation to aggregate over docID and subaggregate over flgname. Result will be similar to the SQL query you mentioned.
Query would look like:
{ "size": 0,
"query": {
"regexp":{
"flgname": "pawa.*"
}
},
"aggs" : {
"docids": {
"terms": {"field": "docid"},
"aggs": { "flgnam": { "terms": {"field": "flgname"}}}}
}
}

Resources