Search by multiple values using NEST(ElasticSearch) - elasticsearch

I have an index called "campaigns" with these records:
"hits" : [
{
"_index" : "campaigns",
"_id" : "cf08b05c-c8b5-45cb-bca8-17267c3613fb",
"_source" : {
"PublisherId" : 1,
"CurrentStatus" : "Pending"
}
},
{
"_index" : "campaigns",
"_id" : "39436cb3-483e-4fb4-92e4-4e06ecad27a1",
"_source" : {
"PublisherId" : 1,
"CurrentStatus" : "Approved"
}
},
{
"_index" : "campaigns",
"_id" : "21436cb1-583e-4fb4-92e4-4e06ecad23a2",
"_source" : {
"PublisherId" : 1,
"CurrentStatus" : "Rejected"
}
}
]
I want to get all campaigns with "PublisherId = 1" and with any statuses between "Approved,Rejected". Something like this:
var statuses = new[] {CampaignStatus.Approved,CampaignStatus.Rejected};
campaigns.Where(c=> c.PublisherId == 1 && statuses.Contains(c.CurrentStatus)).ToList();
How can I run this query using NEST?
Expected Result:
"hits" : [
{
"_index" : "campaigns",
"_id" : "39436cb3-483e-4fb4-92e4-4e06ecad27a1",
"_source" : {
"PublisherId" : 1,
"CurrentStatus" : "Approved"
}
},
{
"_index" : "campaigns",
"_id" : "39436cb3-483e-4fb4-92e4-4e06ecad27a1",
"_source" : {
"PublisherId" : 1,
"CurrentStatus" : "Rejected"
}
}
]

I don't know the syntax of nest but as ES is REST based , providing working example query in JSON format, which you can convert to nest code.
Index mapping
{
"mappings": {
"properties": {
"PublisherId": {
"type": "integer"
},
"CurrentStatus": {
"type": "text"
}
}
}
}
Index all three sample docs and use below search query
{
"query": {
"bool": {
"must": {
"term": {
"PublisherId": 1
}
},
"should": [
{
"match": {
"CurrentStatus": "Rejected"
}
},
{
"match": {
"CurrentStatus": "Approved"
}
}
],
"minimum_should_match" : 1
}
}
}
Search Result
"hits": [
{
"_index": "stof_63968525",
"_type": "_doc",
"_id": "1",
"_score": 1.9808291,
"_source": {
"PublisherId": 1,
"CurrentStatus": "Approved"
}
},
{
"_index": "stof_63968525",
"_type": "_doc",
"_id": "3",
"_score": 1.9808291,
"_source": {
"PublisherId": 1,
"CurrentStatus": "Rejected"
}
}
]
Please note the use of minimum_should_match which forces atleast one of status Rejected and Approved to match and refer bool query in ES to understand the query construct.

Did you try this?
QueryContainer queryAnd = new TermQuery() { Field = "PublisherId", Value = 1 };
QueryContainer queryOr = new TermQuery() { Field = "CurrentStatus", Value = "Approved" };
queryOr |= new TermQuery() { Field = "CurrentStatus", Value = "Rejected" };
QueryContainer queryMain = queryAnd & queryOr;
ISearchResponse<campaigns> searchReponse = elasticClient.Search<campaigns>(s => s
.Query(q2 => q2
.Bool(b => b
.Should(queryMain)
)));

Related

How to query all content from a field in Elasticssearch

I'm queriying data from Elasticsearch with python. I can query a certain value in a field like this:
GET index/_search
{
"query": {
"match" : {
"somefieldname": "somevalue"
}
}
}
But how can I query all values inside the field somefieldname?
UPDATE:
Here's an example index:
"_index" : „indexname
"_type" : "_doc",
"_id" : "lJlcO3wBhlKWxmXE9jrd",
"_score" : 0,
"_source": {
„field1“: „abc“,
„field2“: „123",
„field3": „def“,
},
"_index" : „indexname
"_type" : "_doc",
"_id" : "lJlcO3wBhlKWxmXE9jrd",
"_score" : 0,
"_source": {
„field1“: „fgh“,
„field2“: „654",
„field3": „kui“,
},
"_index" : „indexname
"_type" : "_doc",
"_id" : "lJlcO3wBhlKWxmXE9jrd",
"_score" : 00,
"_source": {
„field1“: „567“,
„field2“: „gfr",
„field3": „234“,
},
Now I want to query all content from field2 from all docs. So that my output is [„123", „654", „gfr"]
UPDATE:
Index mapping for the field:
{
"myindex" : {
"mappings" : {
"field2" : {
"full_name" : "field2",
"mapping" : {
"field2" : {
"type" : "keyword"
}
}
}
}
}
}
You can use terms aggregation, to get unique values from field2
{
"size": 0,
"aggs": {
"field2values": {
"terms": {
"field": "field2"
}
}
}
}
Search Result would be
"aggregations": {
"field2values": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "123",
"doc_count": 1
},
{
"key": "654",
"doc_count": 1
},
{
"key": "gfr",
"doc_count": 1
}
]
}
}

Filter nested objects in ElasticSearch 6.8.1

I didn't find any answers how to do simple thing in ElasticSearch 6.8 I need to filter nested objects.
Index
{
"settings": {
"index": {
"number_of_shards": "5",
"number_of_replicas": "1"
}
},
"mappings": {
"human": {
"properties": {
"cats": {
"type": "nested",
"properties": {
"name": {
"type": "text"
},
"breed": {
"type": "text"
},
"colors": {
"type": "integer"
}
}
},
"name": {
"type": "text"
}
}
}
}
}
Data
{
"name": "iridakos",
"cats": [
{
"colors": 1,
"name": "Irida",
"breed": "European Shorthair"
},
{
"colors": 2,
"name": "Phoebe",
"breed": "european"
},
{
"colors": 3,
"name": "Nino",
"breed": "Aegean"
}
]
}
select human with name="iridakos" and cats with breed contains 'European' (ignore case).
Only two cats should be returned.
Million thanks for helping.
For nested datatypes, you would need to make use of nested queries.
Elasticsearch would always return the entire document as a response. Note that nested datatype means that every item in the list would be treated as an entire document in itself.
Hence in addition to return entire document, if you also want to know the exact hits, you would need to make use of inner_hits feature.
Below query should help you.
POST <your_index_name>/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"name": "iridakos"
}
},
{
"nested": {
"path": "cats",
"query": {
"match": {
"cats.breed": "european"
}
},
"inner_hits": {}
}
}
]
}
}
}
Response:
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.74455214,
"hits" : [
{
"_index" : "my_cat_index",
"_type" : "_doc",
"_id" : "1", <--- The document that hit
"_score" : 0.74455214,
"_source" : {
"name" : "iridakos",
"cats" : [
{
"colors" : 1,
"name" : "Irida",
"breed" : "European Shorthair"
},
{
"colors" : 2,
"name" : "Phoebe",
"breed" : "european"
},
{
"colors" : 3,
"name" : "Nino",
"breed" : "Aegean"
}
]
},
"inner_hits" : { <---- Note this
"cats" : {
"hits" : {
"total" : {
"value" : 2, <---- Count of nested doc hits
"relation" : "eq"
},
"max_score" : 0.52354836,
"hits" : [
{
"_index" : "my_cat_index",
"_type" : "_doc",
"_id" : "1",
"_nested" : {
"field" : "cats",
"offset" : 1
},
"_score" : 0.52354836,
"_source" : { <---- First Nested Document
"breed" : "european"
}
},
{
"_index" : "my_cat_index",
"_type" : "_doc",
"_id" : "1",
"_nested" : {
"field" : "cats",
"offset" : 0
},
"_score" : 0.39019167,
"_source" : { <---- Second Document
"breed" : "European Shorthair"
}
}
]
}
}
}
}
]
}
}
Note in your response how the inner_hits section would appear where you would find the exact hits.
Hope this helps!
You could use something like this:
{
"query": {
"bool": {
"must": [
{ "match": { "name": "iridakos" }},
{ "match": { "cats.breed": "European" }}
]
}
}
}
To search on a cat's breed, you can use the dot-notation.

elasticsearch groupby and filter by regex condition

It's a bit hard for me to define the question as I'm not very experienced with Elasticsearch. I'm focusing the question on my specific problem:
Assuming I have the following records:
{
id: 1
name: bla1_1.aaa
},
{
id: 1
name: bla1_2.bbb
},
{
id: 2
name: bla2_1.aaa
},
{
id: 2
name: bla2_2.aaa
}
What I want is to GET all the ids that have all of their names ending with aaa.
I was thinking about group by id and then do a regex query like so: *\.aaa so that all the name must satisfy the regex query.
On this particular example I would get id: 2 back.
How do I do it?
Let me know if there's anything I need to add to clarify the question.
RegexExp can be used.
Wildcard .* matches any character any number of times including zero
Terms aggregation will give you unique "ids" and number of docs under them.
Mapping :
PUT regex
{
"mappings": {
"properties": {
"id":{
"type":"integer"
},
"name":{
"type":"text",
"fields": {
"keyword":{
"type":"keyword"
}
}
}
}
}
}
Data:
"hits" : [
{
"_index" : "regex",
"_type" : "_doc",
"_id" : "olQXjW0BywGFQhV7k84P",
"_score" : 1.0,
"_source" : {
"id" : 1,
"name" : "bla1_1.aaa"
}
},
{
"_index" : "regex",
"_type" : "_doc",
"_id" : "o1QXjW0BywGFQhV7us6B",
"_score" : 1.0,
"_source" : {
"id" : 1,
"name" : "bla1_2.bbb"
}
},
{
"_index" : "regex",
"_type" : "_doc",
"_id" : "pFQXjW0BywGFQhV77c6J",
"_score" : 1.0,
"_source" : {
"id" : 2,
"name" : "bla2_1.aaa"
}
},
{
"_index" : "regex",
"_type" : "_doc",
"_id" : "pVQYjW0BywGFQhV7Dc6F",
"_score" : 1.0,
"_source" : {
"id" : 2,
"name" : "bla2_2.aaa"
}
}
]
Query:
GET regex/_search
{
"size":0,
"query": {
"regexp": {
"name.keyword": {
"value": ".*.aaa" ---> name ending with .aaa
}
}
},
"aggs": {
"unique_ids": {
"terms": {
"field": "id",
"size": 10
}
}
}
}
Result:
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"unique_ids" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 2, ---> 2 doc under id 2
"doc_count" : 2
},
{
"key" : 1, ----> 1 doc under id 1
"doc_count" : 1
}
]
}
}
Edit:
Using bucket selector to keep buckets where total count of docs in Id matches with docs selected in regex
GET regex/_search
{
"size": 0,
"aggs": {
"unique_ids": {
"terms": {
"field": "id",
"size": 10
},
"aggs": {
"totalCount": { ---> to get total count of id(all docs)
"value_count": {
"field": "id"
}
},
"filter_agg": {
"filter": {
"bool": {
"must": [
{
"regexp": {
"name.keyword": ".*.aaa"
}
}
]
}
},
"aggs": {
"finalCount": { -->total count of docs matching regex
"value_count": {
"field": "id"
}
}
}
},
"mybucket_selector": { ---> include buckets where totalcount==finalcount
"bucket_selector": {
"buckets_path": {
"FinalCount": "filter_agg>finalCount",
"TotalCount": "totalCount"
},
"script": "params.FinalCount==params.TotalCount"
}
}
}
}
}
}

Simple way to find which one are in the same company with me?

There is index have field like below, it saves who in which company and which position is
{
"createtime" : 1562844632272,
"post" : "director",
"personId" : 30007346088,
"comId" : 20010774891
}
now want to find the partners of someone, that is which person is in the same company. Now my implementation is
first find the person's related companies(at most 500)
{
"query": { "term": { "personId": 30007346088 } },
"sort": [ { "createtime": "desc" } ],
"_source": ["comId"],
"size":500
}
then find these companies' related persons and exclude the current person and remove duplicate partner(similarly at most 500 partners)
{
"query": {
"bool": {
"must": [{ "terms": { "comId": [20010774891,...] } } ],
"must_not": [ {"term":{"personId":30007346088}} ]
}
},
"aggs" : {
"personId" : {
"terms" : {
"field" : "personId",
"size": 500
}
}
},
"size":0
}
Obviously it's a little complicated, if could exist some more simple way to implement it?
It can work if data is stored in below format.
A unique document for each person , with document id same as person id and company stored as array
POST indexperson/_doc/1
{
"createtime": 1562844632272,
"personId": 1,
"company": [
{
"id": 100,
"post": "director"
},
{
"id": 100,
"post": "director"
}
]
}
Data:
[
{
"_index" : "indexperson",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"createtime" : 1562844632272,
"personId" : 1,
"company" : [
{
"id" : 100,
"post" : "director"
},
{
"id" : 101,
"post" : "director"
}
]
}
},
{
"_index" : "indexperson",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"createtime" : 1562844632272,
"personId" : 2,
"company" : [
{
"id" : 101,
"post" : "director"
}
]
}
},
{
"_index" : "indexperson",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"createtime" : 1562844632272,
"personId" : 3,
"company" : [
{
"id" : 100,
"post" : "director"
}
]
}
},
{
"_index" : "indexperson",
"_type" : "_doc",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"createtime" : 1562844632272,
"personId" : 4,
"company" : [
{
"id" : 104,
"post" : "director"
}
]
}
}
]
Query:
Use (terms look up)[https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-terms-query.html]. Terms look up takes doc id as parameter
GET indexperson/_search
{
"query": {
"bool": {
"must": [
{
"terms": {
"company.id": {
"index": "indexperson",
"id": "1", --> get all docs in indexperson which match with company id
"path": "company.id"
}
}
}
],
"must_not": [
{
"term": {
"personId": {
"value": 2
}
}
}
]
}
}
}
Result:
"hits" : [
{
"_index" : "indexperson",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"createtime" : 1562844632272,
"personId" : 1,
"company" : [
{
"id" : 100,
"post" : "director"
},
{
"id" : 101,
"post" : "director"
}
]
}
},
{
"_index" : "indexperson",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"createtime" : 1562844632272,
"personId" : 3,
"company" : [
{
"id" : 100,
"post" : "director"
}
]
}
}
]

Elasticsearch unable retrieve child documents

I recently migrated Elasticsearch version 2.4 to 6.2.1 and my previous GET query is not working. Below is the query I am trying to retrieve the child document based on _id and _parent values. DO i have to change the implementation to retreive the documnets from ES?
{
"query": {
"bool": {
"must": [
{
"term": {
"_id": {
"value": "9:v0",
"boost": 1
}
}
},
{
"term": {
"_parent": {
"value": "v0",
"boost": 1
}
}
},
{
"terms": {
"assoc.domainId": [
"XX"
],
"boost": 1
}
},
{
"terms": {
"assoc.nodeId": [
"YY"
],
"boost": 1
}
}
],
"adjust_pure_negative": false,
"boost": 1
}
}
}
parent document in ES:
{
"_index" : "test",
"_type" : "assocjoin",
"_id" : "v0",
"_score" : 1.0,
"_source" : {
"my_join_field" : {
"name" : "version"
},
"versionnumber" : "v0",
"versiondate" : "2018/03/29 13:25:02"
}
}
Child document in ES:
{
"_index" : "test",
"_type" : "versionjoin",
"_id" : "9:v0",
"_score" : 0.18232156,
"_routing" : "v0",
"_source" : {
"id" : 0,
"assocDTO" : {
"id" : 9,
"domainId" : "XX",
"nodeId" : "YY"
},
"biomarkers" : [
{
....
}
],
"contexts" : [
{
....
}
]
},
"my_join_field" : {
"name" : "assocversion",
"parent" : "v0"
}
}
}
]
}

Resources