Empty String elastic search - elasticsearch

I'm using Elastic 6.5 .
I need to include an empty string search with one of the criteria i'm passing.
primaryKey = 1, 2, 3
subKey = "" or subKey = "A" along with a bunch of other criteria.
I've been unable to get the record that has the empty subKey.
i've tried using the MUST_NOT EXISTS but it doesn't fetch the record in question.
So below should return any records that have primarykey of 1, 2, or 3. and subKey of 'A' or Empty String. Filtered by the Date provided. I get all the records Except the record where the subKey is blank.
so i've tried this:
{
"size": 200, "from": 0,
"query": {
"bool": {
"must": [{
"bool": {
"should": [{ "terms": {"primaryKey": [1,2,3] }}]
}
},
{
"bool": {
"should": [
{"match": {"subKey": "A"}},
{
"bool" : {
"must_not": [{ "exists": { "field": "subKey"} }]
}
}
]
}
}],
"filter": [{"range": {"startdate": {"lte": "2018-11-01"}}}]
}
}
}
The subkey field is special.. where it's actually searched by LETTER. But i don't think that effects anything.. but here is the NEST coding i have for that index.
new CreateIndexDescriptor("SpecialIndex").Settings(s => s
.Analysis(a => a
.Analyzers(aa => aa
.Custom("subKey_analyzer", ma => ma
.Tokenizer("subKey_tokenizer")
.Filters("lowercase")
)
)
.Tokenizers(ta => ta
.NGram("subKey_tokenizer", t => t
.MinGram(1)
.MaxGram(1)
.TokenChars(new TokenChar[] { TokenChar.Letter, TokenChar.Whitespace })
)
)
)
)
.Mappings(ms => ms
.Map<SpecialIndex>(m => m
.Properties(p => p
.Text(s => s
.Name(x => x.subKey)
.Analyzer("subKey_analyzer")
)
)
));
Any ideas on how to resolve this? Thank you very much!
NOTE: i've seen posts saying this can be done with a filter, using missing. But as you can see from the query, i need the Query to do this, not the filter.
i've also tried the following rather than the MUST_NOT EXISTS
{
"term": { "subKey": { "value": "" }}
}
but doesn't work. I'm thinking I need another tokenizer to get this working.

Ok, I managed to fix this by using Multi-fields. This is what i did.
Changed the Mappings to this:
.Mappings(ms => ms
.Map<SpecialIndex>(m => m
.Properties(p => p
.Text(s => s
.Name(x => x.subKey)
.Fields(ff => ff
.Text(tt => tt
.Name("subKey")
.Analyzer("subKey_analyzer")
)
.Keyword(k => k
.Name("keyword")
.IgnoreAbove(5)
)
)
)
)
));
then i changed my query BOOL piece to this:
"bool": {
"should": [{
"match": {
"subKey.subKey": {
"query": "A"
}
}
},
{
"term": {
"subKey.keyword": {
"value": ""
}
}
}]
}
what i don't really like about this is that i think Elastic is creating an additional field just to find EMPTY strings of the same field. That really doesn't seem ideal.
Anyone have another suggestion that would be great!
[UPDATE] The NEST implementation needs to use SUFFIX to access the multi-fields.
.Bool(bb => bb
.Should(bbs => bbs
.Match(m => m.Field(f => f.subKey.Suffix("subKey")).Query(search.subKey)),
bbs => bbs
.Term(t => t.Verbatim().Field(f => f.subKey.Suffix("keyword")).Value(string.Empty)))

Related

Elasticsearch sorting by nested field in nested array

I'm using ElasticSearch 7.2.1 and NEST 7.2.1
My data structure is following
{
id: "some_id",
"roles" : [
{
"name" : "role_one_name",
"members" : [
{
"id" : "member_one_id",
"name" : "member_one_name",
}
]
},
{
"name" : "role_two_name",
"members" : [
{
"id" : "member_two_id",
"name" : "member_two_name",
}
]
]
}
The idea is that I need to implement sorting by given role name (e.g. role_one_name).
Sorting should be performed on members.name (e.g. members[0].name). In my case members array will always contain one element, but for some roles (omitted in the example) it contains more that one element, so I can't get rid of nested array.
In my head I have an algorithm:
Get needed role by name.
Specify path to the first element in members array.
Point to the name property to sort on.
I'm a newbie in elasticsearch world, and after few days of trying I got a following query (which does not work).
var sortFilters = new List<Func<FieldSortDescriptor<T>, FieldSortDescriptor<T>>>();
var sortFieldValue = "role_two_name";
...
sortFilters.Add(o => o.Nested(n => n
.Path(p => p.Roles)
.Filter(f => f
.Term(t => t
.Field(c => c.Roles.First().Name)
.Value(sortFieldValue)) && f
.Nested(n => n
.Path(p => p.Roles.First().Members)
.Query(q => q
.Term(t => t
.Field(f => f.Roles.First().Members.First().Name)))))));
What am I doing wrong?
With help of my colleagues I managed to solve it.
GET index_name/_search
{
"from": 0,
"size": 20,
"query": {
"match_all": {}
},
"sort": [{
"roles.members.name.keyword": {
"order": "asc",
"nested": {
"path": "roles",
"filter": {
"term": {
"roles.name.keyword": {
"value": "sortFieldValue"
}
}
},
"nested": {
"path": "roles.members"
}
}
}
}
]
}
or using NEST:
sortFilters.Add(o => o.Field(f => f.Roles.First().Members.First().Name.Suffix("keyword")));
sortFilters.Add(o => o.Nested(n => n
.Path(p => p.Roles)
.Filter(f => f
.Term(t => t
.Field(q => q.Roles.First().Name.Suffix("keyword"))
.Value(sortFieldValue)
)
)
.Nested(n => n
.Path(p => p.Roles.First().Members)
)
));

Converting JSON to Elastic NEST query doesn't work as intended

I'm trying to convert the following JSON to NEST, but it's not working as intended. It does match the field with the website, but it doesn't match the range, so I get some very old results.
When using Kibana to search, I send this request:
"query": {
"bool": {
"must": [],
"filter": [
{
"bool": {
"should": [
{
"match": {
"domain": "website.com"
}
}
],
"minimum_should_match": 1
}
},
{
"range": {
"#timestamp": {
"gte": "2020-08-03T12:37:07.821Z",
"lte": "2020-08-18T12:37:07.821Z",
"format": "strict_date_optional_time"
}
}
}
],
"should": [],
"must_not": []
}
},
And converted to NEST:
SearchDescriptor<ApacheRequest> Query(SearchDescriptor<ApacheRequest> qc)
{
var query = qc.Query(q =>
q.Bool(b =>
b.Filter(f =>
f.Bool(fb =>
fb.Should(sh =>
sh.Match(ma => ma
.Field(x => x.Domain)
.Query("website.com")
)
)
),
f => f.Range(r => r.GreaterThanOrEquals(timestamp))
)
)
);
return query;
}
As I said, it matches the domain, but not the range. I get results a month back, even though I've tested that my timestamp is correct.
What am I doing wrong?
Ah, I found the issue.. I'm not supposed to use .Range() but rather .DateRange(). Now my query looks like this:
SearchDescriptor<ApacheRequest> Query(SearchDescriptor<ApacheRequest> qc)
{
var query = qc.Query(q =>
q.Bool(b =>
b.Filter(f =>
f.Bool(fb =>
fb.Must(sh =>
sh.Match(ma => ma
.Field(x => x.Domain)
.Query("website.com")
)
)
),
f => f.DateRange(r =>
r.Field(fi => fi.Timestamp).GreaterThanOrEquals(from)
)
)
)
);
return query;
}

Convert elastic search query + aggregations to nest syntax

I need to turn the following nested query and aggregations as written in Kibana to c# nest syntax.
The main issue is regarding the "harvest-date" sub-aggregation (I need to set it to the last 3 months). but also not sure the query itself is the best practice.
GET tdnetindex/_search
{
"size": 0,
"aggs": {
"TermsAggregation": {
"terms": {
"field": "database",
"size": 100
},
"aggs": {
"DateHistogramAggregation": {
"date_histogram": {
"field": "harvest_date",
"interval": "month"
}
}
}
}
},
"query": {
"bool": {
"filter": {
"range": {
"harvest_date": {
"gte": "now-3M/M"
}
}
}
}
}
}
what I did so far was:
var query = elasticClient.Search<ElasticResponse>(s => s
.Size(0)
.Aggregations(a1 => a1
.Terms("TermsAggregation", t => t
.Field(f => f.DataBase)
.Size(100)
.Aggregations(a2 => a2
.DateHistogram("DateHistogramAggregation", dh => dh
.Field(f => f.HarvestDate)
.Interval(DateInterval.Month)
)
)
)
)
.Query(q => q
.Bool(b => b
.Filter(f => f
.Range(r => r
.GreaterThanOrEquals(....);
)
)
)
)
)
You're almost there, just need to use .DateRange(r => r...) instead of .Range(r => r...).
For the DateMath expression, you can use the string "now-3M/M" directly, or translate to
DateMath.Now.Subtract("3M").RoundTo(DateMathTimeUnit.Month)

Append .keyword to fieldname in NEST elasticsearch query

Imagine I have my query as:
.Query(query =>
query.Bool(b => b.Must(m =>
m.Wildcard(w => w.Field(f => f.userName).Value(string.Format("*{0}*", searchModel.username).Suffix("keyword")))
)));
the output query (from DeubgInformation) will be like:
{
"query": {
"bool": {
"must": [{
"wildcard": {
"userName": "*alex*"
}
}
],
"must_not": [],
"should": []
}
}
}
how'ever this does not work. it needs the ".keyword" to be appended at the endof username. The query below works, but I can not generate it through NEST:
{
"query": {
"bool": {
"must": [{
"wildcard": {
"userName.keyword": "*alex*"
}
}
],
"must_not": [],
"should": []
}
}
}
any idea how to make NEST to add the ".keyword" at the end of the field name? (of course in Fluent fashion, otherwise w.Field("userName.keyword") works)
The Suffix() call needs to be part of the member access expression
.Query(query => query
.Bool(b => b
.Must(m => m
.Wildcard(w => w
.Field(f => f.userName.Suffix("keyword"))
.Value(string.Format("*{0}*", searchModel.username)
)
)
)
));

Multi-term filter in ElasticSearch (NEST)

I am trying to query documents based on a given field having multiple possible values. For example, my documents have an "extension" property which is the extension type of a file like .docx, xls, .pdf, etc. I want to be able to filter my "extensions" property on any number of values, but cannot find the correct syntax needed to get this functionality. Here is my current query:
desc.Type("entity")
.Routing(serviceId)
.From(pageSize * pageOffset)
.Size(pageSize)
.Query(q => q
.Filtered(f => f
.Query(qq =>
qq.MultiMatch(m => m
.Query(query)
.OnFields(_searchFields)) ||
qq.Prefix(p1 => p1
.OnField("entityName")
.Value(query)) ||
qq.Prefix(p2 => p2
.OnField("friendlyUrl")
.Value(query))
)
.Filter(ff =>
ff.Term("serviceId", serviceId) &&
ff.Term("subscriptionId", subscriptionId) &&
ff.Term("subscriptionType", subscriptionType) &&
ff.Term("entityType", entityType)
)
)
);
P.S. It may be easier to think of it in the inverse, where I send up the file extensions I DON'T want and set up the query to get documents that DON'T have any of the extension values given.
After discussion, this should be a raw json query, that should work and can be translated to NEST quite easily:
POST /test/_search
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"term": {
"serviceId": "VALUE"
}
},
{
"term": {
"subscriptionId": "VALUE"
}
},
{
"term": {
"subscriptionType": "VALUE"
}
},
{
"term": {
"entityType": "VALUE"
}
}
],
"must_not": [
{
"terms": {
"extension": [
"docx",
"doc"
]
}
}
]
}
}
}
}
}
What had to be done:
In order to have clauses that have to exist and the ones, that need to be filtered out, bool query suited best.
Must query stores all clauses that are present in OPs query
Must_not query should store all extensions that need to be filtered out
If you want to return items that match ".doc" OR ".xls" then you want a TERMS query. Here is a sample:
var searchResult = ElasticClient
.Search<SomeESType>(s => s
.Query(q => q
.Filtered(fq => fq
.Filter(f => f
.Terms(t => t.Field123, new List<string> {".doc", ".xls"})
)
)
)
)

Resources