What's the most efficient way to filter out items with member that don't contain the terms? - elasticsearch

What's the most efficient way to write the following query?
Get 5 items that have member member that contains any of these items: ['item1', 'item2'].
should: [{ terms : {member: ['item1', 'item2'] } }]
If you find only 3 items, get 2 more where member is empty.
How do I finish this query?

You can use a should clause with term and not exists. So it will fetch documents where field member matches input query and where field doesn't exist. You can pass size to get top 5 documents from result
{
"size": 5,
"query": {
"bool": {
"should": [
{
"terms": {
"member": [
"1",
"2",
"3"
]
}
},
{
"bool": {
"must_not": [
{
"exists": {
"field": "member"
}
}
]
}
}
]
}
}
}

Related

Elasticsearch return unique string from array field after a given filter

How would I get all values of all the ids with a given prefix from the elastic search records and make them unique.
Records
PUT items/1
{ "ids" : [ "apple_A", "orange_B" ] }
PUT items/2
{ "ids" : [ "apple_A", "apple_B" ] }
PUT items/3
{ "ids" : [ "apple_C", "banana_A" ] }
What I need is to find all the unique ids for a given prefix, for example if input is apple the output of ids should be ["apple_A", "apple_B", "apple_C"]
What I have tried so far is make use of the term aggregation, with the following query I was able to filter out the documents which have ids with given prefix but in the aggregation it will return all the ids part of the document.
{
"aggregations": {
"filterIds": {
"filter": {
"bool": {
"filter": [
{
"prefix": {
"ids.keyword": {
"value": "apple"
}
}
}
]
}
},
"aggregations": {
"uniqueIds": {
"terms": {
"field": "ids.keyword",
}
}
}
}
}
}
It's returning aggregation list as [ "appleA", "orange_B", "apple_B","apple_C", "banana_A"] if we give prefix input as apple. Basically returning all ids which have a matching filter.
Is there to get only the ids which match the prefix in array and not all the ids in the array of document ?
You can limit the returned values using the include parameter:
POST items/_search
{
"size": 0,
"aggregations": {
"filterIds": {
"filter": {
"bool": {
"filter": [
{
"prefix": {
"ids.keyword": {
"value": "apple"
}
}
}
]
}
},
"aggregations": {
"uniqueIds": {
"terms": {
"field": "ids.keyword",
"include": "apple.*" <--
}
}
}
}
}
}
Do check this other thread which deals with using regex within include -- it's very similar to your use case.

Elasticsearch bool query with filter terms and string search returning inconsistent results

I'm running the following query against Elasticsearch that matches documents based on a string search and property terms match. When I pass a single term, I get the expected results, but when I add a second term, I don't get the same results. Ideas?
{
"_source": {
"includes": [
"docID"
]
},
"query": {
"bool": {
"must": [
{
"terms": {
"userID": [
1,
2,
71
]
}
},
{
"query_string": {
"query": "**test**",
"fields": [
"attachment.content"
]
}
}
]
}
}
}
If I pass only userID 1, and omit the others, I get the docIDs I expect (i.e. 1,4,8), but when I pass all three userIDs I have several docIDs missing from the results (i.e. 1, 6, 8, but no 4). Using Elasticsearch 6.5.
Hopefully someone understands better than I why this is!
Thanks in advance!
By default, ES returns result as 10. Maybe the missing documents are in the next page. We can increase the size to larger number such as:
{
"size": 30, // put size here
"_source": {
"includes": [
"docID"
]
},
"query": {
"bool": {
"must": [
{
"terms": {
"userID": [
1,
2,
71
]
}
},
{
"query_string": {
"query": "**test**",
"fields": [
"attachment.content"
]
}
}
]
}
}
}

Elastic search bool query

My objective is to find out most recent 10 documents which match message id as MSG-1013 and Severity field must be info. Both conditions should satisfied and match text should be exact. I have tried with search query below but it does not give me expected results. What am I doing wrong here ?
{
"size": 10,
"query": {
"bool": {
"must": [
{
"match": { "messageId": "MSG-1013" }
},
{
"match": { "Severity": "Info" }
}
]
}
}
}
If I have understood you correctly, you want to find the top 10 (recent) documents having exactly fields "messageId" and "Severity". I assume, you don't need a score because your score seems to be the the document timestamp or something else like a date field. For this purpose, you could use the bool filter in combination with a sort query.
{
"query": {
"bool": {
"filter": [
{ "term": { "messageId": "MSG-1013" } },
{ "term": { "Severity": "Info" } }
]
}
},
"sort" : [
{ "documentTimestamp" : {"order" : "desc"}}
],
"size": 10
}

elasticsearch inner join

I have an index with some fields, my documents contains valid "category" data also contains "url"(analyzed field) data but not contains respsize..
in the other hand documents that contains "respsize" data (greater than 0) also contains "url" data but not contains "category" data..
I think you got the point, I need join or intersection whatever that a query returns all documents contains respsize and category that have same same url documents.
Here what I did so far;(url field analyzed, rest of them not_analyzed)
here documents that have category:
and other documents have respsize that I need to combine them based on url
I need a dsl query that return records that have same url token(in this scenario it will be www.domainname.com) with merge category and respsize,
I simply want field in second img "category":"27" like in img1 but of course with rest of all fields.
here is my query but not work
GET webproxylog/accesslog/_search
{
"query": {
"filtered": {
"filter" : {
"and" : {
"filters": [
{
"not": {
"filter": {
"terms": {
"category": [
"-",
"-1",
"0"
]
},
"term": {
"respsize": "0"
}
}
},
"term": {
"category": "www.hurriyet.com.tr"
}
}
],
"_cache" : true
}
}
}
},
"sort": [
{
"respsize": {
"order": "desc"
}
}
]
}
You can try the query below. It will require the url field to be the one you specify (i.e. must) and then either of the next two clauses (i.e. should) must be true, i.e. category should be not one of the given terms or the respsize must be greater than 0.
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"url": "www.hurriyet.com.tr"
}
}
],
"should": [
{
"not": {
"terms": {
"category": [
"-",
"-1",
"0"
]
}
}
},
{
"range": {
"respsize": {
"gt": 0
}
}
}
]
}
}
}
}
}

Exclude empty array fields - but include documents missing the field - in elasticsearch

I'm trying to run a query against elasticsearch that will find documents where one of the following conditions applies:
The document is missing the given field (tags) OR
The document has the value foo as an element of the tags array
The problem is that my current query will return documents that have a tags field where the value is an empty array. Presumably this is because elasticsearch is treating an empty array as the same thing as not having the field at all. Here's the full query I'm running that's returning the bad results:
{
"from": 0,
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"exists": {
"field": "_rankings.public"
}
},
{
"or": [
{
"missing": {
"existence": true,
"field": "tags",
"null_value": false
}
},
{
"terms": {
"execution": "or",
"tags": [
"foo"
]
}
}
]
}
]
}
},
"query": {
"match_all": {}
}
}
},
"size": 10000,
"sort": [
{
"_rankings.public": {
"ignore_unmapped": true,
"order": "asc"
}
}
]
}
I don't think you can achieve this so easily "out-of-the-box" for the reason you already mentioned: there's no difference between an empty array and a field (corresponding to that array) with no values in it.
Your only option might be to use a "null_value" for that "tags" field and, if you have any control over the data that goes into your documents, to treat a "[]" array as a '["_your_null_value_of_choice_"]'. And in your query to change "null_value": false to true.

Resources