elasticsearch › Problems with eliminating null values from query results - elasticsearch

I was working on fetching data from elasticsearch index.
I wanted to filter out documents that contain null or empty string values in certain columns.
Yet when I used either "missing" or "exists" methods I faced some issues with values = "" as they were not filtered out and showed in the results
I thought of using wildcard instead but then it gave no results when dealing with columns that had multiple words in their ID (ex: Alarm Description,Alarm ID,...etc)
Working on elasticsearch-1.3.2
My code with missing/exists :
{
"query" : {
"constant_score" : {
"filter" : {
"exists" : {
"field" : "myfield"
}
}
}
}
}
My code with wildcard:
{
query: {
bool: {
must: [
{
constant_score: {
filter: {
missing: {
field: trap_message.enterprise
}
}
}
}
]
must_not: [ ]
should: [ ]
}
}
from: 0
size: 10
sort: [ ]
facets: { }
}

I would combine the exists filter with a must_not. Here is a sample searching the field "authResult.address.state"
GET index1/type1/_search
{
"size": 10,
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"exists": {
"field": "authResult.address.state"
}
}
],
"must_not": [
{
"term": {
"authResult.address.state": ""
}
}
]
}
}
}
}
}

It worked when I switched from GET to POST

Related

Diference between term and match in Elasticsearch in a bool query

I have a simple document where the _source looks like:
{
"name" : "myProduct",
"label" : "isApiisApi",
"isApi" : 1,
"sold" : 0
}
I've been trying to create a multiple condition query using bool. The only way that I get it working was by using a match query:
{
"query": {
"bool": {
"must": [
{ "term": { "sold": 0 } },
{ "term": { "isApi": 1 } },
{ "match": { "name": "myProduct" } }
]
}
}
}
But why doesn't it work when I use the term query (as the final condition):
{
"query": {
"bool": {
"must": [
{ "term": { "sold": 0 } },
{ "term": { "isApi": 1 } },
{ "term": { "name": "myProduct" } }
]
}
}
}
Tldr;
Elastic text fields upon ingestion passes the data into a analyzer.
By default the standard analyzer is used. Which comes with a token filter named Lowercase.
Your text is indexed in lowercase.
But you are using a term which search for exact match on the indexed data.
In your case myproduct =/= myProduct.
To Reproduce
By default Elastic index, all string like data in two fields.
text
keyword
For exact match you want to use the keyword version.
See below:
POST /72020272/_doc
{
"name" : "myProduct",
"label" : "isApiisApi",
"isApi" : 1,
"sold" : 0
}
GET /72020272/_mapping
GET /72020272/_search
{
"query": {
"bool": {
"must": [
{ "term": { "sold": 0 } },
{ "term": { "isApi": 1 } },
{ "term": { "name": "myProduct" } }
]
}
}
}
GET /72020272/_search
{
"query": {
"bool": {
"must": [
{ "term": { "sold": 0 } },
{ "term": { "isApi": 1 } },
{ "term": { "name.keyword": "myProduct" } }
]
}
}
}

Elasticsearch aggregation query with filters

I wrote a elasticsearch query to get the aggregated doc count of a matching keyword "webserver1". Below is the query:
POST _search?filter_path=aggregations.*.buckets
{
"query": {
"bool": {
"must": [
{
"match": {
"hostname": "webserver1"
}
}
]
}
},
"aggs": {
"webserver1": {
"terms": {
"field": "webserver1"
}
}
}
}
Response:
{
"aggregations" : {
"webserver1" : {
"buckets" : [
{
"key" : "webserver1",
"doc_count" : 36715
}
]
}
}
}
Is there a way to filter only the wanted text and display it like the below one:
{
"webserver1" : 36715
}
I have checked multiple resource but I'm not able to find any filters/options to do it.

Get all docs which not contains the key?

For example,i have 2 type docs,such as
{
"field2":"xx",
"field1","x"
}
{
"field1","x"
}
The one has 2 fields(field1 and field2),another one just has 1 field(field1).
Now,i want to query all docs which do not have field2 field?
EIDT
dsl:
{
"query": {
"bool": {
"filter": [
{
"exists": {
"field": "LableToMember"
}
}
]
}
}
}
doc:
{
"LableToMember": [
{
"xxx": "xxx",
"id": "1"
}
],
"field2":"xxx"
}
LableToMember is a nested field.I find exists api can't be used for nested field?
Note that in ES 5.x the missing query has been removed in favor of the exists one.
So if you want to be forward compatible, you should prefer using this:
POST /_search
{
"query": {
"bool": {
"must_not": {
"exists": {
"field": "field2"
}
}
}
}
}
UPDATE
If you want to retrieve all docs which don't have field2 or have field2 with a given value, you can do it like this:
POST /_search
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"bool": {
"must_not": {
"exists": {
"field": "field2"
}
}
}
},
{
"term": {
"field2": "somevalue"
}
}
]
}
}
}
In short you want to query those documents which have field2 missing. You can use Missing Query like:
"filter" : {
"missing" : { "field" : "field2" }
}
Hope it helps

Elasticsearch match list against field

I have a list, array or whichever language you are familiar. E.g. names : ["John","Bas","Peter"] and I want to query the name field if it matches one of those names.
One way is with OR Filter. e.g.
{
"filtered" : {
"query" : {
"match_all": {}
},
"filter" : {
"or" : [
{
"term" : { "name" : "John" }
},
{
"term" : { "name" : "Bas" }
},
{
"term" : { "name" : "Peter" }
}
]
}
}
}
Any fancier way? Better if it's a query than a filter.
{
"query": {
"filtered" : {
"filter" : {
"terms": {
"name": ["John","Bas","Peter"]
}
}
}
}
}
Which Elasticsearch rewrites as if you hat used this one
{
"query": {
"filtered" : {
"filter" : {
"bool": {
"should": [
{
"term": {
"name": "John"
}
},
{
"term": {
"name": "Bas"
}
},
{
"term": {
"name": "Peter"
}
}
]
}
}
}
}
}
When using a boolean filter, most of the time, it is better to use the bool filter than and or or. The reason is explained on the Elasticsearch blog: http://www.elasticsearch.org/blog/all-about-elasticsearch-filter-bitsets/
As I tried the filtered query I got no [query] registered for [filtered], based on answer here it seems the filtered query has been deprecated and removed in ES 5.0. So I provide using:
{
"query": {
"bool": {
"filter": {
"terms": {
"name": ["John","Bas","Peter"]
}
}
}
}
}
example query = filter by keyword and a list of values
{
"query": {
"bool": {
"must": [
{
"term": {
"fguid": "9bbfe844-44ad-4626-a6a5-ea4bad3a7bfb.pdf"
}
}
],
"filter": {
"terms": {
"page": [
"1",
"2",
"3"
]
}
}
}
}
}

Create Elasticsearch curl query for not null and not empty("")

How can i create Elasticsearch curl query to get the field value which are not null and not empty(""),
Here is the mysql query:
select field1 from mytable where field1!=null and field1!="";
A null value and an empty string both result in no value being indexed, in which case you can use the exists filter
curl -XGET 'http://127.0.0.1:9200/test/test/_search?pretty=1' -d '
{
"query" : {
"constant_score" : {
"filter" : {
"exists" : {
"field" : "myfield"
}
}
}
}
}
'
Or in combination with (eg) a full text search on the title field:
curl -XGET 'http://127.0.0.1:9200/test/test/_search?pretty=1' -d '
{
"query" : {
"filtered" : {
"filter" : {
"exists" : {
"field" : "myfield"
}
},
"query" : {
"match" : {
"title" : "search keywords"
}
}
}
}
}
'
As #luqmaan pointed out in the comments, the documentation says that the filter exists doesn't filter out empty strings as they are considered non-null values.
So adding to #DrTech's answer, to effectively filter null and empty string values out, you should use something like this:
{
"query" : {
"constant_score" : {
"filter" : {
"bool": {
"must": {"exists": {"field": "<your_field_name_here>"}},
"must_not": {"term": {"<your_field_name_here>": ""}}
}
}
}
}
}
On elasticsearch 5.6, I have to use command below to filter out empty string:
GET /_search
{
"query" : {
"regexp":{
"<your_field_name_here>": ".+"
}
}
}
Wrap a Missing Filter in the Must-Not section of a Bool Filter. It will only return documents where the field exists, and if you set the "null_value" property to true, values that are explicitly not null.
{
"query":{
"filtered":{
"query":{
"match_all":{}
},
"filter":{
"bool":{
"must":{},
"should":{},
"must_not":{
"missing":{
"field":"field1",
"existence":true,
"null_value":true
}
}
}
}
}
}
}
You can do that with bool query and combination of must and must_not like this:
GET index/_search
{
"query": {
"bool": {
"must": [
{"exists": {"field": "field1"}}
],
"must_not": [
{"term": {"field1": ""}}
]
}
}
}
I tested this with Elasticsearch 5.6.5 in Kibana.
The only solution here that worked for me in 5.6.5 was bigstone1998's regex answer. I'd prefer not to use a regex search though for performance reasons. I believe the reason the other solutions don't work is because a standard field will be analyzed and as a result have no empty string token to negate against. The exists query won't help on it's own either since an empty string is considered non-null.
If you can't change the index the regex approach may be your only option, but if you can change the index then adding a keyword subfield will solve the problem.
In the mappings for the index:
"myfield": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
}
Then you can simply use the query:
{
"query": {
"bool": {
"must": {
"exists": {
"field": "myfield"
}
},
"must_not": {
"term": {
"myfield.keyword": ""
}
}
}
}
}
Note the .keyword in the must_not component.
You can use not filter on top of missing.
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"not": {
"filter": {
"missing": {
"field": "searchField"
}
}
}
}
}
}
Here's the query example to check the existence of multiple fields:
{
"query": {
"bool": {
"filter": [
{
"exists": {
"field": "field_1"
}
},
{
"exists": {
"field": "field_2"
}
},
{
"exists": {
"field": "field_n"
}
}
]
}
}
}
You can use a bool combination query with must/must_not which gives great performance and returns all records where the field is not null and not empty.
bool must_not is like "NOT AND" which means field!="", bool must exist means its !=null.
so effectively enabling: where field1!=null and field1!=""
GET IndexName/IndexType/_search
{
"query": {
"bool": {
"must": [{
"bool": {
"must_not": [{
"term": { "YourFieldName": ""}
}]
}
}, {
"bool": {
"must": [{
"exists" : { "field" : "YourFieldName" }
}]
}
}]
}
}
}
ElasticSearch Version:
"version": {
"number": "5.6.10",
"lucene_version": "6.6.1"
}
ES 7.x
{
"_source": "field",
"query": {
"bool": {
"must": [
{
"exists": {
"field":"field"
}
}
],
"must_not": [
{
"term": {
"field.keyword": {
"value": ""
}
}
}
]
}
}
}
We are using Elasticsearch version 1.6 and I used this query from a co-worker to cover not null and not empty for a field:
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"exists": {
"field": "myfieldName"
}
},
{
"not": {
"filter": {
"term": {
"myfieldName": ""
}
}
}
}
]
}
}
}
}
}
You need to use bool query with must/must_not and exists
To get where place is null
{
"query": {
"bool": {
"must_not": {
"exists": {
"field": "place"
}
}
}
}
}
To get where place is not null
{
"query": {
"bool": {
"must": {
"exists": {
"field": "place"
}
}
}
}
}
Elastic search Get all record where condition not empty.
const searchQuery = {
body: {
query: {
query_string: {
default_field: '*.*',
query: 'feildName: ?*',
},
},
},
index: 'IndexName'
};

Resources