Find empty strings in elasticsearch - elasticsearch

I'm trying to _search documents that has some specific value in the field.
{
"query": {
"bool": {
"must": [
{"field": {"advs.status": "warn"}}
]
}
}
}
That works find. But when I'm trying to find documents that has empty string in that field, I get this error:
ParseException[Cannot parse '' ...
and then - long list of what was expected instead of empty string.
I try this query:
{
"query": {
"bool": {
"must": [
{"term": {"advs.status": ""}}
]
}
}
}
It doesn't fails but finds nothing. It works for non empty strings instead. How am I supposed to do this?
My mapping for this type looks exactly like this:
{
"reports": {
"dynamic": "false",
"_ttl": {
"enabled": true,
"default": 7776000000
},
"properties": {
"#fields": {
"dynamic": "true",
"properties": {
"upstream_status": {
"type": "string"
}
}
},
"advs": {
"properties": {
"status": {
"type": "string",
"store": "yes"
}
}
},
"advs.status": {
"type": "string",
"store": "yes"
}
}
}
}

Or another way to do the same thing more efficiently is to use the exists filter:
"exists" : {
"field" : "advs.status"
}
Both are valid, but this one is better :)

You can try this temporary solution which works but isn't optimal - https://github.com/elastic/elasticsearch/issues/7515
PUT t/t/1
{
"textContent": ""
}
PUT t/t/2
{
"textContent": "foo"
}
GET t/t/_search
{
"query": {
"bool": {
"must": [
{
"exists": {
"field": "textContent"
}
}
],
"must_not": [
{
"wildcard": {
"textContent": "*"
}
}
]
}
}
}

Try using must_not with missing in your bool:
"must_not":{
"missing":{
"field":"advs.status",
"existence":true,
"null_value":true
}
}

If tou want to search for fields containing an empty string, either you change your mapping to set not_analyzed to this particular field or you can use a script filter:
"filter": {
"script": {
"script": "_source.advs.status.length() == 0"
}
}

I generally use a filter if the field is not analyzed. Here is snippet:
{
"filtered": {
"filter": {
"term": {
"field": ""
}
}
}
},

the "missing" does work only for null values or not being there at all. Matching empty string was already answered here: https://stackoverflow.com/a/25562877/155708

Related

elasticsearch need to add a must to a bool should query

I have the following query that works as expected:
GET <index_name>/_search
{
"sort": [
{
"irFileCreateTime": {
"order": "desc"
}
}
],
"query": {
"bool": {
"should": [
{
"match": {
"fileId": 46704
}
},
{
"match": {
"fileId": 46706
}
},
{
"match": {
"fileId": 46719
}
}
]
}
}
}
The problem is that I need to further filter the data, but the field I need to filter on is a text field. I have tried many different ways of putting a must match into my query but everything is either malformed or filters out all hits when I know it should only filter out half. How can I add a must match "irStatus":"COMPLETE" to this query? Thanks in advance.
What you're after is a term query on, preferably, the keyword of irStatus. That is to say:
GET index/_search
{
"sort": [
{
"irFileCreateTime": {
"order": "desc"
}
}
],
"query": {
"bool": {
"must": [
{
"term": {
"irStatus.keyword": {
"value": "COMPLETE"
}
}
}
],
"should": [
{
"match": {
"fileId": 46704
}
},
{
"match": {
"fileId": 46706
}
},
{
"match": {
"fileId": 46719
}
}
]
}
}
}
Assuming your mapping looks something like this:
{
"mappings": {
"properties": {
"irFileCreateTime": {
"type": "date"
},
"fileId": {
"type": "integer"
},
"irStatus": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
The reason it's apparently failing on your end is that "COMPLETE" has been lowercased due to standard analyzer.
Alternatively, you could do:
{
"must":[
{
"query_string":{
"query":"irStatus:COMPLETE AND (fileId:(46704 OR 46706 OR 46719))"
}
}
]
}

Term query on nested fields returns no result in Elasticsearch

I have a nested type field in my mapping. When I use Term search query on my nested field no result is returned from Elasticsearch whereas when I change Term to Match query, it works fine and Elasticsearch returns expected result
here is my mapping, imagine I have only one nested field in my type mapping
{
"homing.estatefiles": {
"mappings": {
"estatefile": {
"properties": {
"DynamicFields": {
"type": "nested",
"properties": {
"Name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"ValueBool": {
"type": "boolean"
},
"ValueDateTime": {
"type": "date"
},
"ValueInt": {
"type": "long"
}
}
}
}
}
}
}
}
And here is my term query (which returns no result)
{
"from": 50,
"size": 50,
"query": {
"bool": {
"filter": [
{
"nested": {
"query": {
"bool": {
"must": [
{
"term": {
"DynamicFields.Name":{"value":"HasParking"}
}
},
{
"term": {
"DynamicFields.ValueBool": {
"value": true
}
}
}
]
}
},
"path": "DynamicFields"
}
}
]
}
}
}
And here is my query which returns expected result (by changing Term query to Match query)
{
"from": 50,
"size": 50,
"query": {
"bool": {
"filter": [
{
"nested": {
"query": {
"bool": {
"must": [
{
"match": {
"DynamicFields.Name":"HasParking"
}
},
{
"term": {
"DynamicFields.ValueBool": {
"value": true
}
}
}
]
}
},
"path": "DynamicFields"
}
}
]
}
}
}
This is happening because the capital letters with the analyzer of elastic.
When you are using term the elastic is looking for the exact value you gave.
up until now it sounds good, but before it tries to match the term, the value you gave go through an analyzer of elastic which manipulate your value.
For example in your case it also turn the HasParking to hasparking.
And than it will try to match it and of course will fail. They have a great explanation in the documentation in the "Why doesn’t the term query match my document" section. This analyzer not being activated on the value when you query using match and this why you get your result.

exact match query in elasticsearch

I'm trying to run an exact match query in ES
in MYSQL my query would be:
SELECT * WHERE `content_state`='active' AND `author`='bob' AND `title` != 'Beer';
I looked at the ES docs here:
https://www.elastic.co/guide/en/elasticsearch/guide/current/_finding_exact_values.html
and came up with this:
{
"from" : '.$offset.', "size" : '.$limit.',
"filter": {
"and": [
{
"and": [
{
"term": {
"content_state": "active"
}
},
{
"term": {
"author": "bob"
}
},
{
"not": {
"filter": {
"term": {
"title": "Beer"
}
}
}
}
]
}
]
}
}
but my results are still coming back with the title = Beer, it doesn't seem to be excluding the titles that = Beer.
did I do something wrong?
I'm pretty new to ES
I figured it out, I used this instead...
{
"from" : '.$offset.', "size" : '.$limit.',
"query": {
"bool": {
"must": [
{
"query_string": {
"default_field": "content_state",
"query": "active"
}
},
{
"query_string": {
"default_field": "author",
"query": "bob"
}
}
],
"must_not": [
{
"query_string": {
"default_field": "title",
"query": "Beer"
}
}
]
}
}
}
Query String Query is a pretty good concept to handle various relationship between search criteria. Have a quick look into Query string query syntax to understand in detail about this concept
{
"query": {
"query_string": {
"query": "(content_state:active AND author:bob) AND NOT (title:Beer)"
}
}
}
Filters are supposed to work on exact values, if you had defined your mapping in a manner where title was a non-analyzed field, your previous attempt ( with filters) would have worked as well.
{
"mappings": {
"test": {
"_all": {
"enabled": false
},
"properties": {
"content_state": {
"type": "string"
},
"author": {
"type": "string"
},
"title": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}

Find documents with empty string value on elasticsearch

I've been trying to filter with elasticsearch only those documents that contains an empty string in its body. So far I'm having no luck.
Before I go on, I should mention that I've already tried the many "solutions" spread around the Interwebz and StackOverflow.
So, below is the query that I'm trying to run, followed by its counterparts:
{
"query": {
"filtered":{
"filter": {
"bool": {
"must_not": [
{
"missing":{
"field":"_textContent"
}
}
]
}
}
}
}
}
I've also tried the following:
{
"query": {
"filtered":{
"filter": {
"bool": {
"must_not": [
{
"missing":{
"field":"_textContent",
"existence":true,
"null_value":true
}
}
]
}
}
}
}
}
And the following:
{
"query": {
"filtered":{
"filter": {
"missing": {"field": "_textContent"}
}
}
}
}
None of the above worked. I get an empty result set when I know for sure that there are records that contains an empty string field.
If anyone can provide me with any help at all, I'll be very grateful.
Thanks!
If you are using the default analyzer (standard) there is nothing for it to analyze if it is an empty string. So you need to index the field verbatim (not analyzed). Here is an example:
Add a mapping that will index the field untokenized, if you need a tokenized copy of the field indexed as well you can use a Multi Field type.
PUT http://localhost:9200/test/_mapping/demo
{
"demo": {
"properties": {
"_content": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
Next, index a couple of documents.
/POST http://localhost:9200/test/demo/1/
{
"_content": ""
}
/POST http://localhost:9200/test/demo/2
{
"_content": "some content"
}
Execute a search:
POST http://localhost:9200/test/demo/_search
{
"query": {
"filtered": {
"filter": {
"term": {
"_content": ""
}
}
}
}
}
Returns the document with the empty string.
{
took: 2,
timed_out: false,
_shards: {
total: 5,
successful: 5,
failed: 0
},
hits: {
total: 1,
max_score: 0.30685282,
hits: [
{
_index: test,
_type: demo,
_id: 1,
_score: 0.30685282,
_source: {
_content: ""
}
}
]
}
}
Found solution here https://github.com/elastic/elasticsearch/issues/7515
It works without reindex.
PUT t/t/1
{
"textContent": ""
}
PUT t/t/2
{
"textContent": "foo"
}
GET t/t/_search
{
"query": {
"bool": {
"must": [
{
"exists": {
"field": "textContent"
}
}
],
"must_not": [
{
"wildcard": {
"textContent": "*"
}
}
]
}
}
}
Even with the default analyzer you can do this kind of search: use a script filter, which is slower but can handle the empty string:
curl -XPOST 'http://localhost:9200/test/demo/_search' -d '
{
"query": {
"filtered": {
"filter": {
"script": {
"script": "_source._content.length() == 0"
}
}
}
}
}'
It will return the document with empty string as _content without a special mapping
As pointed by #js_gandalf, this is deprecated for ES>5.0. Instead you should use: query->bool->filter->script as in https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html
For those of you using elastic search 5.2 or above, and still stuck. Easiest way is to reindex your data correctly with the keyword type. Then all the searches for empty values worked. Like this:
"query": {
"term": {"MY_FIELD_TO_SEARCH": ""}
}
Actually, when I reindex my database and rerun the query. It worked =)
The problem was that my field was type: text and NOT a keyword. Changed the index to keyword and reindexed:
curl -X PUT https://username:password#host.io:9200/mycoolindex
curl -X PUT https://user:pass#host.io:9200/mycoolindex/_mapping/mycooltype -d '{
"properties": {
"MY_FIELD_TO_SEARCH": {
"type": "keyword"
},
}'
curl -X PUT https://username:password#host.io:9200/_reindex -d '{
"source": {
"index": "oldindex"
},
"dest": {
"index": "mycoolindex"
}
}'
I hope this helps someone who was as stuck as I was finding those empty values.
OR using lucene query string syntax
q=yourfield.keyword:""
See Elastic Search Reference https://www.elastic.co/guide/en/elasticsearch/reference/6.5/query-dsl-query-string-query.html#query-string-syntax
in order to find the empty string of one field in your document, it's highly relevant to the field's mapping, in other word, its index/analyzer setting .
If its index is not_analyzed, which means the token is just the empty string, you can just use term query to find it, as follows:
{"from": 0, "size": 100, "query":{"term": {"name":""}}}
Otherwise, if the index setting is analyzed and I believe most analyzer will treat empty string as null value So
you can use the filter to find the empty string.
{"filter": {"missing": {"existence": true, "field": "name", "null_value": true}}, "query": {"match_all": {}}}
here is the gist script you can reference: https://gist.github.com/hxuanji/35b982b86b3601cb5571
BTW, I check the commands you provided, it seems you DON'T want the empty string document.
And all my above command are just to find these, so just put it into must_not part of bool query would be fine.
My ES is 1.0.1.
For ES 1.3.0, currently the gist I provided cannot find the empty string. It seems it has been reported: https://github.com/elasticsearch/elasticsearch/issues/7348 . Let's wait and see how it go.
Anyway, it also provides another command to find
{ "query": {
"filtered": {
"filter": {
"not": {
"filter": {
"range": {
"name": {
}
}
}
}
}
} } }
name is the field name to find the empty-string. I've tested it on ES 1.3.2.
I'm using Elasticsearch 5.3 and was having trouble with some of the above answers.
The following body worked for me.
{
"query": {
"bool" : {
"must" : {
"script" : {
"script" : {
"inline": "doc['city'].empty",
"lang": "painless"
}
}
}
}
}
}
Note: you might need to enable the fielddata for text fields, it is disabled by default. Although I would read this: https://www.elastic.co/guide/en/elasticsearch/reference/current/fielddata.html before doing so.
To enable the fielddata for a field e.g. 'city' on index 'business' with type name 'record' you need:
PUT business/_mapping/record
{
"properties": {
"city": {
"type": "text",
"fielddata": true
}
}
}
If you don't want to or can't re-index there is another way. :-)
You can use the negation operator and a wildcard to match any non-blank string *
GET /my_index/_search?q=!(fieldToLookFor:*)
For nested fields use:
curl -XGET "http://localhost:9200/city/_search?pretty=true" -d '{
"query" : {
"nested" : {
"path" : "country",
"score_mode" : "avg",
"query" : {
"bool": {
"must_not": {
"exists": {
"field": "country.name"
}
}
}
}
}
}
}'
NOTE: path and field together constitute for search. Change as required for you to work.
For regular fields:
curl -XGET 'http://localhost:9200/city/_search?pretty=true' -d'{
"query": {
"bool": {
"must_not": {
"exists": {
"field": "name"
}
}
}
}
}'
I didn't manage to search for empty strings in a text field. However it seems to work with a field of type keyword. So I suggest the following:
delete /test_idx
put test_idx
{
"mappings" : {
"testMapping": {
"properties" : {
"tag" : {"type":"text"},
"content" : {"type":"text",
"fields" : {
"x" : {"type" : "keyword"}
}
}
}
}
}
}
put /test_idx/testMapping/1
{
"tag": "null"
}
put /test_idx/testMapping/2
{
"tag": "empty",
"content": ""
}
GET /test_idx/testMapping/_search
{
"query" : {
"match" : {"content.x" : ""}}}
}
}
You need to trigger the keyword indexer by adding .content to your field name. Depending on how the original index was set up, the following "just works" for me using AWS ElasticSearch v6.x.
GET /my_idx/_search?q=my_field.content:""
I am trying to find the empty fields (in indexes with dynamic mapping) and set them to a default value and the below worked for me
Note this is in elastic 7.x
POST <index_name|pattern>/_update_by_query
{
"script": {
"lang": "painless",
"source": """
if (ctx._source.<field name>== "") {
ctx._source.<field_name>= "0";
} else {
ctx.op = "noop";
}
"""
}
}
I followed one of the responses from the thread and came up with below it will do the same
GET index_pattern*/_update_by_query
{
"script": {
"source": "ctx._source.field_name='0'",
"lang": "painless"
},
"query": {
"bool": {
"must": [
{
"exists": {
"field": "field_name"
}
}
],
"must_not": [
{
"wildcard": {
"field_name": "*"
}
}
]
}
}
}
I am also trying to find the documents in the index that dont have the field and add them with a value
one of the responses from this thread helped me to come up with below
GET index_pattern*/_update_by_query
{
"script": {
"source": "ctx._source.field_name='0'",
"lang": "painless"
},
"query": {
"bool": {
"must_not": [
{
"exists": {
"field": "field_name"
}
}
]
}
}
}
Thanks to every one who contributed to this thread I am able to solve my problem

Create Elasticsearch curl query for not null and not empty("")

How can i create Elasticsearch curl query to get the field value which are not null and not empty(""),
Here is the mysql query:
select field1 from mytable where field1!=null and field1!="";
A null value and an empty string both result in no value being indexed, in which case you can use the exists filter
curl -XGET 'http://127.0.0.1:9200/test/test/_search?pretty=1' -d '
{
"query" : {
"constant_score" : {
"filter" : {
"exists" : {
"field" : "myfield"
}
}
}
}
}
'
Or in combination with (eg) a full text search on the title field:
curl -XGET 'http://127.0.0.1:9200/test/test/_search?pretty=1' -d '
{
"query" : {
"filtered" : {
"filter" : {
"exists" : {
"field" : "myfield"
}
},
"query" : {
"match" : {
"title" : "search keywords"
}
}
}
}
}
'
As #luqmaan pointed out in the comments, the documentation says that the filter exists doesn't filter out empty strings as they are considered non-null values.
So adding to #DrTech's answer, to effectively filter null and empty string values out, you should use something like this:
{
"query" : {
"constant_score" : {
"filter" : {
"bool": {
"must": {"exists": {"field": "<your_field_name_here>"}},
"must_not": {"term": {"<your_field_name_here>": ""}}
}
}
}
}
}
On elasticsearch 5.6, I have to use command below to filter out empty string:
GET /_search
{
"query" : {
"regexp":{
"<your_field_name_here>": ".+"
}
}
}
Wrap a Missing Filter in the Must-Not section of a Bool Filter. It will only return documents where the field exists, and if you set the "null_value" property to true, values that are explicitly not null.
{
"query":{
"filtered":{
"query":{
"match_all":{}
},
"filter":{
"bool":{
"must":{},
"should":{},
"must_not":{
"missing":{
"field":"field1",
"existence":true,
"null_value":true
}
}
}
}
}
}
}
You can do that with bool query and combination of must and must_not like this:
GET index/_search
{
"query": {
"bool": {
"must": [
{"exists": {"field": "field1"}}
],
"must_not": [
{"term": {"field1": ""}}
]
}
}
}
I tested this with Elasticsearch 5.6.5 in Kibana.
The only solution here that worked for me in 5.6.5 was bigstone1998's regex answer. I'd prefer not to use a regex search though for performance reasons. I believe the reason the other solutions don't work is because a standard field will be analyzed and as a result have no empty string token to negate against. The exists query won't help on it's own either since an empty string is considered non-null.
If you can't change the index the regex approach may be your only option, but if you can change the index then adding a keyword subfield will solve the problem.
In the mappings for the index:
"myfield": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
}
Then you can simply use the query:
{
"query": {
"bool": {
"must": {
"exists": {
"field": "myfield"
}
},
"must_not": {
"term": {
"myfield.keyword": ""
}
}
}
}
}
Note the .keyword in the must_not component.
You can use not filter on top of missing.
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"not": {
"filter": {
"missing": {
"field": "searchField"
}
}
}
}
}
}
Here's the query example to check the existence of multiple fields:
{
"query": {
"bool": {
"filter": [
{
"exists": {
"field": "field_1"
}
},
{
"exists": {
"field": "field_2"
}
},
{
"exists": {
"field": "field_n"
}
}
]
}
}
}
You can use a bool combination query with must/must_not which gives great performance and returns all records where the field is not null and not empty.
bool must_not is like "NOT AND" which means field!="", bool must exist means its !=null.
so effectively enabling: where field1!=null and field1!=""
GET IndexName/IndexType/_search
{
"query": {
"bool": {
"must": [{
"bool": {
"must_not": [{
"term": { "YourFieldName": ""}
}]
}
}, {
"bool": {
"must": [{
"exists" : { "field" : "YourFieldName" }
}]
}
}]
}
}
}
ElasticSearch Version:
"version": {
"number": "5.6.10",
"lucene_version": "6.6.1"
}
ES 7.x
{
"_source": "field",
"query": {
"bool": {
"must": [
{
"exists": {
"field":"field"
}
}
],
"must_not": [
{
"term": {
"field.keyword": {
"value": ""
}
}
}
]
}
}
}
We are using Elasticsearch version 1.6 and I used this query from a co-worker to cover not null and not empty for a field:
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"exists": {
"field": "myfieldName"
}
},
{
"not": {
"filter": {
"term": {
"myfieldName": ""
}
}
}
}
]
}
}
}
}
}
You need to use bool query with must/must_not and exists
To get where place is null
{
"query": {
"bool": {
"must_not": {
"exists": {
"field": "place"
}
}
}
}
}
To get where place is not null
{
"query": {
"bool": {
"must": {
"exists": {
"field": "place"
}
}
}
}
}
Elastic search Get all record where condition not empty.
const searchQuery = {
body: {
query: {
query_string: {
default_field: '*.*',
query: 'feildName: ?*',
},
},
},
index: 'IndexName'
};

Resources