Elasticsearch mixed number and string multi_match query failing - elasticsearch

I am trying to build a query where I can accept a string containing strings and numbers, and search for those values in fields in my index that contain double values and strings. For example:
Fields: Double doubleVal, String stringVal0, String stringVal1, String doNotSearchVal
Example search string: "person 10"
I am trying to get all documents containing "person" or "10" in any of the fields doubleVal, stringVal0 and stringVal1. This is my example query:
{
"query": {
"multi_match" : {
"query": "person 10",
"fields" : [
"doubleVal^1.0",
"stringVal0^1.0",
"stringVal1^1.0"
],
"type" : "best_fields",
"operator" : "OR",
"slop" : 0,
"prefix_length" : 0,
"max_expansions" : 50,
"zero_terms_query" : "NONE",
"auto_generate_synonyms_phrase_query" : true,
"fuzzy_transpositions" : true,
"boost" : 1.0
}
}
}
(This query was generated by Spring Data Elastic)
When I run this query, I get this error: (I've removed any identifying information)
{
"error": {
"root_cause": [
{
"type": "query_shard_exception",
"reason": "failed to create query: [query removed]",
"index_uuid": "index_uuid",
"index": "index_name"
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "index_name",
"node": "node_value",
"reason": {
"type": "query_shard_exception",
"reason": "failed to create query: [query removed]",
"index_uuid": "index_uuid",
"index": "index_name",
"caused_by": {
"type": "number_format_exception",
"reason": "For input string: \"person 10\""
}
}
}
]
},
"status": 400
}
I do not want to split apart the search string. If there is a way to rewrite the query so that it works in the expected way, I would like to do it that way.

You should try to set parameter lenient to true, then format-based errors, such as providing a text query value for a numeric field, will be ignored
You could achieve this in Spring Data with using builder method like this:
.lenient(true)

Related

Elastic Search Exception for a multi match query of type phrase when using a combination of number and alphabate without space

I am getting exception for below query:
"multi_match": {
"query": "\"73a\"",
"fields": [],
"type": "phrase",
"operator": "AND",
"analyzer": "custom_analyzer",
"slop": 0,
"prefix_length": 0,
"max_expansions": 50,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"fuzzy_transpositions": true,
"boost": 1.0
}
Exception I am getting:
error" : {
"root_cause" : [
{
"type" : "illegal_state_exception",
"reason" : "field \"log_no.keyword\" was indexed without position data; cannot run SpanTermQuery (term=73)"
},
{
"type" : "illegal_state_exception",
"reason" : "field \"airplanes_data.keyword\" was indexed without position data; cannot run SpanTermQuery (term=73)"
}
],
Note: 1) When I am changing the type from "phrase" to "best_fields", I am not getting any error and getting proper results for "query": ""73a"".
2) Using type as "phrase" and giving space between number and alphabet ex: "query": ""73 a"" also gives results without error.
My question is why with type as "phrase", I am getting error when there is no space between a number and alphabet combo in a query. Ex - "query": ""443abx"", "query": ""73222aaa"".
I am new to elastic search. Any help is appreciated. Thanks :)

Open Search, exclude field from indexing in mapping

I have the following mapping:
{
"properties": {
"type": {
"type": "keyword"
},
"body": {
"type": "text"
},
"id": {
"type": "keyword"
},
"date": {
"type": "date"
},
},
}
body field is going to be an email message, it's very long and I don't want to index it.
what is the proper way to exclude this field from indexing?
What I tried:
enabled: false - as I understand from the documentation, it's applied only to object type fields but in my case it's not really an object so I'm not sure whether I can use it.
index: false/'no' - this breaks the code at all and does not allow me to make a search. My query contains query itself and aggregations with filter. Filter contains range:
date: { gte: someDay.getTime(), lte: 'now' }
P.S. someDay is a certain day in my case.
The error I get after applying index: false in mapping to the body field is the following:
{
"error":
{
"root_cause":
[
{
"type": "number_format_exception",
"reason": "For input string: \"now\""
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards":
[
{
"shard": 0,
"index": "test",
"node": "eehPq21jQsmkotVOqQEMeA",
"reason":
{
"type": "number_format_exception",
"reason": "For input string: \"now\""
}
}
],
"caused_by":
{
"type": "number_format_exception",
"reason": "For input string: \"now\"",
"caused_by":
{
"type": "number_format_exception",
"reason": "For input string: \"now\""
}
}
},
"status": 400
}
I'm not sure how these cases are associated as the error is about date field while I'm adding index property to body field.
I'm using: "#opensearch-project/opensearch": "^1.0.2"
Please help me to understand:
how to exclude field from indexing.
why applying index: false to body field in mapping breaks the code an I get an error associated with date field.
You should just modify your mapping to this:
"body": {
"type": "text",
"index": false
}
And it should work

Can I ignore "failed to find nested object under path" in ElasticSearch if I get results?

We have an index with mapping that includes nested fields. In our Java class these fields are lists of objects, and sometimes the lists can be empty (so in the json structure we get e.g {... "some_nested_field": [], ...}.
When we run a query we do get results as expected, but also an error:
"failures": [
{
"shard": 0,
"index": ".kibana",
"node": "ZoEuUdkORpuBSNs7gqiv1Q",
"reason": {
"type": "query_shard_exception",
"reason": """
failed to create query: {
"nested" : {
"query" : {
"bool" : {
"must" : [
{
"match" : {
"foobar.name" : {
"query" : "brlo",
"operator" : "OR",
"prefix_length" : 0,
"max_expansions" : 50,
"fuzzy_transpositions" : true,
"lenient" : false,
"zero_terms_query" : "NONE",
"auto_generate_synonyms_phrase_query" : true,
"boost" : 1.0
}
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
"path" : "foobar",
"ignore_unmapped" : false,
"score_mode" : "avg",
"boost" : 1.0
}
}
""",
"index_uuid": "xrFCunLNSv6AER_KwNMHSA",
"index": ".kibana",
"caused_by": {
"type": "illegal_state_exception",
"reason": "[nested] failed to find nested object under path [foobar]"
}
}
}
Can I assume that this error is caused by records with empty lists, and ignore it? Or does this indicate an internal error and possibly missing results from my query? Is there a way to avoid this error?
UPDATE:
This is an example of the query we're executing:
GET /_search
{
"query": {
"nested": {
"path": "mynested",
"query": {
"bool": {
"should" : [
{ "match" : { "mynested.name": "foo" } },
{ "match" : { "mynested.description": "bar" } },
{ "match" : { "mynested.category": "baz" } }
],
"minimum_should_match" : 1
}
}
}
}
}
The response from ES reports 10 successful shards and one failure:
{
"took": 889,
"timed_out": false,
"_shards": {
"total": 11,
"successful": 10,
"skipped": 0,
"failed": 1,
"failures": [...]
And we do get hits back:
"hits": {
"total": 234450,
"max_score": 11.092936,
"hits": [ ...
Looks like you have Kibana installed. In the error message it says that it can't find nested under path foobar of index .kibana, which is the one that Kibana uses:
"index_uuid": "xrFCunLNSv6AER_KwNMHSA",
"index": ".kibana",
"caused_by": {
"type": "illegal_state_exception",
"reason": "[nested] failed to find nested object under path [foobar]"
}
When doing simple GET /_search all Elasticsearch indexes are queried, also .kibana, which is probably not what you wanted.
To ignore this particular index you can use Multiple Indices search capability and do a query like:
GET /*,-.kibana/_search
Hope that helps!

elasticsearch query text field with length of value more than 20

I would like to query name filed with length of value(text) is more than 20 by using the following but not working:
GET /groups/_search
{
"query": {
"bool" : {
"must" : {
"script" : {
"script" : "_source.name.values.length() > 20"
}
}
}
}
}
the error msg is :
{
"error": {
"root_cause": [
{
"type": "script_exception",
"reason": "compile error",
"script_stack": [
"_source.name.values.lengt ...",
"^---- HERE"
],
"script": "_source.name.values.length() > 5",
"lang": "painless"
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "groups",
"node": "exBbDVGeToSDRzLLmOh8-g",
"reason": {
"type": "query_shard_exception",
"reason": "failed to create query: {\n \"bool\" : {\n \"must\" : [\n {\n \"script\" : {\n \"script\" : {\n \"inline\" : \"_source.name.values.length() > 5\",\n \"lang\" : \"painless\"\n },\n \"boost\" : 1.0\n }\n }\n ],\n \"disable_coord\" : false,\n \"adjust_pure_negative\" : true,\n \"boost\" : 1.0\n }\n}",
"index_uuid": "_VH1OfpdRhmd_UPV7uTNMg",
"index": "groups",
"caused_by": {
"type": "script_exception",
"reason": "compile error",
"script_stack": [
"_source.name.values.lengt ...",
"^---- HERE"
],
"script": "_source.name.values.length() > ",
"lang": "painless",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Variable [_source] is not defined."
}
}
}
}
]
},
"status": 400
}
no idea how should i fix it...
fyi: version of es is 5.4.0
I don't know the following issue related:
Painless script_fields don't have access to a _source variable #20068
https://github.com/elastic/elasticsearch/issues/20068
The best and most optimal way to handle this is to also index another field with the length of the name field, let's call it nameLength. That way you shift the burden of computing the length of the name field at indexing time instead of having to do it (repeatedly) at query time.
So at indexing time if you have a name field like {"name": "A big brown fox"}, then you create a new field with the length of the name field, such as {"name": "A big brown fox", "nameLength": 15}.
At query time, you'll be able to use a simple and quick range query on the nameLength field:
GET /groups/_search
{
"query": {
"bool" : {
"must" : {
"range" : {
"nameLength": {
"gt": 20
}
}
}
}
}
}
You can use:
params._source.name.length() > 20
In case this is a rare query, that's probably ok to do. Otherwise you should add a field for the name length, and use the range query.

Using Wildcards in Field Names of multi_match Query to match the whole fields

Using Wildcards in Field Names says field names can be specified with wildcards, for example:
{
"multi_match": {
"query": "Quick brown fox",
"fields": "*_title"
}
}
Now I'd like to query among all fields in the index, So I tried to use * to match all fields. But I got an error, for example:
{
"multi_match": {
"query": "nanjing jianye",
"fields": ["*"]
}
}
error:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "Invalid format: \"nanjing jianye\""
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "mycustomer",
"node": "YHKU1KllRJW-BiH9g5-McQ",
"reason": {
"type": "illegal_argument_exception",
"reason": "Invalid format: \"nanjing jianye\""
}
}
]
},
"status": 400
}
How to match the whole fields using wildcards instead of explicitly specifying the fields?

Resources