elasticsearch highlighting error, failed to highlight ... String index out of range - elasticsearch

I cannot make head or tail of this error, and it's happening pretty randomly to where I don't even know where to start looking.
This is what the full error looks like
Tire::Search::SearchRequestFailed: 500 :
{
"error": "SearchPhaseExecutionException[Failed to execute phase [query_fetch], total failure;
shardFailures {[7McitJnjQkqLkViqUpZUyw][content][4]:
FetchPhaseExecutionException[[content][4]:
query[+_all:account +_all:set +_all:up],from[0],size[20]:
Fetch Failed [Failed to highlight field [post_content]]];
nested: StringIndexOutOfBoundsException[String index out of range: -5]; }]",
"status": 500
}
A query like
"relationship learning"
will run fine, but running
"relationship centered learning"
will throw the error, actually any of these letters c, d, j, q, x, z used with "relationship learning" .. like "d relationship learning" will throw the error.
Its truly maddening.
I'm running elasticsearch 19.2 with Tire
I just want to know where to start looking, any ideas will help.
This is a more complete explanation of the problem I'm having, it's exactly the same

As #imotov said above, this is a bug in lucene and therefore elasticsearch, https://issues.apache.org/jira/browse/LUCENE-4899
You can resolve it by not using the fast vector highlighter, or by setting fragment_size to a higher number to reduce incidences of the bug appearing.
I doubt that they will go away completely unless you set fragment_size to an impossibly high number, which you could do (in theory, but then you'd have to handle truncation on your own, which kind of defeats the purpose of the highlighter in the first place)

Related

ElasticSearch: Result window is too large

My friend stored 65000 documents on the Elastic Search cloud and I would like to retrieve all of them (using python). However, when I am running my current script, there is an error noticing that :
RequestError(400, 'search_phase_execution_exception', 'Result window is too large, from + size must be less than or equal to: [10000] but was [30000]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.')
My script
es = Elasticsearch(cloud_id=cloud_id, http_auth=(username, password))
docs = es.search(body={"query": {"match_all": {}}, '_source': ["_id"], 'size': 65000})
What would be the easiest way to retrieve all those document and not limit it to 10000 docs? thanks
The limit has been set so that the result set does not overwhelm your nodes. Results will occupy memory in the elastic node. So bigger the result set, bigger the memory footprint and impact on the nodes.
Depending on what you want to do with the retrieved documents,
try to use the scroll api (as suggested in your error message) if its a batch job. Be mindful of the lifetime of scroll context in that case.
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html#request-body-search-scroll
or, use the Search After
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html#request-body-search-search-after
You should use the scroll API and get the results in different calls. The scroll API will return to you the results 10000 by 10000 as maximum (that will be available to consult during the amount of time you indicate in the call) and you will be able then to paginate the results and obtain them thanks to a scroll_id.
The error message itself is mentioning that how can you solve the issue, look carefully this part of the error message.
This limit can be set by changing the [index.max_result_window] index
level setting.
Please refer update indices level setting on how to change that.
So for your setting it would look like:
PUT /<your-index-name>/_settings
{
"index" : {
"index.max_result_window" : 65000 -> note its equal to your all the docs in your index
}
}

Elasticsearch sometimes failing to execute fetch phase

Currently, all search queries that are passed into ES are truncated to 1000 characters. Regardless, every so often we get this error:
[2019-10-08T15:44:08,126][DEBUG][o.e.a.s.TransportSearchAction] [zir05M0] [135539946] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [zir05M0][__IP__][__PATH__[__PATH__]]
Caused by: java.lang.IllegalArgumentException: This builder doesn't allow terms that are larger than 1,000 characters, got <some really long string of text>
I'm a novice when it comes to Elasticsearch and really don't know where to begin in trying to debug this. My only guess is that a more_like_this query is using a field that's too long, but I can't confirm that. If anyone has any hints that could get me going down the right track would be greatly appreciated.

Error 429 [type=reduce_search_phase_exception]

I have many languages for my docs and am following this pattern: One index per language. In that they suggest to search across all indices with the
/blogs-*/post/_count
pattern. For my case I am getting a count across the indices of how many docs I have. I am running my code concurrently so making many requests at same time. If I search
/blogs-en/post/_count
or any other language then all is fine. However if I search
/blogs-*/post/_count
I soon encounter:
"Error 429 (Too Many Requests): [reduce] [type=reduce_search_phase_exception]
"
Is there a workaround for this? The same number of requests is made regardless of if I use
/blogs-en/post/_count or /blogs-*/post/_count.
I have always used the same number of workers in my code but re-arranging the indices to have one index per language suddenly broke my code.
EDIT: It is a brand new index without any documents when I start the program and when I get the error I have about 5,000 documents so not under any heavy load.
Edit: I am using the mapping found in the above-referenced link and running on a local machine with all the defaults of ES...in my case shards=5 and replicas=1. I am really just following the example from the link.
EDIT: The errors are seen with as few as 13-20 requests are made and I know ES can handle more than that. Searching /blogs-en/post/_count instead of /blogs-*/post/_count, etc.. can easily handle thousands with no errors.
Another Edit: I have removed all concurrency but still can only access 40-50 requests before I get the error.
I don't get an error for that request and it returns total documents.
Is you'r cluster under load?
Anyway, using simple aggregation you can get total document count in hits.total and per index document count in count_per_index part of result:
GET /blogs-*/post/_search
{
"size": 0,
"query": {
"match_all": {}
},
"aggs": {
"count_per_index": {
"terms": {
"field": "_index"
}
}
}
}

Can I ignore these query exceptions in my ElasticSearch log?

I have a large number of indices in my ES instance, and I have noticed that the log files are growing rather large. The ElasticSearch Chef cookbook by default sets the log level to DEBUG and this has resulted in millions of error messages being written into the log. Please see this one as an example:
[2015-02-20 18:42:28,858][DEBUG][action.search.type ] [SEARCHNODE] [child_index][4], node[xxxx], [P], s[STARTED]: Failed to execute [org.elasticsearch.action.search.SearchRequest#1a9a62ad] lastShard [true]
org.elasticsearch.search.SearchParseException: [ichild_index][4]: from[0],size[105]: Parse Failure [Failed to parse source [{"from":0,"size":105,"sort":{"lastmodified":{"order":"desc","missing":"_last"}},"query":{"indices":{"indices":["main_index"],"query":{"filtered":{"query":{"bool":{"must":[{"match_all":{}}]}},"filter":{"and":{"filters":[{"term":{"isclosed":false}},{"or":[{"and":[{"type":{"value":"type_name"}}]}]},{"term":{"planid":1454}},{"bool":{"should":[{"terms":{"roles":[173,935,934,937,930,938,936]}},{"missing":{"field":"roles"}}]}}]}}}},"no_match_query":"none"}},"fields":"[]"}]]
at org.elasticsearch.search.SearchService.parseSource(SearchService.java:660)
at org.elasticsearch.search.SearchService.createContext(SearchService.java:516)
at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:488)
at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:257)
at org.elasticsearch.search.action.SearchServiceTransportAction$5.call(SearchServiceTransportAction.java:206)
at org.elasticsearch.search.action.SearchServiceTransportAction$5.call(SearchServiceTransportAction.java:203)
at org.elasticsearch.search.action.SearchServiceTransportAction$23.run(SearchServiceTransportAction.java:517)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.search.SearchParseException: [child_index][4]: from[0],size[105]: Parse Failure [No mapping found for [lastmodified] in order to sort on]
at org.elasticsearch.search.sort.SortParseElement.addSortField(SortParseElement.java:198)
at org.elasticsearch.search.sort.SortParseElement.addCompoundSortField(SortParseElement.java:172)
at org.elasticsearch.search.sort.SortParseElement.parse(SortParseElement.java:90)
at org.elasticsearch.search.SearchService.parseSource(SearchService.java:644)
The query in the error message contains this fragment:
... {"indices":{"indices":["main_index"] ...
However, the error actually originates from child_index. I'm not sure why my instance would even consider child_index to execute the query on as we clearly don't want to consider that index as per the query.
The query above is actually executed successfully. Results are returned correctly, we don't log anything on the web application that indicates a problem. Presumably the query is at some point run against main_index as well and the results are correctly returned to the web app.
My instance is under a moderate workload and this file can comfortably grow to 5gb in a given 12 hour period. I know that the solution to that problem is simple: decrease the log level to WARN and the errors will go away. However, I'm worried that we might have a hitherto undiagnosed problem with the instance that could bite us later.
Of all the errors to ignore, org.elasticsearch.search.SearchParseException is probably the one you should never ignore. It means that ES was unable to parse your search JSON as it expects to be able to (as far as I can tell).
I took a look at your JSON, and although it lints it appears your "fields" array is actually "fields": "[]" which could be what's causing the issue. Can you try without the quotes and see what happens?
Theory, but it's possible it fails to parse that section and so just ignores it (which should result in the same thing as if it were parsed in this case).

How to Fix Read timed out in Elasticsearch

I used Elasticsearch-1.1.0 to index tweets.
The indexing process is okay.
Then I upgraded the version. Now I use Elasticsearch-1.3.2, and I get this message randomly:
Exception happened: Error raised when there was an exception while talking to ES.
ConnectionError(HTTPConnectionPool(host='127.0.0.1', port=8001): Read timed out. (read timeout=10)) caused by: ReadTimeoutError(HTTPConnectionPool(host='127.0.0.1', port=8001): Read timed out. (read timeout=10)).
Snapshot of the randomness:
Happened --33s-- Happened --27s-- Happened --22s-- Happened --10s-- Happened --39s-- Happened --25s-- Happened --36s-- Happened --38s-- Happened --19s-- Happened --09s-- Happened --33s-- Happened --16s-- Happened
--XXs-- = after XX seconds
Can someone point out on how to fix the Read timed out problem?
Thank you very much.
Its hard to give a direct answer since the error your seeing might be associated with the client you are using. However a solution might be one of the following:
1.Increase the default timeout Globally when you create the ES client by passing the timeout parameter. Example in Python
es = Elasticsearch(timeout=30)
2.Set the timeout per request made by the client. Taken from Elasticsearch Python docs below.
# only wait for 1 second, regardless of the client's default
es.cluster.health(wait_for_status='yellow', request_timeout=1)
The above will give the cluster some extra time to respond
Try this:
es = Elasticsearch(timeout=30, max_retries=10, retry_on_timeout=True)
It might won't fully avoid ReadTimeoutError, but it minimalize them.
Read timeouts can also happen when query size is large. For example, in my case of a pretty large ES index size (> 3M documents), doing a search for a query with 30 words took around 2 seconds, while doing a search for a query with 400 words took over 18 seconds. So for a sufficiently large query even timeout=30 won't save you. An easy solution is to crop the query to the size that can be answered below the timeout.
For what it's worth, I found that this seems to be related to a broken index state.
It's very difficult to reliably recreate this issue, but I've seen it several times; operations run as normal except certain ones which periodically seem to hang ES (specifically refreshing an index it seems).
Deleting an index (curl -XDELETE http://localhost:9200/foo) and reindexing from scratch fixed this for me.
I recommend periodically clearing and reindexing if you see this behaviour.
Increasing various timeout options may immediately resolve issues, but does not address the root cause.
Provided the ElasticSearch service is available and the indexes are healthy, try increasing the the Java minimum and maximum heap sizes: see https://www.elastic.co/guide/en/elasticsearch/reference/current/jvm-options.html .
TL;DR Edit /etc/elasticsearch/jvm.options -Xms1g and -Xmx1g
You also should check if all fine with elastic. Some shard can be unavailable, here is nice doc about possible reasons of unavailable shard https://www.datadoghq.com/blog/elasticsearch-unassigned-shards/

Resources