Optimize Elasticsearch Writes slowing down Reads - elasticsearch

I have a Flink job that's bulk writing/upserting a few thousands docs per second onto Elasticsearch. Every time I query it takes ~10-20 seconds to get a response.
I have second index that's exactly the same and equally as full on the same cluster but writes are now turned down to 0 on this index. When I query it it takes milliseconds to get a response.
I.e. with writes off queries take milliseconds. With writes on queries take 10-20 seconds.
CPU utilization ~10%, JVM mem pressure ~70%. ES 7.8.
It would appear then that writes to shards are somehow slowing the reads down. This is odd considering with "profile": true it's giving me query timings on the order of milliseconds yet took (total request time) is in seconds like I'm seeing.
My question is why might this be happening, and how can I optimize it?
(I did think of maybe I could have read replica nodes, but ES doesn't support a read replica node type https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html#node-roles )

Not sure, what was your original refresh_interval value, but the default is 1 second which you set explicitly and it makes a difference on what documents are returned in search results.
Refresh makes documents to be written to segments which is used by search queries and without refresh you will get the obsolete(for existing docs) also new documents written after refresh will not be available in search results.
But it doesn't make difference in the performance of the search queries and in your case while indexing is also happening, search queries are taking more time which you need to debug more(see the CPU, memory, node stats(queue size of search and index queue)) etc.
By the way, the replica is related to shard and not the node, and you can easily increase the replica of an index dynamically to improve the search performance.
PUT /my-index-000001/_settings
{
"index" : {
"number_of_replicas" : 2 // no of replica you want to set.
}
}

Setting refresh_interval to 1s seems to have fixed the issue. If anyone has an explanation as to why I'd appreciate it.
curl -X PUT "host/index/_settings?pretty&human=true" -H 'Content-Type: application/json' -d'
{
"index" : {
"refresh_interval" : "1s"
}
}
Edit:
The refresh rate changes dynamically based on read load unless it's explicitly set.
By default, Elasticsearch periodically refreshes indices every second, but only on indices that have received one search request or more in the last 30 seconds. You can change this default interval using the index.refresh_interval setting.
https://www.elastic.co/guide/en/elasticsearch/reference/7.10/indices-refresh.html#refresh-api-desc

Related

elastic reindex is very slow

elastic reindex is very slow. from some article refresh interval is default 1seconds and it required to change it to -1 and after reindex complete to update back to 1s. my question here is..
Is it good to update the refresh interval value to -1seconds when re index is running. which is already completed 20%.
curl -XPUT 'localhost:9200/my_index/_settings' -d '
{
"index" : {
"refresh_interval" : -1
}
}
It won't hurt if you do that, no
Also if you are using Elasticsearch 5 then you really need to urgently upgrade, it's been EOL for a number of years now
The refresh_interval changes how frequently ElasticSearch syncs data and makes it available for search. This is additional work that is required when the refresh interval is reduced. When re-indexing, if you are not needing to query the new index, then you want to set this high to improve performance or even disable it.
https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-indexing-speed.html
If you must read from the index while writing to it and must have the data available very quickly, indexing performance will be reduced.

When no write, why Elasticsearch performs indexing every 'n' seconds?

I have basic question regarding elastic search.
As per documentation : By default, Elasticsearch periodically refreshes indices every second, but only on indices that have received one search request or more in the last 30 seconds.
Reference: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-refresh.html#refresh-api-desc
Also as per documentation: When a document is stored, it is indexed and fully searchable in near real-time--within 1 second.
Reference : https://www.elastic.co/guide/en/elasticsearch/reference/7.14/documents-indices.html
So when write happens, indexing happen. When write is not happening and documents are already indexed, then why elastic search indexes every 1 second existing documents?
it's not indexing existing documents, that's already been done
it's checking to see if it needs to write any in memory indexing requests that need to be written to disk to make them searchable

Elasticsearch reindex store sizes vary greatly

I am running Elasticsearch 6.2.4. I have a program that will automatically create an index for me as well as the mappings necessary for my data. For this issue, I created an index called "landsat" but it needs to actually be named "landsat_8", so I chose to reindex. The original "landsat" index has 2 shards and 0 read replicas. The store size is ~13.4gb with ~6.6gb per shard and the index holds just over 515k documents.
I created a new index called "landsat_8" with 5 shards, 1 read replica, and started a reindex with no special options. On a very small Elastic Cloud cluster (4GB RAM), it finished in 8 minutes. It was interesting to see that the final store size was only 4.2gb, yet it still held all 515k documents.
After it was finished, I realized that I failed to create my mappings before reindexing, so I blew it away and started over. I was shocked to find that after an hour, the /cat/_indices endpoint showed that only 7.5gb of data and 154,800 documents had been reindexed. 4 hours later, the entire job seemed to have died at 13.1gb, but it only showed 254,000 documents had been reindexed.
On this small 4gb cluster, this reindex operation was maxing out CPU. I increased the cluster to the biggest one Elastic Cloud offered (64gb ram), 5 shards, 0 RR and started the job again. This time, I set the refresh_interval on the new index to -1 and changed the size for the reindex operation to 2000. Long story short, this job ended in somewhere between 1h10m and 1h19m. However, this time I ended up with a total store size of 25gb, where each shard held ~5gb.
I'm very confused as to why the reindex operation causes such wildly different results in store size and reindex performance. Why, when I don't explicitly define any mappings and let ES automatically create mappings, is the store size so much smaller? And why, when I use the exact same mappings as the original index, is the store so much bigger?
Any advice would be greatly appreciated. Thank you!
UPDATE 1:
Here are the only differences in mappings:
The left image is "landsat" and the right image is "landsat_8". There is a root level "type" field and a nested "properties.type" field in the original "landsat" index. I forgot one of my goals was to remove the field "properties.type" from the data during the reindex. I seem to have been successful in doing so, but at the same time, accidentally renamed the root-level "type" field mapping to "provider", thus "landsat_8" has an unused "provider" mapping and an auto-created "type" mapping.
So there are some problems here, but I wouldn't think this would nearly double my store size...

The query results of Elasticsearch are less than the actual number of data when data in Elasticsearch update every 30s .

I am trying to use Grafana and display real time data in table. Grafana uses POST to obtain query results from Elasticsearch. While sometimes the number of acquired doc is less than the actual number. After several times refreshes the number of doc would be complete once while incomplete later. Why does it lack data in responses and not realize real time displaying. How can I obtain complete query results every 5s.
ElasticSearch is a Near Real Time solution ( take a look here ). So your data in display in search after Lucene segment are merge into Lucene index.
This append where there is too many segment opended or the refresh interval of your elastic index kicks in.
To make it simple, you can configure the refresh interval of your index to obtain a "more" real time behavior :
Here is the documentation
PUT /<your_index>/_settings
{
"index" : {
"refresh_interval" : "1s"
}
}
Will make your index change visible to search every second ( check the description here)

How to get more than 10 thousand Documents at a time from Elasticsearch Using Jest client

ElasticSearch by-default gives 10 records, but we can set the size parameter and can get the more than 10 records but there is limit, we can set only 10000 as record size if we use Jest client for Elasticsearch, if its more than 10 thousand then throws Exception.
please help me to get more than 10 thousand records at once in elasticsearch using jest client(java)
That limit is there for a reason — quoting from the documentation:
The index.max_result_window which defaults to 10,000 is a safeguard, search requests take heap memory and time proportional to from + size.
Depending on your use-case, there are better alternatives:
Real-time / user facing: Use Search After (and avoid deep pagination) with good response times and reasonable heap usage.
Machine / batch processing (often to read all data): Scroll, which creates a search context and keeps it open for the specified amount of time. The result will also be stable as long as the context is open.
You can update the max-result window of elasticsearch.
curl -XPUT "http://localhost:9200/my_index/_settings" -d '{ "index" : { "max_result_window" : 500000 } }'
But this will only be persisted until Elastic is restarted. You can make a change
index.max_result_window: 1000000
in elasticsearch.yml file for permanent solution.

Resources