Initializing shards in updating non dynamic setting of Elasticsearch index - elasticsearch

Setup: Elasticsearch 1.4.7 (I know it's old, legacy issues).
I need to update non dynamic settings of an index. We're saving some metadata of the index in the settings. Since the settings are non dynamic I need to close the index, only then update the settings and finally reopen the index again. As explained in this SO answer.
es_index = 'my_index'
data = {'settings': {'version_feed': version_feed}}
self.get_connection().indices.close(index=es_index)
self.get_connection().indices.put_settings(index=es_index, body=data)
self.get_connection().indices.open(index=es_index)
Problem occurs when trying to access (read/update) the index after the reopening. I get the following exception:
TransportError: TransportError(503, u'SearchPhaseExecutionException[Failed to execute phase [init_scan], all shards failed]')
In Head Plugin I see the cluster health is red and the index shard itself is yellow (and not green) and in state INITIALIZING.
The current index is contained in one shard while in the production environment it will be divided in four shards which might also be a problem, I'm not sure and haven't tested it in that environment.
Also, this isn't something I can do right now to solve it, for mere interest, is this issue solved or differently in more newer versions of Elasticsearch?

Related

Elasticsearch warning messages

I have ES running on my local development machine for my Rails app (Using Searchkick). I am getting these error messages:
299 Elasticsearch-6.8.8-2f4c224 "In a future major version, this
request will fail because this action would add [1] total shards, but
this cluster currently has [1972]/[1000] maximum shards open. Before
upgrading, reduce the number of shards in your cluster or adjust the
cluster setting [cluster.max_shards_per_node]."
My config file already has cluster.max_shards_per_node: 2000. Am I missing something here?
299 Elasticsearch-6.8.8-2f4c224 "[types removal] The parameter
include_type_name should be explicitly specified in create index
requests to prepare for 7.0. In 7.0 include_type_name will default to
'false', and requests are expected to omit the type name in mapping
definitions."
I have zero clue where to start looking on this one.
These flood my terminal when I run my re-indexing - looking to resolve it.
I think this is dynamic cluster setting an you should use _cluster/settings API.
obviously it is very wrong that have this number of shards in one node. please read followning article:
https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster
you can use shrink index API. The shrink index API allows you to shrink an existing index into a new index with fewer primary shards

Elasticsearch reindex gets stuck

Context
We have two Elasticsearch clusters with 6 and 3 nodes each. The cluster with 6 nodes is the one we use in production environment and we use the one with 3 nodes for testing purposes. (We have the same problem in both clusters). All the nodes have the following characteristics:
Elasticsearch 7.4.2
1TB HDD disk
8 GB RAM
In our case, we need to reindex some of the indexes. Those indexes have billions of documents and a size between 50GB and 250GB.
Problem
Whenever we start reindexing, internally or from a remote source, the task starts working correctly but it reaches a point where it stops reindexing, without apparent reason. We can´t see anything in the logs. The task is not cancelled or anything, it only stops reindexing documents, it looks like the task gets stuck. We tried changing GC strategies, we used CMS and Shenandoah but nothing changes.
Has anyone run into the same problem?
It's difficult to find the RCA of these issues without debugging it and with the little information you provided(missing cluster and index configuration, index slow logs information, elasticsearch error logs, Elasticsearch hot threads to name a few).

Elasticsearch indexes but does not store documents

I'm having troubles storing documents within a 3-node Elasticsearch cluster that previously was able to store documents. I use the Java API to send bulks of documents to Elasticsearch, which are accepted (no failure in BulkResponse object) AND Elasticsearch has heavy index activities. However, the number of documents are not increased and I assume that none of them are store.
I've looked into Elasticsearch logs (of all three nodes) but I see no errors or warnings.
Note: I've had to restart two nodes previously but search/query is working perfectly. (the count in the image starts at ~17:00 as I've installed the Marvel plugin at this time)
What can I do to solve or debug the problem?
Sorry for this point blank code blindness by me! I forgot to skip the cursor when reading from MongoDB and therefore re-inserted the same 1000 documents into Elasticsearch for thousands of times!
Learning: If this problem occurs check if you select the correct documents in your database and that these documents are not already stored in ES.
Sidenote to Marvel: It would be great is this could be indicated in any way - e.g. by having a chart with "updated documents" (I rechecked and could not find one)

how to disable shard re-balancing in elastic search, while allowing new indices to be allocated?

I am using ElasticSearch version 1.0.1 and want to achieve two things at the same time -
1. Allow new indices to be created ( the primary and replica shards need to be allocated as per usual logic).
2. Prevent existing shards to be rebalanced on node failure.
What combination of settings will allow me to achieve the same? I tried the settings from the cluster module documented at http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html. But I am unable to achieve both of them at the same time.
Thanks,

Update ElasticSearch Document while maintaining its external version the same?

I would like to update an ElasticSearch Document while maintaining the document's version the same. I'm using version_type=external as indicated in the versioning section of the index_ documentation. Updating a document with another of the same version is normally prevented as indicated in that section: "If the value provided is less than or equal to the stored document’s version number, a version conflict will occur and the index operation will fail."
The reason I want to keep the version unaltered is because I do not create a new version of my object (stored in my database) when one adds new tags to that object, but I would like the new tags to show up in my ElasticSearch index. Is this possible with ElasticSearch?
I tried deleting the document and then adding a new document with the same Id and Version but that still gives me the following exception:
VersionConflictEngineException[[myindex][2] [mytype][6]: version
conflict, current 1, provided 1]
Just for reference, I'm using PHP Elastica (with methods $type->deleteDocument($doc); and $type->addDocument($doc);) but this question should apply to ElasticSearch in general.
The time for which elasticsearch keeps information about deleted documents is controlled by index.gc_deletes parameter. By default this time is 1m. So, theoretically, you can decrease this time to 0s, wait for a second, delete the document, index a new document with the same version, and set index.gc_deletes back to 1m. But at the moment that would work only on master due to a bug. If you are using older version of elasticsearch, you will not be able to change index.gc_deletes without closing the index first.
There is a good blog post on elasticsearch.org web site that describes how versions are handled by elasticsearch in details.

Resources