I'm having an issue with my Elasticsearch replication of my Couchbase DB.
Certain documents in my DB are not being replicated correctly. Rather than replicating as full couchbase documents, there is an entry with _type of couchbaseCheckpoint pointing to certain documents. These checkpoints never seem to update or grab the documents correctly, even after a refresh. This seems to occur at random, with some documents replicating correctly and being stored as full couchbaseDocuments, while others stay as these checkpoints. I'm not seeing any errors related to this in my logs at all.
versions:
couchbase 3.0.1
elasticsearch 1.3
couchbase elasticsearch plugin 2.0.0
{
"name":"xxx",
"age": "1",
}
Related
So I have Kibana set up with my data in it. About 3 indices.
Recently I've deployed Elastic Enterprise Search And im testing out Elastic App Search, but I have no data in it.
My question therefore is, can I somehow migrate or sync my data inside Kibana into Elastic App Search?
Sorry, migration of Elasticsearch indices to Elastic App Search is not available as of now.
Even though it looks like Kibana is holding the data, but actually Elasticsearch is the datastore behind it. App Search is a layer on top of Elasticsearch which manages the indexes, schema, documents etc.
If you're directly ingesting data into Elasticsearch, at this moment it is not possible to automatically migrate to Elastic App Search.
I have a couchbase cluster setup as the primary source for data. From this a subset of data is synced to a elasticsearch cluster via the Couchbase Transport Plugin for ElasticSearch(https://github.com/couchbaselabs/elasticsearch-transport-couchbase) which sets up an XDCR stream from couchbase to elasticsearch.
Due to some issues with the elasticsearch cluster all data needs to be synced again from couchbase to elasticsearch. I have tried recreating XDCR but that does not seem to help as it only copies a very small subset of documents. Is there a way by which this can be achieved?
Additional details
Couchbase version: 3.1.0
Number of couchbase documents: 50K+
Documents synced to elasticsearch: around 700 (expected 20K+)
If a document in couchbase is modified it is successfully synced to elasticsearch
The issue you're experiencing is likely in one of the following: XDCR, the Couchbase Transport Plugin for Elasticsearch, or Elasticsearch itself.
Start by checking for XDCR errors. You can find your XDCR logs using these instructions. Be aware that the Transport Plugin uses XDCR v1 and almost everything else in Couchbase uses v2.
Consult the advice in troubleshooting the Couchbase Transport Plugin for Elasticsearch. Instructions should work for you even though they are from the 4.0 docs.
Pay attention to how your documents are being mapped to Elasticsearch. You mention that you're expecting only a subset of documents to be synced to Elasticsearch, so it's possible that you have lost a setting or misconfigured something. You can enable logging and observe a small set of test data. At TRACE level, you should be able to see each document that is inspected.
If all of that fails, make sure the basics are working by indexing the beer sample dataset, following the directions in the Couchbase docs. ES is probably not the issue, but test with a fresh ES instance will rule out problems on that side.
I have cross-datacenter replication (XDCR) setup to replicate data from a Couchbase bucket to an Elasticsearch index. Some times, however, Elasticsearch fails and needs to be restarted. During this time any changes to the Couchbase bucket are not replicated. Is there a recommended way of dealing with this? Ideally, once Elasticsearch restarts the data added to Couchbase in the interim gets replcated successfully.
We have a single machine elastic search server (8 shards but all hosted at the same machine). Index contains 7 million documents. We do not specify any custom routing when indexing the documents. We are using Elastic search version 1.2.
The problem is that we are unable to retrieve many of our documents using GET , . However using search?_id: we are able to retrieve all of those documents.
We are also successful in retrieving a document by specifying routing parameter (with different values (1,2,3,...) ) with GET.
With previous version, i.e. Elastic Search 1.0.3, we did not have that problem.
Any suggestions for resolution?
Thanks in advance
There is a bug in Elasticsearch 1.2.0 that causes this specific behavior. It's due to a routing bug that was introduced in 1.2.0:
There was a routing bug in Elasticsearch 1.2.0 that could have a
number of bad side effects on the cluster. Possible side effects
include:
documents that were indexed prior to the upgrade to 1.2.0 may not be accessible via get. A search would find these documents, but not
a direct get of the document by ID.
documents that were updated after the upgrade to 1.2.0 may be duplicated, with one copy from pre-1.2.0 and a second copy updated
since the upgrade to 1.2.0.
if a document is duplicated as above, and versioning is in use, the document added after the upgrade to 1.2.0 will have its version
reset.
ES is advising everyone to upgrade to 1.2.1 immediately. No word yet on how to resolve what appears to be index corruption introduced by using 1.2.0 to insert or update. Full details here:
http://www.elasticsearch.org/blog/elasticsearch-1-2-1-released/
I have lot of data indexed in my elasticsearch.
I deleted elasticsearch folder and then extarct again fresh zip of elasticsearch and start the elasticsearch server.
I am surprised because after staring new elasticsearch server, I again found all old data and this problem persists again and again.
Can any please help me? I don't want to get all old data indexed in elasticsearch.
Regards
Given the cluster health response it's not a problem with multiple nodes running on the same cluster as suggested by Igor. I'd suggest you to check the java processes running. You could maybe have an elasticsearch hanging somewhere which keeps writing in that folder.