ElasticSearch Node location updation - elasticsearch

I am using graylog1.4 and elasticsearch 2.3,
I would like to change the location of (cluster indexes) -> /var/lib/elasticsearch/graylog2/nodes/0/indices/graylog2_0/0/index/ -> to an attached storage (like I have SAN storage which is mounted as /data), please suggest where to make changes in configuration to achieve it because this /var/lib/elasticsearch/graylog2 have consumed almost all local disk.
Thanks.

You can change the location of the Elasticsearch indices on disk using the path.data configuration setting: https://www.elastic.co/guide/en/elasticsearch/reference/2.3/setup-configuration.html#paths

Related

How to move the data and log location in ElasticSearch

I have ES cluster setup with 3 master and 2 data node and running properly. I want to change one of the data node data and log location from local to external disk
In my current YAML file
path.data: /opt/elasticsearch/data
path.logs: /opt/logs/elasticsearch
Now I added 2 external disk to my server to store data/logs and would like to change the location to the new drives
I have added the new disk. What is correct process to point ES data/log to the new disk
The data on this node can be deleted as this is a dev env.
Could I just stop the ES on this server
delete the info in the current data and log folder
mount the new drive to the same mount point and restart the cluster
Thanks
You could just change the settings in YAML file and restart the elasticsearch service, it should work for you. There is no automatic reload when you change any YAML configuration.
Steps :
change Path in YAML
Restart the service

What's the easiest way of moving Elastic Search data between servers

I've got Elastic Search v6.1.0 installed on Windows and Centos7 machines. The goal is to migrate data from Win to Centos7 machine.
Since they both have the same ES version, I simply dragged "data" folder from machine A to B. When I checked its health, its status was red and active_primary_shards was 0. So I reversed the changes I made.
What other methods are there? Can Snapshot/Restore method be used for this purpose? I think it's for migrating between different versions.
So the question is, what's the best/easiest method for moving data between 2 servers with same ES versions?
Using snapshot/restore
You can perfectly use snapshot/restore for this task as long as you have a shared file system or a single-node cluster. The shared FS should meet the following criteria:
In order to register the shared file system repository it is necessary
to mount the same shared filesystem to the same location on all master
and data nodes.
So it's not a problem if you have a single-node cluster. In this case just make a snapshot and copy it over to other machine.
It might though be a challenging task if you have many nodes running.
You may use one of the supported plugins for S3, HDFS and other cloud storages.
The advantage of this approach is that the data and the indices are snapshotted entirely.
Using _reindex API
It might be easier to use _reindex API to transfer data from one ES cluster to another. There is a special Reindex from Remote mode that allows exactly this use case.
What reindex actually does is a scroll on the source index and a lot of bulk inserts to the target index (which can be remote).
There are couple of issues you should take care of:
setting up the target index (no mapping, no settings will be set by reindex)
if some fields on the source index are excluded from _source then their contents won't be copied to the target index
Summing up
For snapshot/restore
Pros:
all data and the indices are saved/restored as they are
2 calls to the ES API are needed
Cons:
if cluster has more than 1 node, you need to setup a shared FS or to use some cloud storage
For _reindex
Pros:
Works for cluster of any size
Data is copied directly (no intermediate storage required)
1 call to the ES API is needed
Cons:
Data excluded from _source will be lost
Here's also a similar SO question from some three years ago.
Hope that helps!

Elasticsearch Shard Location

I am trying to setup an elasticsearch cluster and have a question thats bothering me. I am transitioning from Marklogic to Elasticsearch and have this concept of storing data on a different disk rather than on the same disk where my software i.e. MarkLogic is installed. I know how to do it in MarkLogic but somehow can not find anything on this on elasticsearch. Can anyone point me to a document that can help me configure my shard on a different machine where elasticsearch is not installed?
Thanks,
S.
You simply need to change the path.data setting in your elasticsearch.yml configuration file:
path:
data:
- /mnt/hda1
- /mnt/hda2
- /mnt/hda3
You can use a single location or several and when you do, ES will store your index data on those locations. Note that data pertaining to a given shard will always be located at the same path location.

Change cluster name in elastic search

How to rename the current cluster in elasticsearch config?
i want to rename the cluster without it going down if possible.
Make edits in the elasticsearch.yml file. By default the es cluster name is elasticsearch and the cluster.name field in the yml file is commented out. So first uncomment it, then give a name and restart es.
If you are having multi nodes cluster means, you can try updating cluster names in config file & directory name (if replicas enabled) one by one nodes; which is similar to rolling upgrade of the Elasticsearch.
if you are using single node cluster means, you can attempt changing the cluster name in config file but restart of cluster will be needed to take effect change.

Where does Elasticsearch store its data?

So I have this Elasticsearch installation, in insert data with logstash, visualize them with kibana.
Everything in the conf file is commented, so it's using the default folders which are relative to the elastic search folder.
1/ I store data with logstash
2/ I look at them with kibana
3/ I close the instance of elastic seach, kibana and logstash
4/ I DELETE their folders
5/ I re-extract everything and reconfigure them
6/ I go into kibana and the data are still there
How is this possible?
This command will however delete the data : curl -XDELETE 'http://127.0.0.1:9200/_all'
Thanks.
ps : forgot to say that I'm on windows
If you've installed ES on Linux, the default data folder is in /var/lib/elasticsearch (CentOS) or /var/lib/elasticsearch/data (Ubuntu)
If you're on Windows or if you've simply extracted ES from the ZIP/TGZ file, then you should have a data sub-folder in the extraction folder.
Have a look into the Nodes Stats and try
http://127.0.0.1:9200/_nodes/stats/fs?pretty
On Windows 10 with ElasticSearch 7 it shows:
"path" : "C:\\ProgramData\\Elastic\\Elasticsearch\\data\\nodes\\0"
According to the documentation the data is stored in a folder called "data" in the elastic search root directory.
If you run the Windows MSI installer (at least for 5.5.x), the default location for data files is:
C:\ProgramData\Elastic\Elasticsearch\data
The config and logs directories are siblings of data.
Elastic search is storing data under the folder 'Data' as mentioned above answers.
Is there any other elastic search instance available on your local network?
If yes, please check the cluster name. If you use same cluster name in the same network it will share data.
Refer this link for more info.
On centos:
/var/lib/elasticsearch
It should be in your extracted elasticsearch. Something like es/data

Resources