I have a cluster of 5 servers for elasticsearch, all with the same version of elasticsearch.
I need to move all data from servers 2, 3, 4, 5 to server 1.
How can I do it?
How can I know which server has data at all?
After change of _cluster/settings with:
PUT _cluster/settings
{
"persistent" : {
"cluster.routing.allocation.require._host" : "server1"
}
}
I get for: curl -GET http://localhost:9200/_cat/allocation?v
the following:
shards disk.indices disk.used disk.avail disk.total disk.percent host ip node
6 54.5gb 170.1gb 1.9tb 2.1tb 7 *.*.*.* *.*.*.* node-5
6 50.4gb 167.4gb 1.9tb 2.1tb 7 *.*.*.* *.*.*.* node-3
6 22.6gb 139.8gb 2tb 2.1tb 6 *.*.*.* *.*.*.* node-2
6 49.8gb 166.6gb 1.9tb 2.1tb 7 *.*.*.* *.*.*.* node-4
6 54.8gb 172.1gb 1.9tb 2.1tb 7 *.*.*.* *.*.*.* node-1
and for: GET _cluster/settings?include_defaults
the following:
#! Deprecation: [node.max_local_storage_nodes] setting was deprecated in Elasticsearch and will be removed in a future release!
{
"persistent" : {
"cluster" : {
"routing" : {
"allocation" : {
"require" : {
"_host" : "server1"
}
}
}
}
},
"transient" : { },
"defaults" : {
"cluster" : {
"max_voting_config_exclusions" : "10",
"auto_shrink_voting_configuration" : "true",
"election" : {
"duration" : "500ms",
"initial_timeout" : "100ms",
"max_timeout" : "10s",
"back_off_time" : "100ms",
"strategy" : "supports_voting_only"
},
"no_master_block" : "write",
"persistent_tasks" : {
"allocation" : {
"enable" : "all",
"recheck_interval" : "30s"
}
},
"blocks" : {
"read_only_allow_delete" : "false",
"read_only" : "false"
},
"remote" : {
"node" : {
"attr" : ""
},
"initial_connect_timeout" : "30s",
"connect" : "true",
"connections_per_cluster" : "3"
},
"follower_lag" : {
"timeout" : "90000ms"
},
"routing" : {
"use_adaptive_replica_selection" : "true",
"rebalance" : {
"enable" : "all"
},
"allocation" : {
"node_concurrent_incoming_recoveries" : "2",
"node_initial_primaries_recoveries" : "4",
"same_shard" : {
"host" : "false"
},
"total_shards_per_node" : "-1",
"shard_state" : {
"reroute" : {
"priority" : "NORMAL"
}
},
"type" : "balanced",
"disk" : {
"threshold_enabled" : "true",
"watermark" : {
"low" : "85%",
"flood_stage" : "95%",
"high" : "90%"
},
"include_relocations" : "true",
"reroute_interval" : "60s"
},
"awareness" : {
"attributes" : [ ]
},
"balance" : {
"index" : "0.55",
"threshold" : "1.0",
"shard" : "0.45"
},
"enable" : "all",
"node_concurrent_outgoing_recoveries" : "2",
"allow_rebalance" : "indices_all_active",
"cluster_concurrent_rebalance" : "2",
"node_concurrent_recoveries" : "2"
}
},
...
"nodes" : {
"reconnect_interval" : "10s"
},
"service" : {
"slow_master_task_logging_threshold" : "10s",
"slow_task_logging_threshold" : "30s"
},
...
"name" : "cluster01",
...
"max_shards_per_node" : "1000",
"initial_master_nodes" : [ ],
"info" : {
"update" : {
"interval" : "30s",
"timeout" : "15s"
}
}
},
...
You can use shard allocation filtering to move all your data to server 1.
Simply run this:
PUT _cluster/settings
{
"persistent" : {
"cluster.routing.allocation.require._name" : "node-1",
"cluster.routing.allocation.exclude._name" : "node-2,node-3,node-4,node-5"
}
}
Instead of _name you can also use _ip or _host depending on what is more practical for you.
After running this command, all primary shards will migrate to server1 (the replicas will be unassigned). You just need to make sure that server1 has enough storage space to store all the primary shards.
If you want to get rid of the unassigned replicas (and get back to green state), simply run this:
PUT _all/_settings
{
"index" : {
"number_of_replicas" : 0
}
}
Related
I have configured my ILM to rollover when the indice size be 20GB or after passing 30 days in the hot node
but my indice passed 20GB and still didn't pass to the cold node
and when I run: GET _cat/indices?v I get:
green open packetbeat-7.9.2-2020.10.22-000001 RRAnRZrrRZiihscJ3bymig 10 1 63833049 0 44.1gb 22gb
Could you tell me how to solve that please !
Knowing that in my packetbeat file configuration, I have just changed the number of shards:
setup.template.settings:
index.number_of_shards: 10
index.number_of_replicas: 1
when I run the command GET packetbeat-7.9.2-2020.10.22-000001/_settings I get this output:
{
"packetbeat-7.9.2-2020.10.22-000001" : {
"settings" : {
"index" : {
"lifecycle" : {
"name" : "packetbeat",
"rollover_alias" : "packetbeat-7.9.2"
},
"routing" : {
"allocation" : {
"include" : {
"_tier_preference" : "data_content"
}
}
},
"mapping" : {
"total_fields" : {
"limit" : "10000"
}
},
"refresh_interval" : "5s",
"number_of_shards" : "10",
"provided_name" : "<packetbeat-7.9.2-{now/d}-000001>",
"max_docvalue_fields_search" : "200",
"query" : {
"default_field" : [
"message",
"tags",
"agent.ephemeral_id",
"agent.id",
"agent.name",
"agent.type",
"agent.version",
"as.organization.name",
"client.address",
"client.as.organization.name",
and the output of the command GET /packetbeat-7.9.2-2020.10.22-000001/_ilm/explain is :
{
"indices" : {
"packetbeat-7.9.2-2020.10.22-000001" : {
"index" : "packetbeat-7.9.2-2020.10.22-000001",
"managed" : true,
"policy" : "packetbeat",
"lifecycle_date_millis" : 1603359683835,
"age" : "15.04d",
"phase" : "hot",
"phase_time_millis" : 1603359684332,
"action" : "rollover",
"action_time_millis" : 1603360173138,
"step" : "check-rollover-ready",
"step_time_millis" : 1603360173138,
"phase_execution" : {
"policy" : "packetbeat",
"phase_definition" : {
"min_age" : "0ms",
"actions" : {
"rollover" : {
"max_size" : "50gb",
"max_age" : "30d"
}
}
},
"version" : 1,
"modified_date_in_millis" : 1603359683339
}
}
}
}
It's weird that it's 50GB !!
Thanks for your help
So I found the solution of this problem.
After updating the policy, I removed the policy from the index using it, and then added it again to those index.
I want to automatic connect to hdfs ha when namenode switch active to standby, which uri should be ?
PUT _snapshot/my_hdfs_repository
{
"type": "hdfs",
"settings": {
"uri": "hdfs://namenode:8020/",
"path": "/user/elasticsearch/repositories"
}
}
till now, I manual change the uri when hdfs namenode switch
This is my setting with ha hdfs and kerberos enabled.
PUT /_snapshot/elastic_hdfs_repository
{
"type" : "hdfs",
"settings" : {
"dfs" : {
"http" : {
"policy" : "HTTPS_ONLY"
}
},
"path" : "/elasticsearch/repositories/elastic_hdfs_repository",
"conf" : {
"dfs" : {
"client" : {
"failover" : {
"proxy" : {
"provider" : {
"my-cluster-nameservice" : "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"
}
}
}
},
"ha" : {
"automatic-failover" : {
"enabled" : {
"my-cluster-nameservice" : "true"
}
},
"namenodes" : {
"my-cluster-nameservice" : "namenode1,namenode2"
}
},
"data" : {
"transfer" : {
"protection" : "privacy"
}
},
"namenode" : {
"rpc-address" : {
"my-cluster-nameservice" : {
"namenode1" : "nn1.domain.com:8020",
"namenode2" : "nn2.domain.com:8020"
}
}
},
"nameservices" : "my-cluster-nameservice"
},
"fs" : {
"defaultFS" : "hdfs://elastic_hdfs_repository",
"hdfs" : {
"impl" : "org.apache.hadoop.hdfs.DistributedFileSystem"
}
},
"hadoop.http.authentication.token.validity": 36000
},
"security" : {
"principal" : "elasticsearch/_HOST#DOMAIN.COM"
},
"uri" : "hdfs://my-cluster-nameservice"
}
}
I installed oracle-jdk8 and elasticsearch on a ec2 instance and created an ami out of it. In the next copy of the ec2 machine i just changed the node name in elasticsearch.yml
However both the nodes if run individually are running.[NOTE the node id is appearing as same] But if run simultaneously, the one started later is failing with following in the logs:
[2018-08-07T16:35:06,260][INFO ][o.e.d.z.ZenDiscovery ] [node-1]
failed to send join request to master
[{node-2}{uQHBhDuxTeWOgmZHsuaZmA}{akWOcJ47SZKpR_EpA2lpyg}{10.127.114.212}{10.127.114.212:9300}{aws_availability_zone=us-east-1c, ml.machine_memory=66718932992, ml.max_open_jobs=20,
xpack.installed=true, ml.enabled=true}], reason
[RemoteTransportException[[node-2][10.127.114.212:9300][internal:discovery/zen/join]];
nested: IllegalArgumentException[can't add node
{node-1}{uQHBhDuxTeWOgmZHsuaZmA}{Ba1r1GoMSZOMeIWVKtPD2Q}{10.127.114.194}{10.127.114.194:9300}{aws_availability_zone=us-east-1c, ml.machine_memory=66716696576, ml.max_open_jobs=20,
xpack.installed=true, ml.enabled=true}, found existing node
{node-2}{uQHBhDuxTeWOgmZHsuaZmA}{akWOcJ47SZKpR_EpA2lpyg}{10.127.114.212}{10.127.114.212:9300}{aws_availability_zone=us-east-1c, ml.machine_memory=66718932992, xpack.installed=true,
ml.max_open_jobs=20, ml.enabled=true} with the same id but is a
different node instance];
My elasticsearch.yml:
cluster.name: elk
node.name: node-1
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
bootstrap.memory_lock: true
network.host: 0.0.0.0
network.publish_host: _ec2:privateIp_
transport.publish_host: _ec2:privateIp_
discovery.zen.hosts_provider: ec2
discovery.ec2.tag.ElasticSearch: elk-tag
cloud.node.auto_attributes: true
cluster.routing.allocation.awareness.attributes: aws_availability_zone
Output from _nodes endpoint:
//----Output when node-1 is run individually/at first----
{
"_nodes" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"cluster_name" : "elk",
"nodes" : {
"uQHBhDuxTeWOgmZHsuaZmA" : {
"name" : "node-1",
"transport_address" : "10.127.114.194:9300",
"host" : "10.127.114.194",
"ip" : "10.127.114.194",
"version" : "6.3.2",
"build_flavor" : "default",
"build_type" : "rpm",
"build_hash" : "053779d",
"roles" : [
"master",
"data",
"ingest"
],
"attributes" : {
"aws_availability_zone" : "us-east-1c",
"ml.machine_memory" : "66716696576",
"xpack.installed" : "true",
"ml.max_open_jobs" : "20",
"ml.enabled" : "true"
},
"process" : {
"refresh_interval_in_millis" : 1000,
"id" : 3110,
"mlockall" : true
}
}
}
}
//----Output when node-2 is run individually/at first----
{
"_nodes" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"cluster_name" : "elk",
"nodes" : {
"uQHBhDuxTeWOgmZHsuaZmA" : {
"name" : "node-2",
"transport_address" : "10.127.114.212:9300",
"host" : "10.127.114.212",
"ip" : "10.127.114.212",
"version" : "6.3.2",
"build_flavor" : "default",
"build_type" : "rpm",
"build_hash" : "053779d",
"roles" : [
"master",
"data",
"ingest"
],
"attributes" : {
"aws_availability_zone" : "us-east-1c",
"ml.machine_memory" : "66718932992",
"xpack.installed" : "true",
"ml.max_open_jobs" : "20",
"ml.enabled" : "true"
},
"process" : {
"refresh_interval_in_millis" : 1000,
"id" : 4869,
"mlockall" : true
}
}
}
}
Solved this by deleting rm -rf /var/lib/elasticsearch/nodes/ in every instance and restarting elasticsearch.
We're using logstash to sync Elastic search and we've around 3 million documents. It takes 3 to 4 hours to sync. Currently all we get is, it is started and stopped. Is there any way to see how many records processed in logstash ?
If you're using Logstash 5 and higher, the Logstash Monitoring API can help you. You can see and monitor what's happening inside Logstash as it processes events. If you hit the Pipeline stats API you'll get the total number of processed events per stage and plugin (input/filter/output):
curl -XGET 'localhost:9600/_node/stats/pipelines?pretty'
You'll get this type of response in which you can clearly see at any time how many events have been processed:
{
"pipelines" : {
"test" : {
"events" : {
"duration_in_millis" : 365495,
"in" : 216485,
"filtered" : 216485,
"out" : 216485,
"queue_push_duration_in_millis" : 342466
},
"plugins" : {
"inputs" : [ {
"id" : "35131f351e2dc5ed13ee04265a8a5a1f95292165-1",
"events" : {
"out" : 216485,
"queue_push_duration_in_millis" : 342466
},
"name" : "beats"
} ],
"filters" : [ {
"id" : "35131f351e2dc5ed13ee04265a8a5a1f95292165-2",
"events" : {
"duration_in_millis" : 55969,
"in" : 216485,
"out" : 216485
},
"failures" : 216485,
"patterns_per_field" : {
"message" : 1
},
"name" : "grok"
}, {
"id" : "35131f351e2dc5ed13ee04265a8a5a1f95292165-3",
"events" : {
"duration_in_millis" : 3326,
"in" : 216485,
"out" : 216485
},
"name" : "geoip"
} ],
"outputs" : [ {
"id" : "35131f351e2dc5ed13ee04265a8a5a1f95292165-4",
"events" : {
"duration_in_millis" : 278557,
"in" : 216485,
"out" : 216485
},
"name" : "elasticsearch"
} ]
},
"reloads" : {
"last_error" : null,
"successes" : 0,
"last_success_timestamp" : null,
"last_failure_timestamp" : null,
"failures" : 0
},
"queue" : {
"type" : "memory"
}
}
}
I am attempting to upgrade our Elastic Search cluster from 1.6 to 2.3.4. The upgrade seems to work, and I can see shard allocation starting to happen within Kopf - but at some point the shard allocation appears to stop with many shards left unallocated, and no errors being reported in the logs. Typically I'm left with 1200 / 3800 shards unallocated.
We have a typical 3 node cluster and I am trialing this standalone on my local machine with all 3 nodes running on my local machine.
I have seen similar symptoms reported - see https://t37.net/how-to-fix-your-elasticsearch-cluster-stuck-in-initializing-shards-mode.html
. The solution here seemed to be to manually allocate the shards, which I've tried (and works) but I'm at a loss to explain the behaviour of elastic search here. I'd prefer not to go down this route, as I want my cluster to spin up automatically without intervention.
There is also https://github.com/elastic/elasticsearch/pull/14494 which seems to be resolved with the latest ES version, so shouldn't be a problem.
There are no errors in log files - I have upped the root level logging to 'DEBUG' in order to see what I can. What I can see is lines like the below for each unallocated shard (this from the master node logs):
[2016-07-26 09:18:04,859][DEBUG][gateway ] [germany] [index][4] found 0 allocations of [index][4], node[null], [P], v[0], s[UNASSIGNED], unassigned_info[[reason=CLUSTER_RECOVERED], at[2016-07-26T08:05:04.447Z]], highest version: [-1]
[2016-07-26 09:18:04,859][DEBUG][gateway ] [germany] [index][4]: not allocating, number_of_allocated_shards_found [0]
Config file (with comments removed):
cluster.name: elasticsearch-jm-2.3.4
node.name: germany
script.inline: true
script.indexed: true
If I query the cluster health after reallocation has stopped - I get the response below:
http://localhost:9200/_cluster/health?pretty
cluster_name : elasticsearch-jm-2.3.4
status : red
timed_out : False
number_of_nodes : 3
number_of_data_nodes : 3
active_primary_shards : 1289
active_shards : 2578
relocating_shards : 0
initializing_shards : 0
unassigned_shards : 1264
delayed_unassigned_shards : 0
number_of_pending_tasks : 0
number_of_in_flight_fetch : 0
task_max_waiting_in_queue_millis : 0
active_shards_percent_as_number : 67.10046850598647
Further querying for shards - filtered to one index with unallocated shards. As can be seen - shard 0 and 4 are unallocated whereas shard 1 2 and 3 have been allocated :
http://localhost:9200/_cat/shards
cs-payment-warn-2016.07.20 3 p STARTED 106 92.4kb 127.0.0.1 germany
cs-payment-warn-2016.07.20 3 r STARTED 106 92.4kb 127.0.0.1 switzerland
cs-payment-warn-2016.07.20 4 p UNASSIGNED
cs-payment-warn-2016.07.20 4 r UNASSIGNED
cs-payment-warn-2016.07.20 2 r STARTED 120 74.5kb 127.0.0.1 cyprus
cs-payment-warn-2016.07.20 2 p STARTED 120 74.5kb 127.0.0.1 germany
cs-payment-warn-2016.07.20 1 r STARTED 120 73.8kb 127.0.0.1 cyprus
cs-payment-warn-2016.07.20 1 p STARTED 120 73.8kb 127.0.0.1 germany
cs-payment-warn-2016.07.20 0 p UNASSIGNED
cs-payment-warn-2016.07.20 0 r UNASSIGNED
Manually rerouting an unassigned shard appears to work - (stripped back results set)
http://localhost:9200/_cluster/reroute
POST:
{
"dry_run": true,
"commands": [
{
"allocate": {
"index": "cs-payment-warn-2016.07.20",
"shard": 4,
"node": "switzerland" ,
"allow_primary": true
}
}
]
}
Response:
{
"acknowledged" : true,
"state" : {
"version" : 722,
"state_uuid" : "Vw2vPoCMQk2ZosjzviD4TQ",
"master_node" : "yhL7XXy-SKu_WAM-C33dzA",
"blocks" : {},
"nodes" : {},
"routing_table" : {
"indices" : {
"cs-payment-warn-2016.07.20" : {
"shards" : {
"3" : [{
"state" : "STARTED",
"primary" : true,
"node" : "yhL7XXy-SKu_WAM-C33dzA",
"relocating_node" : null,
"shard" : 3,
"index" : "cs-payment-warn-2016.07.20",
"version" : 22,
"allocation_id" : {
"id" : "x_Iq88hmTqiasrjW09hVuw"
}
}, {
"state" : "STARTED",
"primary" : false,
"node" : "1a8dgBscTUS3c7Pv4mN9CQ",
"relocating_node" : null,
"shard" : 3,
"index" : "cs-payment-warn-2016.07.20",
"version" : 22,
"allocation_id" : {
"id" : "DF-EUEy_SpeUElnZI6cgsQ"
}
}
],
"4" : [{
"state" : "INITIALIZING",
"primary" : true,
"node" : "1a8dgBscTUS3c7Pv4mN9CQ",
"relocating_node" : null,
"shard" : 4,
"index" : "cs-payment-warn-2016.07.20",
"version" : 1,
"allocation_id" : {
"id" : "1tw7C7YPQsWwm_O-8mYHRg"
},
"unassigned_info" : {
"reason" : "INDEX_CREATED",
"at" : "2016-07-26T14:20:15.395Z",
"details" : "force allocation from previous reason CLUSTER_RECOVERED, null"
}
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 4,
"index" : "cs-payment-warn-2016.07.20",
"version" : 1,
"unassigned_info" : {
"reason" : "CLUSTER_RECOVERED",
"at" : "2016-07-26T11:24:11.868Z"
}
}
],
"2" : [{
"state" : "STARTED",
"primary" : false,
"node" : "rlRQ2u0XQRqxWld-wSrOug",
"relocating_node" : null,
"shard" : 2,
"index" : "cs-payment-warn-2016.07.20",
"version" : 22,
"allocation_id" : {
"id" : "eQ-_vWNbRp27So0iGSitmA"
}
}, {
"state" : "STARTED",
"primary" : true,
"node" : "yhL7XXy-SKu_WAM-C33dzA",
"relocating_node" : null,
"shard" : 2,
"index" : "cs-payment-warn-2016.07.20",
"version" : 22,
"allocation_id" : {
"id" : "O1PU1_NVS8-uB2yBrG76MA"
}
}
],
"1" : [{
"state" : "STARTED",
"primary" : false,
"node" : "rlRQ2u0XQRqxWld-wSrOug",
"relocating_node" : null,
"shard" : 1,
"index" : "cs-payment-warn-2016.07.20",
"version" : 24,
"allocation_id" : {
"id" : "ZmxtOvorRVmndR15OJMkMA"
}
}, {
"state" : "STARTED",
"primary" : true,
"node" : "yhL7XXy-SKu_WAM-C33dzA",
"relocating_node" : null,
"shard" : 1,
"index" : "cs-payment-warn-2016.07.20",
"version" : 24,
"allocation_id" : {
"id" : "ZNgzePThQxS-iqhRSXzZCw"
}
}
],
"0" : [{
"state" : "UNASSIGNED",
"primary" : true,
"node" : null,
"relocating_node" : null,
"shard" : 0,
"index" : "cs-payment-warn-2016.07.20",
"version" : 0,
"unassigned_info" : {
"reason" : "CLUSTER_RECOVERED",
"at" : "2016-07-26T11:24:11.868Z"
}
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 0,
"index" : "cs-payment-warn-2016.07.20",
"version" : 0,
"unassigned_info" : {
"reason" : "CLUSTER_RECOVERED",
"at" : "2016-07-26T11:24:11.868Z"
}
}
]
}
}
},
"routing_nodes" : {
"unassigned" : [{
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 4,
"index" : "cs-payment-warn-2016.07.20",
"version" : 1,
"unassigned_info" : {
"reason" : "CLUSTER_RECOVERED",
"at" : "2016-07-26T11:24:11.868Z"
}
}, {
"state" : "UNASSIGNED",
"primary" : true,
"node" : null,
"relocating_node" : null,
"shard" : 0,
"index" : "cs-payment-warn-2016.07.20",
"version" : 0,
"unassigned_info" : {
"reason" : "CLUSTER_RECOVERED",
"at" : "2016-07-26T11:24:11.868Z"
}
}, {
"state" : "UNASSIGNED",
"primary" : false,
"node" : null,
"relocating_node" : null,
"shard" : 0,
"index" : "cs-payment-warn-2016.07.20",
"version" : 0,
"unassigned_info" : {
"reason" : "CLUSTER_RECOVERED",
"at" : "2016-07-26T11:24:11.868Z"
}
}
]
},
"nodes" : {
"rlRQ2u0XQRqxWld-wSrOug" : [{
"state" : "STARTED",
"primary" : false,
"node" : "rlRQ2u0XQRqxWld-wSrOug",
"relocating_node" : null,
"shard" : 2,
"index" : "cs-payment-warn-2016.07.20",
"version" : 22,
"allocation_id" : {
"id" : "eQ-_vWNbRp27So0iGSitmA"
}
}, {
"state" : "STARTED",
"primary" : false,
"node" : "rlRQ2u0XQRqxWld-wSrOug",
"relocating_node" : null,
"shard" : 1,
"index" : "cs-payment-warn-2016.07.20",
"version" : 24,
"allocation_id" : {
"id" : "ZmxtOvorRVmndR15OJMkMA"
}
}
]
}
}
}
}