I am using the _cat API of elasticsearch to get the various details of my elasticsearch cluster.
https://www.elastic.co/guide/en/elasticsearch/reference/current/cat.html
What I want is the ability to filter the response which I can't see in the documentation, for example output of _cat/node?v give the node.role which tells whether a node is data or master or ingest node and I want a way to filter the only master and data node in the response.
You can use GET /_cat/master instead of _cat/nodes?v to get the master node. Otherwise, you can use the /_nodes/data:true to get only data nodes
GET /_nodes/data:true
GET /_nodes/ingest:true
GET /_nodes/master:true
Related
I would like to know on a multi-node Elasticsearch cluster (3 nodes), to which node we can send curl call to fetch some results (by running query)?
If we can use any node IP what is can be the best practice? , for example, if
I am using node 1's URL from "node 1, node 2, and node 3", let's say node 1 goes down, I have to manually update the query URL to "node 2 or node 3" is their way so that I can have one centralized URL which does itself.
Do I have to manually do it using Nginx or load balancer, Or there is something in the elastic search itself
Although in ES if you send the request to any node, that is part of a valid ES cluster, it will route the request internally and provide you the result.
But You shouldn't use the directly node ip to communicate with the Elasticsearch for obvious reasons and one of that you already mentioned. You can use the load balancer, ngnix or DNS for your Elasticsearch cluster.
But if you are accessing it programmatically you don't need this also, while creating the Elasticsearch clients, you can specify all the nodes ip in Elasticsearch client, this way even when some nodes are down still your request will not fail.
RestClientBuilder restClientBuilder = RestClient.builder(
new HttpHost(esConfig.getHost(), esConfig.getPort()), new HttpHost(esConfig.getHost2(), esConfig.getPort2()));
As you can see i created my Elasticsearch client(works with Elasticsearch 8.5) WITH two Elasticsearch hosts.
When you run either curl http://<node_ip>:9200/_cat/indices or GET _cat/indices (the latter one in the Dev Tools console, you get a summary of all the indices present in your cluster, as well as some size and counts statistics.
Is there a way to access that information via query string query?
I mean, is there an internal ES index with all that information available, that I can query to get the same/similar information?
No there isn't. That information is not kept in an index, but in the cluster state which is stored in a different location.
I have 3 nodes elasticsearch cluster. If more than one node goes down then I can easily check them manually. Suppose nodes in the cluster got increased then it will be difficult to check them manually. So, how can I get all the nodes(specifically name of the nodes) of the cluster even if they are down?
To get live/healthy nodes I hit the api endpoint:
curl -X GET "hostname/ip:port/_cat/nodes?v&pretty"
Is there any endpoint by using which I can get total nodes and unhealthy/down nodes in elasticsearch cluster?
I was trying to list all the nodes using discovery.seed.hosts present in elasticsearch.yml config file. But I don't know how to do it or is it the right approach or not.
I don't think there is any API to know about offline nodes. If your entire cluster is down or single node down, then Elastic doesn't provide any way to check the node's health. You need to depend on an external script or code or monitoring tool which will ping all your nodes and print status.
You can write a custom script which will call below API and it will return all the nodes which are available in the cluster. Once you have received response, you can filter out IP or hostname of the node and whichever are not coming in response you can consider it as down node.
GET _cat/nodes?format=json&filter_path=ip,name
Another option is to enable cluster monitoring which will give you status of entire cluster but again it will show information about running node only.
Please check this answer for how Kibana show offline node in Cluster Monitoring.
we want to upgrade our elasticsearch version from 5.6 to 7.9 in our project.
I have to migrate our indexes and docs to new version but I cant use reindex, So I rest high level client to connect to elasticsearch 7 and use http request for elasticsearch 5.
For migration I get part of docs with match_all query and scroll from old version and index them in new elasticsearch with bulk request.
our old version elasticsearch has 3 node.My question is that I have to send request to all node separately and process docs or if I send match_all query search to one node it will be handled by elsaticsearch (I read sth about cordinating node that handle requests and Every node is implicitly a coordinating node cordinating node.) or I have to send request to data node
Adding more details to #saeednasehi answer, Looks like you are getting confused about how Elasticsearch and its queries work internally, please refer to my answer to how search queries works in elasticsearch.
Apart from this while it's true, you can get data by connecting to any node, but in your ES client(JHLRC or HTTP) you should mention all the nodes IP, so that your request(note coordinating) load is distributed among all the data nodes, if you just give one node-IP, than that node always acts as a co-ordinating node in absence of dedicated coordinating node(default).
When you start a cluster of elsticsearch you can see all of the cluster as a single data base. it means that you can fetch and insert to all of the cluster by sending your request to one of them. You just need to send your request to a node and fetch your data.
I have implemented clustering using Elasticsearch. ElasticHead UI displays detected nodes.
However I am not sure how it works. Any one could please provide me a link/direction that shows how clustering works with elasticsearch?
ElasticSearch uses multicasting to see if there are other nodes with same cluster name present in the network.
If there is such a node it connects to it.
It shares it data with it depending upon the shard configuration.
http://www.elasticsearch.org/guide/reference/modules/discovery/zen.html
Read the above to get the full idea